As a guideline for our actor's improvisation we provided them with 7 scenarios which could evolve in 4 different ways each ("subscenarios"). A complete list of all scenarios and subscenarios can be found here.
Each of the 8 pairs of actors who we recorded performed all (sub)scenarios. Thus, the dataset consists of 224 sequences. There are 8 viewpoints for every sequence, resulting in 1792 video files. We provide the videos in archives seperated by viewpoints. If you want to get an impression of the interactions, it is best to start with viewpoint 2, as it gives a good overview over the environment. The audio in the videos is a simple downmix of all 4 recorded audio channels to mono. We will release the raw audiofiles soon.
The structure inside one archive is the following:
<id of scenario>_<id of subscenario>_A<id of actor starting in kitchen>_B<id of actor starting outside kitchen>.avi
view1, view2, view3, view4, view5, view6, view7, view8 (~4.5GB each). Supplementary material with list of all scenarios and subscenarios and table of similar datasets. Raw annotations from all 5 annotators.
Contact: Dominike Thomas,
The data is only to be used for non-commercial scientific purposes. If you use this dataset in a scientific publication, please cite the following papers:
Emotion recognition from embedded bodily expressions and speech during dyadic interactions
Proc. International Conference on Affective Computing and Intelligent Interaction (ACII),
The dataset was recorded with a camera system from 4D View Solutions.
The authors would like to thank Johannes Tröger for working as a director in our recordings, as well as all involved actors and annotators.