RECOLA Database

Modules

Recordings, features, annotations, timing of events (face or speech detection) and metadata are provided for the 23 participants from the training and development partitions in different modules. There are 12 modules in total in the data repository, split over 5 main folders: Annotation (1), Audio (4), Biosignals (2), Metadata (1) and Video (4). A summary of the content of these modules is given below; the reader is referred to the following article for further details on the computation of features: Pattern Recognition Letters, 2015. In addition two others sets of features further developped for the AVEC'16 and AVEC'18 Challenges are available in the data repository. Details regarding the data collection and annotation are given in the article introducing the RECOLA dataset and presented at Face & Gestures 2013. Details regarding the annotation of the laughter events are given in the paper presented at ICMI'18.

Annotation

This module contains the annotations (socio-affective behaviors) performed by the six assistants (three males, three females) using the ANNEMO web-based annotation tool. Data are provided separately for each participant and each assistant, with a framerate of 40ms for the affective behaviors (arousal and valence). (new) Annotations of laugther are also provided for different types of events, such as breath (unvoiced) laughter (BL), (voiced) laughter (L), speech (S), and speech laughter (SL).

Audio

This folder contains 4 modules: (1) audio recordings, (2) timings (start/stop) of utterances, (3) probability of voice activity detection (VAD), and (4) acoustic features.
(1) Unidirectional headset microphone, external sound card (44.1kHz, 16bits), wav file (PCM).
(2) Timings of spoken utterances (1308 in total) provided as start and stop timecodes in a csv file.
(3) VAD probabilities stored into an ARFF file at a framerate of 40ms (25Hz).
(4) Acoustic features (ComParE/eGeMAPS - openSMILE) provided as ARFF files.

Biosignals

This folder contains 2 modules: (1) physiological recordings, and (2) features.
(1) Electrocardiogram (ECG) and electrodhermal activity (EDA), Biopac MP150 unit, stored into a csv file.
(2) Features given separately for the ECG and the EDA signals.

Metadata

Various data are provided in this module: age, gender and mother tongue of the participants; used communication tool (with or without emotional feedback of teammate); self-reports (mood, and social behaviour); questionnaire given to the participants.

Video

This folder contains 4 modules: (1) video recordings, (2) timing of each video frame, (3) probabilty of face detection and (4) visual features.
(1) Logitech webcam; 1080x720 pixels; YUV; variable 30fps; compression in MPG4: H264, q=25, 'constant' 25fps.
(2) Timecode of each video frame given in a csv file (frame number ; time in seconds).
(3) Probability of face detection provided as ARFF file for each frame.
(4) Probabilty of 15 emotion-related facial action units, movements of the face in X-Y-Z directions, and both mean and standard-deviation of the optical flow in the region of the face provided for each frame into an ARFF file.