IM2.DI Document-centric meeting recordings.

Straigth to the data

The meeting room installed in Fribourg aims at recording meetings where documents play a central role, either because they are discussed or because they are in visual focus (on the projection screen or on the table). Thus, the so-called "document-centric meeting room", is tailored to capture all the phenomena related to documents. The document-centric meeting room is equipped with 10 cameras (8 close-ups, one per participant, 2 overviews), 8 microphones, a video projector, 1 camera for the projection screen capture and several cameras for capturing documents on the table. The equipment is lightweight (PCs with firewire webcams) and not intrusive. Camera and microphone pairs' synchronization is guarantied on a per-computer basis. Due to the high bandwidth of each camera (8.8MB/s for 640x480, 15fps), we could not put all the cameras on a single PC. Therefore, we use 4 PCs remotely controlled and synchronized, through monitors and sockets, by a master PC, so that recordings start simultaneously. The architecture is thus fully scalable.
A meeting capture application, running on the master PC, pilots the slave PCs and their devices. It has a user-friendly interface to start, pause and stop recording, to control post-processing operations such as compression (for streaming and archiving) and to control file transfers to a server. This application is part of a more general Organizer tool for specifying the cameras and microphones to be used, the participants' position, camera's frame-rate, etc. The Organizer tool also assists users in the preparation, management and archiving of a meeting. This includes services for registering meeting participants, gathering documents and related information.
There are various types of document-centric meetings: structured meetings agenda driven, professor lectures with a class book, reading clubs, administrative meetings for writing text laws, etc. Finally, we are currently concentrating our efforts on press reviews (i.e. meetings where participants discuss the cover page and the content of the newspapers of the day). Newspaper can be matched with speech transcript through citation, reference and thematic alignment. They contain small articles easy to segment. Thus, press reviews follow a structured agenda that should fit well document temporal alignment through document content alignment with speech transcripts.

 

We have already recorded and annotated 20 meetings of about 15 minutes each. For each meeting a directory has been created on our media file server containing:

  • The meeting global descriptor (XML file holding general information about the meeting, participants' name, position, A/V setup, location, date, etc);
  • SMIL files to play all audio/video together.
  • And the following directories:
    • Audio: audio files for each participants;
    • Video: one head-and-shoulder movie for each participant, one movie for each of the two overview cameras and the movie of the projected slides;
    • Document: The PDF file of each discussed and/or projected document, and for each of those files: its linear textual version, and an XML file holding the manual logical structure;
    • Speech: manual speech transcript and various other speech annotations;
    • Image: one keyframe for each video.