IM2.DI: Document Integration Project (2002-now)


In this project, which is part of the IM2 NCCR, various types of documents, used during a meeting, either distributed in paper form or projected on a screen, are analyzed and compared to video and speech data in order to allow further linking between the different modalities. In other words, the main goal is to bridge the gap between non-temporal documents and other temporal media. Current research activities are concentrating on two major aspects of document alignment, i.e. document thematic alignment and image matching. In the future, the smart meeting room installed in Fribourg will be enhanced for real-time interactions with documents, and document alignment techniques will be improved.

In that context various browsers are being developped:
  • FriDoc (2003-now), a document-based multimedia meeting browser (financed by IM2): The FriDoc browser proposes to consider static documents as structured and thematic vectors towards multimedia archives. It is base on the assumption that in a large proportion of multimedia applications (e.g. lectures and meetings), printable documents play a central role in the structure of discussions. The main purpose of the FriDoc browser is to assess the usefulness of document alignments for multimedia browsing. Our hypothesis is that finding links between documents and multimodal annotations of the audio and video streams will permit the design of user-interfaces that improve retrieval tasks. Based on 22 meetings, a user evaluation has been performed on 8 users that prove that users browse more efficiently when making full use of documents.

  • FaericWorld (Maurizio Rigmaonti's PhD thesis), interactive visualizations for multimedia digital libraries: design and implementation of a framework that allows 1) browsing in a world composed of heterogeneous and multimedia documents; 2) visualizing and validating documents derived data, links and annotations; 3) creating summaries of regions of the world. The documents in this world consist of media (static documents, video, etc.), multimedia documents (e.g. slide shows), strictly related documents, i.e. clusters and events such as conference and meetings, and finally persons. Search and navigation will be based on the document derived data, in order to retrieve and present relevant information to the user. The validation task will allow 1) final users to navigate in a consistent world for retrieving information and 2) researchers to correct and refine algorithms for multimodal data analysis. Finally, summary creation targets the production of new documents and links, associating existing data with new ones. Many application of this research are envisioned such as tools for learning centers, document server browsers, etc.

More details on IM2.DI webpage and IM2 webpage.