School of the 3eme cycle romand in Informatics
October 8th and 9th, 2007
Location: Perolles II (F7 on map) in the building 21
Monday: room G140, Tuesday: room 002 Joseph Deiss
Program of the school
With the emergent constitution of multimedia archives, the expansion of infotainment and the democratisation of multimedia acquisition devices, multimedia search engines are becoming crucial for managing both personal data and large multimedia archives. Further, medias (images, videos, sounds, documents, etc.) are often related and thus cross-media mining methods are fundamental not only to enrich multimedia indexing methods but also to extend current querying and search strategies. The major goal of this seminar is to give a state-of-the-art of the current research trends and to bring out the major challenges related to the creation of multimedia search engines.
Content of Tutorials
Semantic Indexing & Retrieval of Video (Dr. Marcel Worring, University of Amsterdam) [handouts]
The semantic gap between the low level information that can be derived from the visual data and the conceptual view the user has of the same data is a major bottleneck in video retrieval systems. It has dictated that solutions to image and video indexing could only be applied in narrow domains using specific concept detectors, e.g., “sunset” or “face”. This leads to lexica of at most 10-20 concepts. The use of multimodal indexing, advances in machine learning, and the availability of some large, annotated information sources, e.g., the TRECVID benchmark, has paved the way to increase lexicon size by orders of magnitude (now 100 concepts, in a few years 1,000). This brings it within reach of research in ontology engineering, i.e. creating and maintaining large, typically 10,000+ structured sets of shared concepts. When this goal is reached we could search for videos in our home collection or on the web based on their semantic content, we could develop semantic video editing tools, or develop tools that monitor various video sources and trigger alerts based on semantic events. This tutorial lays the foundation for these exciting new horizons. It will cover: Different methods for semantic video indexing; Semantic retrieval; Interactive access to the data; Evaluation of indexing and interactive access in TRECVID; The challenges ahead and how to meet them.
Video processing for indexing and retrieval (Dr. Georges Quénot, Laboratoire d'Informatique de Grenoble - CNRS) [handouts]
Indexing the contents of video documents requires a lot of consecutive steps to go from the raw binary contents up to its semantic interpretation. In this course, we will focus on the first stages of the contents analysis which correspond to the extraction of low to intermediate level information. This includes: low level visual feature extraction; shot Boundary Detection, key frame selection and shot or keyframe clustering; camera motion indexing and mobile object segmentation and tracking; story segmentation and video structuring or summarization; speaker identification and emotion indexing. The question of the evaluation of video indexing and retrieval systems will also be addressed and illustrated in the context of the TREC/TRECVID campaigns.
Music Information Retrieval (Prof. Andreas Rauber, Vienna University of Technology) [handouts]
In this course we will take a closer look at the various areas, tasks, and methods that together form the field of music information retrieval (MIR). We will start by considering the various types of data that are relevant for MIR activities, ranging from both symbolic as well as acoustic music data, via textual, up to image and video data. This will be followed by a brief overview of the overwhelming number of tasks and challenges in MIR to provide a thorough understanding of the problem domain and the interdisciplinary nature of this domain. The core part of the course will then address a number of selected topics. Specifically, we will focus on various techniques for feature extraction from music, and their utilization for tasks such as retrieval, genre classification, chord detection, and others. We will also analyze and discuss the benefits of combining different modalities, such as textual and acoustic information, as well as the utilization of web information for these tasks. Last, but not least, we will take a closer look at a few applications, such as the PlaySOM and PocketSOM, that assist users in organizing their music collections, creating playlists on desktop computers as well as mobile phones.
Content of Talks
"How solid are the foundations of speech-driven information retrieval?" (Pr. Fabio Crestani, Faculty of Informatics, University of Lugano, CH) [handouts]
Mobility is changing information access applications and information retrieval is not immune to that. In this talk I will discuss some of the issues related to research on mobile information retrieval and in particular to speech-driven information retrieval. I will report on some work at the foundations of this area of research, studying the differences between spoken and written queries and the differences in the perceptions of relevance of spoken and written documents retrieved in response to a query.
Making sense of people in audio recordings: social sciences in multimedia analysis. (Dr. Alessandro Vinciarelli, IDIAP) [handouts]
Multimedia data rarely contain something else than people involved
in social interactions. This presentation shows how algorithms inspired by
social sciences, namely sociology and social psychology, can be used to
extract from multimedia data information that can be difficult, if not
impossible, to extract by other means.
Interactive Media Retrieval in Mobile Communication (Dr. Robert van Kommer, Swisscom Innovation) [handouts]
All-in-one mobile phones have changed our social communication behaviors and infotainment habits. For people on the move, accessing media content represents new challenges and different use cases: on the one hand, mobile phones' display and keyboard are much smaller than those on regular PCs; however, on the other hand, these devices are always on, personalized and carried around. In this context, the following topics are addressed: how to enhance user's access by cross-media indexing and, furthermore, how could the search/retrieval performance be improved with a "human in the loop" personalization algorithm? Both topics will be illustrated through an interactive media application tailored towards mobile user experience.
Indexing and visual browsing of multimedia collections thanks to links between documents (Maurizio Rigamonti, Fribourg) [handouts]
The indexing of multimedia data requires semantic information that is
sometimes difficult to extract from isolated media in an automatic manner.
In this talk, we present a technology that tries to overcome this problem
using indexes based on the explicit and implicit correlations existing
between multimedia documents. More precisely, the talk presents how the
system elicits these relationships and how users can browse a collection of
multimedia documents using the discovered links.
Human-Centered Perspectives in Image Retrieval (Dr. Alex Jaimes) [handouts]
Image retrieval is a human-centered task: images are created by people and
ultimately accessed and used by people for human-related activities. In
designing image retrieval systems and algorithms, or measuring their
performance, it is therefore imperative to consider the conditions that
surround both the indexing of image content and the retrieval (e.g.,
different levels of interpretation, possible search strategies, image
uses, etc.). It is also important to consider the role of culture, memory,
and personal context. In this talk I will outline important factors in
image retrieval from a human-centered perspective, in particular, I
will discuss different levels of description, types of users, search
strategies, image uses, and issues such as human memory, context,
and subjectivity, and their role in image retrieval system development and
Important dates:[ September 15th, 2007 ] - Registration deadline; [ October 8th and 9th, 2007 ] - Date of the school
This workshop is part of the 3eme cycle romand in Informatics
For more information, please refer to the conference website at:
http://diuf.unifr.ch/3e-cycle/ or http://www.cuso.ch/3e-cycle/bienvenue.html