Visualizing Multimedia Structure

Wednesday, October 31, 2007 - 17:30
TH 331
Matthew Cooper, FXPAL
In this talk we review a general framework for visualizing multimedia structure using inter-sample similarity. The approach is unsupervised and readily adaptable to various modalities, feature representations, and similarity measures. The resulting visualizations suggest several approaches for automatically characterizing the temporal structure of media streams. We consider two examples. The first is a system for identifying repetitive structure in music and audio. We use it to detect chorus segments in popular music for use as summaries. The approach provides a complete structural characterization of the audio stream to enable adaptable summary design. The second example is a system for automatically clustering digital photo collections into time-based events. We consider visualizations of temporal and content-based inter-photo similarity at multiple scales to construct a hierarchical partition of the time interval during which the photos were taken. We use various clustering criteria to select a final partitioning of the photos. This system is integrated in an application which also allows users to browse the hierarchical segmentation and organize their photos semi-automatically.

Matthew Cooper is a Senior Research Scientist at FX Palo Alto Laboratory, where he works in the Interactive Media group. He received the BS, MS, and DSc degrees in Electrical Engineering from Washington University in St. Louis in 1993, 1994, and 1999 respectively. His primary research focus is developing content analysis techniques to enable multimedia management and information retrieval applications. His research interests include multimedia analysis, information retrieval, statistical inference, information theory, and computer vision.