A Multilingual Video Transcriptor and Annotation-based Video Transcoding

Shigeki OHIRA
名古屋大学 情報基盤センター
Mitsuhiro YONEOKA
Katashi NAGAO
名古屋大学 大学院 情報科学研究科


This paper proposes a tool for multimedia annotation and its application. The annotation tool allows users to easily create annotation data including video transcripts, video scene descriptions, and visual/auditory object descriptions. The video transcriptor iscapable of multilingual speech identification and recognition.The annotation data enables users to retrieve and transform multimedia content according to their preferences.A video scene description consists of semi-automatically detected key frames of each scene in a video clip and their time codes.A visual object description is created by automatic tracking and interactive naming of people and objects in video frames.An auditory object is also detected semi-automatically.The annotation data is described using XML (Extensible Markup Language).The annotation-based content transformation is called "semantic transcoding" because we deal with semantic features of content.This paper also introduces some examples of annotation-based video transcoding such as video summarization, video-to-document transformation, and video translation.