Video Annotation with Pictorially Enriched Ontologies

Abstract: Video annotation is typically performed by classifying video elements according to some pre-defined ontology of the video content domain. Ontologies are defined by establishing relationships between linguistic terms, that specify domain concepts at different abstraction levels. However, although linguistic terms are appropriate to distinguish event and object categories, they are inadequate when they must describe specific patterns of events or video entities. Instead, in these cases, pattern specifications are better expressed through visual prototypes that capture the essence of the event or entity. Pictorially enriched ontologies, that include visual concepts together with linguistic keywords, are therefore needed tosupport video annotation up to the level of detail of pattern specification. This paper presents pictorially enriched ontologies and provide a solution for their implementation in the soccer video domain. The pictorially enriched ontology is used both to directly assign multimedia objects to concepts, providing a more meaningful definition than the linguistics terms, and to extend the initial knowledge of the domain, adding subclasses of highlights or new highlight classes that were not defined in the linguistic ontology. Automatic annotation of soccer clips up to the pattern specification level using a pictorially enriched ontology is discussed.


C., Torniai; A., DEL BIMBO; Cucchiara, Rita; M., Bertini "Video Annotation with Pictorially Enriched Ontologies" Proceedings of ICME 2005, Amsterdam, The Netherlands, pp. 1428 -1431 , 6-8 July 2005, 2005

