[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[AUDITORY] Hannah: dense audio-visual person annotation in "Hannah and her sisters"



Dear list,

 

[Sorry for cross-posting]

 

We have created and made publicly available a dense audio-visual person-oriented ground-truth annotation of a feature movie (100 minutes long): “Hannah and her sisters” by Woody Allen.

 

The annotation includes

 

•             Face tracks in video (densely annotated, i.e., in each frame, and person-labeled)

•             Speech segments in audio (person-labeled)

•             Shot boundaries in video

 

The annotation can be useful for evaluating

 

•             Person-oriented video-based tasks (e.g., face tracking, automatic character naming, etc.)

•             Person-oriented audio-based tasks (e.g., speaker diarization or recognition)

•             Person-oriented multimodal-based tasks (e.g., audio-visual character naming)

 

Detail on Hannah dataset and access to it can be obtained there:

https://research.technicolor.com/rennes/hannah-home/

https://research.technicolor.com/rennes/hannah-download/

 

Acknowledgments:

This work is supported by AXES EU project: http://www.axes-project.eu/

 

Best regards,

 

Alexey Ozerov, Jean-Ronan Vigouroux, Louis Chevallier and Patrick Pérez

 

 

 

Alexey Ozerov
Technicolor Research & Innovation

Alexey.Ozerov@xxxxxxxxxxxxxxx