[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[AUDITORY] Hannah: dense audio-visual person annotation in "Hannah and her sisters"

Dear list,


[Sorry for cross-posting]


We have created and made publicly available a dense audio-visual person-oriented ground-truth annotation of a feature movie (100 minutes long): “Hannah and her sisters” by Woody Allen.


The annotation includes


•             Face tracks in video (densely annotated, i.e., in each frame, and person-labeled)

•             Speech segments in audio (person-labeled)

•             Shot boundaries in video


The annotation can be useful for evaluating


•             Person-oriented video-based tasks (e.g., face tracking, automatic character naming, etc.)

•             Person-oriented audio-based tasks (e.g., speaker diarization or recognition)

•             Person-oriented multimodal-based tasks (e.g., audio-visual character naming)


Detail on Hannah dataset and access to it can be obtained there:





This work is supported by AXES EU project: http://www.axes-project.eu/


Best regards,


Alexey Ozerov, Jean-Ronan Vigouroux, Louis Chevallier and Patrick Pérez




Alexey Ozerov
Technicolor Research & Innovation