Pandora
Pandora
Head and Shoulder Pose
Estimation from Images
CVPR 2017
Guido Borghi | Marco Venturelli | Roberto Vezzani | Rita Cucchiara |
---|---|---|---|
University of Modena and Reggio Emilia, Italy |
Abstract
Fast and accurate upper-body and head pose estimation is a key task for automatic monitoring of driver attention, a challenging context characterized by severe illumination changes, occlusions and extreme poses. In this work, we present a new deep learning framework for head localization and pose estimation on depth images. The core of the proposal is a regressive neural network, called POSEidon, which is composed of three independent convolutional nets followed by a fusion layer, specially conceived for understanding the pose by depth. In addition, to recover the intrinsic value of face appearance for understanding head position and orientation, we propose a new Facefrom- Depth model for learning image faces from depth. Results in face reconstruction are qualitatively impressive. We test the proposed framework on two public datasets, namely Biwi Kinect Head Pose and ICT-3DHP, and on Pandora, a new challenging dataset mainly inspired by the automotive setup. Results show that our method overcomes all recent state-of-art works, running in real time at more than 30 frames per second.
Dataset features
Pandora dataset has the following features:
- Deep Learning-oriented: Pandora contains more than 250k full resolution RGB (1920x1080 pixels) and depth images (512x424) with the corresponding annotation; 110 annotated sequences using 10 male and 12 female actors. Each subject has been recorded five times.
- Time-of-Flight data: a Microsoft Kinect One device is used to acquire depth data, with a better quality than other datasets created with the first Kinect version, as reported in the paper.
- Data annotations: each frame of the dataset is composed of the RGB appearance image, the corresponding depth map, the 3D coordinates of the skeleton joints corresponding to the upper body part, including the head center and the shoulders positions. For convenience's sake, the 2D coordinates of the joints on both color and depth frames are provided as well as the head and shoulder pose angles with respect to the camera reference frame. Shoulder angles are obtained through the conversion to Euler angles of a corresponding rotation matrix.
Click here to download the readme of the dataset.
Legal notice
Downloading the dataset, you agree with the following statement: You are hereby given permission to copy this data in electronic or hardcopy form for your own scientific use and to distribute it for scientific use to colleagues within your research group. Inclusion of rendered images or video made from this data in a scholarly publication (printed or electronic) is also permitted. In this case, credit must be given to the publication. However, the data may not be included in the electronic version of a publication, nor placed on the Internet. These restrictions apply to any representations (other than images or video) derived from the data, including but not limited to simplifications, remeshing, and the fitting of smooth surfaces. The making of physical replicas this data is prohibited, and the data may not be distributed to students in connection with a class. For any other use, including distribution outside your research group, written permission is required. Any commercial use of the data is prohibited. Commercial use includes but is not limited to sale of the data, derivatives, replicas, images, or video, inclusion in a product for sale, or inclusion in advertisements (printed or electronic), on commercially-oriented web sites, or in trade shows.