GAN4Surveillance: Generative Adversarial Networks for Attribute Classification
Security is of fundamental importance in a world where terrorist attacks are steadily increasing. Governments and agencies face these realities every day, but not always the means at their disposal are sufficient to effectively prevent those attacks. The security area uses many science and engineering fields, and many are the areas of study available. This research activity is focused on the problem of attribute classification (such as age, sex, etc.) and items (backpacks, bags, etc.) of people through security cameras. Computer Vision based Deep Learning techniques and generative models are exploited to address this problem in an automatic fashion. We explore the generalization capability of adversarial networks to enhance people image resolution and to hallucinate occluded body parts.
Generative Adversarial Models for People Attribute Recognition in Surveillance
In this work we propose a deep architecture for detecting people attributes (e.g. gender, race, clothing ...) in surveillance contexts. Our proposal explicitly deal with poor resolution and occlusion issues that often occur in surveillance footages by enhancing the images by means of Deep Convolutional Generative Adversarial Networks (DCGAN). Experiments show that by combining both our Generative Reconstruction and Deep Attribute Classification Network we can effectively extract attributes even when resolution is poor and in presence of strong occlusions up to 80% of the whole person figure.
Can Adversarial Networks Hallucinate Occluded People With a Plausible Aspect?
When you see a person in a crowd, occluded by other persons, you miss visual information that can be used to recognize, re-identify or simply classify him or her. You can imagine its appearance given your experience, nothing more. Similarly AI solutions can try to hallucinate missing information with specific deep learning architectures, suitably trained with people with and without occlusions. The goal of this work is to generate a complete image of a person, given an occluded version in input, that should be a) without occlusion b) similar at pixel level to a completely visible people shape c) capable to conserve similar visual attributes (e.g. male/female) of the original one.
For the purpose we propose a new approach by integrating the state-of-the-art of neural network architectures, namely U-nets and GANs, as well as discriminative attribute classification nets, with an architecture specifically designed to de-occlude people shapes. The network is trained to optimize a Loss function which could take into account the aforementioned objectives. As well we propose two datasets for testing our solution: the first one, occluded RAP, created automatically by occluding real shapes of the RAP dataset from \cite{rap} (which collects also attributes of the people aspect); the second is a large synthetic dataset AiC, generated in computer graphics with data extracted by the GTA video game, that contains 3D data of occluded objects by construction. Results are impressive and outperform any other previous proposal. This result could be an initial step to many further researches to recognize people and their behavior in an open crowded world.
Publications
1 | Fulgeri, F.; Fabbri, Matteo; Alletto, Stefano; Calderara, S.; Cucchiara, R. "Can adversarial networks hallucinate occluded people with a plausible aspect?" COMPUTER VISION AND IMAGE UNDERSTANDING, vol. 182, pp. 71 -80 , 2019 | DOI: 10.1016/j.cviu.2019.03.007 Journal |
2 | Fabbri, Matteo; Calderara, Simone; Cucchiara, Rita "Generative Adversarial Models for People Attribute Recognition in Surveillance" Proceedings of the 14th IEEE International Conference on Advanced Video and Signal based Surveillance, Lecce, Italy, 29th August - 1st September, 2017, 2017 Conference |