Prediction of activities and Events by Vision in an Urban Environment
The PREVUE (Predicting activities and Events by Vision in an Urban Environment) project plans to investigate modern Artificial Intelligence approaches for video analysis and event prediction in urban scenarios. Specifically, we claim that two significant Computer Vision topics, i.e., video-surveillance and autonomous driving are nowadays technologically mature enough to be rethought jointly in a unique framework. Simultaneously using both mobile (e.g., mounted on vehicles) and fixed (i.e. mounted in a smart city) cameras, we will analyse the urban environment (context), the behaviour of humans and moving agents (e.g.autonomous vehicles, bikes, social robots), as well as their mutual interaction.
We believe that the effort in computer vision should go behind scene segmentation and target detection and tracking towards a predictive ability. We will explore the capabilities and we will stress the limits of algorithms and novel solutions in visual artificial intelligence to predict different types of anomalies and potentially dangerous events in the city, ranging from suspicious human behaviour detection, panic recognition in individuals and in the crowd, potential collisions, etc. Moreover, we will predict the effect of collaborative interactions between mobile agents (e.g. how a moving car can be aware of a presence of another moving car) and between moving agents, context and humans (e.g. a smart robot/car can understand the intention of people to cross the road), in order to improve safety and efficiency in urban life. To this end we will improve also the general context awareness, for instance by refining the weather forecasting with on-line visual data (towards “weather nowcasting” solutions) or to support the reasoning with additional knowledge extracted by visual data, such as the recognition of people-related attributes and their biometric traits in cluttered scenes (with low resolutions, blurriness, occlusions).
Finally, we aim at building a general framework for urban event early detection/prediction which can be used both for surveillance purposes and for providing auxiliary data to autonomous vehicles.
From the scientific point of view, we will investigate hot research topics in deep learning and computer vision. First, considering that modern deep learning techniques are heavily dependent on the availability of training data, we will collect different fully-annotated, partially-annotated and synthetic (simulated) datasets for predictive tasks. We will also cope with the need for optimizing acquisition parameters in networks of cameras in order to have the best acquisition, both for dataset collection and for real-time processing.
Moreover, we will investigate the frontier of deep learning in challenging situations, such as transfer learning and domain adaptation, semi-supervised learning and few shot learning, whose solutions have a crucial importance for the adaptation of prediction systems to new scenarios.
State-of-the-art deep networks, such as Generative Adversarial Networks, Recurrent Neural Networks and Autoregressive Autoencoders will be used together with standard Convolutional Networks in order to define an effective predictive visual intelligence for new images and temporal reasoning.
Finally, thanks also to the industrial stakeholders (that already expressed their endorsement) of the project and the support of public bodies, development and evaluation will be performed in a large scale, using massive processing resources and two different large-scale urban evaluation areas.
The results of the project will have a disruptive impact in the scientific community improving the Italian presence in international rankings, also considering datasets we will collect for real needs and in real contexts and the open-source solutions, as well a direct impact on the society of smart inclusive cities and on the Italian automotive and IT industry.
Computer Vision is a fast growing area with a huge impact to our daily lives. The scientific community is confident that tomorrow’s Computer Vision will provide artificial systems with “visual intelligence”, defined by psychologists as the human ability to reason by images in order to predict near future events and situations employing visual information.
Modern computer vision systems can successfully detect urban agents (cars, motorbikes); as humans, we can also infer that scooters are likely to zigzag between cars, performing potentially risky manoeuvres. Consider, for instance, a mother with her kids crossing the street: as drivers we know that the kid might deviate from crossing lines becoming thus the focus of our attention. This predictive capability is what modern technologies still miss. Indeed, computers equipped with visual intelligence will be capable of early detecting abnormal behaviors and potentially dangerous manoeuvres, as well as predicting other agents’ interactions. They could also inform all the approaching vehicles and/or the driver and eventually actively brake. PREVUE aims at these emerging scenarios focusing on next generation of Computer Vision (CV) solutions empowered by Deep Learning (DL) applied in an urban context where people, vehicles, or moving robots move, interact and detect predictable or anomalous situations.
We aim to create new scientifically disruptive algorithms, develop software prototypes coping with predicting near-future events, actions and situations, test them in real open-world scenarios, and build the key components of new service platforms for smart cities concerning intelligent mobility.
List of the Research Units:
|Associated Investigator||University/ Research Institution|
|CUCCHIARA Rita|| Università degli Studi di MODENA e
|SEBE Niculae||Università degli Studi di TRENTO|
|BALLAN Lamberto||Università degli Studi di PADOVA|
|NAPPI Michele||Università degli Studi di SALERNO|
We recived some letters/protocols of intent from industrial stakeholders. They are:
1. Comune di Modena
2. Magneti Marelli
3. AD Consulting
4. Cluster Trasporti
5. Ferrari SPA
The project starts in September 2019 and it is expected to last 36 months.