
Continual Learning in Scenari Realistici: dai domini naturali a quelli specializzati
Abstract: Modern Deep Neural Networks (DNNs) suffer from severe performance degradation when trained on continuously evolving and diverse data distributions. This phenomenon is formally known as catastrophic forgetting: as the model’s parameters are tuned according to only the latest data, previously learned knowledge is overwritten and lost. As a consequence, current Deep Learning systems cannot be updated without expensive re-training procedures on all seen data, limiting their lifespan and contributing to a larger environmental footprint. Such a limitation is not present in humans; while they endure some forgetting, its effect is far less catastrophic than in DNNs. To mitigate forgetting in DNNs, most successful Continual Learning (CL) approaches build upon I) leveraging previously acquired data from the past or II) using a strong initialization of the parameters for downstream tasks – pretraining. Methods based on (I) rely on a small memory buffer to store a limited set of examples from previous tasks. However, their effectiveness depends entirely on the content of the buffer, making them vulnerable when faced with real-world constraints such as incomplete or erroneous annotations or rapidly changing distributions. With the goal of making rehearsal-based methods more reliable in practical scenarios, this thesis begins by introducing new strategies to: incorporate novel information regarding past data discovered as the tasks progress (X-DER), mitigate the overfitting of the buffer (LiDER), bridge the gap between rehearsal-based methods and pretraining (TwF), and design more efficient Self-Supervised regularizers for the single-epoch CL scenario (CLER). The thesis then expands on rehearsal-based approaches to handle noisy labels during training (AER & ABS) and scenarios with incomplete supervision (CCIC). Building on the second category (II), the thesis explores CL methods that rely on pretraining on large datasets before fine-tuning on the downstream tasks. While this practice has shown to be effective in improving the stability of the model, real-world scenarios are often characterized by a high variance and rapidly changing trends. To address these challenges, the thesis explores specialized domains such as those involving satellite data, fine-grained classification, or medical imaging, which pose a distinct challenge as they involve a substantial domain shift from the pretraining dataset. It then introduces a novel approach to extend the zero-shot capabilities of multi-modal models to these specialized domains (CGIL) and a two-stage approach to mitigate the instabilities of current parameter-efficient fine-tuning strategies (STAR-Prompt). Finally, the thesis broadens its scope to applications outside the realm of Continual Learning, focusing on the use of satellite imagery and graph neural networks to monitor the spread of the West Nile Virus and its primary vector, the Culex pipiens mosquito (MAGAT). These works emphasize the importance of adapting machine learning methods to specialized, real-world challenges. The work presented herein provides a comprehensive exploration of the current state-of-the-art in Continual Learning, extending its application to more realistic and specialized domains. Overall, the aim is to contribute towards the development of more robust, adaptive, and efficient AI systems that can thrive in complex and dynamic environments.
Citation:
Bonicelli, Lorenzo "Continual Learning in Scenari Realistici: dai domini naturali a quelli specializzati" 2025
not available