AImageLab - Publications

Tecniche Avanzate di Discretizzazione nell'Era del Deep Learning

Abstract: Deep learning has transformed how we tackle complex tasks, but challenges persist, particularly in the areas of lifelong learning and generating reliable predictions in dynamic environments. This thesis investigates advanced discretization techniques aimed at addressing two crucial domains: Continual Learning (CL) and trajectory forecasting. Both present distinct challenges related to managing latent spaces and ensuring long-term adaptability. Discretization techniques play a pivotal role in handling graph structures and latent space quantization. They simplify the management of complex, continuous data by structuring it in a more analyzable and model-friendly format. In graph structures, discretization helps capture relationships between entities, making these connections more interpretable and manageable. Latent space discretization, on the other hand, transforms continuous latent variables into discrete ones, improving the interpretability and efficiency of machine learning models. This is particularly advantageous in tasks like clustering, representation learning, and generative modeling, where clear, discrete categories within latent space allow models to generalize more effectively and produce more robust predictions. In the first part, this thesis investigates the problem of catastrophic forgetting in Artificial Neural Networks (ANNs) during Continual Learning. Unlike biological intelligence, which integrates new knowledge throughout life without losing prior understanding, ANNs struggle when faced with a non-static training data distribution. CaSpeR-IL is introduced as a geometric regularizer that enhances the stability of rehearsal-based CL methods by enforcing spectral constraints on the latent space. Specifically, it mitigates the disruption caused by class interference during data replay, promoting a better partitioning of the latent space. This approach improves the state-of-the-art performance of CL models on standard benchmarks by maintaining more consistent predictions, even under memory constraints. In the second part, the thesis addresses the challenge of trajectory forecasting, a key component in fields like video surveillance and sports analytics. Forecasting the future movements of agents, such as basketball players interacting in real-time, requires a deep understanding of their intentions. Here, Vector Quantized Variational Autoencoders (VQ-VAEs) are exploited, utilizing a discrete latent space to prevent posterior collapse while capturing diverse future trajectories. The thesis proposes a novel adaptation mechanism through low-rank updates to the latent codebook, enabling instance-based customization of latent representations. This ensures that past motion patterns and contextual information dynamically shape the latent space, leading to more accurate and diverse trajectory predictions. It is empirically demonstrated that combining this approach with a diffusion-based predictive model achieves state-of-the-art performance on multiple trajectory forecasting benchmarks. This work comprehensively studies discretization techniques in deep learning, showcasing their power in solving continual learning and trajectory forecasting challenges through geometric and latent space regularization strategies.

Citation:

Benaglia, Riccardo "Tecniche Avanzate di Discretizzazione nell'Era del Deep Learning" 2025

 not available

Paper download:

Author version: