
Virtual Try-On
Virtual try-on is the task of generating realistic images of a person wearing a target garment, without requiring physical trials. In our research, we focus on image-based virtual try-on methods that take as input a photo of a person and an image of a clothing item, and produce a photo-realistic output of the person wearing that item.

LaDI-VTON
Publication:
"LaDI-VTON: Latent Diffusion Textual-Inversion Enhanced Virtual Try-On"
D. Morelli, A. Baldrati, G. Cartella, M. Cornia, M. Bertini, R. Cucchiara
Proceedings of the ACM International Conference on Multimedia (ACM MM), 2023
Abstract. The rapidly evolving fields of e-commerce and metaverse continue to seek innovative approaches to enhance the consumer experience. At the same time, recent advancements in the development of diffusion models have enabled generative networks to create remarkably realistic images. In this context, image-based virtual try-on, which consists of generating a novel image of a target model wearing a given in-shop garment, has yet to capitalize on the potential of these powerful generative solutions. This work introduces LaDIVTON, the first Latent Diffusion textual Inversion-enhanced model for the Virtual Try-ON task. The proposed architecture relies on a latent diffusion model extended with a novel additional autoencoder module that exploits learnable skip connections to enhance the generation process, preserving the model’s characteristics. To effectively maintain the texture and details of the in-shop garment, we propose a textual inversion component that can map the visual features of the garment to the CLIP token embedding space and thus generate a set of pseudo-word token embeddings capable of conditioning the generation process. Experimental results on Dress Code and VITON-HD datasets demonstrate that our approach outperforms the competitors by a consistent margin, achieving a significant milestone for the task.
Keywords: Dataset, Latent Diffusion Models, Generative AI
Dress Code
Publication:
"Dress Code: High-Resolution Multi-Category Virtual Try-On"
D. Morelli, M. Fincato, M. Cornia, F. Landi, F. Cesari, R. Cucchiara
Proceedings of the European Conference on Computer Vision (ECCV), 2022
Abstract. Image-based virtual try-on strives to transfer the appearance of a clothing item onto the image of a target person. Existing literature focuses mainly on upper-body clothes (e.g., t-shirts, shirts, and tops) and neglects full-body or lower-body items. This shortcoming arises from a primary factor: current publicly available datasets for image-based virtual try-ons do not account for this variety, thus limiting progress in the field. In this research activity, we introduce Dress Code, a novel dataset that contains images of multi-category clothes. Dress Code is more than 3x larger than publicly available datasets for image-based virtual try-on and features high-resolution paired images (1024x768) with front-view, full-body reference models. We propose to learn fine-grained discriminating features to generate HD try-on images with high visual quality and rich details. Specifically, we leverage a semantic-aware discriminator that makes predictions at the pixel level instead of the image patch level.
Keywords: Virtual Try-on Dataset, GAN, Generative AI
Publications
1 |
Morelli, Davide; Baldrati, Alberto; Cartella, Giuseppe; Cornia, Marcella; Bertini, Marco; Cucchiara, Rita
"LaDI-VTON: Latent Diffusion Textual-Inversion Enhanced Virtual Try-On"
MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia,
Ottawa, Canada,
pp. 8580
-8589
,
October 29-November 3, 2023,
2023
| DOI: 10.1145/3581783.3612137
Conference
![]() |
2 |
Morelli, Davide; Fincato, Matteo; Cornia, Marcella; Landi, Federico; Cesari, Fabio; Cucchiara, Rita
"Dress Code: High-Resolution Multi-Category Virtual Try-On"
Proceeding of the European Conference on Computer Vision (Lecture Notes in Computer Science),
vol. 13668,
Tel Aviv,
pp. 345
-362
,
October 23-27, 2022,
2022
| DOI: 10.1007/978-3-031-20074-8_20
Conference
![]() |
3 |
Fenocchi, Emanuele; Morelli, Davide; Cornia, Marcella; Baraldi, Lorenzo; Cesari, Fabio; Cucchiara, Rita
"Dual-Branch Collaborative Transformer for Virtual Try-On"
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops,
vol. 2022-,
New Orleans, Louisiana,
pp. 2246
-2250
,
June 19-24, 2022,
2022
| DOI: 10.1109/CVPRW56347.2022.00246
Conference
![]() |
4 |
Morelli, Davide; Fincato, Matteo; Cornia, Marcella; Landi, Federico; Cesari, Fabio; Cucchiara, Rita
"Dress Code: High-Resolution Multi-Category Virtual Try-On"
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops,
vol. 2022-,
New Orleans, Louisiana,
pp. 2230
-2234
,
June 19-24, 2022,
2022
| DOI: 10.1109/CVPRW56347.2022.00243
Conference
![]() |
5 | Fincato, Matteo; Cornia, Marcella; Landi, Federico; Cesari, Fabio; Cucchiara, Rita "Transform, Warp, and Dress: A New Transformation-Guided Model for Virtual Try-On" ACM TRANSACTIONS ON MULTIMEDIA COMPUTING, COMMUNICATIONS AND APPLICATIONS, vol. 18, pp. 1 -23 , 2022 | DOI: 10.1145/3491226 Journal |
6 |
Fincato, Matteo; Landi, Federico; Cornia, Marcella; Cesari, Fabio; Cucchiara, Rita
"VITON-GT: An Image-based Virtual Try-On Model with Geometric Transformations"
Proceedings of the 25th International Conference on Pattern Recognition,
Milan, Italy,
pp. 7669
-7676
,
10-15 January 2021,
2021
| DOI: 10.1109/ICPR48806.2021.9412052
Conference
![]() |