Research on Fashion AI

Multimodal Image Editing

Multimodal-conditioned fashion image editing aims to generate realistic images of a person wearing new garments by conditioning on multiple types of input. Unlike standard text-to-image generation, this task leverages a combination of modalities — such as human pose, garment sketches, and textual descriptions — to guide the image synthesis process. Our research explores how to effectively integrate these diverse constraints to enable fine-grained control over garment appearance and fit, while preserving the person’s identity and pose.

Virtual Try-On

Virtual try-on is the task of generating realistic images of a person wearing a target garment, without requiring physical trials. In our research, we focus on image-based virtual try-on methods that take as input a photo of a person and an image of a clothing item, and produce a photo-realistic output of the person wearing that item.