Wiki-LLaVA: Hierarchical Retrieval-Augmented Generation for Multimodal LLMs
Citation:
Caffagni, Davide; Cocchi, Federico; Moratelli, Nicholas; Sarto, Sara; Cornia, Marcella; Baraldi, Lorenzo; Cucchiara, Rita "Wiki-LLaVA: Hierarchical Retrieval-Augmented Generation for Multimodal LLMs" Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, Jun 17-21, 2024not available