Papers

2,781 papers found
Analyzing Finetuning Representation Shift for Multimodal LLMs Steering
Pegah Khayatan, Mustafa Shukor, Jayneel Parekh et al.
2025 ICCV
Social Debiasing for Fair Multi-modal LLMs
Harry Cheng, Yangyang Guo, Qingpei Guo et al.
2025 ICCV
2025 ICCV
2025 ICCV
ILLUME: Illuminating Your LLMs to See, Draw, and Self-Enhance
Chunwei Wang, Guansong Lu, Junwei Yang et al.
2025 ICCV
2025 ICCV
Controlling Multimodal LLMs via Reward-guided Decoding
Oscar MaƱas, Pierluca D'Oro, Koustuv Sinha et al.
2025 ICCV
Multimodal LLMs as Customized Reward Models for Text-to-Image Generation
Shijie Zhou, Ruiyi Zhang, Huaisheng Zhu et al.
2025 ICCV
Enrich and Detect: Video Temporal Grounding with Multimodal LLMs
Shraman Pramanick, Effrosyni Mavroudi, Yale Song et al.
2025 ICCV
2025 ICCV
MM-Spatial: Exploring 3D Spatial Understanding in Multimodal LLMs
Erik Daxberger, Nina Wenzel, David Griffiths et al.
2025 ICCV
2025 ICCV
AURELIA: Test-time Reasoning Distillation in Audio-Visual LLMs
Sanjoy Chowdhury, Hanan Gani, Nishit Anand et al.
2025 ICCV
AVTrustBench: Assessing and Enhancing Reliability and Robustness in Audio-Visual LLMs
Sanjoy Chowdhury, Sayan Nag, Subhrajyoti Dasgupta et al.
2025 ICCV
Token Activation Map to Visually Explain Multimodal LLMs
Yi Li, Hualiang Wang, Xinpeng Ding et al.
2025 ICCV
ARGUS: Hallucination and Omission Evaluation in Video-LLMs
Ruchit Rawal, Reza Shirkavand, Heng Huang et al.
2025 ICCV
2024 ICLR