Papers

2,781 papers found
LLMs are Good Action Recognizers
Haoxuan Qu, Yujun Cai, Jun Liu
2024 CVPR
2024 CVPR
Learning to Localize Objects Improves Spatial Reasoning in Visual-LLMs
Kanchana Ranasinghe, Satya Narayan Shukla, Omid Poursaeed et al.
2024 CVPR
Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs
Shengbang Tong, Zhuang Liu, Yuexiang Zhai et al.
2024 CVPR
Prompt Highlighter: Interactive Control for Multi-Modal LLMs
Yuechen Zhang, Shengju Qian, Bohao Peng et al.
2024 CVPR
Low-Rank Approximation for Sparse Attention in Multi-Modal LLMs
Lin Song, Yukang Chen, Shuai Yang et al.
2024 CVPR
Link-Context Learning for Multimodal LLMs
Yan Tai, Weichen Fan, Zhao Zhang et al.
2024 CVPR
2024 CVPR
2024 CVPR
2024 CVPR
Chapter-Llama: Efficient Chaptering in Hour-Long Videos with LLMs
Lucas Ventura, Antoine Yang, Cordelia Schmid et al.
2025 CVPR
2025 CVPR
Empowering LLMs to Understand and Generate Complex Vector Graphics
Ximing Xing, Juncheng Hu, Guotao Liang et al.
2025 CVPR
2025 CVPR
SKE-Layout: Spatial Knowledge Enhanced Layout Generation with LLMs
Junsheng Wang, Nieqing Cao, Yan Ding et al.
2025 CVPR