Mu Cai
15 papers · 2021–2025 · 8 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+7 more ↓ Show less ↑
πΊοΈ Taxonomy Completionist (24) π Interdisciplinary Bridge π Conference Polyglot (8) π Renaissance Researcher (5) π§ Keyword Pioneer
π£
Hot Topic Early Bird
π
Conference Polyglot
(8)
π€
Dynamic Duo
(11)
β‘
Prolific Year
(5)
π
Century Club
(15)
π₯
Unstoppable
(5)
ποΈ
Keyword Collector
(51)
Conferences
ICCV (3)
ICLR (3)
CVPR (2)
ECCV (2)
WACV (2)
ACL (1)
EMNLP (1)
NIPS (1)
Top co-authors
Keywords
large language model
(3)
large multimodal model
(3)
visual question answering
(3)
multimodal learning
(3)
vision-language model
(3)
image generation
(2)
visual understanding
(2)
visual grounding
(1)
domain generalization
(1)
benchmark evaluation
(1)
visual reasoning
(1)
image translation
(1)
vision language model
(1)
efficient computing
(1)
image captioning
(1)
semantic embedding
(1)
text generation
(1)
generative model
(1)
robot manipulation
(1)
knowledge distillation
(1)
Papers
Magma: A Foundation Model for Multimodal AI Agents
CVPR 2025
An Investigation on LLMs' Visual Understanding Ability using SVG for Image-Text Bridging
WACV 2025
Matryoshka Multimodal Models
ICLR 2025
LLaRA: Supercharging Robot Learning Data for Vision-Language Policy
ICLR 2025
LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models
ICCV 2025
Yo'LLaVA: Your Personalized Language and Vision Assistant
NIPS 2024
CounterCurate: Enhancing Physical and Semantic Visio-Linguistic Compositional Reasoning via Counterfactual Examples
ACL 2024
ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts
CVPR 2024
Removing Distributional Discrepancies in Captions Improves Image-Text Alignment
ECCV 2024
VGBench: Evaluating Large Language Models on Vector Graphics Understanding and Generation
EMNLP 2024
A Sentence Speaks a Thousand Images: Domain Generalization through Distilling CLIP with Language Guidance
ICCV 2023
Out-of-Distribution Detection via Frequency-Regularized Generative Models
WACV 2023
VOS: Learning What You Don't Know by Virtual Outlier Synthesis
ICLR 2022
Masked Discrimination for Self-Supervised Learning on Point Clouds
ECCV 2022
Frequency Domain Image Translation: More Photo-Realistic, Better Identity-Preserving
ICCV 2021