Yuxin Song
8 papers · 2023–2025 · 4 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+3 more ↓ Show less ↑
πΊοΈ Taxonomy Completionist (24) π Renaissance Researcher (5) π Interdisciplinary Bridge π§ Keyword Pioneer π£ Hot Topic Early Bird
π
Conference Polyglot
(4)
π
Cross-Pollinator
(12)
β
The Questioner
Conferences
ICCV (3)
NIPS (3)
CVPR (1)
ICML (1)
Top co-authors
Keywords
multimodal large language model
(3)
video understanding
(2)
vision-language model
(2)
vision transformer
(1)
feature extraction
(1)
object detection
(1)
uncertainty modeling
(1)
preference learning
(1)
direct preference optimization
(1)
video captioning
(1)
video classification
(1)
attention mechanism
(1)
text generation
(1)
preference optimization
(1)
image captioning
(1)
multimodal learning
(1)
visual question answering
(1)
visual grounding
(1)
reinforcement learning from human feedback
(1)
temporal modeling
(1)
Papers
DistinctAD: Distinctive Audio Description Generation in Contexts
CVPR 2025
Explanatory Instructions: Towards Unified Vision Tasks Understanding and Zero-shot Generalization
ICML 2025
MMReason: An Open-Ended Multi-Modal Multi-Step Reasoning Benchmark for MLLMs Toward AGI
ICCV 2025
Automated Multi-level Preference for MLLMs
NIPS 2024
Dense Connector for MLLMs
NIPS 2024
Octopus: A Multi-modal LLM with Parallel Recognition and Sequential Understanding
NIPS 2024
What Can Simple Arithmetic Operations Do for Temporal Modeling?
ICCV 2023
UATVR: Uncertainty-Adaptive Text-Video Retrieval
ICCV 2023