Yuanxin Liu

25 papers · 2018–2026 · 11 conferences · across top CS/AI conferences

Achievements

+13 more ↓

🌍 Conference Polyglot (11) 🏃 Academic Marathon (7) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌈 Renaissance Researcher (5)

🌈 Renaissance Researcher (5) 🗺️ Taxonomy Completionist (52) 🐣 Hot Topic Early Bird 🤝 Dynamic Duo (12) 🏆 Keyword Champion (2) 🧬 Topic Evolution 🏆 Grand Slam ⚡ Prolific Year (5) 🗃️ Keyword Collector (88) 📈 Trend Setter 💎 Century Club (23) 🔥 Unstoppable (5) ❓ The Questioner

Conferences

EMNLP (7) ACL (4) NIPS (3) AAAI (2) IJCAI (2) IJCNLP (2) CONLL (1) ECCV (1) ICLR (1) ICML (1) NAACL (1)

Top co-authors

Zheng Lin (12) Xu Sun (11) Weiping Wang (11) Jie Zhou (8) Fandong Meng (8) Peng Fu (6) Lei Li (6) Shicheng Li (5) Yanan Cao (5) Shuhuai Ren (5)

Keywords

model compression (6) image captioning (4) visual question answering (4) knowledge distillation (3) multimodal learning (3) dataset bia (2) neural network pruning (2) video large language model (2) domain generalization (2) visual attention (2) transfer learning (2) sampling strategy (2) vision-language model (2) multimodal large language model (2) neural network optimization (2) model pruning (2) supervised fine-tuning (2) bert compression (2) question answering (2) hidden state knowledge (2)

Papers

Investigating Cross-Modal Skill Injection: Scenarios, Methods, and Hyperparameters ACL 2026 TEMPLE: Incentivizing Temporal Understanding of Video Large Language Models via Progressive Pre-SFT Alignment AAAI 2026 RICO: Improving Accuracy and Completeness in Image Recaptioning via Visual Reconstruction EMNLP 2025 PunchBench: Benchmarking MLLMs in Multimodal Punchline Comprehension ACL 2025 BRiTE: Bootstrapping Reinforced Thinking Process to Enhance Language Model Reasoning ICML 2025 Temporal Reasoning Transfer from Text to Video ICLR 2025 VITATECS: A Diagnostic Dataset for Temporal Concept Understanding of Video-Language Models ECCV 2024 TempCompass: Do Video LLMs Really Understand Videos? ACL 2024 Compressing and Debiasing Vision-Language Pre-Trained Models for Visual Question Answering EMNLP 2023 FETV: A Benchmark for Fine-Grained Evaluation of Open-Domain Text-to-Video Generation NIPS 2023 Language Prior Is Not the Only Shortcut: A Benchmark for Shortcut Learning in VQA EMNLP 2022 A Win-win Deal: Towards Sparse and Robust Pre-trained Language Models NIPS 2022 COST-EFF: Collaborative Optimization of Spatial and Temporal Efficiency with Slenderized Multi-exit Language Models EMNLP 2022 Towards Robust Visual Question Answering: Making the Most of Biased Samples via Contrastive Learning EMNLP 2022 Learning to Win Lottery Tickets in BERT Transfer via Task-agnostic Mask Training NAACL 2022 Learning Class-Transductive Intent Representations for Zero-shot Intent Detection IJCAI 2021 Marginal Utility Diminishes: Exploring the Minimum Knowledge for BERT Knowledge Distillation ACL 2021 Marginal Utility Diminishes: Exploring the Minimum Knowledge for BERT Knowledge Distillation IJCNLP 2021 ROSITA: Refined BERT cOmpreSsion with InTegrAted techniques AAAI 2021 Aligning Visual Regions and Textual Concepts for Semantic-Grounded Image Representations NIPS 2019 Self-Adaptive Scaling for Learnable Residual Structure CONLL 2019 Exploring and Distilling Cross-Modal Information for Image Captioning IJCAI 2019 Ranking and Sampling in Open-Domain Question Answering IJCNLP 2019 Ranking and Sampling in Open-Domain Question Answering EMNLP 2019 simNet: Stepwise Image-Topic Merging Network for Generating Detailed and Comprehensive Image Captions EMNLP 2018