Meng Cao

57 papers · 2019–2026 · 15 conferences · across top CS/AI conferences

Achievements

+16 more ↓

🐣 Hot Topic Early Bird 🌍 Conference Polyglot (15) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🏃 Academic Marathon (6)

🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🤝 Dynamic Duo (11) 👑 Triple Crown 🏆 Grand Slam 👥 Mega-Team (24) 🔬 Deep Specialist (11) 🏆 Keyword Champion (2) 🚀 Conference Pioneer ⚡ Prolific Year (5) 🗃️ Keyword Collector (221) 📈 Trend Setter 💎 Century Club (52) 🔥 Unstoppable (7) ❓ The Questioner (3)

Conferences

EMNLP (13) AAAI (9) ACL (7) ICLR (6) CVPR (4) ICCV (4) ECCV (3) NIPS (3) ICML (2) IJCAI (1) IJCNLP (1) JMLR (1) MICCAI (1) NAACL (1) WACV (1)

Top co-authors

Jiulong Shan (11) Yuexian Zou (8) Haoping Bai (8) Can Zhang (7) Zhengfeng Lai (7) Jackie Chi Kit CHEUNG (7) Ping Huang (6) Jackie CK Cheung (6) Xiaodan Liang (5) Haotian Zhang (4)

Keywords

large language model (7) contrastive learning (5) abstractive summarization (4) neural network (4) video grounding (4) video understanding (3) text generation (3) temporal localization (3) vision-language model (3) benchmark evaluation (3) weakly-supervised learning (2) text summarization (2) semantic alignment (2) message passing (2) diffusion model (2) continual learning (2) domain generalization (2) reinforcement learning (2) question answering (2) video generation (2)

Papers

Video Spatial Reasoning with Object-Centric 3D Rollout AAAI 2026 Bring Your Dreams to Life: Continual Text-to-Video Customization AAAI 2026 Beyond Observations: Reconstruction Error-Guided Irregularly Sampled Time Series Representation Learning AAAI 2026 Video SimpleQA: Towards Factuality Evaluation in Large Video Language Models AAAI 2026 Putting Captions to the Test: Evaluating Video Caption Quality through Multiple-Choice Question Answering ACL 2026 Token Preference Optimization with Self-Calibrated Visual-Anchored Rewards for Hallucination Mitigation EMNLP 2025 Can LLMs Reason Abstractly Over Math Word Problems Without CoT? Disentangling Abstract Formulation From Arithmetic Computation EMNLP 2025 MR. Judge: Multimodal Reasoner as a Judge EMNLP 2025 Where Did That Come From? Sentence-Level Error-Tolerant Attribution EMNLP 2025 MUSE: Mamba Is Efficient Multi-scale Learner for Text-video Retrieval AAAI 2025 Stochastic Chameleons: Irrelevant Context Hallucinations Reveal Class-Based (Mis)Generalization in LLMs ACL 2025 See the World, Discover Knowledge: A Chinese Factuality Evaluation for Large Vision Language Models ACL 2025 The Reasoning-Memorization Interplay in Language Models Is Mediated by a Single Direction ACL 2025 AnyTalk: Multi-modal Driven Multi-domain Talking Head Generation AAAI 2025 MMAU: A Holistic Benchmark of Agent Capabilities Across Diverse Domains NAACL 2025 TimeCHEAT: A Channel Harmony Strategy for Irregularly Sampled Multivariate Time Series Analysis AAAI 2025 depyf: Open the Opaque Box of PyTorch Compiler for Machine Learning Researchers JMLR 2025 TIS-DPO: Token-level Importance Sampling for Direct Preference Optimization With Estimated Weights ICLR 2025 Contrastive Localized Language-Image Pre-Training ICML 2025 Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models ICLR 2025 A0: An Affordance-Aware Hierarchical Model for General Robotic Manipulation ICCV 2025 ETVA: Evaluation of Text-to-Video Alignment via Fine-grained Question Generation and Answering ICCV 2025 EvaGaussians: Event Stream Assisted Gaussian Splatting from Blurry Images ICCV 2025 Enhancing Reinforcement Learning with Dense Rewards from Language Model Critic EMNLP 2024 How to Continually Adapt Text-to-Image Diffusion Models for Flexible Customization? NIPS 2024 Mixup-Induced Domain Extrapolation for Domain Generalization AAAI 2024 Exploiting Auxiliary Caption for Video Grounding AAAI 2024 Direct Large Language Model Alignment Through Self-Rewarding Contrastive Prompt Distillation ACL 2024 RAP: Efficient Text-Video Retrieval with Sparse-and-Correlated Adapter ACL 2024 Real-Time Exposure Correction via Collaborative Transformations and Adaptive Sampling CVPR 2024 VeCLIP: Improving CLIP Training via Visual-enriched Captions ECCV 2024 Uncertainty-aware sign language video retrieval with probability distribution modeling ECCV 2024 Mechanistic Understanding and Mitigation of Language Model Non-Factual Hallucinations EMNLP 2024 Efficient ConvBN Blocks for Transfer Learning and Beyond ICLR 2024 Successor Features for Efficient Multi-Subject Controlled Text Generation ICML 2024 Textual Inversion and Self-supervised Refinement for Radiology Report Generation MICCAI 2024 Empowering Unsupervised Domain Adaptation With Large-Scale Pre-Trained Vision-Language Models WACV 2024 Analyzing Multi-Sentence Aggregation in Abstractive Summarization via the Shapley Value EMNLP 2023 G2L: Semantically Aligned and Uniform Video Grounding via Geodesic and Game Theory ICCV 2023 Iterative Proposal Refinement for Weakly-Supervised Video Grounding CVPR 2023 RGI: robust GAN-inversion for mask-free image inpainting and unsupervised pixel-wise anomaly detection ICLR 2023 Systematic Rectification of Language Models via Dead-end Analysis ICLR 2023 Responsible AI Considerations in Text Summarization Research: A Review of Current Practices EMNLP 2023 Learning with Rejection for Abstractive Text Summarization EMNLP 2022 Information Gain Propagation: a New Way to Graph Active Learning with Soft Labels ICLR 2022 Unsupervised Pre-Training for Temporal Action Localization Tasks CVPR 2022 Hallucinated but Factual! Inspecting the Factuality of Hallucinations in Abstractive Summarization ACL 2022 LocVTP: Video-Text Pre-training for Temporal Localization ECCV 2022 RIM: Reliable Influence-based Active Learning on Graphs NIPS 2021 BatchQuant: Quantized-for-all Architecture Search with Robust Quantizer NIPS 2021 CoLA: Weakly-Supervised Temporal Action Localization With Snippet Contrastive Learning CVPR 2021 On Pursuit of Designing Multi-modal Transformer for Video Grounding EMNLP 2021 RR-Net: Injecting Interactive Semantics in Human-Object Interaction Detection IJCAI 2021 Factual Error Correction for Abstractive Summarization Models EMNLP 2020 TeMP: Temporal Message Passing for Temporal Knowledge Graph Completion EMNLP 2020 Referring Expression Generation Using Entity Profiles IJCNLP 2019 Referring Expression Generation Using Entity Profiles EMNLP 2019