Manling Li

53 papers · 2019–2026 · 10 conferences · across top CS/AI conferences

Achievements

+15 more ↓

🌍 Conference Polyglot (10) 🏃 Academic Marathon (6) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐝 Cross-Pollinator (7)

🌈 Renaissance Researcher (10) 🐣 Hot Topic Early Bird 🌍 Conference Polyglot (10) 👥 Mega-Team (34) 🏆 Grand Slam 🤝 Dynamic Duo (39) 🔬 Deep Specialist (15) 🧬 Topic Evolution 🏆 Keyword Champion (3) 📈 Trend Setter 🗃️ Keyword Collector (234) ⚡ Prolific Year (11) 🔥 Unstoppable (7) 💎 Century Club (51) ❓ The Questioner (2)

Conferences

ACL (14) EMNLP (9) NAACL (8) AAAI (5) NIPS (5) CVPR (4) ICML (4) ICLR (2) COLING (1) IJCNLP (1)

Top co-authors

Heng Ji (39) Shih-fu Chang (14) Xudong Lin (12) Sha Li (10) Pengfei Yu (6) Chi Han (6) Jiajun Wu (6) Zhenhailong Wang (6) Clare Voss (5) Kathleen McKeown (5)

Research topics

Education (1)

Keywords

multimodal learning (12) large language model (9) video understanding (8) event extraction (8) knowledge graph (6) zero-shot learning (5) relation extraction (5) information extraction (5) event schema (4) vision-language model (4) event prediction (4) language model (4) few-shot learning (3) visual reasoning (3) multimedia event extraction (3) contrastive learning (3) knowledge extraction (3) coreference resolution (3) schema induction (3) event schema induction (3)

Papers

WorldAgen: Unified State-Action Prediction with Test-Time World Model Training AAAI 2026 Trajectory2Task: Training Robust Tool-Calling Agents with Synthesized Yet Verifiable Data for Complex User Intents ACL 2026 SyncMind: Measuring Agent Out-of-Sync Recovery in Collaborative Software Engineering ICML 2025 From Large Language Models to Large Action Models: Reasoning and Planning with Physical World Knowledge AAAI 2025 The Law of Knowledge Overshadowing: Towards Understanding, Predicting and Preventing LLM Hallucination ACL 2025 LEMONADE: A Large Multilingual Expert-Annotated Abstractive Event Dataset for the Real World ACL 2025 The Law of Knowledge Overshadowing: Towards Understanding, Predicting, and Preventing LLM Hallucination ACL 2025 LayoutVLM: Differentiable Optimization of 3D Layout via Vision-Language Models CVPR 2025 Re-thinking Temporal Search for Long-Form Video Understanding CVPR 2025 Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models ICLR 2025 Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging ICML 2025 Why Is Spatial Reasoning Hard for VLMs? An Attention Mechanism Perspective on Focus Areas ICML 2025 EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents ICML 2025 Foundation Models Meet Embodied Agents NAACL 2025 IKEA Manuals at Work: 4D Grounding of Assembly Instructions on Internet Videos NIPS 2024 HourVideo: 1-Hour Video-Language Understanding NIPS 2024 Training-free Deep Concept Injection Enables Language Models for Video Question Answering EMNLP 2024 Why Does New Knowledge Create Messy Ripple Effects in LLMs? EMNLP 2024 Word Embeddings Are Steers for Language Models ACL 2024 Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making NIPS 2024 Non-Sequential Graph Script Induction via Multimedia Grounding ACL 2023 Open-Domain Hierarchical Event Schema Induction by Incremental Prompting and Verification ACL 2023 Multimedia Generative Script Learning for Task Planning ACL 2023 A Language-First Approach for Procedure Planning ACL 2023 Towards Fast Adaptation of Pretrained Contrastive Models for Multi-Channel Video-Language Retrieval CVPR 2023 ViStruct: Visual Structural Knowledge Extraction via Curriculum Guided Code-Vision Representation EMNLP 2023 Defining a New NLP Playground EMNLP 2023 Learning to Decompose Visual Features with Latent Textual Prompts ICLR 2023 Open Visual Knowledge Extraction via Relation-Oriented Multimodality Model Prompting NIPS 2023 Video Event Extraction via Tracking Visual States of Arguments AAAI 2023 ADEPT: A DEbiasing PrompT Framework AAAI 2023 MuMuQA: Multimedia Multi-Hop News Question Answering via Cross-Media Knowledge Extraction and Grounding AAAI 2022 Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners NIPS 2022 Event Schema Induction with Double Graph Autoencoders NAACL 2022 CLIP-Event: Connecting Text and Images With Event Structures CVPR 2022 Rethinking Task Sampling for Few-shot Vision-Language Transfer Learning COLING 2022 COVID-19 Claim Radar: A Structured Claim Extraction and Tracking System ACL 2022 RESIN-11: Schema-guided Event Prediction for 11 Newsworthy Scenarios NAACL 2022 New Frontiers of Information Extraction NAACL 2022 Joint Multimedia Event Extraction from Video and Article EMNLP 2021 Event-Centric Natural Language Processing IJCNLP 2021 RESIN: A Dockerized Schema-Guided Cross-document Cross-lingual Cross-media Information Extraction and Event Tracking System NAACL 2021 COVID-19 Literature Knowledge Graph Construction and Drug Repurposing Report Generation NAACL 2021 Event-Centric Natural Language Processing ACL 2021 GENE: Global Event Network Embedding NAACL 2021 The Future is not One-dimensional: Complex Event Schema Induction by Graph Modeling for Event Prediction EMNLP 2021 Timeline Summarization based on Event Graph Compression via Time-Aware Optimal Transport EMNLP 2021 Coreference by Appearance: Visually Grounded Event Coreference Resolution EMNLP 2021 GAIA: A Fine-grained Multimedia Knowledge Extraction System ACL 2020 Cross-media Structured Common Space for Multimedia Event Extraction ACL 2020 Connecting the Dots: Event Graph Schema Induction with Path Language Modeling EMNLP 2020 Multilingual Entity, Relation, Event and Human Value Extraction NAACL 2019 Keep Meeting Summaries on Topic: Abstractive Multi-Modal Meeting Summarization ACL 2019