Xudong Lin
34 papers · 2018–2025 · 8 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+12 more ↓ Show less ↑
π Renaissance Researcher (7) π Interdisciplinary Bridge π Academic Marathon (7) π Conference Polyglot (8) πΊοΈ Taxonomy Completionist (72)
πΊοΈ
Taxonomy Completionist
(72)
π§
Keyword Pioneer
π£
Hot Topic Early Bird
π€
Dynamic Duo
(24)
π§¬
Topic Evolution
π¬
Deep Specialist
(15)
π₯
Mega-Team
(34)
π₯
Unstoppable
(8)
π
Century Club
(34)
β‘
Prolific Year
(7)
π
Conference Pioneer
ποΈ
Keyword Collector
(151)
Conferences
CVPR (11)
EMNLP (7)
AAAI (4)
NAACL (4)
ECCV (3)
ICLR (3)
ACL (1)
NIPS (1)
Top co-authors
Keywords
multimodal learning
(15)
video understanding
(7)
zero-shot learning
(4)
event extraction
(4)
video question answering
(3)
contrastive learning
(3)
video grounding
(3)
few-shot learning
(3)
video captioning
(3)
semantic alignment
(2)
video retrieval
(2)
event coreference
(2)
visual grounding
(2)
coreference resolution
(2)
transfer learning
(2)
unsupervised learning
(2)
visual question answering
(2)
weakly supervised learning
(2)
vision transformer
(2)
self-supervised learning
(2)
Papers
PuzzleGPT: Emulating Human Puzzle-Solving Ability for Time and Location Prediction
NAACL 2025
LOFT: Scalable and More Realistic Long-Context Evaluation
NAACL 2025
BLINK: Multimodal Large Language Models Can See but Not Perceive
ECCV 2024
Training-free Deep Concept Injection Enables Language Models for Video Question Answering
EMNLP 2024
VIEWS: Entity-Aware News Video Captioning
EMNLP 2024
Personalized Video Comment Generation
EMNLP 2024
Unveiling Narrative Reasoning Limits of Large Language Models with Trope in Movie Synopses
EMNLP 2024
SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos
ICLR 2024
Beyond Grounding: Extracting Fine-Grained Event Hierarchies across Modalities
AAAI 2024
TempCLR: Temporal Alignment Representation with Contrastive Learning
ICLR 2023
Learning to Decompose Visual Features with Latent Textual Prompts
ICLR 2023
Video-Text Pre-training with Learned Regions for Retrieval
AAAI 2023
Video Event Extraction via Tracking Visual States of Arguments
AAAI 2023
Towards Fast Adaptation of Pretrained Contrastive Models for Multi-Channel Video-Language Retrieval
CVPR 2023
All in One: Exploring Unified Video-Language Pre-Training
CVPR 2023
Non-Sequential Graph Script Induction via Multimedia Grounding
ACL 2023
Supervised Masked Knowledge Distillation for Few-Shot Transformers
CVPR 2023
Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
NIPS 2022
MuMuQA: Multimedia Multi-Hop News Question Answering via Cross-Media Knowledge Extraction and Grounding
AAAI 2022
Object-Aware Video-Language Pre-Training for Retrieval
CVPR 2022
Learning To Recognize Procedural Activities With Distant Supervision
CVPR 2022
CLIP-Event: Connecting Text and Images With Event Structures
CVPR 2022
Weakly-Supervised Temporal Article Grounding
EMNLP 2022
RESIN-11: Schema-guided Event Prediction for 11 Newsworthy Scenarios
NAACL 2022
Joint Multimedia Event Extraction from Video and Article
EMNLP 2021
Co-Grounding Networks With Semantic Attention for Referring Expression Comprehension in Videos
CVPR 2021
Vx2Text: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs
CVPR 2021
RESIN: A Dockerized Schema-Guided Cross-document Cross-lingual Cross-media Information Extraction and Event Tracking System
NAACL 2021
Coreference by Appearance: Visually Grounded Event Coreference Resolution
EMNLP 2021
Context-Gated Convolution
ECCV 2020
DMC-Net: Generating Discriminative Motion Cues for Fast Compressed Video Action Recognition
CVPR 2019
Deep Variational Metric Learning
ECCV 2018
GraphBit: Bitwise Interaction Mining via Deep Reinforcement Learning
CVPR 2018
Deep Adversarial Metric Learning
CVPR 2018