Lei Ji
32 papers · 2019–2026 · 12 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+10 more ↓ Show less ↑
π Interdisciplinary Bridge π Academic Marathon (6) π Renaissance Researcher (9) π Conference Polyglot (12) πΊοΈ Taxonomy Completionist (45)
πΊοΈ
Taxonomy Completionist
(45)
π§
Keyword Pioneer
π£
Hot Topic Early Bird
π€
Dynamic Duo
(18)
π
Grand Slam
π§¬
Topic Evolution
β‘
Prolific Year
(9)
ποΈ
Keyword Collector
(113)
π
Century Club
(30)
π₯
Unstoppable
(7)
Conferences
ACL (9)
IJCNLP (4)
AAAI (3)
NIPS (3)
CVPR (2)
ECCV (2)
EMNLP (2)
ICLR (2)
NAACL (2)
ICML (1)
IJCAI (1)
WACV (1)
Top co-authors
Research topics
Keywords
video understanding
(7)
multimodal learning
(4)
instructional video
(3)
large language model
(3)
contrastive learning
(3)
video-level context
(2)
attention guidance
(2)
visual reasoning
(2)
local context
(2)
attention mechanism
(2)
global context
(2)
chest x-ray
(2)
image captioning
(2)
video captioning
(2)
multi-modal learning
(2)
vision-language model
(2)
dense captioning
(2)
video question answering
(2)
sentiment classification
(1)
chain-of-thought reasoning
(1)
Papers
Too Long, Do Re-weighting for Efficient LLM Reasoning Compression
ACL 2026
Data Mixing Agent: Learning to Re-weight Domains for Continual Pre-training
ACL 2026
Explore the Reasoning Capability of LLMs in the Chess Testbed
NAACL 2025
Generative Prompt Internalization
NAACL 2025
Overcoming Vocabulary Mismatch: Vocabulary-agnostic Teacher Guided Language Modeling
ICML 2025
ToolGen: Unified Tool Retrieval and Calling via Generation
ICLR 2025
AssistGUI: Task-Oriented PC Graphical User Interface Automation
CVPR 2024
Voila-A: Aligning Vision-Language Models with User's Gaze Attention
NIPS 2024
HORIZON: High-Resolution Semantically Controlled Panorama Synthesis
AAAI 2024
Exploring Diffusion Time-steps for Unsupervised Representation Learning
ICLR 2024
MIST: Multi-Modal Iterative Spatial-Temporal Transformer for Long-Form Video Question Answering
CVPR 2023
CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video Temporal Grounding
ACL 2023
KU-DMIS-MSRA at RadSum23: Pre-trained Vision-Language Model for Radiology Report Summarization
ACL 2023
EHRXQA: A Multi-Modal Question Answering Dataset for Electronic Health Records with Chest X-ray Images
NIPS 2023
Trace Controlled Text to Image Generation
ECCV 2022
NΓWA: Visual Synthesis Pre-training for Neural visUal World creAtion
ECCV 2022
Learning Temporal Video Procedure Segmentation From an Automatically Collected Large Dataset
WACV 2022
Learning from Inside: Self-driven Siamese Sampling and Reasoning for Video Question Answering
NIPS 2021
Control Image Captioning Spatially and Temporally
ACL 2021
Hierarchical Context-aware Network for Dense Video Event Captioning
IJCNLP 2021
Control Image Captioning Spatially and Temporally
IJCNLP 2021
Hashing based Efficient Inference for Image-Text Matching
IJCNLP 2021
GEM: A General Evaluation Benchmark for Multimodal Tasks
IJCNLP 2021
Hierarchical Context-aware Network for Dense Video Event Captioning
ACL 2021
GEM: A General Evaluation Benchmark for Multimodal Tasks
ACL 2021
Hashing based Efficient Inference for Image-Text Matching
ACL 2021
A Benchmark for Structured Procedural Knowledge Extraction from Cooking Videos
EMNLP 2020
Functionality Discovery and Prediction of Physical Objects
AAAI 2020
Segment-Then-Rank: Non-Factoid Question Answering on Instructional Videos
AAAI 2020
GRACE: Gradient Harmonized and Cascaded Labeling for Aspect-based Sentiment Analysis
EMNLP 2020
Dense Procedure Captioning in Narrated Instructional Videos
ACL 2019
Knowledge Aware Semantic Concept Expansion for Image-Text Matching
IJCAI 2019