Long Chen
108 papers · 2015–2026 · 16 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+15 more ↓ Show less ↑
🏃 Academic Marathon (10) 🐝 Cross-Pollinator (12) 🌍 Conference Polyglot (16) 🧭 Keyword Pioneer 🌈 Renaissance Researcher (10)
🌈
Renaissance Researcher
(10)
🐝
Cross-Pollinator
(12)
🌍
Conference Polyglot
(16)
🏠
Conference Loyalist
(20)
🤝
Dynamic Duo
(29)
🔬
Deep Specialist
(25)
🏆
Keyword Champion
(5)
🏆
Grand Slam
👑
Triple Crown
🗃️
Keyword Collector
(398)
🚀
Conference Pioneer
💎
Century Club
(99)
🔥
Unstoppable
(11)
📈
Trend Setter
⚡
Prolific Year
(15)
Conferences
AAAI (20)
CVPR (18)
ICLR (12)
EMNLP (11)
ACL (8)
ECCV (8)
NIPS (7)
ICML (5)
IJCAI (5)
ICCV (4)
ACML (3)
IJCNLP (2)
INTERSPEECH (2)
AACL (1)
CORL (1)
JMLR (1)
Top co-authors
Research topics
Keywords
multimodal learning
(9)
vision-language model
(8)
large language model
(6)
video localization
(5)
graph neural network
(5)
autonomous driving
(5)
video grounding
(4)
vision language model
(4)
attention mechanism
(4)
scene graph generation
(4)
video understanding
(4)
visual grounding
(4)
diffusion model
(4)
temporal localization
(4)
weakly supervised learning
(4)
knowledge distillation
(4)
reinforcement learning
(4)
object detection
(4)
few-shot learning
(3)
visual question answering
(3)
Papers
Enhancing Diffusion Policies with Distribution-Matching Generator in Offline Reinforcement Learning
AAAI 2026
Personalize Your Gaussian: Consistent 3D Scene Personalization from a Single Image
AAAI 2026
Heterogeneous Uncertainty-Guided Composed Image Retrieval with Fine-Grained Probabilistic Learning
AAAI 2026
Relation-R1: Progressively Cognitive Chain-of-Thought Guided Reinforcement Learning for Unified Relation Comprehension
AAAI 2026
VILTA: A VLM-in-the-Loop Adversary for Enhancing Driving Policy Robustness
AAAI 2026
LAS: Loss-less ANN-SNN Conversion for Fully Spike-Driven Large Language Models
AAAI 2026
Spatial-Frequency Spiking Neural Network for Underwater Object Detection
AAAI 2026
What You See Is What You Reach: Towards Spatial Navigation with High-Level Human Instructions
AAAI 2026
Think before Go: Hierarchical Reasoning for Image-goal Navigation
ACL 2026
Enhancing Partially Relevant Video Retrieval with Hyperbolic Learning
ICCV 2025
RED: Unleashing Token-Level Rewards from Holistic Feedback via Reward Redistribution
EMNLP 2025
Modeling Uncertainty in Composed Image Retrieval via Probabilistic Embeddings
ACL 2025
Event-Customized Image Generation
ICML 2025
Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing
ICML 2025
Cyclic Contrastive Knowledge Transfer for Open-Vocabulary Object Detection
ICLR 2025
Cross-lingual Multimodal Sentiment Analysis for Low-Resource Languages via Language Family Disentanglement and Rethinking Transfer
ACL 2025
Nautilus: Locality-aware Autoencoder for Scalable Mesh Generation
ICCV 2025
3D Annotation-Free Learning by Distilling 2D Open-Vocabulary Segmentation Models for Autonomous Driving
AAAI 2025
Learning Causal Transition Matrix for Instance-dependent Label Noise
AAAI 2025
Open-World Multimodal Understanding and Generation with Efficiently Finetuned Foundation Models
AAAI 2025
DisPose: Disentangling Pose Guidance for Controllable Human Image Animation
ICLR 2025
CLIPDrag: Combining Text-based and Drag-based Instructions for Image Editing
ICLR 2025
Towards Better Alignment: Training Diffusion Models with Reinforcement Learning Against Sparse Rewards
CVPR 2025
SimLingo: Vision-Only Closed-Loop Autonomous Driving with Language-Action Alignment
CVPR 2025
IterIS: Iterative Inference-Solving Alignment for LoRA Merging
CVPR 2025
CoMM: A Coherent Interleaved Image-Text Dataset for Multimodal Understanding and Generation
CVPR 2025
Inversion Circle Interpolation: Diffusion-based Image Augmentation for Data-scarce Classification
CVPR 2025
Embracing Collaboration Over Competition: Condensing Multiple Prompts for Visual In-Context Learning
CVPR 2025
Accelerated Over-Relaxation Heavy-Ball Method: Achieving Global Accelerated Convergence with Broad Generalization
ICLR 2025
Multi-Resolution Decomposable Diffusion Model for Non-Stationary Time Series Anomaly Detection
ICLR 2025
An Efficient and Effective Transformer Decoder-Based Framework for Multi-Task Visual Grounding
ECCV 2024
LLMs Can Evolve Continually on Modality for $\mathbb{X}$-Modal Reasoning
NIPS 2024
$\text{Di}^2\text{Pose}$: Discrete Diffusion Model for Occluded 3D Human Pose Estimation
NIPS 2024
SoundCount: Sound Counting from Raw Audio with Dyadic Decomposition Neural Network
AAAI 2024
Beyond Grounding: Extracting Fine-Grained Event Hierarchies across Modalities
AAAI 2024
RAP: Efficient Text-Video Retrieval with Sparse-and-Correlated Adapter
ACL 2024
Chain Association-based Attacking and Shielding Natural Language Processing Systems
ACML 2024
UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory
CVPR 2024
Distributionally Generative Augmentation for Fair Facial Attribute Classification
CVPR 2024
View-Consistent 3D Editing with Gaussian Splatting
ECCV 2024
DECap: Towards Generalized Explicit Caption Editing via Diffusion Mechanism
ECCV 2024
SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning
ECCV 2024
Generative End-to-End Autonomous Driving
ECCV 2024
LingoQA: Video Question Answering for Autonomous Driving
ECCV 2024
MIND: Multimodal Shopping Intention Distillation from Large Vision-language Models for E-commerce Purchase Understanding
EMNLP 2024
Optimizing Language Models with Fair and Stable Reward Composition in Reinforcement Learning
EMNLP 2024
SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos
ICLR 2024
Towards efficient deep spiking neural networks construction with spiking activity based pruning
ICML 2024
ClothPPO: A Proximal Policy Optimization Enhancing Framework for Robotic Cloth Manipulation with Observation-Aligned Action Spaces
IJCAI 2024
IdealGPT: Iteratively Decomposing Vision and Language Reasoning via Large Language Models
EMNLP 2023
Enhanced Chart Understanding via Visual Language Pre-training on Plot Table Pairs
ACL 2023
Zero-shot Visual Relation Detection via Composite Visual Cues from Large Language Models
NIPS 2023
Compositional Feature Augmentation for Unbiased Scene Graph Generation
ICCV 2023
Progressive Deep Multi-View Comprehensive Representation Learning
AAAI 2023
Two Heads are Better Than One: A Simple Exploration Framework for Efficient Multi-Agent Reinforcement Learning
NIPS 2023
Fairness-aware Contrastive Learning with Partially Annotated Sensitive Attributes
ICLR 2023
TempCLR: Temporal Alignment Representation with Contrastive Learning
ICLR 2023
Transformer Meets Boundary Value Inverse Problems
ICLR 2023
Video Scene Graph Generation from Single-Frame Weak Supervision
ICLR 2023
Compositional Prompt Tuning with Motion Cues for Open-vocabulary Video Relation Detection
ICLR 2023
Iterative Proposal Refinement for Weakly-Supervised Video Grounding
CVPR 2023
Discrepancy-Guided Reconstruction Learning for Image Forgery Detection
IJCAI 2023
Dataset Bias Mitigation in Multiple-Choice Visual Question Answering and Beyond
EMNLP 2023
Beneath the Surface: Unveiling Harmful Memes with Multimodal Reasoning Distilled from Large Language Models
EMNLP 2023
Rethinking Data Augmentation for Robust Visual Question Answering
ECCV 2022
Explicit Image Caption Editing
ECCV 2022
Rethinking Multi-Modal Alignment in Multi-Choice VideoQA from Feature and Sample Perspectives
EMNLP 2022
Weakly-Supervised Temporal Article Grounding
EMNLP 2022
CrossFormer: A Versatile Vision Transformer Hinging on Cross-scale Attention
ICLR 2022
A Frame-Based Model of Inherent Polysemy, Copredication and Argument Coercion
AACL 2022
Deconfounded Value Decomposition for Multi-Agent Reinforcement Learning
ICML 2022
Respecting Transfer Gap in Knowledge Distillation
NIPS 2022
Rethinking the Two-Stage Framework for Grounded Situation Recognition
AAAI 2022
AutoMine: An Unmanned Mine Dataset
CVPR 2022
Few-Shot Object Detection With Fully Cross-Transformer
CVPR 2022
The Devil Is in the Labels: Noisy Label Correction for Robust Scene Graph Generation
CVPR 2022
Graph-based Multi-View Fusion and Local Adaptation: Mitigating Within-Household Confusability for Speaker Identification
INTERSPEECH 2022
Classification-Then-Grounding: Reformulating Video Scene Graphs As Temporal Bipartite Graphs
CVPR 2022
FMMformer: Efficient and Flexible Transformer via Decomposed Near-field and Far-field Attention
NIPS 2021
Ref-NMS: Breaking Proposal Bottlenecks in Two-Stage Referring Expression Grounding
AAAI 2021
Boundary Proposal Network for Two-stage Natural Language Video Localization
AAAI 2021
On Pursuit of Designing Multi-modal Transformer for Video Grounding
EMNLP 2021
Human-Like Controllable Image Captioning With Verb-Specific Semantic Roles
CVPR 2021
Natural Language Video Localization with Learnable Moment Proposals
EMNLP 2021
Accelerate CNNs from Three Dimensions: A Comprehensive Pruning Framework
ICML 2021
Graph-Based Label Propagation for Semi-Supervised Speaker Identification
INTERSPEECH 2021
Question-Driven Purchasing Propensity Analysis for Recommendation
AAAI 2020
Distinguish Confusing Law Articles for Legal Judgment Prediction
ACL 2020
Rethinking the Bottom-Up Framework for Query-Based Video Localization
AAAI 2020
Counterfactual Samples Synthesizing for Robust Visual Question Answering
CVPR 2020
Deep Dynamic Boosted Forest
ACML 2020
One Thousand and One Hours: Self-driving Motion Prediction Dataset
CORL 2020
Cross-View Tracking for Multi-Human 3D Pose Estimation at Over 100 FPS
CVPR 2020
Trading Personalization for Accuracy: Data Debugging in Collaborative Filtering
NIPS 2020
DEBUG: A Dense Bottom-Up Grounding Approach for Natural Language Video Localization
IJCNLP 2019
DEBUG: A Dense Bottom-Up Grounding Approach for Natural Language Video Localization
EMNLP 2019
Answer Identification from Product Reviews for User Questions by Multi-Task Attentive Networks
AAAI 2019
MR-GNN: Multi-Resolution and Dual Graph Neural Network for Predicting Structured Entity Interactions
IJCAI 2019
Counterfactual Critic Multi-Agent Training for Scene Graph Generation
ICCV 2019
Exploiting Entity BIO Tag Embeddings and Multi-task Learning for Relation Extraction with Imbalanced Data
ACL 2019
Zero-Shot Visual Recognition Using Semantics-Preserving Adversarial Embedding Networks
CVPR 2018
Maximum Principle Based Algorithms for Deep Learning
JMLR 2018
ZoomNet: Deep Aggregation Learning for High-Performance Small Pedestrian Detection
ACML 2018
Tag-based Weakly-supervised Hashing for Image Retrieval
IJCAI 2018
SCA-CNN: Spatial and Channel-Wise Attention in Convolutional Networks for Image Captioning
CVPR 2017
Weakly-Supervised Deep Learning for Customer Review Sentiment Classification
IJCAI 2016
Learning Bilingual Sentiment Word Embeddings for Cross-language Sentiment Classification
IJCNLP 2015
Learning Bilingual Sentiment Word Embeddings for Cross-language Sentiment Classification
ACL 2015