Yan Lu
86 papers · 2015–2026 · 11 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+15 more ↓ Show less ↑
π Academic Marathon (10) π Conference Polyglot (11) π§ Keyword Pioneer π Interdisciplinary Bridge π Cross-Pollinator (6)
π
Cross-Pollinator
(6)
π
Renaissance Researcher
(10)
πΊοΈ
Taxonomy Completionist
(120)
π
Conference Loyalist
(32)
π€
Dynamic Duo
(15)
π
Grand Slam
π
Keyword Champion
(3)
π
Triple Crown
π¬
Deep Specialist
(16)
β‘
Prolific Year
(15)
π
Conference Pioneer
ποΈ
Keyword Collector
(384)
π₯
Unstoppable
(8)
π
Trend Setter
π
Century Club
(81)
Conferences
CVPR (32)
AAAI (13)
ICCV (11)
NIPS (8)
ACL (7)
INTERSPEECH (5)
ECCV (4)
ICLR (3)
EMNLP (1)
ICML (1)
IJCAI (1)
Top co-authors
Keywords
representation learning
(7)
diffusion model
(6)
multimodal learning
(6)
feature extraction
(6)
video understanding
(5)
self-supervised learning
(5)
reinforcement learning
(5)
neural network
(5)
contrastive learning
(4)
temporal context
(4)
person re-identification
(4)
video compression
(4)
attention mechanism
(4)
video generation
(4)
transformer network
(4)
metric learning
(3)
uncertainty modeling
(3)
object detection
(3)
generative model
(3)
semantic segmentation
(3)
Papers
Closing the Modality Reasoning Gap for Speech Large Language Models
ACL 2026
SpeechLLM-as-Judges: Towards General and Interpretable Speech Quality Evaluation
ACL 2026
InfiniteWeb: Scalable Web Environment Synthesis for GUI Agent Training
ACL 2026
CSPO: Alleviating Reward Ambiguity for Structured Table-to-LaTeX Generation
ACL 2026
From Off-Policy to On-Policy: Enhancing GUI Agents via Bi-level Expert-to-Policy Assimilation
ACL 2026
I2VGuard: Safeguarding Images against Misuse in Diffusion-based Image-to-Video Models
CVPR 2025
SVLTA: Benchmarking Vision-Language Temporal Alignment via Synthetic Video Situation
CVPR 2025
PICD: Versatile Perceptual Image Compression with Diffusion Rendering
CVPR 2025
Towards Anytime Retrieval: A Benchmark for Anytime Person Re-Identification
IJCAI 2025
Bitrate-Controlled Diffusion for Disentangling Motion and Content in Video
ICCV 2025
DLF: Extreme Image Compression with Dual-generative Latent Fusion
ICCV 2025
StreamGS: Online Generalizable Gaussian Splatting Reconstruction for Unposed Image Streams
ICCV 2025
TrInk: Ink Generation with Transformer Network
EMNLP 2025
Towards Practical Real-Time Neural Video Compression
CVPR 2025
UniGraspTransformer: Simplified Policy Distillation for Scalable Dexterous Robotic Grasping
CVPR 2025
MEATRD: Multimodal Anomalous Tissue Region Detection Enhanced with Spatial Transcriptomics
AAAI 2025
SLAM-Omni: Timbre-Controllable Voice Interaction System with Single-Stage Training
ACL 2025
UI-E2I-Synth: Advancing GUI Grounding with Large-Scale Instruction Synthesis
ACL 2025
Implicit Motion Function
CVPR 2024
Slot-VLM: Object-Event Slots for Video-Language Modeling
NIPS 2024
Diffusion Model with Cross Attention as an Inductive Bias for Disentanglement
NIPS 2024
Arbitrary-Scale Video Super-resolution Guided by Dynamic Context
AAAI 2024
MotionGPT: Finetuned LLMs Are General-Purpose Motion Generators
AAAI 2024
Unifying Multi-Modal Uncertainty Modeling and Semantic Alignment for Text-to-Image Person Re-identification
AAAI 2024
Hierarchical Intra-modal Correlation Learning for Label-free 3D Semantic Segmentation
CVPR 2024
MovieChat: From Dense Token to Sparse Memory for Long Video Understanding
CVPR 2024
Text Grouping Adapter: Adapting Pre-trained Text Detector for Layout Analysis
CVPR 2024
QDFormer: Towards Robust Audiovisual Segmentation in Complex Environments with Quantization-based Semantic Decomposition
CVPR 2024
Generative Latent Coding for Ultra-Low Bitrate Image Compression
CVPR 2024
Neural Video Compression with Feature Modulation
CVPR 2024
Long-term Temporal Context Gathering for Neural Video Compression
ECCV 2024
Mask-Based Modeling for Neural Radiance Fields
ICLR 2024
Breaking through the learning plateaus of in-context learning in Transformer
ICML 2024
Unifying Layout Generation With a Decoupled Diffusion Model
CVPR 2023
Efficient View Synthesis with Neural Radiance Distribution Field
ICCV 2023
Neural Video Compression With Diverse Contexts
CVPR 2023
VideoTrack: Learning To Track Objects via Video Transformer
CVPR 2023
Crossing the Gap: Domain Generalization for Image Captioning
CVPR 2023
StableVideo: Text-driven Consistency-aware Diffusion Video Editing
ICCV 2023
Robust Referring Video Object Segmentation with Cyclic Structural Consensus
ICCV 2023
Adaptive Frequency Filters As Efficient Global Token Mixers
ICCV 2023
Two-Shot Video Object Segmentation
CVPR 2023
Motion Information Propagation for Neural Video Compression
CVPR 2023
EVC: Towards Real-Time Neural Image Compression with Mask Decay
ICLR 2023
Deep Frequency Filtering for Domain Generalization
CVPR 2023
High-Fidelity and Freely Controllable Talking Head Video Generation
CVPR 2023
ABC-KD: Attention-Based-Compression Knowledge Distillation for Deep Learning-Based Noise Suppression
INTERSPEECH 2023
Masked Audio Modeling with CLAP and Multi-Objective Learning
INTERSPEECH 2023
Learning Trajectories are Generalization Indicators
NIPS 2023
DisDiff: Unsupervised Disentanglement of Diffusion Probabilistic Models
NIPS 2023
Versatile Neural Processes for Learning Implicit Neural Representations
ICLR 2023
Multi-View Domain Adaptive Object Detection on Camera Networks
AAAI 2023
Active Token Mixer
AAAI 2023
Structural Multiplane Image: Bridging Neural View Synthesis and 3D Reconstruction
CVPR 2023
Mask-based Latent Reconstruction for Reinforcement Learning
NIPS 2022
Towards Error-Resilient Neural Speech Coding
INTERSPEECH 2022
Cross-Scale Vector Quantization for Scalable Neural Speech Coding
INTERSPEECH 2022
Reliable Propagation-Correction Modulation for Video Object Segmentation
AAAI 2022
Hybrid Instance-Aware Temporal Fusion for Online Video Instance Segmentation
AAAI 2022
Neural Capture of Animatable 3D Human from Monocular Video
ECCV 2022
Counterfactual Intervention Feature Transfer for Visible-Infrared Person Re-identification
ECCV 2022
Visual Concepts Tokenization
NIPS 2022
Semantic-Aligned Fusion Transformer for One-Shot Object Detection
CVPR 2022
Alignment-guided Temporal Attention for Video Action Recognition
NIPS 2022
Multi-Modal Multi-Correlation Learning for Audio-Visual Speech Separation
INTERSPEECH 2022
Neural Compression-Based Feature Learning for Video Restoration
CVPR 2022
Self-Supervised Image Representation Learning With Geometric Set Consistency
CVPR 2022
Rethinking Minimal Sufficient Representation in Contrastive Learning
CVPR 2022
T-Net: Effective Permutation-Equivariant Network for Two-View Correspondence Learning
ICCV 2021
Joint Color-irrelevant Consistency Learning and Identity-aware Modality Adaptation for Visible-infrared Cross Modality Person Re-identification
AAAI 2021
Interactive Speech and Noise Modeling for Speech Enhancement
AAAI 2021
Weakly-supervised Temporal Action Localization by Uncertainty Modeling
AAAI 2021
Deep Contextual Video Compression
NIPS 2021
SSAN: Separable Self-Attention Network for Video Representation Learning
CVPR 2021
Geometry Uncertainty Projection Network for Monocular 3D Object Detection
ICCV 2021
Self-Supervised Video Representation Learning With Meta-Contrastive Network
ICCV 2021
Recursive Least-Squares Estimator-Aided Online Learning for Visual Tracking
CVPR 2020
Cross-Modality Person Re-Identification With Shared-Specific Feature Transfer
CVPR 2020
Triangulation Learning Network: From Monocular to Stereo 3D Object Detection
CVPR 2019
MVPNet: Multi-View Point Regression Networks for 3D Object Reconstruction from A Single Image
AAAI 2019
Relational Knowledge Distillation
CVPR 2019
MonoGRNet: A Geometric Reasoning Network for Monocular 3D Object Localization
AAAI 2019
Affinity Derivation and Graph Merge for Instance Segmentation
ECCV 2018
Local Descriptors Optimized for Average Precision
CVPR 2018
Feature Selective Networks for Object Detection
CVPR 2018
Robust RGB-D Odometry Using Point and Line Features
ICCV 2015