Li Yuan
84 papers · 2019–2026 · 13 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+14 more ↓ Show less ↑
π Conference Polyglot (13) π§ Keyword Pioneer πΊοΈ Taxonomy Completionist (11) π Interdisciplinary Bridge π Academic Marathon (6)
π
Academic Marathon
(6)
π
Cross-Pollinator
(9)
π
Renaissance Researcher
(10)
π
Keyword Champion
π§¬
Topic Evolution
π€
Dynamic Duo
(17)
π¬
Deep Specialist
(17)
π
Grand Slam
ποΈ
Keyword Collector
(346)
π
Century Club
(75)
π₯
Unstoppable
(7)
β
The Questioner
(2)
β‘
Prolific Year
(24)
π
Conference Pioneer
Conferences
CVPR (18)
AAAI (11)
NIPS (11)
ICCV (10)
ECCV (8)
ACL (5)
ICLR (5)
COLING (4)
EMNLP (4)
ICML (4)
IJCAI (2)
AACL (1)
SEMEVAL (1)
Top co-authors
Research topics
Keywords
large language model
(8)
multimodal learning
(7)
diffusion model
(6)
image classification
(6)
semantic segmentation
(5)
vision-language model
(5)
object detection
(4)
model compression
(4)
energy efficiency
(3)
video understanding
(3)
spiking neural network
(3)
contrastive learning
(3)
unsupervised learning
(3)
video generation
(3)
image generation
(3)
vision transformer
(3)
few-shot learning
(3)
representation learning
(3)
3d reconstruction
(3)
attention mechanism
(3)
Papers
MavenCoder: Competitive Code Generation via Model Adaptive Planning Strategies and Multi-Perspective Verification Enhancement
ACL 2026
SAFE-QAQ: End-to-End Slow-Thinking Audio-Text Fraud Detection via Reinforcement Learning
ACL 2026
Look-Back: Implicit Visual Re-focusing in MLLM Reasoning
AAAI 2026
360Explorer: Exploring 4D Controllable World in Panoramic Videos
AAAI 2026
Truth or Sophistry? LoFa: A Benchmark for LLM Robustness Against Logical Fallacies
ACL 2026
NeuralGS: Bridging Neural Fields and 3D Gaussian Splatting for Compact 3D Representations
AAAI 2026
AsFT: Anchoring Safety During LLM Fine-Tuning Within Narrow Safety Basin
AAAI 2026
Hybrid-DMKG: A Hybrid Reasoning Framework over Dynamic Multimodal Knowledge Graphs for Multimodal Multihop QA with Knowledge Editing
AAAI 2026
Next Patch Prediction for AutoRegressive Visual Generation
AAAI 2026
DreamDance: Animating Human Images by Enriching 3D Geometry Cues from 2D Poses
ICCV 2025
Rethinking Text-based Protein Understanding: Retrieval or LLM?
EMNLP 2025
AE-NeRF: Augmenting Event-Based Neural Radiance Fields for Non-ideal Conditions and Larger Scenes
AAAI 2025
Cycle3D: High-quality and Consistent Image-to-3D Generation via Generation-Reconstruction Cycle
AAAI 2025
RuleEdit: Towards Rule-Level Knowledge Generalization to Mitigate Over-Editing in Large Language Models
ACL 2025
Is Parameter Collision Hindering Continual Learning in LLMs?
COLING 2025
Collaborative Multi-LoRA Experts with Achievement-based Multi-Tasks Loss for Unified Multimodal Information Extraction
IJCAI 2025
Orthogonal Subspace Decomposition for Generalizable AI-Generated Image Detection
ICML 2025
MoH: Multi-Head Attention as Mixture-of-Head Attention
ICML 2025
MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts
ICLR 2025
Epona: Autoregressive Diffusion World Model for Autonomous Driving
ICCV 2025
PiCO: Peer Review in LLMs based on Consistency Optimization
ICLR 2025
LLaVA-CoT: Let Vision Language Models Reason Step-by-Step
ICCV 2025
LangBridge: Interpreting Image as a Combination of Language Embeddings
ICCV 2025
EvaGaussians: Event Stream Assisted Gaussian Splatting from Blurry Images
ICCV 2025
Generalizing Deepfake Video Detection with Plug-and-Play: Video-Level Blending and Spatiotemporal Adapter Tuning
CVPR 2025
WF-VAE: Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model
CVPR 2025
RoomPainter: View-Integrated Diffusion for Consistent Indoor Scene Texturing
CVPR 2025
UPME: An Unsupervised Peer Review Framework for Multimodal Large Language Model Evaluation
CVPR 2025
Identity-Preserving Text-to-Video Generation by Frequency Decomposition
CVPR 2025
Regressor-Segmenter Mutual Prompt Learning for Crowd Counting
CVPR 2024
Spiking Transformer with Experts Mixture
NIPS 2024
QKFormer: Hierarchical Spiking Transformer using Q-K Attention
NIPS 2024
ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
NIPS 2024
ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation
NIPS 2024
DF40: Toward Next-Generation Deepfake Detection
NIPS 2024
VLMimic: Vision Language Models are Visual Imitation Learner for Fine-grained Actions
NIPS 2024
Parallel Vertex Diffusion for Unified Visual Grounding
AAAI 2024
RAP: Efficient Text-Video Retrieval with Sparse-and-Correlated Adapter
ACL 2024
A Logical Pattern Memory Pre-trained Model for Entailment Tree Generation
COLING 2024
Grounded Multimodal Procedural Entity Recognition for Procedural Documents: A New Dataset and Baseline
COLING 2024
SynSP: Synergy of Smoothness and Precision in Pose Sequences Refinement
CVPR 2024
GraCo: Granularity-Controllable Interactive Segmentation
CVPR 2024
Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
CVPR 2024
FreestyleRet: Retrieving Images from Style-Diversified Queries
ECCV 2024
Repaint123: Fast and High-quality One Image to 3D Generation with Progressive Controllable Repainting
ECCV 2024
Local Action-Guided Motion Diffusion Model for Text-to-Motion Generation
ECCV 2024
HiFi-123: Towards High-fidelity One Image to 3D Content Generation
ECCV 2024
Learning Pseudo 3D Guidance for View-consistent Texturing with 2D Diffusion
ECCV 2024
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
EMNLP 2024
Med-MoE: Mixture of Domain-Specific Experts for Lightweight Medical Vision-Language Models
EMNLP 2024
LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context Inference
EMNLP 2024
LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
ICLR 2024
Progressive3D: Progressively Local Editing for Text-to-3D Content Creation with Complex Semantic Prompts
ICLR 2024
IDRNet: Intervention-Driven Relation Network for Semantic Segmentation
NIPS 2023
Text-Video Retrieval with Disentangled Conceptualization and Set-to-Set Alignment
IJCAI 2023
Spike-driven Transformer
NIPS 2023
Act As You Wish: Fine-Grained Control of Motion Diffusion Model with Hierarchical Semantic Graphs
NIPS 2023
ACSeg: Adaptive Conceptualization for Unsupervised Semantic Segmentation
CVPR 2023
Multi-granularity Interaction Simulation for Unsupervised Interactive Segmentation
ICCV 2023
DiffusionRet: Generative Text-Video Retrieval with Diffusion Model
ICCV 2023
Joint Multimodal Entity-Relation Extraction Based on Edge-Enhanced Graph Alignment Network and Word-Pair Relation Tagging
AAAI 2023
Rethinking Point Cloud Registration as Masking and Reconstruction
ICCV 2023
Learning With Fantasy: Semantic-Aware Virtual Contrastive Constraint for Few-Shot Class-Incremental Learning
CVPR 2023
Video-Text As Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning
CVPR 2023
Out-of-Candidate Rectification for Weakly Supervised Semantic Segmentation
CVPR 2023
Spikformer: When Spiking Neural Network Meets Transformer
ICLR 2023
PointGPT: Auto-regressively Generative Pre-training from Point Clouds
NIPS 2023
Locality Guidance for Improving Vision Transformers on Tiny Datasets
ECCV 2022
Improving Vision Transformers by Revisiting High-Frequency Components
ECCV 2022
Masked Autoencoders for Point Cloud Self-Supervised Learning
ECCV 2022
DynaMixer: A Vision MLP Architecture with Dynamic Mixing
ICML 2022
Positive-Negative Momentum: Manipulating Stochastic Gradient Noise to Improve Generalization
ICML 2021
Tokens-to-Token ViT: Training Vision Transformers From Scratch on ImageNet
ICCV 2021
All Tokens Matter: Token Labeling for Training Better Vision Transformers
NIPS 2021
Continual Learning via Bit-Level Information Preserving
CVPR 2021
PnP-DETR: Towards Efficient Visual Analysis With Transformers
ICCV 2021
Graph Attention Network with Memory Fusion for Aspect-level Sentiment Analysis
AACL 2020
Central Similarity Quantization for Efficient Image and Video Retrieval
CVPR 2020
Revisiting Knowledge Distillation via Label Smoothing Regularization
CVPR 2020
YNU-HPCC at SemEval-2020 Task 8: Using a Parallel-Channel Model for Memotion Analysis
SEMEVAL 2020
YNU-HPCC at SemEval-2020 Task 8: Using a Parallel-Channel Model for Memotion Analysis
COLING 2020
Cycle-SUM: Cycle-Consistent Adversarial LSTM Networks for Unsupervised Video Summarization
AAAI 2019
Distilling Object Detectors With Fine-Grained Feature Imitation
CVPR 2019
Few-Shot Adaptive Faster R-CNN
CVPR 2019