Li Yuan

84 papers · 2019–2026 · 13 conferences · across top CS/AI conferences

Achievements

+14 more ↓

🌍 Conference Polyglot (13) 🧭 Keyword Pioneer 🗺️ Taxonomy Completionist (11) 🌉 Interdisciplinary Bridge 🏃 Academic Marathon (6)

🏃 Academic Marathon (6) 🐝 Cross-Pollinator (9) 🌈 Renaissance Researcher (10) 🏆 Keyword Champion 🧬 Topic Evolution 🤝 Dynamic Duo (17) 🔬 Deep Specialist (17) 🏆 Grand Slam 🗃️ Keyword Collector (346) 💎 Century Club (75) 🔥 Unstoppable (7) ❓ The Questioner (2) ⚡ Prolific Year (24) 🚀 Conference Pioneer

Conferences

CVPR (18) AAAI (11) NIPS (11) ICCV (10) ECCV (8) ACL (5) ICLR (5) COLING (4) EMNLP (4) ICML (4) IJCAI (2) AACL (1) SEMEVAL (1)

Top co-authors

Peng Jin (18) Jie Chen (11) Yonghong Tian (10) Chang Liu (10) Kehan Li (10) Tao Wang (9) Jiashi Feng (9) Bin Lin (9) Xinhua Cheng (9) Zesen Cheng (9)

Research topics

Computer Vision (1) Robotics (1)

Keywords

large language model (8) multimodal learning (7) diffusion model (6) image classification (6) semantic segmentation (5) vision-language model (5) object detection (4) model compression (4) energy efficiency (3) video understanding (3) spiking neural network (3) contrastive learning (3) unsupervised learning (3) video generation (3) image generation (3) vision transformer (3) few-shot learning (3) representation learning (3) 3d reconstruction (3) attention mechanism (3)

Papers

MavenCoder: Competitive Code Generation via Model Adaptive Planning Strategies and Multi-Perspective Verification Enhancement ACL 2026 SAFE-QAQ: End-to-End Slow-Thinking Audio-Text Fraud Detection via Reinforcement Learning ACL 2026 Look-Back: Implicit Visual Re-focusing in MLLM Reasoning AAAI 2026 360Explorer: Exploring 4D Controllable World in Panoramic Videos AAAI 2026 Truth or Sophistry? LoFa: A Benchmark for LLM Robustness Against Logical Fallacies ACL 2026 NeuralGS: Bridging Neural Fields and 3D Gaussian Splatting for Compact 3D Representations AAAI 2026 AsFT: Anchoring Safety During LLM Fine-Tuning Within Narrow Safety Basin AAAI 2026 Hybrid-DMKG: A Hybrid Reasoning Framework over Dynamic Multimodal Knowledge Graphs for Multimodal Multihop QA with Knowledge Editing AAAI 2026 Next Patch Prediction for AutoRegressive Visual Generation AAAI 2026 DreamDance: Animating Human Images by Enriching 3D Geometry Cues from 2D Poses ICCV 2025 Rethinking Text-based Protein Understanding: Retrieval or LLM? EMNLP 2025 AE-NeRF: Augmenting Event-Based Neural Radiance Fields for Non-ideal Conditions and Larger Scenes AAAI 2025 Cycle3D: High-quality and Consistent Image-to-3D Generation via Generation-Reconstruction Cycle AAAI 2025 RuleEdit: Towards Rule-Level Knowledge Generalization to Mitigate Over-Editing in Large Language Models ACL 2025 Is Parameter Collision Hindering Continual Learning in LLMs? COLING 2025 Collaborative Multi-LoRA Experts with Achievement-based Multi-Tasks Loss for Unified Multimodal Information Extraction IJCAI 2025 Orthogonal Subspace Decomposition for Generalizable AI-Generated Image Detection ICML 2025 MoH: Multi-Head Attention as Mixture-of-Head Attention ICML 2025 MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts ICLR 2025 Epona: Autoregressive Diffusion World Model for Autonomous Driving ICCV 2025 PiCO: Peer Review in LLMs based on Consistency Optimization ICLR 2025 LLaVA-CoT: Let Vision Language Models Reason Step-by-Step ICCV 2025 LangBridge: Interpreting Image as a Combination of Language Embeddings ICCV 2025 EvaGaussians: Event Stream Assisted Gaussian Splatting from Blurry Images ICCV 2025 Generalizing Deepfake Video Detection with Plug-and-Play: Video-Level Blending and Spatiotemporal Adapter Tuning CVPR 2025 WF-VAE: Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model CVPR 2025 RoomPainter: View-Integrated Diffusion for Consistent Indoor Scene Texturing CVPR 2025 UPME: An Unsupervised Peer Review Framework for Multimodal Large Language Model Evaluation CVPR 2025 Identity-Preserving Text-to-Video Generation by Frequency Decomposition CVPR 2025 Regressor-Segmenter Mutual Prompt Learning for Crowd Counting CVPR 2024 Spiking Transformer with Experts Mixture NIPS 2024 QKFormer: Hierarchical Spiking Transformer using Q-K Attention NIPS 2024 ShareGPT4Video: Improving Video Understanding and Generation with Better Captions NIPS 2024 ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation NIPS 2024 DF40: Toward Next-Generation Deepfake Detection NIPS 2024 VLMimic: Vision Language Models are Visual Imitation Learner for Fine-grained Actions NIPS 2024 Parallel Vertex Diffusion for Unified Visual Grounding AAAI 2024 RAP: Efficient Text-Video Retrieval with Sparse-and-Correlated Adapter ACL 2024 A Logical Pattern Memory Pre-trained Model for Entailment Tree Generation COLING 2024 Grounded Multimodal Procedural Entity Recognition for Procedural Documents: A New Dataset and Baseline COLING 2024 SynSP: Synergy of Smoothness and Precision in Pose Sequences Refinement CVPR 2024 GraCo: Granularity-Controllable Interactive Segmentation CVPR 2024 Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding CVPR 2024 FreestyleRet: Retrieving Images from Style-Diversified Queries ECCV 2024 Repaint123: Fast and High-quality One Image to 3D Generation with Progressive Controllable Repainting ECCV 2024 Local Action-Guided Motion Diffusion Model for Text-to-Motion Generation ECCV 2024 HiFi-123: Towards High-fidelity One Image to 3D Content Generation ECCV 2024 Learning Pseudo 3D Guidance for View-consistent Texturing with 2D Diffusion ECCV 2024 Video-LLaVA: Learning United Visual Representation by Alignment Before Projection EMNLP 2024 Med-MoE: Mixture of Domain-Specific Experts for Lightweight Medical Vision-Language Models EMNLP 2024 LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context Inference EMNLP 2024 LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment ICLR 2024 Progressive3D: Progressively Local Editing for Text-to-3D Content Creation with Complex Semantic Prompts ICLR 2024 IDRNet: Intervention-Driven Relation Network for Semantic Segmentation NIPS 2023 Text-Video Retrieval with Disentangled Conceptualization and Set-to-Set Alignment IJCAI 2023 Spike-driven Transformer NIPS 2023 Act As You Wish: Fine-Grained Control of Motion Diffusion Model with Hierarchical Semantic Graphs NIPS 2023 ACSeg: Adaptive Conceptualization for Unsupervised Semantic Segmentation CVPR 2023 Multi-granularity Interaction Simulation for Unsupervised Interactive Segmentation ICCV 2023 DiffusionRet: Generative Text-Video Retrieval with Diffusion Model ICCV 2023 Joint Multimodal Entity-Relation Extraction Based on Edge-Enhanced Graph Alignment Network and Word-Pair Relation Tagging AAAI 2023 Rethinking Point Cloud Registration as Masking and Reconstruction ICCV 2023 Learning With Fantasy: Semantic-Aware Virtual Contrastive Constraint for Few-Shot Class-Incremental Learning CVPR 2023 Video-Text As Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning CVPR 2023 Out-of-Candidate Rectification for Weakly Supervised Semantic Segmentation CVPR 2023 Spikformer: When Spiking Neural Network Meets Transformer ICLR 2023 PointGPT: Auto-regressively Generative Pre-training from Point Clouds NIPS 2023 Locality Guidance for Improving Vision Transformers on Tiny Datasets ECCV 2022 Improving Vision Transformers by Revisiting High-Frequency Components ECCV 2022 Masked Autoencoders for Point Cloud Self-Supervised Learning ECCV 2022 DynaMixer: A Vision MLP Architecture with Dynamic Mixing ICML 2022 Positive-Negative Momentum: Manipulating Stochastic Gradient Noise to Improve Generalization ICML 2021 Tokens-to-Token ViT: Training Vision Transformers From Scratch on ImageNet ICCV 2021 All Tokens Matter: Token Labeling for Training Better Vision Transformers NIPS 2021 Continual Learning via Bit-Level Information Preserving CVPR 2021 PnP-DETR: Towards Efficient Visual Analysis With Transformers ICCV 2021 Graph Attention Network with Memory Fusion for Aspect-level Sentiment Analysis AACL 2020 Central Similarity Quantization for Efficient Image and Video Retrieval CVPR 2020 Revisiting Knowledge Distillation via Label Smoothing Regularization CVPR 2020 YNU-HPCC at SemEval-2020 Task 8: Using a Parallel-Channel Model for Memotion Analysis SEMEVAL 2020 YNU-HPCC at SemEval-2020 Task 8: Using a Parallel-Channel Model for Memotion Analysis COLING 2020 Cycle-SUM: Cycle-Consistent Adversarial LSTM Networks for Unsupervised Video Summarization AAAI 2019 Distilling Object Detectors With Fine-Grained Feature Imitation CVPR 2019 Few-Shot Adaptive Faster R-CNN CVPR 2019