Yifan Yang

79 papers · 2016–2026 · 16 conferences · across top CS/AI conferences

Achievements

+13 more ↓

🗺️ Taxonomy Completionist (11) 🧭 Keyword Pioneer 🌈 Renaissance Researcher (5) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (15)

🌍 Conference Polyglot (15) 🏃 Academic Marathon (10) 🐝 Cross-Pollinator (12) 🔬 Deep Specialist (12) 🏆 Keyword Champion (2) 🏆 Grand Slam 🚀 Conference Pioneer 🔥 Unstoppable (7) ❓ The Questioner 📈 Trend Setter 💎 Century Club (68) 🗃️ Keyword Collector (335) ⚡ Prolific Year (17)

Conferences

AAAI (12) ACL (12) ICCV (9) NIPS (8) EMNLP (7) CVPR (5) ECCV (5) INTERSPEECH (5) ICLR (4) IJCAI (3) NAACL (3) WACV (2) EACL (1) ICML (1) IJCNLP (1) SEMEVAL (1)

Top co-authors

Xie Chen (9) Zheng Zhang (7) XINYANG JIANG (7) Ziyang Ma (7) Yuqing Yang (7) Lili Qiu (6) Dongsheng Li (6) Kaiyi Ji (5) Yice Zhang (5) Guorong Li (5)

Research topics

Computer Vision (1) Privacy (1)

Keywords

large language model (9) diffusion model (6) automatic speech recognition (4) representation learning (4) reinforcement learning (4) contrastive learning (4) relation extraction (4) model compression (4) zeroth-order optimization (3) federated learning (3) multimodal learning (3) parameter-efficient fine-tuning (3) low-rank adaptation (3) convolutional neural network (3) transfer learning (3) memory efficiency (3) bilevel optimization (3) semantic segmentation (2) connectionist temporal classification (2) few-shot learning (2)

Papers

LLM2CLIP: Powerful Language Model Unlocks Richer Cross-Modality Representation AAAI 2026 Towards Fine-Grained and Multi-Granular Contrastive Language-Speech Pre-training ACL 2026 FLAT-LLM: Fine-grained Low-rank Activation Space Transformation for Large Language Model Compression EACL 2026 HumanGuideNet: Adapter-Based Alignment of Deep Neural Networks with Human Similarity Judgments WACV 2026 WebSynthesis: World Model-Guided Monte Carlo Tree Search for Efficient WebAgent Trajectory Synthesis ACL 2026 Evaluating the Expressive Appropriateness of Speech in Rich Contexts ACL 2026 SpeechLLM-as-Judges: Towards General and Interpretable Speech Quality Evaluation ACL 2026 MageBench: Bridging Large Multimodal Models to Agents WACV 2026 Towards Explainable Video Camouflaged Object Detection: SAM2 with Eventstream-Inspired Data AAAI 2026 Learning Systems Expansion with Efficient Heterogeneity-aware Knowledge Transfer AAAI 2026 HiTVideo: Hierarchical Tokenizers for Enhancing Text-to-Video Generation with Autoregressive Large Language Models AAAI 2026 Aligning Cross-View Visual Geometries in LVLMs Through Human-Like Reasoning Learning AAAI 2026 StreamingTalker: Audio-driven 3D Facial Animation with Autoregressive Diffusion Model AAAI 2026 Making LLMs Better Many-to-Many Speech-to-Text Translators with Curriculum Learning ACL 2025 SLAM-Omni: Timbre-Controllable Voice Interaction System with Single-Stage Training ACL 2025 Wanda++: Pruning Large Language Models via Regional Gradients ACL 2025 Communication-Efficient and Tensorized Federated Fine-Tuning of Large Language Models ACL 2025 Large Language Models and Causal Inference in Collaboration: A Comprehensive Survey NAACL 2025 HygMap: Representing All Types of Map Entities via Heterogeneous Hypergraph IJCAI 2025 DreamDistribution: Learning Prompt Distribution for Diverse In-distribution Generation ICLR 2025 Tuning-Free Bilevel Optimization: New Algorithms and Convergence Analysis ICLR 2025 QuZO: Quantized Zeroth-Order Fine-Tuning for Large Language Models EMNLP 2025 MaZO: Masked Zeroth-Order Optimization for Multi-Task Fine-Tuning of Large Language Models EMNLP 2025 ProLongVid: A Simple but Strong Baseline for Long-context Video Instruction Tuning EMNLP 2025 Graph Assisted Offline-Online Deep Reinforcement Learning for Dynamic Workflow Scheduling ICLR 2025 SP2T: Sparse Proxy Attention for Dual-stream Point Transformer ICCV 2025 Lark: Low-Rank Updates After Knowledge Localization for Few-shot Class-Incremental Learning ICCV 2025 StreamMind: Unlocking Full Frame Rate Streaming Video Dialogue through Event-Gated Cognition ICCV 2025 REDUCIO! Generating 1K Video within 16 Seconds using Extremely Compressed Motion Latents ICCV 2025 Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models ICCV 2025 First-Order Federated Bilevel Learning AAAI 2025 Speech Recognition Meets Large Language Model: Benchmarking, Models, and Exploration AAAI 2025 GigaSpeech 2: An Evolving, Large-Scale and Multi-domain ASR Corpus for Low-Resource Languages with Automated Crawling, Transcription and Refinement ACL 2025 Online Video Quality Enhancement with Spatial-Temporal Look-up Tables ECCV 2024 Is Your HD Map Constructor Reliable under Sensor Corruptions? NIPS 2024 First-Order Minimax Bilevel Optimization NIPS 2024 Understanding and Improving Training-free Loss-based Diffusion Guidance NIPS 2024 Adversarial Preference Optimization: Enhancing Your Alignment via RM-LLM Game ACL 2024 HiLo: Detailed and Robust 3D Clothed Human Reconstruction with High-and Low-Frequency Information of Parametric Models CVPR 2024 HIMap: HybrId Representation Learning for End-to-end Vectorized HD Map Construction CVPR 2024 G-NeRF: Geometry-enhanced Novel View Synthesis from Single-View Images CVPR 2024 Unlocking Attributes' Contribution to Successful Camouflage: A Combined Textual and Visual Analysis Strategy ECCV 2024 Unified Medical Image Pre-training in Language-Guided Common Semantic Space ECCV 2024 AdaZeta: Adaptive Zeroth-Order Tensor-Train Adaption for Memory-Efficient Large Language Models Fine-Tuning EMNLP 2024 LoRASC: Expressive and Generalizable Low-rank Adaptation for Large Models via Slow Cascaded Learning EMNLP 2024 Zipformer: A faster and better encoder for automatic speech recognition ICLR 2024 LibriheavyMix: A 20,000-Hour Dataset for Single-Channel Reverberant Multi-Talker Speech Separation, ASR and Speaker Diarization INTERSPEECH 2024 Incorporating Class-based Language Model for Named Entity Recognition in Factorized Neural Transducer INTERSPEECH 2024 LoRA-Whisper: Parameter-Efficient and Extensible Multilingual ASR INTERSPEECH 2024 LoRETTA: Low-Rank Economic Tensor-Train Adaptation for Ultra-Low-Parameter Fine-Tuning of Large Language Models NAACL 2024 Towards Inference Efficient Deep Ensemble Learning AAAI 2023 Detecting Adversarial Data by Probing Multiple Perturbations Using Expected Perturbation Score ICML 2023 Similarity Distribution Based Membership Inference Attack on Person Re-identification AAAI 2023 Target-to-Source Augmentation for Aspect Sentiment Triplet Extraction EMNLP 2023 Attentive Mask CLIP ICCV 2023 Achieving $\mathcal{O}(\epsilon^{-1.5})$ Complexity in Hessian/Jacobian-free Stochastic Bilevel Optimization NIPS 2023 SimFBO: Towards Simple, Flexible and Communication-efficient Federated Bilevel Learning NIPS 2023 Masked Retraining Teacher-Student Framework for Domain Adaptive Object Detection ICCV 2023 Cross-Ray Neural Radiance Fields for Novel-View Synthesis from Unconstrained Image Collections ICCV 2023 An Empirical Study of Sentiment-Enhanced Pre-Training for Aspect-Based Sentiment Analysis ACL 2023 Delay-penalized CTC Implemented Based on Finite State Transducer INTERSPEECH 2023 Blank-regularized CTC for Frame Skipping in Neural Transducer INTERSPEECH 2023 ImageBrush: Learning Visual In-Context Instructions for Exemplar-Based Image Manipulation NIPS 2023 C-Disentanglement: Discovering Causally-Independent Generative Factors under an Inductive Bias of Confounder NIPS 2023 Adaptive Data Debiasing through Bounded Exploration NIPS 2022 Boundary-Driven Table-Filling for Aspect Sentiment Triplet Extraction EMNLP 2022 HITSZ-HLT at SemEval-2022 Task 10: A Span-Relation Extraction Framework for Structured Sentiment Analysis SEMEVAL 2022 Directional Self-Supervised Learning for Heavy Image Augmentations CVPR 2022 HITSZ-HLT at SemEval-2022 Task 10: A Span-Relation Extraction Framework for Structured Sentiment Analysis NAACL 2022 Exploiting Sample Correlation for Crowd Counting With Multi-Expert Network ICCV 2021 PRGC: Potential Relation and Global Correspondence Based Joint Relational Triple Extraction IJCNLP 2021 PRGC: Potential Relation and Global Correspondence Based Joint Relational Triple Extraction ACL 2021 Looking Wider for Better Adaptive Representation in Few-Shot Learning AAAI 2021 Weakly-Supervised Crowd Counting Learns from Sorting rather than Locations ECCV 2020 Release the Power of Online-Training for Robust Visual Tracking AAAI 2020 Reverse Perspective Network for Perspective-Aware Object Counting CVPR 2020 The Unmanned Aerial Vehicle Benchmark: Object Detection and Tracking ECCV 2018 Impression Allocation for Combating Fraud in E-commerce Via Deep Reinforcement Learning with Action Norm Penalty IJCAI 2018 Fast Laplace Approximation for Sparse Bayesian Spike and Slab Models IJCAI 2016