Wei Xue

58 papers · 2013–2026 · 15 conferences · across top CS/AI conferences

Achievements

+13 more ↓

🗺️ Taxonomy Completionist (13) 🧭 Keyword Pioneer 🌈 Renaissance Researcher (5) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (15)

🌈 Renaissance Researcher (5) 🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (13) 🤝 Dynamic Duo (26) 👑 Triple Crown 🌱 Topic Pioneer 🏆 Grand Slam 👥 Mega-Team (32) 🔥 Unstoppable (5) ⚡ Prolific Year (17) 🚀 Conference Pioneer 🗃️ Keyword Collector (200) 💎 Century Club (53)

Conferences

ACL (10) INTERSPEECH (8) AAAI (7) ICLR (7) ICML (6) CVPR (3) EMNLP (3) IJCAI (3) NIPS (3) ECCV (2) ICCV (2) AACL (1) IJCNLP (1) NSDI (1) UAI (1)

Top co-authors

Yike Guo (29) Qifeng Liu (15) Lujun Li (10) Zhen Ye (8) Chi-Min Chan (7) Wenhan Luo (7) Xu Tan (6) Shanghang Zhang (6) Ruibin Yuan (6) Sirui Han (5)

Keywords

music generation (4) diffusion model (3) audio generation (3) large language model (3) neural network (3) model compression (3) multi-task learning (2) sound source localization (2) low-rank adaptation (2) convolutional neural network (2) sentiment analysis (2) language model (2) speech dereverberation (2) weakly supervised learning (2) multimodal learning (2) reward model (2) video understanding (2) speech enhancement (2) information extraction (2) music understanding (2)

Papers

Inference-time Scaling for Diffusion-based Audio Super-resolution AAAI 2026 Benchmarking Fine-Grained Error Detection in Multimodal Reasoning ACL 2026 VMChill: A Dataset for Fine-Grained Visual-Musical Synergy AAAI 2026 Omni-RewardBench: Toward a Comprehensive Evaluation of Generative Reward Models Across Modalities ACL 2026 WenetSpeech-Yue: A Large-Scale Cantonese Speech Corpus with Multi-dimensional Annotation AAAI 2026 Both Ears Wide Open: Towards Language-Driven Spatial Audio Generation ICLR 2025 STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs ICLR 2025 FlashAudio: Rectified Flow for Fast and High-Fidelity Text-to-Audio Generation ACL 2025 It’s Not a Walk in the Park! Challenges of Idiom Translation in Speech-to-text Systems ACL 2025 Empowering World Models with Reflection for Embodied Video Prediction ICML 2025 Delta Decompression for MoE-based LLMs Compression ICML 2025 MoE-SVD: Structured Mixture-of-Experts LLMs Compression via Singular Value Decomposition ICML 2025 OmniAudio: Generating Spatial Audio from 360-Degree Video ICML 2025 BayesKD: Bayesian Knowledge Distillation for Compact LLMs in Constrained Fine-tuning Scenarios ACL 2025 Boosting Policy and Process Reward Models with Monte Carlo Tree Search in Open-Domain QA ACL 2025 Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model AAAI 2025 PSHuman: Photorealistic Single-image 3D Human Reconstruction using Cross-Scale Multiview Diffusion and Explicit Remeshing CVPR 2025 VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling CVPR 2025 Graceful Forgetting in Generative Language Models EMNLP 2025 AIRA: Activation-Informed Low-Rank Adaptation for Large Models ICCV 2025 Importance Weighting Can Help Large Language Models Self-Improve AAAI 2025 Efficient Fine-Tuning of Large Models via Nested Low-Rank Adaptation ICCV 2025 LLaSE-G1: Incentivizing Generalization Capability for LLaMA-based Speech Enhancement ACL 2025 MuPT: A Generative Symbolic Music Pretrained Transformer ICLR 2025 Co$^{\mathbf{3}}$Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive Diffusion ICLR 2025 VAE-Var: Variational Autoencoder-Enhanced Variational Methods for Data Assimilation in Meteorology ICLR 2025 Cross-Linguistic Intelligibility of Non-Compositional Expressions in Spoken Context INTERSPEECH 2024 Information Re-Organization Improves Reasoning in Large Language Models NIPS 2024 Discovering Sparsity Allocation for Layer-wise Pruning of Large Language Models NIPS 2024 FM-OV3D: Foundation Model-Based Cross-Modal Knowledge Blending for Open-Vocabulary 3D Detection AAAI 2024 Insert or Attach: Taxonomy Completion via Box Embedding ACL 2024 ChatMusician: Understanding and Generating Music Intrinsically with LLM ACL 2024 Weakly-Supervised Emotion Transition Learning for Diverse 3D Co-speech Gesture Generation CVPR 2024 AttnZero: Efficient Attention Discovery for Vision Transformers ECCV 2024 Auto-GAS: Automated Proxy Discovery for Training-free Generative Architecture Search ECCV 2024 PyramidCodec: Hierarchical Codec for Long-form Music Generation in Audio Domain EMNLP 2024 ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate ICLR 2024 ViDA: Homeostatic Visual Domain Adapter for Continual Test Time Adaptation ICLR 2024 DetKDS: Knowledge Distillation Search for Object Detectors ICML 2024 Towards a Self-contained Data-driven Global Weather Forecasting Framework ICML 2024 FastSAG: Towards Fast Non-Autoregressive Singing Accompaniment Generation IJCAI 2024 Towards a better understanding of receptive multilingualism: listening conditions and priming effects INTERSPEECH 2024 Dirichlet Continual Learning: Tackling Catastrophic Forgetting in NLP UAI 2024 NAS-FM: Neural Architecture Search for Tunable and Interpretable Sound Synthesis Based on Frequency Modulation IJCAI 2023 MARBLE: Music Audio Representation Benchmark for Universal Evaluation NIPS 2023 MoMusic: A Motion-Driven Human-AI Collaborative Music Composition and Performing System AAAI 2023 Enhancing Emotion Recognition in Conversation via Multi-view Feature Alignment and Memorization EMNLP 2023 Speech Intelligibility of Dysarthric Speech: Human Scores and Acoustic-Phonetic Features INTERSPEECH 2021 metaCAT: A Metadata-based Task-oriented Chatbot Annotation Tool AACL 2020 Sound Event Localization and Detection Based on Multiple DOA Beamforming and Multi-Task Learning INTERSPEECH 2020 The JD AI Speaker Verification System for the FFSVC 2020 Challenge INTERSPEECH 2020 SkipConvNet: Skip Convolutional Neural Network for Speech Dereverberation Using Optimally Smoothed Spectral Mapping INTERSPEECH 2020 Direct-Path Signal Cross-Correlation Estimation for Sound Source Localization in Reverberation INTERSPEECH 2019 End-to-end I/O Monitoring on a Leading Supercomputer NSDI 2019 Aspect Based Sentiment Analysis with Gated Convolutional Networks ACL 2018 MTNA: A Neural Multi-task Model for Aspect Category Classification and Aspect Term Extraction On Restaurant Reviews IJCNLP 2017 Multilingual i-Vector Based Statistical Modeling for Music Genre Classification INTERSPEECH 2017 Probabilistic Multi-Label Classification with Sparse Feature Learning IJCAI 2013