Wei Xue
58 papers · 2013–2026 · 15 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+13 more ↓ Show less ↑
πΊοΈ Taxonomy Completionist (13) π§ Keyword Pioneer π Renaissance Researcher (5) π Interdisciplinary Bridge π Conference Polyglot (15)
π
Renaissance Researcher
(5)
π
Interdisciplinary Bridge
πΊοΈ
Taxonomy Completionist
(13)
π€
Dynamic Duo
(26)
π
Triple Crown
π±
Topic Pioneer
π
Grand Slam
π₯
Mega-Team
(32)
π₯
Unstoppable
(5)
β‘
Prolific Year
(17)
π
Conference Pioneer
ποΈ
Keyword Collector
(200)
π
Century Club
(53)
Conferences
ACL (10)
INTERSPEECH (8)
AAAI (7)
ICLR (7)
ICML (6)
CVPR (3)
EMNLP (3)
IJCAI (3)
NIPS (3)
ECCV (2)
ICCV (2)
AACL (1)
IJCNLP (1)
NSDI (1)
UAI (1)
Top co-authors
Keywords
music generation
(4)
diffusion model
(3)
audio generation
(3)
large language model
(3)
neural network
(3)
model compression
(3)
multi-task learning
(2)
sound source localization
(2)
low-rank adaptation
(2)
convolutional neural network
(2)
sentiment analysis
(2)
language model
(2)
speech dereverberation
(2)
weakly supervised learning
(2)
multimodal learning
(2)
reward model
(2)
video understanding
(2)
speech enhancement
(2)
information extraction
(2)
music understanding
(2)
Papers
Inference-time Scaling for Diffusion-based Audio Super-resolution
AAAI 2026
Benchmarking Fine-Grained Error Detection in Multimodal Reasoning
ACL 2026
VMChill: A Dataset for Fine-Grained Visual-Musical Synergy
AAAI 2026
Omni-RewardBench: Toward a Comprehensive Evaluation of Generative Reward Models Across Modalities
ACL 2026
WenetSpeech-Yue: A Large-Scale Cantonese Speech Corpus with Multi-dimensional Annotation
AAAI 2026
Both Ears Wide Open: Towards Language-Driven Spatial Audio Generation
ICLR 2025
STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs
ICLR 2025
FlashAudio: Rectified Flow for Fast and High-Fidelity Text-to-Audio Generation
ACL 2025
Itβs Not a Walk in the Park! Challenges of Idiom Translation in Speech-to-text Systems
ACL 2025
Empowering World Models with Reflection for Embodied Video Prediction
ICML 2025
Delta Decompression for MoE-based LLMs Compression
ICML 2025
MoE-SVD: Structured Mixture-of-Experts LLMs Compression via Singular Value Decomposition
ICML 2025
OmniAudio: Generating Spatial Audio from 360-Degree Video
ICML 2025
BayesKD: Bayesian Knowledge Distillation for Compact LLMs in Constrained Fine-tuning Scenarios
ACL 2025
Boosting Policy and Process Reward Models with Monte Carlo Tree Search in Open-Domain QA
ACL 2025
Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model
AAAI 2025
PSHuman: Photorealistic Single-image 3D Human Reconstruction using Cross-Scale Multiview Diffusion and Explicit Remeshing
CVPR 2025
VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling
CVPR 2025
Graceful Forgetting in Generative Language Models
EMNLP 2025
AIRA: Activation-Informed Low-Rank Adaptation for Large Models
ICCV 2025
Importance Weighting Can Help Large Language Models Self-Improve
AAAI 2025
Efficient Fine-Tuning of Large Models via Nested Low-Rank Adaptation
ICCV 2025
LLaSE-G1: Incentivizing Generalization Capability for LLaMA-based Speech Enhancement
ACL 2025
MuPT: A Generative Symbolic Music Pretrained Transformer
ICLR 2025
Co$^{\mathbf{3}}$Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive Diffusion
ICLR 2025
VAE-Var: Variational Autoencoder-Enhanced Variational Methods for Data Assimilation in Meteorology
ICLR 2025
Cross-Linguistic Intelligibility of Non-Compositional Expressions in Spoken Context
INTERSPEECH 2024
Information Re-Organization Improves Reasoning in Large Language Models
NIPS 2024
Discovering Sparsity Allocation for Layer-wise Pruning of Large Language Models
NIPS 2024
FM-OV3D: Foundation Model-Based Cross-Modal Knowledge Blending for Open-Vocabulary 3D Detection
AAAI 2024
Insert or Attach: Taxonomy Completion via Box Embedding
ACL 2024
ChatMusician: Understanding and Generating Music Intrinsically with LLM
ACL 2024
Weakly-Supervised Emotion Transition Learning for Diverse 3D Co-speech Gesture Generation
CVPR 2024
AttnZero: Efficient Attention Discovery for Vision Transformers
ECCV 2024
Auto-GAS: Automated Proxy Discovery for Training-free Generative Architecture Search
ECCV 2024
PyramidCodec: Hierarchical Codec for Long-form Music Generation in Audio Domain
EMNLP 2024
ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate
ICLR 2024
ViDA: Homeostatic Visual Domain Adapter for Continual Test Time Adaptation
ICLR 2024
DetKDS: Knowledge Distillation Search for Object Detectors
ICML 2024
Towards a Self-contained Data-driven Global Weather Forecasting Framework
ICML 2024
FastSAG: Towards Fast Non-Autoregressive Singing Accompaniment Generation
IJCAI 2024
Towards a better understanding of receptive multilingualism: listening conditions and priming effects
INTERSPEECH 2024
Dirichlet Continual Learning: Tackling Catastrophic Forgetting in NLP
UAI 2024
NAS-FM: Neural Architecture Search for Tunable and Interpretable Sound Synthesis Based on Frequency Modulation
IJCAI 2023
MARBLE: Music Audio Representation Benchmark for Universal Evaluation
NIPS 2023
MoMusic: A Motion-Driven Human-AI Collaborative Music Composition and Performing System
AAAI 2023
Enhancing Emotion Recognition in Conversation via Multi-view Feature Alignment and Memorization
EMNLP 2023
Speech Intelligibility of Dysarthric Speech: Human Scores and Acoustic-Phonetic Features
INTERSPEECH 2021
metaCAT: A Metadata-based Task-oriented Chatbot Annotation Tool
AACL 2020
Sound Event Localization and Detection Based on Multiple DOA Beamforming and Multi-Task Learning
INTERSPEECH 2020
The JD AI Speaker Verification System for the FFSVC 2020 Challenge
INTERSPEECH 2020
SkipConvNet: Skip Convolutional Neural Network for Speech Dereverberation Using Optimally Smoothed Spectral Mapping
INTERSPEECH 2020
Direct-Path Signal Cross-Correlation Estimation for Sound Source Localization in Reverberation
INTERSPEECH 2019
End-to-end I/O Monitoring on a Leading Supercomputer
NSDI 2019
Aspect Based Sentiment Analysis with Gated Convolutional Networks
ACL 2018
MTNA: A Neural Multi-task Model for Aspect Category Classification and Aspect Term Extraction On Restaurant Reviews
IJCNLP 2017
Multilingual i-Vector Based Statistical Modeling for Music Genre Classification
INTERSPEECH 2017
Probabilistic Multi-Label Classification with Sparse Feature Learning
IJCAI 2013