Xiaofei Wang
43 papers · 2016–2026 · 13 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+12 more ↓ Show less ↑
π§ Keyword Pioneer πΊοΈ Taxonomy Completionist (17) π Renaissance Researcher (5) π Interdisciplinary Bridge π Conference Polyglot (12)
πΊοΈ
Taxonomy Completionist
(17)
π§
Keyword Pioneer
π
Academic Marathon
(10)
π€
Dynamic Duo
(11)
π¬
Deep Specialist
(10)
π§¬
Topic Evolution
π
Keyword Champion
(2)
β‘
Prolific Year
(11)
ποΈ
Keyword Collector
(176)
π
Century Club
(40)
π₯
Unstoppable
(9)
π
Conference Pioneer
Conferences
INTERSPEECH (18)
AAAI (7)
MICCAI (3)
NIPS (3)
CVPR (2)
ECCV (2)
ICML (2)
ACL (1)
CORL (1)
EMNLP (1)
IJCAI (1)
JMLR (1)
WACV (1)
Top co-authors
Keywords
automatic speech recognition
(6)
speaker diarization
(4)
speech enhancement
(4)
speaker counting
(3)
adversarial attack
(3)
zero-shot learning
(3)
speech synthesis
(3)
speech recognition
(3)
speaker identification
(3)
serialized output training
(2)
end-to-end model
(2)
word error rate
(2)
acoustic model
(2)
medical imaging
(2)
reinforcement learning
(2)
flow matching
(2)
point cloud
(2)
attention mechanism
(2)
non-negative matrix factorization
(2)
attention-based encoder-decoder
(2)
Papers
Less Is More: Sparse and Cooperative Perturbation for Point Cloud Attacks
AAAI 2026
Shanks: Simultaneous Hearing and Thinking for Spoken Language Models
ACL 2026
Stratos: An End-to-End Distillation Pipeline for Customized LLMs Under Distributed Cloud Environments
AAAI 2026
Zero-Shot Audio-Visual Editing via Cross-Modal Delta Denoising
WACV 2026
Enhancing Statistical Validity and Power in Hybrid Controlled Trials: A Randomization Inference Approach with Conformal Selective Borrowing
ICML 2025
AdvGrasp: Adversarial Attacks on Robotic Grasping from a Physical Perspective
IJCAI 2025
Adaptive Spatial Transcriptomics Interpolation via Cross-modal Cross-slice Modeling
MICCAI 2025
Imperceptible 3D Point Cloud Attacks on Lattice-based Barycentric Coordinates
AAAI 2025
ELLA-V: Stable Neural Codec Language Modeling with Alignment-Guided Sequence Reordering
AAAI 2025
Audio-Aware Large Language Models as Judges for Speaking Styles
EMNLP 2025
Extracting Rare Dependence Patterns via Adaptive Sample Reweighting
ICML 2025
CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations
NIPS 2024
An Investigation of Noise Robustness for Flow-Matching-Based Zero-Shot TTS
INTERSPEECH 2024
Total-Duration-Aware Duration Modeling for Text-to-Speech Systems
INTERSPEECH 2024
TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation
NIPS 2024
FLAT: Flux-aware Imperceptible Adversarial Attacks on 3D Point Clouds
ECCV 2024
Global-Local Collaborative Inference with LLM for Lidar-Based Open-Vocabulary Detection
ECCV 2024
Cross-modal Diffusion Modelling for Super-resolved Spatial Transcriptomics
MICCAI 2024
NOTSOFAR-1 Challenge: New Datasets, Baseline, and Tasks for Distant Meeting Transcription
INTERSPEECH 2024
TacoLM: GaTed Attention Equipped Codec Language Model are Efficient Zero-Shot Text to Speech Synthesizers
INTERSPEECH 2024
EasyTS: The Express Lane to Long Time Series Forecasting
AAAI 2024
Knowledge-driven Subspace Fusion and Gradient Coordination for Multi-modal Learning
MICCAI 2024
Speaker Diarization for ASR Output with T-vectors: A Sequence Classification Approach
INTERSPEECH 2023
Leveraging Real Conversational Data for Multi-Channel Continuous Speech Separation
INTERSPEECH 2022
Streaming Multi-Talker ASR with Token-Level Serialized Output Training
INTERSPEECH 2022
Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings
INTERSPEECH 2022
End-to-End Speaker-Attributed ASR with Transformer
INTERSPEECH 2021
Skill Preferences: Learning to Extract and Execute Robotic Skills from Human Feedback
CORL 2021
Deep Multi-Task Learning for Diabetic Retinopathy Grading in Fundus Images
AAAI 2021
Saliency-Guided Image Translation
CVPR 2021
Human Listening and Live Captioning: Multi-Task Training for Speech Enhancement
INTERSPEECH 2021
Large-Scale Pre-Training of End-to-End Multi-Talker ASR for Meeting Transcription with Single Distant Microphone
INTERSPEECH 2021
Reinforcement Learning with Latent Flow
NIPS 2021
Serialized Output Training for End-to-End Overlapped Speech Recognition
INTERSPEECH 2020
Joint Speaker Counting, Speech Recognition, and Speaker Identification for Overlapped Speech of any Number of Speakers
INTERSPEECH 2020
Learning Mixed Latent Tree Models
JMLR 2020
D2D-LSTM: LSTM-Based Path Prediction of Content Diffusion Tree in Device-to-Device Social Networks
AAAI 2020
Attention Based Glaucoma Detection: A Large-Scale Database and CNN Model
CVPR 2019
Exploring Methods for the Automatic Detection of Errors in Manual Transcription
INTERSPEECH 2019
Stream Attention for Distributed Multi-Microphone Speech Recognition
INTERSPEECH 2018
A DNN-HMM Approach to Non-Negative Matrix Factorization Based Speech Enhancement
INTERSPEECH 2016
A Robust Dual-Microphone Speech Source Localization Algorithm for Reverberant Environments
INTERSPEECH 2016
Adaptive Group Sparsity for Non-Negative Matrix Factorization with Application to Unsupervised Source Separation
INTERSPEECH 2016