conftrace_

Yanmin Qian

63 papers · 2016–2026 · 5 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+16 more ↓

🗺️ Taxonomy Completionist (29) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (6) 🐣 Hot Topic Early Bird

🌉 Interdisciplinary Bridge 🏃 Academic Marathon (9) 🐝 Cross-Pollinator (7) 🏠 Conference Loyalist (56) 🔬 Deep Specialist (15) 🧬 Topic Evolution 🏆 Keyword Champion (2) 🌱 Topic Pioneer 🤝 Dynamic Duo (13) 📈 Trend Setter 🚀 Conference Pioneer ⚡ Prolific Year (10) 🔥 Unstoppable (10) ❓ The Questioner 🗃️ Keyword Collector (126) 💎 Century Club (61)

Conferences

INTERSPEECH (56) NIPS (3) ACL (2) AAAI (1) IJCAI (1)

Top co-authors

Zhengyang Chen (13) Shuai Wang (12) Kai Yu (11) Wangyou Zhang (10) Chenda Li (9) Bei Liu (9) Bing Han (7) Xuankai Chang (6) Michael Zeng (4) Yao Qian (4)

Keywords

speaker verification (19) automatic speech recognition (9) speaker embedding (8) speech recognition (8) speaker recognition (7) speech separation (6) embedding learning (6) permutation invariant training (5) model compression (5) convolutional neural network (4) adversarial training (4) domain adaptation (4) knowledge distillation (4) end-to-end model (4) speech enhancement (4) attention mechanism (4) self-supervised learning (4) multi-talker speech recognition (4) connectionist temporal classification (3) cocktail party problem (3)

Papers

A Data-Centric Approach to Generalizable Speech Deepfake Detection ACL 2026 USE: A Unified Model for Universal Sound Separation and Extraction AAAI 2026 SpeechFake: A Large-Scale Multilingual Speech Deepfake Dataset Incorporating Cutting-Edge Generation Methods ACL 2025 InstructME: An Instruction Guided Music Edit Framework with Latent Diffusion Models IJCAI 2024 URGENT Challenge: Universality, Robustness, and Generalizability For Speech Enhancement INTERSPEECH 2024 SparseWAV: Fast and Accurate One-Shot Unstructured Pruning for Large Speech Foundation Models INTERSPEECH 2024 Generating Speakers by Prompting Listener Impressions for Pre-trained Multi-Speaker Text-to-Speech Systems INTERSPEECH 2024 WeSep: A Scalable and Flexible Toolkit Towards Generalizable Target Speaker Extraction INTERSPEECH 2024 Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enhancement INTERSPEECH 2024 Contextual Biasing Speech Recognition in Speech-enhanced Large Language Model INTERSPEECH 2024 AnoPatch: Towards Better Consistency in Machine Anomalous Sound Detection INTERSPEECH 2024 TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation NIPS 2024 CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations NIPS 2024 ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation NIPS 2023 Adaptive Neural Network Quantization For Lightweight Speaker Verification INTERSPEECH 2023 Attention-based Encoder-Decoder Network for End-to-End Neural Speaker Diarization with Target Speaker Attractor INTERSPEECH 2023 Weakly-Supervised Speech Pre-training: A Case Study on Target Speech Recognition INTERSPEECH 2023 Overlap Aware Continuous Speech Separation without Permutation Invariant Training INTERSPEECH 2023 Text Only Domain Adaptation with Phoneme Guided Data Splicing for End-to-End Speech Recognition INTERSPEECH 2023 Build a SRE Challenge System: Lessons from VoxSRC 2022 and CNSRC 2022 INTERSPEECH 2023 ECAPA++: Fine-grained Deep Embedding Learning for TDNN Based Speaker Verification INTERSPEECH 2023 Reversible Neural Networks for Memory-Efficient Speaker Verification INTERSPEECH 2023 UniSplice: Universal Cross-Lingual Data Splicing for Low-Resource ASR INTERSPEECH 2023 Fast and Efficient Multilingual Self-Supervised Pre-training for Low-Resource Speech Recognition INTERSPEECH 2023 Extremely Low Bit Quantization for Mobile Speaker Verification Systems Under 1MB Memory INTERSPEECH 2023 Adapting Multi-Lingual ASR Models for Handling Multiple Talkers INTERSPEECH 2023 Dual Path Embedding Learning for Speaker Verification with Triplet Attention INTERSPEECH 2022 Enroll-Aware Attentive Statistics Pooling for Target Speaker Verification INTERSPEECH 2022 MSDWild: Multi-modal Speaker Diarization Dataset in the Wild INTERSPEECH 2022 Knowledge Transfer and Distillation from Autoregressive to Non-Autoregessive Speech Recognition INTERSPEECH 2022 Self-Supervised Speaker Verification Using Dynamic Loss-Gate and Label Correction INTERSPEECH 2022 ESPnet-SE++: Speech Enhancement for Robust Speech Recognition, Translation, and Understanding INTERSPEECH 2022 Separating Long-Form Speech with Group-wise Permutation Invariant Training INTERSPEECH 2022 Attentive Feature Fusion for Robust Speaker Verification INTERSPEECH 2022 DF-ResNet: Boosting Speaker Verification Performance with Depth-First Design INTERSPEECH 2022 The SJTU System for Short-Duration Speaker Verification Challenge 2021 INTERSPEECH 2021 Layer-Wise Fast Adaptation for End-to-End Multi-Accent Speech Recognition INTERSPEECH 2021 Knowledge Distillation from Multi-Modality to Single-Modality for Person Verification INTERSPEECH 2021 Basis-MelGAN: Efficient Neural Vocoder Based on Audio Decomposition INTERSPEECH 2021 Audio-Visual Multi-Talker Speech Recognition in a Cocktail Party INTERSPEECH 2021 Bi-Encoder Transformer Network for Mandarin-English Code-Switching Speech Recognition Using Mixture of Experts INTERSPEECH 2020 Adversarial Domain Adaptation for Speaker Verification Using Partially Shared Network INTERSPEECH 2020 Multi-Modality Matters: A Performance Leap on VoxCeleb INTERSPEECH 2020 Listen, Watch and Understand at the Cocktail Party: Audio-Visual-Contextual Speech Separation INTERSPEECH 2020 Dual-Adversarial Domain Adaptation for Generalized Replay Attack Detection INTERSPEECH 2020 End-to-End Far-Field Speech Recognition with Unified Dereverberation and Beamforming INTERSPEECH 2020 Learning Contextual Language Embeddings for Monaural Multi-Talker Speech Recognition INTERSPEECH 2020 Prosody Usage Optimization for Children Speech Recognition with Zero Resource Children Speech INTERSPEECH 2019 Cross-Domain Replay Spoofing Attack Detection Using Domain Adversarial Training INTERSPEECH 2019 Robust DOA Estimation Based on Convolutional Neural Network and Time-Frequency Masking INTERSPEECH 2019 Knowledge Distillation for End-to-End Monaural Multi-Talker ASR System INTERSPEECH 2019 Joint Decoding of CTC Based Systems for Speech Recognition INTERSPEECH 2019 Data Augmentation Using Variational Autoencoder for Embedding Based Speaker Verification INTERSPEECH 2019 On the Usage of Phonetic Information for Text-Independent Speaker Embedding Extraction INTERSPEECH 2019 The SJTU Robust Anti-Spoofing System for the ASVspoof 2019 Challenge INTERSPEECH 2019 Knowledge Distillation for Sequence Model INTERSPEECH 2018 Monaural Multi-Talker Speech Recognition with Attention Mechanism and Gated Convolutional Networks INTERSPEECH 2018 Deep Extractor Network for Target Speaker Recovery from Single Channel Speech Mixtures INTERSPEECH 2018 Permutation Invariant Training of Generative Adversarial Network for Monaural Speech Separation INTERSPEECH 2018 Recognizing Multi-Talker Speech with Permutation Invariant Training INTERSPEECH 2017 What Does the Speaker Embedding Encode? INTERSPEECH 2017 Binary Deep Neural Networks for Speech Recognition INTERSPEECH 2017 Unrestricted Vocabulary Keyword Spotting Using LSTM-CTC INTERSPEECH 2016