conftrace_

Dongchao Yang

24 papers · 2021–2026 · 7 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓
+13 more ↓ 🌍 Conference Polyglot (7) 🐝 Cross-Pollinator (9) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🏃 Academic Marathon (5)
🧭 Keyword Pioneer 🌈 Renaissance Researcher (7) 🐣 Hot Topic Early Bird 🤝 Dynamic Duo (10) 👑 Triple Crown 🏆 Grand Slam 🧬 Topic Evolution 🏆 Keyword Champion (2) 🗃️ Keyword Collector (97) Prolific Year (9) The Questioner 🔥 Unstoppable (5) 💎 Century Club (22)

Conferences

INTERSPEECH (10) ICML (5) ACL (4) AAAI (2) EMNLP (1) ICLR (1) NIPS (1)

Papers

UniSRM: A Unified Speech Reward Model for Reasoning-Based Fine-grained Assessment ACL 2026 DualSpeechLM: Towards Unified Speech Understanding and Generation via Dual Speech Token Modeling with Large Language Models AAAI 2026 InSerter: Speech Instruction Following with Unsupervised Interleaved Pre-training ACL 2025 Speech Discrete Tokens or Continuous Features? A Comparative Analysis for Spoken Language Understanding in SpeechLLMs EMNLP 2025 ATRI: Mitigating Multilingual Audio Text Retrieval Inconsistencies by Reducing Data Distribution Errors ACL 2025 ALMTokenizer: A Low-bitrate and Semantic-rich Audio Codec Tokenizer for Audio Language Modeling ICML 2025 UniAudio: Towards Universal Audio Generation with Large Language Models ICML 2024 UniAudio 1.5: Large Language Model-Driven Audio Codec is A Few-Shot Audio Task Learner NIPS 2024 AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head AAAI 2024 Make-A-Voice: Revisiting Voice Large Language Models as Scalable Multilingual and Multitask Learners ACL 2024 PromptTTS 2: Describing and Generating Voices with Text Prompt ICLR 2024 InstructSpeech: Following Speech Editing Instructions via Large Language Models ICML 2024 NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models ICML 2024 CoLM-DSR: Leveraging Neural Codec Language Modeling for Multi-Modal Dysarthric Speech Reconstruction INTERSPEECH 2024 SimpleSpeech: Towards Simple and Efficient Text-to-Speech with Scalar Latent Transformer Diffusion Models INTERSPEECH 2024 Background-aware Modeling for Weakly Supervised Sound Event Detection INTERSPEECH 2023 NoreSpeech: Knowledge Distillation based Conditional Diffusion Model for Noise-robust Expressive TTS INTERSPEECH 2023 Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models ICML 2023 Improving Target Sound Extraction with Timestamp Information INTERSPEECH 2022 Speaker-Aware Mixture of Mixtures Training for Weakly Supervised Speaker Extraction INTERSPEECH 2022 Target Confusion in End-to-end Speaker Extraction: Analysis and Approaches INTERSPEECH 2022 Audio Pyramid Transformer with Domain Adaption for Weakly Supervised Sound Event Detection and Audio Classification INTERSPEECH 2022 RaDur: A Reference-aware and Duration-robust Network for Target Sound Detection INTERSPEECH 2022 Unsupervised Multi-Target Domain Adaptation for Acoustic Scene Classification INTERSPEECH 2021