Dan Su

53 papers · 2018–2025 · 11 conferences · across top CS/AI conferences

Achievements

+12 more ↓

🧭 Keyword Pioneer 🗺️ Taxonomy Completionist (23) 🌈 Renaissance Researcher (5) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (11)

🏃 Academic Marathon (7) 🗺️ Taxonomy Completionist (23) 🧭 Keyword Pioneer 🏠 Conference Loyalist (30) 🤝 Dynamic Duo (26) 👥 Mega-Team (48) 🗃️ Keyword Collector (75) 📈 Trend Setter 🔥 Unstoppable (8) ⚡ Prolific Year (9) 🚀 Conference Pioneer 💎 Century Club (53)

Conferences

INTERSPEECH (30) ACL (8) EMNLP (4) AAAI (2) EACL (2) IJCNLP (2) AACL (1) ICLR (1) ICML (1) IJCAI (1) MICCAI (1)

Top co-authors

Dong Yu (26) Pascale Fung (14) Helen Meng (11) Tiezheng Yu (8) Meng Yu (8) Jun Wang (7) Yan Xu (7) Zhiyong Wu (7) Wenliang Dai (6) Lianwu Chen (6)

Keywords

text-to-speech synthesis (5) speech separation (5) speech recognition (4) speech synthesis (4) attention mechanism (4) speaker similarity (3) voice conversion (3) speaker verification (3) end-to-end speech recognition (3) question answering (3) machine reading comprehension (2) reading comprehension (2) source separation (2) multi-task learning (2) automatic speech recognition (2) speech enhancement (2) extractive summarization (2) benchmark evaluation (2) text generation (2) knowledge distillation (2)

Papers

Nemotron-CC: Transforming Common Crawl into a Refined Long-Horizon Pretraining Dataset ACL 2025 MM-LLMs: Recent Advances in MultiModal Large Language Models ACL 2024 Prompt-guided Precise Audio Editing with Diffusion Models ICML 2024 ORCGT: Ollivier-Ricci Curvature-based Graph Model for Lung STAS Prediction MICCAI 2024 Generative Pre-trained Speech Language Model with Efficient Hierarchical Transformer ACL 2024 NusaCrowd: Open Source Initiative for Indonesian NLP Resources ACL 2023 Plausible May Not Be Faithful: Probing Object Hallucination in Vision-Language Pre-training EACL 2023 Context Generation Improves Open Domain Question Answering EACL 2023 Multi-mode Neural Speech Coding Based on Deep Generative Networks INTERSPEECH 2023 Text-Only Domain Adaptation for End-to-End Speech Recognition through Down-Sampling Acoustic Representation INTERSPEECH 2023 A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity IJCNLP 2023 UniSyn: An End-to-End Unified Model for Text-to-Speech and Singing Voice Synthesis AAAI 2023 A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity AACL 2023 Compressed MoE ASR Model Based on Knowledge Distillation and Quantization INTERSPEECH 2023 Clozer”:” Adaptable Data Augmentation for Cloze-style Reading Comprehension ACL 2022 Read before Generate! Faithful Long Form Question Answering with Machine Reading ACL 2022 Retrieval-Free Knowledge-Grounded Dialogue Response Generation with Adapters ACL 2022 BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis ICLR 2022 FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis IJCAI 2022 Cross-Age Speaker Verification: Learning Age-Invariant Speaker Embeddings INTERSPEECH 2022 Learning Noise-independent Speech Representation for High-quality Voice Conversion for Noisy Target Speakers INTERSPEECH 2022 Glow-WaveGAN 2: High-quality Zero-shot Text-to-speech Synthesis and Any-to-any Voice Conversion INTERSPEECH 2022 Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis INTERSPEECH 2022 Enhancing Word-Level Semantic Representation via Dependency Structure for Expressive Text-to-Speech Synthesis INTERSPEECH 2022 Improve Query Focused Abstractive Summarization by Incorporating Answer Relevance ACL 2021 Controllable Context-Aware Conversational Speech Synthesis INTERSPEECH 2021 GigaSpeech: An Evolving, Multi-Domain ASR Corpus with 10,000 Hours of Transcribed Audio INTERSPEECH 2021 Improve Query Focused Abstractive Summarization by Incorporating Answer Relevance IJCNLP 2021 Tune-In: Training Under Negative Environments with Interference for Attention Networks Simulating Cocktail Party Effect AAAI 2021 Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition INTERSPEECH 2021 TeCANet: Temporal-Contextual Attention Network for Environment-Aware Speech Dereverberation INTERSPEECH 2021 SpeechMoE: Scaling to Large Acoustic Models with Dynamic Routing Mixture of Experts INTERSPEECH 2021 Glow-WaveGAN: Learning Speech Representations from GAN-Based Variational Auto-Encoder for High Fidelity Flow-Based Speech Synthesis INTERSPEECH 2021 Dimsum @LaySumm 20 EMNLP 2020 CAiRE-COVID: A Question Answering and Query-focused Multi-Document Summarization System for COVID-19 Scholarly Information Management EMNLP 2020 End-to-End Multi-Look Keyword Spotting INTERSPEECH 2020 SpecSwap: A Simple Data Augmentation Method for End-to-End Speech Recognition INTERSPEECH 2020 Investigating Robustness of Adversarial Samples Detection for Automatic Speaker Verification INTERSPEECH 2020 DurIAN: Duration Informed Attention Network for Speech Synthesis INTERSPEECH 2020 Audio-Visual Multi-Channel Recognition of Overlapped Speech INTERSPEECH 2020 Speech-XLNet: Unsupervised Acoustic Model Pretraining for Self-Attention Networks INTERSPEECH 2020 Transferring Source Style in Non-Parallel Voice Conversion INTERSPEECH 2020 Multi-hop Question Generation with Graph Convolutional Network EMNLP 2020 Extract, Adapt and Recognize: An End-to-End Neural Network for Corrupted Monaural Speech Recognition INTERSPEECH 2019 Disambiguation of Chinese Polyphones in an End-to-End Framework with Semantic Features Extracted by Pre-Trained BERT INTERSPEECH 2019 Generalizing Question Answering System with Pre-trained Language Model Fine-tuning EMNLP 2019 Neural Spatial Filter: Target Speaker Speech Separation Assisted with Directional Information INTERSPEECH 2019 Deep Extractor Network for Target Speaker Recovery from Single Channel Speech Mixtures INTERSPEECH 2018 Deep Discriminative Embeddings for Duration Robust Speaker Verification INTERSPEECH 2018 Text-Dependent Speech Enhancement for Small-Footprint Robust Keyword Detection INTERSPEECH 2018 Rapid Style Adaptation Using Residual Error Embedding for Expressive Speech Synthesis INTERSPEECH 2018 Permutation Invariant Training of Generative Adversarial Network for Monaural Speech Separation INTERSPEECH 2018 Improving Attention Based Sequence-to-Sequence Models for End-to-End English Conversational Speech Recognition INTERSPEECH 2018