Dan Su
53 papers · 2018–2025 · 11 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+12 more ↓ Show less ↑
π§ Keyword Pioneer πΊοΈ Taxonomy Completionist (23) π Renaissance Researcher (5) π Interdisciplinary Bridge π Conference Polyglot (11)
π
Academic Marathon
(7)
πΊοΈ
Taxonomy Completionist
(23)
π§
Keyword Pioneer
π
Conference Loyalist
(30)
π€
Dynamic Duo
(26)
π₯
Mega-Team
(48)
ποΈ
Keyword Collector
(75)
π
Trend Setter
π₯
Unstoppable
(8)
β‘
Prolific Year
(9)
π
Conference Pioneer
π
Century Club
(53)
Conferences
INTERSPEECH (30)
ACL (8)
EMNLP (4)
AAAI (2)
EACL (2)
IJCNLP (2)
AACL (1)
ICLR (1)
ICML (1)
IJCAI (1)
MICCAI (1)
Top co-authors
Keywords
text-to-speech synthesis
(5)
speech separation
(5)
speech recognition
(4)
speech synthesis
(4)
attention mechanism
(4)
speaker similarity
(3)
voice conversion
(3)
speaker verification
(3)
end-to-end speech recognition
(3)
question answering
(3)
machine reading comprehension
(2)
reading comprehension
(2)
source separation
(2)
multi-task learning
(2)
automatic speech recognition
(2)
speech enhancement
(2)
extractive summarization
(2)
benchmark evaluation
(2)
text generation
(2)
knowledge distillation
(2)
Papers
Nemotron-CC: Transforming Common Crawl into a Refined Long-Horizon Pretraining Dataset
ACL 2025
MM-LLMs: Recent Advances in MultiModal Large Language Models
ACL 2024
Prompt-guided Precise Audio Editing with Diffusion Models
ICML 2024
ORCGT: Ollivier-Ricci Curvature-based Graph Model for Lung STAS Prediction
MICCAI 2024
Generative Pre-trained Speech Language Model with Efficient Hierarchical Transformer
ACL 2024
NusaCrowd: Open Source Initiative for Indonesian NLP Resources
ACL 2023
Plausible May Not Be Faithful: Probing Object Hallucination in Vision-Language Pre-training
EACL 2023
Context Generation Improves Open Domain Question Answering
EACL 2023
Multi-mode Neural Speech Coding Based on Deep Generative Networks
INTERSPEECH 2023
Text-Only Domain Adaptation for End-to-End Speech Recognition through Down-Sampling Acoustic Representation
INTERSPEECH 2023
A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity
IJCNLP 2023
UniSyn: An End-to-End Unified Model for Text-to-Speech and Singing Voice Synthesis
AAAI 2023
A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity
AACL 2023
Compressed MoE ASR Model Based on Knowledge Distillation and Quantization
INTERSPEECH 2023
Clozerβ:β Adaptable Data Augmentation for Cloze-style Reading Comprehension
ACL 2022
Read before Generate! Faithful Long Form Question Answering with Machine Reading
ACL 2022
Retrieval-Free Knowledge-Grounded Dialogue Response Generation with Adapters
ACL 2022
BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis
ICLR 2022
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
IJCAI 2022
Cross-Age Speaker Verification: Learning Age-Invariant Speaker Embeddings
INTERSPEECH 2022
Learning Noise-independent Speech Representation for High-quality Voice Conversion for Noisy Target Speakers
INTERSPEECH 2022
Glow-WaveGAN 2: High-quality Zero-shot Text-to-speech Synthesis and Any-to-any Voice Conversion
INTERSPEECH 2022
Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis
INTERSPEECH 2022
Enhancing Word-Level Semantic Representation via Dependency Structure for Expressive Text-to-Speech Synthesis
INTERSPEECH 2022
Improve Query Focused Abstractive Summarization by Incorporating Answer Relevance
ACL 2021
Controllable Context-Aware Conversational Speech Synthesis
INTERSPEECH 2021
GigaSpeech: An Evolving, Multi-Domain ASR Corpus with 10,000 Hours of Transcribed Audio
INTERSPEECH 2021
Improve Query Focused Abstractive Summarization by Incorporating Answer Relevance
IJCNLP 2021
Tune-In: Training Under Negative Environments with Interference for Attention Networks Simulating Cocktail Party Effect
AAAI 2021
Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition
INTERSPEECH 2021
TeCANet: Temporal-Contextual Attention Network for Environment-Aware Speech Dereverberation
INTERSPEECH 2021
SpeechMoE: Scaling to Large Acoustic Models with Dynamic Routing Mixture of Experts
INTERSPEECH 2021
Glow-WaveGAN: Learning Speech Representations from GAN-Based Variational Auto-Encoder for High Fidelity Flow-Based Speech Synthesis
INTERSPEECH 2021
Dimsum @LaySumm 20
EMNLP 2020
CAiRE-COVID: A Question Answering and Query-focused Multi-Document Summarization System for COVID-19 Scholarly Information Management
EMNLP 2020
End-to-End Multi-Look Keyword Spotting
INTERSPEECH 2020
SpecSwap: A Simple Data Augmentation Method for End-to-End Speech Recognition
INTERSPEECH 2020
Investigating Robustness of Adversarial Samples Detection for Automatic Speaker Verification
INTERSPEECH 2020
DurIAN: Duration Informed Attention Network for Speech Synthesis
INTERSPEECH 2020
Audio-Visual Multi-Channel Recognition of Overlapped Speech
INTERSPEECH 2020
Speech-XLNet: Unsupervised Acoustic Model Pretraining for Self-Attention Networks
INTERSPEECH 2020
Transferring Source Style in Non-Parallel Voice Conversion
INTERSPEECH 2020
Multi-hop Question Generation with Graph Convolutional Network
EMNLP 2020
Extract, Adapt and Recognize: An End-to-End Neural Network for Corrupted Monaural Speech Recognition
INTERSPEECH 2019
Disambiguation of Chinese Polyphones in an End-to-End Framework with Semantic Features Extracted by Pre-Trained BERT
INTERSPEECH 2019
Generalizing Question Answering System with Pre-trained Language Model Fine-tuning
EMNLP 2019
Neural Spatial Filter: Target Speaker Speech Separation Assisted with Directional Information
INTERSPEECH 2019
Deep Extractor Network for Target Speaker Recovery from Single Channel Speech Mixtures
INTERSPEECH 2018
Deep Discriminative Embeddings for Duration Robust Speaker Verification
INTERSPEECH 2018
Text-Dependent Speech Enhancement for Small-Footprint Robust Keyword Detection
INTERSPEECH 2018
Rapid Style Adaptation Using Residual Error Embedding for Expressive Speech Synthesis
INTERSPEECH 2018
Permutation Invariant Training of Generative Adversarial Network for Monaural Speech Separation
INTERSPEECH 2018
Improving Attention Based Sequence-to-Sequence Models for End-to-End English Conversational Speech Recognition
INTERSPEECH 2018