Nima Mesgarani

22 papers · 2016–2026 · 7 conferences · across top CS/AI conferences

Achievements

+7 more ↓

🐣 Hot Topic Early Bird 🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (11) 🧭 Keyword Pioneer 🌍 Conference Polyglot (6)

🐣 Hot Topic Early Bird 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (6) 🗃️ Keyword Collector (114) 🚀 Conference Pioneer 💎 Century Club (21) ⚡ Prolific Year (6)

Conferences

INTERSPEECH (12) ACL (3) NAACL (2) NIPS (2) AAAI (1) EMNLP (1) ICML (1)

Top co-authors

Yi Luo (7) Yinghao Aaron Li (5) Xilin Jiang (5) Cong Han (5) Linyang He (3) Tasha Nagamine (3) Helmut Schmid (2) Ercong Nie (2) Jonathan Brennan (2) Zhuo Chen (2)

Keywords

neural network (6) source separation (4) deep neural network (4) speech separation (3) feature space (2) representation learning (2) minimal pair (2) phoneme classification (2) speech recognition (2) large language model (2) speech enhancement (2) diffusion model (2) speech synthesis (1) voice conversion (1) domain adaptation (1) knowledge distillation (1) self-supervised learning (1) zero-shot learning (1) embedding learning (1) syntactic analysis (1)

Papers

DMOSpeech 2: Reinforcement Learning for Duration Prediction in Metric-Optimized Speech Synthesis AAAI 2026 AAD-LLM: Neural Attention-Driven Auditory Scene Understanding ACL 2025 Large Language Models as Neurolinguistic Subjects: Discrepancy between Performance and Competence ACL 2025 XCOMPS: A Multilingual Benchmark of Conceptual Minimal Pairs ACL 2025 Layer-wise Minimal Pair Probing Reveals Contextual Grammatical-Conceptual Hierarchy in Speech Representations EMNLP 2025 StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusion NAACL 2025 Quantifying Semantic Functional Specialization in the Brain Using Encoding Models of Natural Language NAACL 2025 DeCoR: Defy Knowledge Forgetting by Predicting Earlier Audio Codes INTERSPEECH 2023 StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models NIPS 2023 Understanding Adaptive, Multiscale Temporal Integration In Deep Speech Recognition Systems NIPS 2021 StarGANv2-VC: A Diverse, Unsupervised, Non-Parallel Framework for Natural-Sounding Voice Conversion INTERSPEECH 2021 Continuous Speech Separation Using Speaker Inventory for Long Recording INTERSPEECH 2021 Implicit Filter-and-Sum Network for End-to-End Multi-Channel Speech Separation INTERSPEECH 2021 Empirical Analysis of Generalized Iterative Speech Separation Networks INTERSPEECH 2021 Binaural Speech Separation of Moving Speakers With Preserved Spatial Cues INTERSPEECH 2021 Separating Varying Numbers of Sources with Auxiliary Autoencoding Loss INTERSPEECH 2020 Music Source Activity Detection and Separation Using Deep Attractor Network INTERSPEECH 2018 Speech Processing in the Human Brain Meets Deep Learning INTERSPEECH 2018 Real-time Single-channel Dereverberation and Separation with Time-domain Audio Separation Network INTERSPEECH 2018 Understanding the Representation and Computation of Multilayer Perceptrons: A Case Study in Speech Recognition ICML 2017 On the Role of Nonlinear Transformations in Deep Neural Network Acoustic Models INTERSPEECH 2016 Adaptation of Neural Networks Constrained by Prior Statistics of Node Co-Activations INTERSPEECH 2016