Ashish Seth

15 papers · 2021–2026 · 6 conferences · across top CS/AI conferences

Achievements

+9 more ↓

🐝 Cross-Pollinator (12) 🌍 Conference Polyglot (5) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌈 Renaissance Researcher (6)

🌈 Renaissance Researcher (6) 🌍 Conference Polyglot (5) 🤝 Dynamic Duo (10) 👥 Mega-Team (27) 🔥 Unstoppable (5) 💎 Century Club (14) ⚡ Prolific Year (6) 🗃️ Keyword Collector (62) ❓ The Questioner

Conferences

EMNLP (5) INTERSPEECH (4) ICLR (2) NAACL (2) ACL (1) CVPR (1)

Top co-authors

Dinesh Manocha (11) Sreyan Ghosh (10) Sonal Kumar (9) Ramani Duraiswami (6) Ramaneswaran Selvakumar (6) Utkarsh Tyagi (6) Nishit Anand (4) S Sakshi (4) Chirag Agarwal (3) Oriol Nieto (3)

Keywords

multimodal learning (5) automatic speech recognition (4) benchmark evaluation (3) audio-language model (3) zero-shot learning (2) vision-language model (2) vision language model (2) representation learning (2) contrastive learning (2) hallucination benchmark (2) hallucination detection (2) large language model (2) speech analysis (1) progressive learning (1) machine translation (1) multimodal interaction (1) self-supervised learning (1) video understanding (1) bias detection (1) visual speech recognition (1)

Papers

FIGMA: Towards FIne-Grained Music retrievAl ACL 2026 MMAU: A Massive Multi-Task Audio Understanding and Reasoning Benchmark ICLR 2025 EGOILLUSION: Benchmarking Hallucinations in Egocentric Video Understanding EMNLP 2025 MULTIVOX: A Benchmark for Evaluating Voice Assistants for Multimodal Interactions EMNLP 2025 HALLUCINOGEN: Benchmarking Hallucination in Implicit Reasoning within Large Vision Language Models EMNLP 2025 PAT: Parameter-Free Audio-Text Aligner to Boost Zero-Shot Audio Classification NAACL 2025 Do Audio-Language Models Understand Linguistic Variations? NAACL 2025 CompA: Addressing the Gap in Compositional Reasoning in Audio-Language Models ICLR 2024 LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition INTERSPEECH 2024 GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities EMNLP 2024 EH-MAM: Easy-to-Hard Masked Acoustic Modeling for Self-Supervised Speech Representation Learning EMNLP 2024 Technology Pipeline for Large Scale Cross-Lingual Dubbing of Lecture Videos into Multiple Indian Languages INTERSPEECH 2023 DeAR: Debiasing Vision-Language Models With Additive Residuals CVPR 2023 Gram Vaani ASR Challenge on spontaneous telephone speech recordings in regional variations of Hindi INTERSPEECH 2022 Dual Script E2E Framework for Multilingual and Code-Switching ASR INTERSPEECH 2021