Ashish Seth
15 papers · 2021–2026 · 6 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+9 more ↓ Show less ↑
🐝 Cross-Pollinator (12) 🌍 Conference Polyglot (5) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌈 Renaissance Researcher (6)
🌈
Renaissance Researcher
(6)
🌍
Conference Polyglot
(5)
🤝
Dynamic Duo
(10)
👥
Mega-Team
(27)
🔥
Unstoppable
(5)
💎
Century Club
(14)
⚡
Prolific Year
(6)
🗃️
Keyword Collector
(62)
❓
The Questioner
Conferences
EMNLP (5)
INTERSPEECH (4)
ICLR (2)
NAACL (2)
ACL (1)
CVPR (1)
Top co-authors
Keywords
multimodal learning
(5)
automatic speech recognition
(4)
benchmark evaluation
(3)
audio-language model
(3)
zero-shot learning
(2)
vision-language model
(2)
vision language model
(2)
representation learning
(2)
contrastive learning
(2)
hallucination benchmark
(2)
hallucination detection
(2)
large language model
(2)
speech analysis
(1)
progressive learning
(1)
machine translation
(1)
multimodal interaction
(1)
self-supervised learning
(1)
video understanding
(1)
bias detection
(1)
visual speech recognition
(1)
Papers
FIGMA: Towards FIne-Grained Music retrievAl
ACL 2026
MMAU: A Massive Multi-Task Audio Understanding and Reasoning Benchmark
ICLR 2025
EGOILLUSION: Benchmarking Hallucinations in Egocentric Video Understanding
EMNLP 2025
MULTIVOX: A Benchmark for Evaluating Voice Assistants for Multimodal Interactions
EMNLP 2025
HALLUCINOGEN: Benchmarking Hallucination in Implicit Reasoning within Large Vision Language Models
EMNLP 2025
PAT: Parameter-Free Audio-Text Aligner to Boost Zero-Shot Audio Classification
NAACL 2025
Do Audio-Language Models Understand Linguistic Variations?
NAACL 2025
CompA: Addressing the Gap in Compositional Reasoning in Audio-Language Models
ICLR 2024
LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition
INTERSPEECH 2024
GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities
EMNLP 2024
EH-MAM: Easy-to-Hard Masked Acoustic Modeling for Self-Supervised Speech Representation Learning
EMNLP 2024
Technology Pipeline for Large Scale Cross-Lingual Dubbing of Lecture Videos into Multiple Indian Languages
INTERSPEECH 2023
DeAR: Debiasing Vision-Language Models With Additive Residuals
CVPR 2023
Gram Vaani ASR Challenge on spontaneous telephone speech recordings in regional variations of Hindi
INTERSPEECH 2022
Dual Script E2E Framework for Multilingual and Code-Switching ASR
INTERSPEECH 2021