Sonal Kumar

27 papers · 2021–2026 · 10 conferences · across top CS/AI conferences

Achievements

+12 more ↓

🐝 Cross-Pollinator (14) 🌍 Conference Polyglot (9) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🏃 Academic Marathon (5)

🏃 Academic Marathon (5) 🌈 Renaissance Researcher (6) 🗺️ Taxonomy Completionist (40) 🧬 Topic Evolution 🤝 Dynamic Duo (26) 👥 Mega-Team (34) 🔬 Deep Specialist (11) 🔥 Unstoppable (5) 🗃️ Keyword Collector (105) 💎 Century Club (26) ❓ The Questioner (5) ⚡ Prolific Year (9)

Conferences

EMNLP (6) NAACL (5) ACL (4) ICLR (4) ICML (2) INTERSPEECH (2) AAAI (1) CVPR (1) IJCNLP (1) SEMEVAL (1)

Top co-authors

Sreyan Ghosh (27) Dinesh Manocha (23) Utkarsh Tyagi (15) S Sakshi (9) Ashish Seth (9) Ramani Duraiswami (8) Chandra Kiran Reddy Evuru (6) Ramaneswaran Selvakumar (6) Ramaneswaran S (5) Nishit Anand (4)

Keywords

data augmentation (5) multimodal learning (5) transformer model (4) sequence tagging (3) benchmark evaluation (3) large language model (3) toxic span detection (3) dependency parsing (3) audio-language model (3) biaffine attention (3) biaffine model (2) contrastive learning (2) visual cue (2) span extraction (2) text classification (2) low-resource setting (2) automatic speech recognition (2) text generation (1) named entity recognition (1) transfer learning (1)

Papers

MMAU-Pro: A Challenging and Comprehensive Benchmark for Holistic Evaluation of Audio General Intelligence AAAI 2026 MMAU: A Massive Multi-Task Audio Understanding and Reasoning Benchmark ICLR 2025 EGOILLUSION: Benchmarking Hallucinations in Egocentric Video Understanding EMNLP 2025 MULTIVOX: A Benchmark for Evaluating Voice Assistants for Multimodal Interactions EMNLP 2025 Visual Description Grounding Reduces Hallucinations and Boosts Reasoning in LVLMs ICLR 2025 Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic Data ICLR 2025 Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities ICML 2025 PAT: Parameter-Free Audio-Text Aligner to Boost Zero-Shot Audio Classification NAACL 2025 ProSE: Diffusion Priors for Speech Enhancement NAACL 2025 Do Audio-Language Models Understand Linguistic Variations? NAACL 2025 Do Vision-Language Models Understand Compound Nouns? NAACL 2024 CompA: Addressing the Gap in Compositional Reasoning in Audio-Language Models ICLR 2024 CoDa: Constrained Generation based Data Augmentation for Low-Resource NLP NAACL 2024 A Closer Look at the Limitations of Instruction Tuning ICML 2024 ABEX: Data Augmentation for Low-Resource NLU via Expanding Abstract Descriptions ACL 2024 ASPIRE: Language-Guided Data Augmentation for Improving Robustness Against Spurious Correlations ACL 2024 AV-RIR: Audio-Visual Room Impulse Response Estimation CVPR 2024 GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities EMNLP 2024 EH-MAM: Easy-to-Hard Masked Acoustic Modeling for Self-Supervised Speech Representation Learning EMNLP 2024 LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition INTERSPEECH 2024 CoSyn: Detecting Implicit Hate Speech in Online Conversations Using a Context Synergized Hyperbolic Network EMNLP 2023 DALE: Generative Data Augmentation for Low-Resource Legal NLP EMNLP 2023 ACLM: A Selective-Denoising based Generative Data Augmentation Approach for Low-Resource Complex NER ACL 2023 Span Classification with Structured Information for Disfluency Detection in Spoken Utterances INTERSPEECH 2022 Cisco at SemEval-2021 Task 5: What’s Toxic?: Leveraging Transformers for Multiple Toxic Span Extraction from Online Comments SEMEVAL 2021 Cisco at SemEval-2021 Task 5: What’s Toxic?: Leveraging Transformers for Multiple Toxic Span Extraction from Online Comments ACL 2021 Cisco at SemEval-2021 Task 5: What’s Toxic?: Leveraging Transformers for Multiple Toxic Span Extraction from Online Comments IJCNLP 2021