conftrace_

Sreyan Ghosh

35 papers · 2020–2026 · 12 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓
+13 more ↓ 🧭 Keyword Pioneer πŸ—ΊοΈ Taxonomy Completionist (10) 🌈 Renaissance Researcher (5) πŸŒ‰ Interdisciplinary Bridge 🌍 Conference Polyglot (11)
🌍 Conference Polyglot (11) πŸƒ Academic Marathon (5) 🐝 Cross-Pollinator (14) πŸ† Keyword Champion (3) 🀝 Dynamic Duo (26) πŸ‘₯ Mega-Team (34) πŸ”¬ Deep Specialist (15) 🧬 Topic Evolution πŸ’Ž Century Club (32) πŸ—ƒοΈ Keyword Collector (130) πŸ”₯ Unstoppable (6) ❓ The Questioner (5) ⚑ Prolific Year (10)

Conferences

ACL (7) EMNLP (6) INTERSPEECH (5) NAACL (5) ICLR (4) ICML (2) AAAI (1) COLING (1) CVPR (1) ICCV (1) IJCNLP (1) SEMEVAL (1)

Papers

MMAU-Pro: A Challenging and Comprehensive Benchmark for Holistic Evaluation of Audio General Intelligence AAAI 2026 FIGMA: Towards FIne-Grained Music retrievAl ACL 2026 Speech-Hands: A Self-Reflection Voice Agentic Approach to Speech Recognition and Audio Reasoning with Omni Perception ACL 2026 Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic Data ICLR 2025 Failing Forward: Improving Generative Error Correction for ASR with Synthetic Data and Retrieval Augmentation ACL 2025 EGOILLUSION: Benchmarking Hallucinations in Egocentric Video Understanding EMNLP 2025 MULTIVOX: A Benchmark for Evaluating Voice Assistants for Multimodal Interactions EMNLP 2025 Visual Description Grounding Reduces Hallucinations and Boosts Reasoning in LVLMs ICLR 2025 MMAU: A Massive Multi-Task Audio Understanding and Reasoning Benchmark ICLR 2025 Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities ICML 2025 PAT: Parameter-Free Audio-Text Aligner to Boost Zero-Shot Audio Classification NAACL 2025 ProSE: Diffusion Priors for Speech Enhancement NAACL 2025 Do Audio-Language Models Understand Linguistic Variations? NAACL 2025 Do Vision-Language Models Understand Compound Nouns? NAACL 2024 ABEX: Data Augmentation for Low-Resource NLU via Expanding Abstract Descriptions ACL 2024 LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition INTERSPEECH 2024 CompA: Addressing the Gap in Compositional Reasoning in Audio-Language Models ICLR 2024 ASPIRE: Language-Guided Data Augmentation for Improving Robustness Against Spurious Correlations ACL 2024 A Closer Look at the Limitations of Instruction Tuning ICML 2024 CoDa: Constrained Generation based Data Augmentation for Low-Resource NLP NAACL 2024 EH-MAM: Easy-to-Hard Masked Acoustic Modeling for Self-Supervised Speech Representation Learning EMNLP 2024 GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities EMNLP 2024 AV-RIR: Audio-Visual Room Impulse Response Estimation CVPR 2024 MMER: Multimodal Multi-task Learning for Speech Emotion Recognition INTERSPEECH 2023 ACLM: A Selective-Denoising based Generative Data Augmentation Approach for Low-Resource Complex NER ACL 2023 AdVerb: Visually Guided Audio Dereverberation ICCV 2023 CoSyn: Detecting Implicit Hate Speech in Online Conversations Using a Context Synergized Hyperbolic Network EMNLP 2023 DALE: Generative Data Augmentation for Low-Resource Legal NLP EMNLP 2023 Span Extraction Aided Improved Code-mixed Sentiment Classification COLING 2022 Span Classification with Structured Information for Disfluency Detection in Spoken Utterances INTERSPEECH 2022 DeToxy: A Large-Scale Multimodal Dataset for Toxicity Classification in Spoken Utterances INTERSPEECH 2022 Cisco at SemEval-2021 Task 5: What’s Toxic?: Leveraging Transformers for Multiple Toxic Span Extraction from Online Comments SEMEVAL 2021 Cisco at SemEval-2021 Task 5: What’s Toxic?: Leveraging Transformers for Multiple Toxic Span Extraction from Online Comments ACL 2021 Cisco at SemEval-2021 Task 5: What’s Toxic?: Leveraging Transformers for Multiple Toxic Span Extraction from Online Comments IJCNLP 2021 End-to-End Named Entity Recognition from English Speech INTERSPEECH 2020