Anurag Kumar

23 papers · 2016–2025 · 7 conferences · across top CS/AI conferences

Achievements

+12 more ↓

🌍 Conference Polyglot (7) 🏃 Academic Marathon (9) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐝 Cross-Pollinator (12)

🌉 Interdisciplinary Bridge 🐣 Hot Topic Early Bird 🌍 Conference Polyglot (7) 🧬 Topic Evolution 👥 Mega-Team (85) 🏆 Keyword Champion 🔥 Unstoppable (7) ⚡ Prolific Year (5) 🚀 Conference Pioneer 💎 Century Club (23) ❓ The Questioner 🗃️ Keyword Collector (109)

Conferences

INTERSPEECH (10) CVPR (5) IJCAI (3) NIPS (2) COLING (1) ECCV (1) ICML (1)

Top co-authors

Vamsi Krishna Ithapu (5) Buye Xu (5) Pranay Manocha (4) Ruohan Gao (3) Paul Calamia (3) Chenliang Xu (3) Bhiksha Raj (3) Chao Huang (3) Ke Tan (2) Ishwarya Ananthabhotla (2)

Keywords

speech enhancement (5) multimodal learning (5) transfer learning (4) spatial audio (4) deep neural network (3) audio-visual learning (3) speech quality assessment (2) neural network (2) weakly supervised learning (2) non-matching reference (2) room impulse response (2) neural radiance field (2) mean opinion score (2) speaker separation (2) time-domain audio (2) multi-modal learning (1) egocentric vision (1) self-supervised learning (1) matrix factorization (1) domain generalization (1)

Papers

Learning to Highlight Audio by Watching Movies CVPR 2025 Hearing Anywhere in Any Environment CVPR 2025 Bridging Context Gaps: Enhancing Comprehension in Long-Form Social Conversations Through Contextualized Excerpts COLING 2025 Spherical World-Locking for Audio-Visual Localization in Egocentric Videos ECCV 2024 Cross-Talk Reduction IJCAI 2024 Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and Benchmark CVPR 2024 AV-NeRF: Learning Neural Fields for Real-World Audio-Visual Scene Synthesis NIPS 2023 Egocentric Audio-Visual Object Localization CVPR 2023 Spatialization Quality Metric for Binaural Speech INTERSPEECH 2023 Rethinking Complex-Valued Deep Neural Networks for Monaural Speech Enhancement INTERSPEECH 2023 Time-domain Transformer-based Audiovisual Speaker Separation INTERSPEECH 2023 SAQAM: Spatial Audio Quality Assessment Metric INTERSPEECH 2022 Speech Quality Assessment through MOS using Non-Matching References INTERSPEECH 2022 Ego4D: Around the World in 3,000 Hours of Egocentric Video CVPR 2022 Improving Speech Enhancement through Fine-Grained Speech Characteristics INTERSPEECH 2022 Time-domain Ad-hoc Array Speech Enhancement Using a Triple-path Network INTERSPEECH 2022 NORESQA: A Framework for Speech Quality Assessment using Non-Matching References NIPS 2021 Do Sound Event Representations Generalize to Other Audio Tasks? A Case Study in Audio Transfer Learning INTERSPEECH 2021 Large Scale Audiovisual Learning of Sounds with Weakly Labeled Data IJCAI 2020 A Sequential Self Teaching Approach for Improving Generalization in Sound Event Recognition ICML 2020 Learning Sound Events from Webly Labeled Data IJCAI 2019 Audio Content Based Geotagging in Multimedia INTERSPEECH 2017 Speech Enhancement in Multiple-Noise Conditions Using Deep Neural Networks INTERSPEECH 2016