Vineet Gandhi

15 papers · 2013–2025 · 10 conferences · across top CS/AI conferences

Achievements

+9 more ↓

🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (10) 🏃 Academic Marathon (12) 🌈 Renaissance Researcher (7) 🗺️ Taxonomy Completionist (36)

🌈 Renaissance Researcher (7) 🌉 Interdisciplinary Bridge 🏃 Academic Marathon (12) 🧬 Topic Evolution 📈 Trend Setter 💎 Century Club (15) 🔥 Unstoppable (7) 🚀 Conference Pioneer 🗃️ Keyword Collector (80)

Conferences

CVPR (3) INTERSPEECH (2) NAACL (2) WACV (2) ACL (1) EACL (1) EMNLP (1) ICLR (1) IJCAI (1) NIPS (1)

Top co-authors

Makarand Tapaswi (3) Saiteja Kosgi (3) Shyamgopal Karthik (3) Shubham Toshniwal (2) Anil Nelakanti (2) Neil Shah (2) Kawshik Manikantan (2) Kanishk Jain (2) Sarath Sivaprasad (2) Shirish Karande (1)

Keywords

speech synthesis (3) self-supervised learning (3) domain generalization (2) text-to-speech synthesis (2) emotional prosody (2) multimodal learning (2) large language model (2) video understanding (2) coreference resolution (2) named entity recognition (1) cross-lingual transfer (1) entity linking (1) few-shot learning (1) semantic segmentation (1) visual grounding (1) benchmark evaluation (1) referring expression (1) object detection (1) multi-modal learning (1) test-time adaptation (1)

Papers

VELOCITI: Benchmarking Video-Language Compositional Reasoning with Strict Entailment CVPR 2025 IdentifyMe: A Challenging Long-Context Mention Resolution Benchmark for LLMs NAACL 2025 TIDE: Training Locally Interpretable Domain Generalization Models Enables Test-time Correction CVPR 2025 Towards Improving NAM-to-Speech Synthesis Intelligibility using Self-Supervised Speech Models INTERSPEECH 2024 Major Entity Identification: A Generalizable Alternative to Coreference Resolution EMNLP 2024 Real Time GAZED: Online Shot Selection and Editing of Virtual Cameras From Wide-Angle Monocular Video Recordings WACV 2024 ParrotTTS: Text-to-speech synthesis exploiting disentangled self-supervised representations EACL 2024 Test-Time Amendment with a Coarse Classifier for Fine-Grained Classification NIPS 2023 Empathic Machines: Using Intermediate Features as Levers to Emulate Emotions in Text-To-Speech Systems NAACL 2022 Comprehensive Multi-Modal Interactions for Referring Image Segmentation ACL 2022 No Cost Likelihood Manipulation at Test Time for Making Better Mistakes in Deep Networks ICLR 2021 Emotional Prosody Control for Speech Generation INTERSPEECH 2021 Exploring 3 R's of Long-term Tracking: Redetection, Recovery and Reliability WACV 2020 Learning Unsupervised Visual Grounding Through Semantic Self-Supervision IJCAI 2019 Detecting and Naming Actors in Movies Using Generative Appearance Models CVPR 2013