Vineet Gandhi
15 papers · 2013–2025 · 10 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+9 more ↓ Show less ↑
π Interdisciplinary Bridge π Conference Polyglot (10) π Academic Marathon (12) π Renaissance Researcher (7) πΊοΈ Taxonomy Completionist (36)
π
Renaissance Researcher
(7)
π
Interdisciplinary Bridge
π
Academic Marathon
(12)
π§¬
Topic Evolution
π
Trend Setter
π
Century Club
(15)
π₯
Unstoppable
(7)
π
Conference Pioneer
ποΈ
Keyword Collector
(80)
Conferences
CVPR (3)
INTERSPEECH (2)
NAACL (2)
WACV (2)
ACL (1)
EACL (1)
EMNLP (1)
ICLR (1)
IJCAI (1)
NIPS (1)
Top co-authors
Keywords
speech synthesis
(3)
self-supervised learning
(3)
domain generalization
(2)
text-to-speech synthesis
(2)
emotional prosody
(2)
multimodal learning
(2)
large language model
(2)
video understanding
(2)
coreference resolution
(2)
named entity recognition
(1)
cross-lingual transfer
(1)
entity linking
(1)
few-shot learning
(1)
semantic segmentation
(1)
visual grounding
(1)
benchmark evaluation
(1)
referring expression
(1)
object detection
(1)
multi-modal learning
(1)
test-time adaptation
(1)
Papers
VELOCITI: Benchmarking Video-Language Compositional Reasoning with Strict Entailment
CVPR 2025
IdentifyMe: A Challenging Long-Context Mention Resolution Benchmark for LLMs
NAACL 2025
TIDE: Training Locally Interpretable Domain Generalization Models Enables Test-time Correction
CVPR 2025
Towards Improving NAM-to-Speech Synthesis Intelligibility using Self-Supervised Speech Models
INTERSPEECH 2024
Major Entity Identification: A Generalizable Alternative to Coreference Resolution
EMNLP 2024
Real Time GAZED: Online Shot Selection and Editing of Virtual Cameras From Wide-Angle Monocular Video Recordings
WACV 2024
ParrotTTS: Text-to-speech synthesis exploiting disentangled self-supervised representations
EACL 2024
Test-Time Amendment with a Coarse Classifier for Fine-Grained Classification
NIPS 2023
Empathic Machines: Using Intermediate Features as Levers to Emulate Emotions in Text-To-Speech Systems
NAACL 2022
Comprehensive Multi-Modal Interactions for Referring Image Segmentation
ACL 2022
No Cost Likelihood Manipulation at Test Time for Making Better Mistakes in Deep Networks
ICLR 2021
Emotional Prosody Control for Speech Generation
INTERSPEECH 2021
Exploring 3 R's of Long-term Tracking: Redetection, Recovery and Reliability
WACV 2020
Learning Unsupervised Visual Grounding Through Semantic Self-Supervision
IJCAI 2019
Detecting and Naming Actors in Movies Using Generative Appearance Models
CVPR 2013