Vibhav Vineet

35 papers · 2013–2025 · 10 conferences · across top CS/AI conferences

Achievements

+13 more ↓

🌍 Conference Polyglot (10) 🏃 Academic Marathon (12) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐝 Cross-Pollinator (7)

🌉 Interdisciplinary Bridge 🐣 Hot Topic Early Bird 🌍 Conference Polyglot (10) 🌟 Keyword Trendsetter Combo (4) 🧬 Topic Evolution 🏆 Keyword Champion 🔥 Unstoppable (6) ⚡ Prolific Year (9) 🚀 Conference Pioneer 📈 Trend Setter 💎 Century Club (35) ❓ The Questioner (2) 🗃️ Keyword Collector (165)

Conferences

CVPR (8) NIPS (7) ICCV (4) ACL (3) ECCV (3) EMNLP (3) ICLR (3) CORL (2) CLEAR (1) WACV (1)

Top co-authors

Neel Joshi (6) Xin Wang (4) Harkirat Behl (4) Yale Song (4) Shuai Zheng (3) Sai Vemprala (3) Yogesh Rawat (3) Yash Jain (3) Hadi Salman (3) Yogesh Singh Rawat (2)

Keywords

multimodal learning (4) transfer learning (3) convolutional neural network (3) semantic segmentation (3) vision language model (3) object detection (3) image segmentation (2) conditional random field (2) robustness analysis (2) benchmark dataset (2) vision-language model (2) representation learning (2) domain generalization (2) video understanding (2) domain adaptation (2) model robustness (2) knowledge distillation (2) spatial reasoning (2) zero-shot learning (2) video action recognition (2)

Papers

Exposing the Achilles’ Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning ACL 2025 Grounding Task Assistance with Multimodal Cues from a Single Demonstration ACL 2025 RiTTA: Modeling Event Relations in Text-to-Audio Generation EMNLP 2025 HierarQ: Task-Aware Hierarchical Q-Former for Enhanced Video Understanding CVPR 2025 Out of Sight, Not Out of Context? Egocentric Spatial Reasoning in VLMs Across Disjoint Frames EMNLP 2025 DreamDistribution: Learning Prompt Distribution for Diverse In-distribution Generation ICLR 2025 Unearthing Skill-level Insights for Understanding Trade-offs of Foundation Models ICLR 2025 Is A Picture Worth A Thousand Words? Delving Into Spatial Reasoning for Vision Language Models NIPS 2024 PEEKABOO: Interactive Video Generation via Masked-Diffusion CVPR 2024 Navigating Hallucinations for Reasoning of Unintentional Activities EMNLP 2024 Exploring the Sim2Real Gap Using Digital Twins ICCV 2023 Scaling Novel Object Detection With Weakly Supervised Detection Transformers WACV 2023 PLEX: Making the Most of the Available Data for Robotic Manipulation Pretraining CORL 2023 On Occlusions in Video Action Detection: Benchmark Datasets And Training Recipes NIPS 2023 Revealing the unseen: Benchmarking video action recognition under occlusion NIPS 2023 Efficiently Robustify Pre-Trained Models ICCV 2023 DAMEX: Dataset-aware Mixture-of-Experts for visual understanding of mixture-of-datasets NIPS 2023 A Large-Scale Robustness Analysis of Video Action Recognition Models CVPR 2023 Learning To Align Sequential Actions in the Wild CVPR 2022 3DB: A Framework for Debugging Computer Vision Models NIPS 2022 Robustness Analysis of Video-Language Models Against Visual and Language Perturbations NIPS 2022 Image Retrieval from Contextual Descriptions ACL 2022 CausalCity: Complex Simulations with Agency for Causal Discovery and Reasoning CLEAR 2022 Robust Contrastive Learning Against Noisy Views CVPR 2022 Neural-Sim: Learning to Generate Training Data with NeRF ECCV 2022 MTFormer: Multi-task Learning via Transformer and Cross-Task Reasoning ECCV 2022 Missingness Bias in Model Debugging ICLR 2022 Taskography: Evaluating robot task planning over large 3D scene graphs CORL 2021 AutoSimulate: (Quickly) Learning Synthetic Data Generation ECCV 2020 Dense Monocular Depth Estimation in Complex Dynamic Scenes CVPR 2016 Feature Space Optimization for Semantic Video Segmentation CVPR 2016 Conditional Random Fields as Recurrent Neural Networks ICCV 2015 Dense Semantic Image Segmentation with Objects and Attributes CVPR 2014 Higher Order Priors for Joint Intrinsic Image, Objects, and Attributes Estimation NIPS 2013 Efficient Salient Region Detection with Soft Image Abstraction ICCV 2013