Abhinav Shrivastava

95 papers · 2013–2026 · 11 conferences · across top CS/AI conferences

Achievements

+14 more ↓

🌉 Interdisciplinary Bridge 🏃 Academic Marathon (13) 🌍 Conference Polyglot (11) 🌈 Renaissance Researcher (8) 🗺️ Taxonomy Completionist (117)

🗺️ Taxonomy Completionist (117) 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🏠 Conference Loyalist (33) 🏆 Keyword Champion (4) 🔬 Deep Specialist (13) 🤝 Dynamic Duo (14) 💎 Century Club (95) 📈 Trend Setter 🔥 Unstoppable (14) 🗃️ Keyword Collector (335) ❓ The Questioner (2) ⚡ Prolific Year (23) 🚀 Conference Pioneer

Conferences

CVPR (33) ECCV (18) ICCV (16) WACV (11) ICLR (5) AAAI (4) NIPS (4) CORL (1) EACL (1) EMNLP (1) NAACL (1)

Top co-authors

Kamal Gupta (14) Ser-Nam Lim (12) Larry S. Davis (10) Bo He (8) Saksham Suri (8) Hanyu Wang (7) Sharath Girish (7) Larry Davis (7) Abhinav Gupta (7) Hao Chen (6)

Research topics

Computer Vision (1) Core AI (1)

Keywords

object detection (9) representation learning (7) video compression (7) convolutional neural network (6) video understanding (6) multimodal learning (6) zero-shot learning (6) implicit neural representation (6) contrastive learning (5) self-supervised learning (5) vision-language model (4) semi-supervised learning (4) action recognition (4) neural representation (4) vision transformer (3) generative adversarial network (3) pose estimation (3) generative model (3) unsupervised learning (3) diffusion model (3)

Papers

How to Design and Train Your Implicit Neural Representation for Video Compression WACV 2026 LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior ICLR 2025 CoLLM: A Large Language Model for Composed Image Retrieval CVPR 2025 Unified Framework for Open-World Compositional Zero-Shot Learning WACV 2025 Trokens: Semantic-Aware Relational Trajectory Tokens for Few-Shot Action Recognition ICCV 2025 Imagine, Verify, Execute: Memory-guided Agentic Exploration with Vision-Language Models CORL 2025 A Video is Worth 10000 Words: Training and Benchmarking with Diverse Captions for Better Long Video Retrieval WACV 2025 Leveraging Bitstream Metadata for Fast, Accurate, Generalized Compressed Video Quality Enhancement WACV 2024 Multimodality-Guided Image Style Transfer Using Cross-Modal GAN Inversion WACV 2024 Diff2Lip: Audio Conditioned Diffusion Models for Lip-Synchronization WACV 2024 Video Decomposition Prior: Editing Videos Layer by Layer ICLR 2024 AutoHallusion: Automatic Generation of Hallucination Benchmarks for Vision-Language Models EMNLP 2024 Investigating Style Similarity in Diffusion Models ECCV 2024 EAGLES: Efficient Accelerated 3D Gaussians with Lightweight EncodingS ECCV 2024 Do text-free diffusion models learn discriminative visual representations? ECCV 2024 Fast Encoding and Decoding for Implicit Video Representation ECCV 2024 Trajectory-aligned Space-time Tokens for Few-shot Action Recognition ECCV 2024 Customize-A-Video: One-Shot Motion Customization of Text-to-Video Diffusion Models ECCV 2024 LEIA: Latent View-invariant Embeddings for Implicit 3D Articulation ECCV 2024 Latent-INR: A Flexible Framework for Implicit Representations of Videos with Discriminative Semantics ECCV 2024 LiFT: A Surprisingly Simple Lightweight Feature Transform for Dense ViT Descriptors ECCV 2024 MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding CVPR 2024 Beyond Seen Primitive Concepts and Attribute-Object Compositional Learning CVPR 2024 QUEEN: QUantized Efficient ENcoding of Dynamic Gaussians for Streaming Free-viewpoint Videos NIPS 2024 GRIT: GAN Residuals for Paired Image-to-Image Translation WACV 2024 Content-Aware Image Color Editing With Auxiliary Color Restoration Tasks WACV 2024 Composing Object Relations and Attributes for Image-Text Matching CVPR 2024 Explaining the Implicit Neural Canvas: Connecting Pixels to Neurons by Tracing their Contributions CVPR 2024 Video Prediction by Modeling Videos as Continuous Multi-Dimensional Processes CVPR 2024 MaGGIe: Masked Guided Gradual Human Instance Matting CVPR 2024 SHACIRA: Scalable HAsh-grid Compression for Implicit Neural Representations ICCV 2023 Video Dynamics Prior: An Internal Learning Approach for Robust Video Enhancements NIPS 2023 Towards Scalable Neural Representation for Diverse Videos CVPR 2023 NIRVANA: Neural Implicit Representations of Videos With Adaptive Networks and Autoregressive Patch-Wise Modeling CVPR 2023 FlexNeRF: Photorealistic Free-Viewpoint Rendering of Moving Humans From Sparse Views CVPR 2023 Align and Attend: Multimodal Summarization With Dual Contrastive Losses CVPR 2023 Teaching Matters: Investigating the Role of Supervision in Vision Transformers CVPR 2023 SimpSON: Simplifying Photo Cleanup With Single-Click Distracting Object Segmentation Network CVPR 2023 HNeRV: A Hybrid Neural Representation for Videos CVPR 2023 COVID-VTS: Fact Extraction and Verification on Short Video Platforms EACL 2023 Chop & Learn: Recognizing and Generating Object-State Compositions ICCV 2023 ASIC: Aligning Sparse in-the-wild Image Collections ICCV 2023 MOST: Multiple Object Localization with Self-Supervised Transformers for Object Discovery ICCV 2023 BT^2: Backward-compatible Training with Basis Transformation ICCV 2023 SparseDet: Improving Sparsely Annotated Object Detection with Pseudo-positive Mining ICCV 2023 LilNetX: Lightweight Networks with EXtreme Model Compression and Structured Sparsification ICLR 2023 Disentangling Visual Embeddings for Attributes and Objects CVPR 2022 ObjectFormer for Image Manipulation Detection and Localization CVPR 2022 Beyond Supervised vs. Unsupervised: Representative Benchmarking and Analysis of Image Representation Learning CVPR 2022 Pose and Joint-Aware Action Recognition WACV 2022 Burn after Reading: Online Adaptation for Cross-Domain Streaming Data ECCV 2022 Dual-Key Multimodal Backdoors for Visual Question Answering CVPR 2022 ASM-Loc: Action-Aware Segment Modeling for Weakly-Supervised Temporal Action Localization CVPR 2022 Improving Closed and Open-Vocabulary Attribute Prediction Using Transformers ECCV 2022 Learning Semantic Correspondence with Sparse Annotations ECCV 2022 Neural Space-Filling Curves ECCV 2022 Rethinking Pseudo Labels for Semi-supervised Object Detection AAAI 2022 Towards Discovery and Attribution of Open-World GAN Generated Images ICCV 2021 LayoutTransformer: Layout Generation and Completion With Self-Attention ICCV 2021 Deep Co-Training With Task Decomposition for Semi-Supervised Domain Adaptation ICCV 2021 The Pursuit of Knowledge: Discovering and Localizing Novel Categories Using Dual Memory ICCV 2021 PatchGame: Learning to Signal Mid-level Patches in Referential Games NIPS 2021 NeRV: Neural Representations for Videos NIPS 2021 Towards a Unifying Framework for Formal Theories of Novelty AAAI 2021 Diverse Video Generation using a Gaussian Process Trigger ICLR 2021 StEP: Style-Based Encoder Pre-Training for Multi-Modal Image Synthesis CVPR 2021 Knowledge Evolution in Neural Networks CVPR 2021 2D or not 2D? Adaptive 3D Convolution Selection for Efficient Video Recognition CVPR 2021 The Lottery Ticket Hypothesis for Object Recognition CVPR 2021 Learning To Predict Visual Attributes in the Wild CVPR 2021 Hierarchical Video Prediction Using Relational Layouts for Human-Object Interactions CVPR 2021 Learning Graphs for Knowledge Transfer With Limited Labels CVPR 2021 Learned Spatial Representations for Few-Shot Talking-Head Synthesis ICCV 2021 Quantization Guided JPEG Artifact Correction ECCV 2020 A Generic Visualization Approach for Convolutional Neural Networks ECCV 2020 PatchVAE: Learning Local Latent Codes for Recognition CVPR 2020 Generate, Segment, and Refine: Towards Generic Manipulation Segmentation AAAI 2020 Detecting Human-Object Interactions via Functional Generalization AAAI 2020 Scalable Model Compression by Entropy Penalized Reparameterization ICLR 2020 Boosting Standard Classification Architectures Through a Ranking Regularizer WACV 2020 Hand-Priming in Object Localization for Assistive Egocentric Vision WACV 2020 Curriculum Manager for Source Selection in Multi-Source Domain Adaptation ECCV 2020 Relational Action Forecasting CVPR 2019 Referring to Objects in Videos Using Spatio-Temporal Identifying Descriptions NAACL 2019 EvalNorm: Estimating Batch Normalization Statistics for Evaluation ICCV 2019 Tracking Emerges by Colorizing Videos ECCV 2018 Actor-centric Relation Network ECCV 2018 A-Fast-RCNN: Hard Positive Generation via Adversary for Object Detection CVPR 2017 Revisiting Unreasonable Effectiveness of Data in Deep Learning Era ICCV 2017 Training Region-Based Object Detectors With Online Hard Example Mining CVPR 2016 Cross-Stitch Networks for Multi-Task Learning CVPR 2016 Watch and Learn: Semi-Supervised Learning for Object Detectors From Video CVPR 2015 Enriching Visual Knowledge Bases via Object Discovery and Segmentation CVPR 2014 Building Part-Based Object Detectors via 3D Geometry ICCV 2013 NEIL: Extracting Visual Knowledge from Web Data ICCV 2013