Anoop Cherian

44 papers · 2014–2026 · 8 conferences · across top CS/AI conferences

Achievements

+13 more ↓

🌈 Renaissance Researcher (11) 🌉 Interdisciplinary Bridge 🏃 Academic Marathon (12) 🌍 Conference Polyglot (8) 🗺️ Taxonomy Completionist (90)

🗺️ Taxonomy Completionist (90) 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🔬 Deep Specialist (11) 🧬 Topic Evolution 🏆 Keyword Champion (2) ⚡ Prolific Year (6) 🚀 Conference Pioneer 💎 Century Club (44) 🔥 Unstoppable (11) 🗃️ Keyword Collector (221) ❓ The Questioner 📈 Trend Setter

Conferences

CVPR (14) ICCV (7) WACV (7) AAAI (5) ICML (4) NIPS (4) ECCV (2) INTERSPEECH (1)

Top co-authors

Tim K. Marks (9) Stephen Gould (7) Moitreya Chatterjee (6) Suhas Lohit (5) Jue Wang (5) Ye Wang (5) Chiori Hori (5) Jonathan Le Roux (4) Yuhang He (3) Cristian Rodriguez (3)

Keywords

multimodal learning (10) video understanding (6) contrastive learning (5) representation learning (4) subspace learning (4) self-supervised learning (4) riemannian optimization (3) scene graph (3) zero-shot learning (3) generative adversarial network (3) action recognition (3) audio-visual navigation (3) 3d reconstruction (2) visual reasoning (2) image synthesis (2) dictionary learning (2) multi-modal learning (2) depth estimation (2) sparse coding (2) pose estimation (2)

Papers

MMHOI: Modeling Complex 3D Multi-Human Multi-Object Interactions WACV 2026 SoundLoc3D: Invisible 3D Sound Source Localization and Classification using a Multimodal RGB-D Acoustic Camera WACV 2025 Temporally Grounding Instructional Diagrams in Unconstrained Videos WACV 2025 Manual-PA: Learning 3D Part Assembly from Instruction Diagrams ICCV 2025 RILA: Reflective and Imaginative Language Agent for Zero-Shot Semantic Audio-Visual Navigation CVPR 2024 TI2V-Zero: Zero-Shot Image Conditioning for Text-to-Video Diffusion Models CVPR 2024 Evaluating Large Vision-and-Language Models on Children's Mathematical Olympiads NIPS 2024 CAVEN: An Embodied Conversational Agent for Efficient Audio-Visual Navigation in Noisy Environments AAAI 2024 Sound3DVDet: 3D Sound Source Detection Using Multiview Microphone Array and RGB Images WACV 2024 Pixel-Grounded Prototypical Part Networks WACV 2024 Deep Neural Room Acoustics Primitive ICML 2024 HaLP: Hallucinating Latent Positives for Skeleton-Based Self-Supervised Learning of Actions CVPR 2023 Are Deep Neural Networks SMARTer Than Second Graders? CVPR 2023 Aligning Step-by-Step Instructional Diagrams to Video Demonstrations CVPR 2023 Steered Diffusion: A Generalized Framework for Plug-and-Play Conditional Image Synthesis ICCV 2023 MOST-GAN: 3D Morphable StyleGAN for Disentangled Face Image Manipulation AAAI 2022 Learning Audio-Visual Dynamics Using Scene Graphs for Audio Source Separation NIPS 2022 (2.5+1)D Spatio-Temporal Scene Graphs for Video Question Answering AAAI 2022 FeLMi : Few shot Learning with hard Mixup NIPS 2022 Max-Margin Contrastive Learning AAAI 2022 AVLEN: Audio-Visual-Language Embodied Navigation in 3D Environments NIPS 2022 Visual Scene Graphs for Audio Source Separation ICCV 2021 Dynamic Graph Representation Learning for Video Dialog via Multi-Modal Shuffled Transformers AAAI 2021 A Hierarchical Variational Neural Uncertainty Model for Stochastic Video Prediction ICCV 2021 InSeGAN: A Generative Approach to Segmenting Identical Instances in Depth Images ICCV 2021 Sound2Sight: Generating Visual Dynamics from Sound and Context ECCV 2020 LUVLi Face Alignment: Estimating Landmarks' Location, Uncertainty, and Visibility Likelihood CVPR 2020 Representation Learning via Adversarially-Contrastive Optimal Transport ICML 2020 Spatio-Temporal Ranked-Attention Networks for Video Captioning WACV 2020 FX-GAN: Self-Supervised GAN Learning via Feature Exchange WACV 2020 Audio Visual Scene-Aware Dialog CVPR 2019 GODS: Generalized One-Class Discriminative Subspaces for Anomaly Detection ICCV 2019 Joint Student-Teacher Learning for Audio-Visual Scene-Aware Dialog INTERSPEECH 2019 Game Theoretic Optimization via Gradient-based Nikaido-Isoda Function ICML 2019 Learning Discriminative Video Representations Using Adversarial Perturbations ECCV 2018 Non-Linear Temporal Subspace Representations for Activity Recognition CVPR 2018 Video Representation Learning Using Discriminative Pooling CVPR 2018 Scalable Dense Non-Rigid Structure-From-Motion: A Grassmannian Perspective CVPR 2018 Learning Discriminative ab-Divergences for Positive Definite Matrices ICCV 2017 DeepPermNet: Visual Permutation Learning CVPR 2017 Generalized Rank Pooling for Activity Recognition CVPR 2017 Sparse Coding for Third-Order Super-Symmetric Tensor Descriptors With Application to Texture Recognition CVPR 2016 Mixing Body-Part Sequences for Human Pose Estimation CVPR 2014 Nearest Neighbors Using Compact Sparse Codes ICML 2014