Anand Mishra

20 papers · 2013–2026 · 8 conferences · across top CS/AI conferences

Achievements

+12 more ↓

🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (7) 🌍 Conference Polyglot (8) 🏃 Academic Marathon (12) 🗺️ Taxonomy Completionist (40)

🧭 Keyword Pioneer 🐝 Cross-Pollinator (15) 🌈 Renaissance Researcher (7) 🔬 Deep Specialist (13) 🌱 Topic Pioneer 🧬 Topic Evolution 🚀 Conference Pioneer ⚡ Prolific Year (5) 🔥 Unstoppable (5) 🗃️ Keyword Collector (91) 💎 Century Club (18) 📈 Trend Setter

Conferences

AAAI (6) ICCV (4) EMNLP (3) WACV (3) AACL (1) CVPR (1) IJCAI (1) IJCNLP (1)

Top co-authors

Abhirama Subramanyam Penamakuri (6) Manish Gupta (5) Prajwal Gatti (4) Yogesh Kumar (4) Revant Teotia (3) Roshni Ramnani (3) Anirban Chakraborty (3) Aditay Tripathi (2) Shubhashis Sengupta (2) Mithun Das Gupta (2)

Research topics

Domain-Specific (1)

Keywords

visual question answering (5) object localization (5) multimodal learning (5) image retrieval (4) vision-language model (4) knowledge graph (3) few-shot learning (3) multimodal transformer (3) factual reasoning (2) transformer architecture (2) visual relationship (2) vision transformer (2) multi-modal learning (2) temporal localization (2) commonsense reasoning (2) information retrieval (2) large multimodal model (2) video understanding (2) knowledge retrieval (2) scene text (2)

Papers

PatientVLM Meets DocVLM: Pre-Consultation Dialogue Between Vision-Language Models for Efficient Diagnosis AAAI 2026 Temporal Object-Aware Vision Transformer for Few-Shot Video Object Detection AAAI 2026 When Big Models Train Small Ones: Label-Free Model Parity Alignment for Efficient Visual Question Answering using Small VLMs EMNLP 2025 PatentLMM: Large Multimodal Model for Generating Descriptions for Patent Figures AAAI 2025 Aligning Moments in Time using Video Queries ICCV 2025 Semantic Labels-Aware Transformer Model for Searching Over a Large Collection of Lecture-Slides WACV 2024 QDETRv: Query-Guided DETR for One-Shot Object Localization in Videos AAAI 2024 Visual Text Matters: Improving Text-KVQA with Visual Text Entity Knowledge-aware Large Multimodal Assistant EMNLP 2024 Composite Sketch+Text Queries for Retrieving Objects with Elusive Names and Complex Interactions AAAI 2024 Query-Guided Attention in Vision Transformers for Localizing Objects Using a Single Sketch WACV 2024 Answer Mining from a Pool of Images: Towards Retrieval-Based Visual Question Answering IJCAI 2023 Grounding Scene Graphs on Natural Images via Visio-Lingual Message Passing WACV 2023 Few-Shot Referring Relationships in Videos CVPR 2023 VisToT: Vision-Augmented Table-to-Text Generation EMNLP 2022 COFAR: Commonsense and Factual Reasoning in Image Search IJCNLP 2022 COFAR: Commonsense and Factual Reasoning in Image Search AACL 2022 Few-Shot Visual Relationship Co-Localization ICCV 2021 From Strings to Things: Knowledge-Enabled VQA Model That Can Read and Reason ICCV 2019 KVQA: Knowledge-Aware Visual Question Answering AAAI 2019 Image Retrieval Using Textual Cues ICCV 2013