Aishwarya Agrawal

24 papers · 2015–2025 · 9 conferences · across top CS/AI conferences

Achievements

+10 more ↓

🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (9) 🏃 Academic Marathon (10) 🌈 Renaissance Researcher (6) 🗺️ Taxonomy Completionist (44)

🏃 Academic Marathon (10) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (9) 🏆 Keyword Champion (4) 🔬 Deep Specialist (11) 🧬 Topic Evolution 📈 Trend Setter 💎 Century Club (24) ⚡ Prolific Year (7) 🗃️ Keyword Collector (90)

Conferences

EMNLP (8) CVPR (4) EACL (3) ACL (2) ICCV (2) NIPS (2) AAAI (1) ICML (1) NAACL (1)

Top co-authors

Dhruv Batra (5) Le Zhang (5) Rabiul Awal (5) Aida Nematzadeh (4) Devi Parikh (4) Siva Reddy (3) Shravan Nayak (3) Yash Goyal (3) Saba Ahmadi (3) Lisa Anne Hendricks (3)

Research topics

Core AI (1)

Keywords

visual question answering (8) vision language model (6) image-text alignment (4) vision-language model (4) large language model (3) multimodal learning (3) contrastive learning (3) visual grounding (3) benchmark evaluation (2) in-context learning (2) zero-shot learning (2) fine-grained understanding (2) text-to-image generation (2) self-supervised learning (2) multimodal large language model (2) image captioning (1) information retrieval (1) prompt engineering (1) transfer learning (1) out-of-distribution generalization (1)

Papers

Assessing and Learning Alignment of Unimodal Vision and Language Models CVPR 2025 CTRL-O: Language-Controllable Object-Centric Visual Representation Learning CVPR 2025 UI-Vision: A Desktop-centric GUI Benchmark for Visual Perception and Interaction ICML 2025 REARANK: Reasoning Re-ranking Agent via Reinforcement Learning EMNLP 2025 Controlling Multimodal LLMs via Reward-guided Decoding ICCV 2025 WebMMU: A Benchmark for Multimodal Multilingual Website Understanding and Code Generation EMNLP 2025 CulturalFrames: Assessing Cultural Expectation Alignment in Text-to-Image Models and Evaluation Metrics EMNLP 2025 VisMin: Visual Minimal-Change Understanding NIPS 2024 Decompose and Compare Consistency: Measuring VLMs’ Answer Reliability via Task-Decomposition Consistency Comparison EMNLP 2024 Benchmarking Vision Language Models for Cultural Understanding EMNLP 2024 Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Compositional Understanding CVPR 2024 An Examination of the Robustness of Reference-Free Image Captioning Evaluation Metrics EACL 2024 Improving Automatic VQA Evaluation Using Large Language Models AAAI 2024 Reassessing Evaluation Practices in Visual Question Answering: A Case Study on Out-of-Distribution Generalization EACL 2023 Measuring Progress in Fine-grained Vision-and-Language Understanding ACL 2023 MAPL: Parameter-Efficient Adaptation of Unimodal Pre-Trained Models for Vision-Language Few-Shot Prompting EACL 2023 MoqaGPT : Zero-Shot Multi-modal Open-domain Question Answering with Large Language Model EMNLP 2023 Vision-Language Pretraining: Current Trends and the Future ACL 2022 Overcoming Language Priors in Visual Question Answering with Adversarial Regularization NIPS 2018 Don't Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering CVPR 2018 Visual Storytelling NAACL 2016 Resolving Language and Vision Ambiguities Together: Joint Segmentation & Prepositional Attachment Resolution in Captioned Scenes EMNLP 2016 Analyzing the Behavior of Visual Question Answering Models EMNLP 2016 VQA: Visual Question Answering ICCV 2015