Marcella Cornia

24 papers · 2019–2026 · 7 conferences · across top CS/AI conferences

Achievements

+11 more ↓

🐝 Cross-Pollinator (10) 🏃 Academic Marathon (7) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (7) 🌈 Renaissance Researcher (7)

🌈 Renaissance Researcher (7) 🗺️ Taxonomy Completionist (54) 🧭 Keyword Pioneer 🔬 Deep Specialist (10) 🧬 Topic Evolution 🤝 Dynamic Duo (24) 💎 Century Club (24) 🗃️ Keyword Collector (109) 🔥 Unstoppable (5) ❓ The Questioner ⚡ Prolific Year (7)

Conferences

CVPR (7) ICCV (6) ECCV (4) WACV (3) IJCAI (2) ACL (1) NIPS (1)

Top co-authors

Rita Cucchiara (24) Lorenzo Baraldi (17) Sara Sarto (7) Giuseppe Cartella (4) Luca Barsellotti (4) Federico Cocchi (4) Vittorio Cuculo (3) Davide Morelli (3) Giuseppe Boccignone (3) Alessandro D'Amelio (3)

Keywords

image captioning (5) multimodal large language model (5) multimodal learning (3) visual grounding (2) scanpath prediction (2) self-supervised learning (2) vision language model (2) visual attention (2) semantic segmentation (2) open-vocabulary segmentation (2) gaze prediction (2) diffusion model (2) missing modality (2) vision transformer (2) image generation (2) attention mechanism (1) video captioning (1) visual question answering (1) metric learning (1) sequence modeling (1)

Papers

Sketch2Stitch: GANs for Abstract Sketch-Based Dress Synthesis WACV 2026 TPP-Gaze: Modelling Gaze Dynamics in Space and Time with Neural Temporal Point Processes WACV 2025 Image Captioning Evaluation in the Age of Multimodal LLMs: Challenges and Future Perspectives IJCAI 2025 Modeling Human Gaze Behavior with Diffusion Models for Unified Scanpath Prediction ICCV 2025 What Changed? Detecting and Evaluating Instruction-Guided Image Edits with Multimodal Large Language Models ICCV 2025 MissRAG: Addressing the Missing Modality Challenge in Multimodal Large Language Models ICCV 2025 Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering CVPR 2025 Recurrence-Enhanced Vision-and-Language Transformers for Robust Multimodal Document Retrieval CVPR 2025 Semantically Conditioned Prompts for Visual Recognition under Missing Modality Scenarios WACV 2025 Talking to DINO: Bridging Self-Supervised Vision Backbones with Language for Open-Vocabulary Segmentation ICCV 2025 BRIDGE: Bridging Gaps in Image Captioning Evaluation with Stronger Visual Cues ECCV 2024 The Revolution of Multimodal Large Language Models: A Survey ACL 2024 Training-Free Open-Vocabulary Segmentation with Offline Diffusion-Augmented Prototype Generation CVPR 2024 Safe-CLIP: Removing NSFW Concepts from Vision-and-Language Models ECCV 2024 Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similarities ECCV 2024 Personalized Instance-based Navigation Toward User-Specific Objects in Realistic Environments NIPS 2024 Trends, Applications, and Challenges in Human Attention Modelling IJCAI 2024 Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation CVPR 2023 Multimodal Garment Designer: Human-Centric Latent Diffusion Models for Fashion Image Editing ICCV 2023 With a Little Help from Your Own Past: Prototypical Memory Networks for Image Captioning ICCV 2023 Dress Code: High-Resolution Multi-Category Virtual Try-On ECCV 2022 Meshed-Memory Transformer for Image Captioning CVPR 2020 Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions CVPR 2019 Art2Real: Unfolding the Reality of Artworks via Semantically-Aware Image-To-Image Translation CVPR 2019