Irfan Essa

29 papers · 2013–2025 · 10 conferences · across top CS/AI conferences

Achievements

+14 more ↓

🏃 Academic Marathon (12) 🐝 Cross-Pollinator (10) 🌍 Conference Polyglot (10) 🧭 Keyword Pioneer 🌈 Renaissance Researcher (8)

🌉 Interdisciplinary Bridge 🐣 Hot Topic Early Bird 🌍 Conference Polyglot (10) 🤝 Dynamic Duo (10) 👑 Triple Crown 🏆 Grand Slam 👥 Mega-Team (31) 🧬 Topic Evolution 💎 Century Club (29) 📈 Trend Setter ⚡ Prolific Year (6) 🚀 Conference Pioneer 🗃️ Keyword Collector (130) 🔥 Unstoppable (7)

Conferences

CVPR (11) ECCV (5) ICLR (5) AAAI (2) ACL (1) AISTATS (1) EMNLP (1) ICML (1) NIPS (1) WACV (1)

Top co-authors

Lu Jiang (10) José Lezama (8) Kihyuk Sohn (7) Ming-Hsuan Yang (6) Huiwen Chang (5) Stefan Lee (5) Dhruv Batra (5) Lijun Yu (4) Han Zhang (4) Junfeng He (3)

Keywords

image generation (3) embodied question answering (2) diffusion model (2) point cloud (2) scene understanding (2) vision transformer (2) multi-task learning (2) large language model (2) temporal modeling (1) computer vision (1) few-shot learning (1) sentiment analysis (1) semi-supervised learning (1) video segmentation (1) anomaly detection (1) video synthesis (1) transfer learning (1) visual question answering (1) video generation (1) style transfer (1)

Papers

Calibrated Multi-Preference Optimization for Aligning Diffusion Models CVPR 2025 Africa Health Check: Probing Cultural Bias in Medical LLMs EMNLP 2025 Limitations in Employing Natural Language Supervision for Sensor-Based Human Activity Recognition - And Ways to Overcome Them AAAI 2025 AfriMed-QA: A Pan-African, Multi-Specialty, Medical Question-Answering Benchmark Dataset ACL 2025 Cropper: Vision-Language Model for Image Cropping through In-Context Learning CVPR 2025 FineStyle: Fine-grained Controllable Style Personalization for Text-to-image Models NIPS 2024 Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models CVPR 2024 Parrot: Pareto-optimal Multi-Reward Reinforcement Learning Framework for Text-to-Image Generation ECCV 2024 Photorealistic Video Generation with Diffusion Models ECCV 2024 Language Model Beats Diffusion - Tokenizer is key to visual generation ICLR 2024 VideoPoet: A Large Language Model for Zero-Shot Video Generation ICML 2024 MAGVIT: Masked Generative Video Transformer CVPR 2023 MaskSketch: Unpaired Structure-Guided Masked Image Generation CVPR 2023 Discrete Predictor-Corrector Diffusion Models for Image Synthesis ICLR 2023 Emergence of Maps in the Memories of Blind Navigation Agents ICLR 2023 Visual Prompt Tuning for Generative Transfer Learning CVPR 2023 BLT: Bidirectional Layout Transformer for Controllable Layout Generation ECCV 2022 Improved Masked Image Generation with Token-Critic ECCV 2022 Discrete Representations Strengthen Vision Transformer Robustness ICLR 2022 Sharing Decoders: Network Fission for Multi-Task Pixel Prediction WACV 2022 Semantic MapNet: Building Allocentric Semantic Maps and Representations from Egocentric Views AAAI 2021 Neural Design Network: Graphic Layout Generation with Constraints ECCV 2020 DD-PPO: Learning Near-Perfect PointGoal Navigators from 2.5 Billion Frames ICLR 2020 Audio Visual Scene-Aware Dialog CVPR 2019 Embodied Question Answering in Photorealistic Environments With Point Cloud Perception CVPR 2019 Efficient Hierarchical Graph-Based Segmentation of RGBD Videos CVPR 2014 Geometric Context from Videos CVPR 2013 Augmenting Bag-of-Words: Data-Driven Discovery of Temporal and Structural Information for Activity Recognition CVPR 2013 Beyond Sentiment: The Manifold of Human Emotions AISTATS 2013