Irfan Essa
29 papers · 2013–2025 · 10 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+14 more ↓ Show less ↑
🏃 Academic Marathon (12) 🐝 Cross-Pollinator (10) 🌍 Conference Polyglot (10) 🧭 Keyword Pioneer 🌈 Renaissance Researcher (8)
🌉
Interdisciplinary Bridge
🐣
Hot Topic Early Bird
🌍
Conference Polyglot
(10)
🤝
Dynamic Duo
(10)
👑
Triple Crown
🏆
Grand Slam
👥
Mega-Team
(31)
🧬
Topic Evolution
💎
Century Club
(29)
📈
Trend Setter
⚡
Prolific Year
(6)
🚀
Conference Pioneer
🗃️
Keyword Collector
(130)
🔥
Unstoppable
(7)
Conferences
CVPR (11)
ECCV (5)
ICLR (5)
AAAI (2)
ACL (1)
AISTATS (1)
EMNLP (1)
ICML (1)
NIPS (1)
WACV (1)
Top co-authors
Keywords
image generation
(3)
embodied question answering
(2)
diffusion model
(2)
point cloud
(2)
scene understanding
(2)
vision transformer
(2)
multi-task learning
(2)
large language model
(2)
temporal modeling
(1)
computer vision
(1)
few-shot learning
(1)
sentiment analysis
(1)
semi-supervised learning
(1)
video segmentation
(1)
anomaly detection
(1)
video synthesis
(1)
transfer learning
(1)
visual question answering
(1)
video generation
(1)
style transfer
(1)
Papers
Calibrated Multi-Preference Optimization for Aligning Diffusion Models
CVPR 2025
Africa Health Check: Probing Cultural Bias in Medical LLMs
EMNLP 2025
Limitations in Employing Natural Language Supervision for Sensor-Based Human Activity Recognition - And Ways to Overcome Them
AAAI 2025
AfriMed-QA: A Pan-African, Multi-Specialty, Medical Question-Answering Benchmark Dataset
ACL 2025
Cropper: Vision-Language Model for Image Cropping through In-Context Learning
CVPR 2025
FineStyle: Fine-grained Controllable Style Personalization for Text-to-image Models
NIPS 2024
Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models
CVPR 2024
Parrot: Pareto-optimal Multi-Reward Reinforcement Learning Framework for Text-to-Image Generation
ECCV 2024
Photorealistic Video Generation with Diffusion Models
ECCV 2024
Language Model Beats Diffusion - Tokenizer is key to visual generation
ICLR 2024
VideoPoet: A Large Language Model for Zero-Shot Video Generation
ICML 2024
MAGVIT: Masked Generative Video Transformer
CVPR 2023
MaskSketch: Unpaired Structure-Guided Masked Image Generation
CVPR 2023
Discrete Predictor-Corrector Diffusion Models for Image Synthesis
ICLR 2023
Emergence of Maps in the Memories of Blind Navigation Agents
ICLR 2023
Visual Prompt Tuning for Generative Transfer Learning
CVPR 2023
BLT: Bidirectional Layout Transformer for Controllable Layout Generation
ECCV 2022
Improved Masked Image Generation with Token-Critic
ECCV 2022
Discrete Representations Strengthen Vision Transformer Robustness
ICLR 2022
Sharing Decoders: Network Fission for Multi-Task Pixel Prediction
WACV 2022
Semantic MapNet: Building Allocentric Semantic Maps and Representations from Egocentric Views
AAAI 2021
Neural Design Network: Graphic Layout Generation with Constraints
ECCV 2020
DD-PPO: Learning Near-Perfect PointGoal Navigators from 2.5 Billion Frames
ICLR 2020
Audio Visual Scene-Aware Dialog
CVPR 2019
Embodied Question Answering in Photorealistic Environments With Point Cloud Perception
CVPR 2019
Efficient Hierarchical Graph-Based Segmentation of RGBD Videos
CVPR 2014
Geometric Context from Videos
CVPR 2013
Augmenting Bag-of-Words: Data-Driven Discovery of Temporal and Structural Information for Activity Recognition
CVPR 2013
Beyond Sentiment: The Manifold of Human Emotions
AISTATS 2013