Alireza Fathi
22 papers · 2013–2025 · 6 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+11 more ↓ Show less ↑
π Renaissance Researcher (5) π Interdisciplinary Bridge π Academic Marathon (12) π Conference Polyglot (6) πΊοΈ Taxonomy Completionist (51)
πΊοΈ
Taxonomy Completionist
(51)
π§
Keyword Pioneer
π£
Hot Topic Early Bird
π
Triple Crown
π§¬
Topic Evolution
π€
Dynamic Duo
(10)
π
Century Club
(22)
π
Trend Setter
ποΈ
Keyword Collector
(108)
β‘
Prolific Year
(5)
π
Conference Pioneer
Conferences
CVPR (12)
ECCV (5)
NIPS (2)
ICCV (1)
ICLR (1)
ICML (1)
Top co-authors
Keywords
vision-language model
(2)
object detection
(2)
retrieval-augmented generation
(2)
image generation
(2)
action recognition
(2)
multimodal learning
(2)
generative model
(2)
multimodal large language model
(2)
visual entity recognition
(2)
large language model
(2)
semantic segmentation
(2)
visual question answering
(2)
entity linking
(1)
feature extraction
(1)
weakly supervised learning
(1)
attention mechanism
(1)
transfer learning
(1)
self-supervised learning
(1)
image compression
(1)
noisy label learning
(1)
Papers
FirePlace: Geometric Refinements of LLM Common Sense Reasoning for 3D Object Placement
CVPR 2025
Language-Guided Image Tokenization for Generation
CVPR 2025
Visual Lexicon: Rich Image Features in Language Space
CVPR 2025
SceneCraft: An LLM Agent for Synthesizing 3D Scenes as Blender Code
ICML 2024
Web-Scale Visual Entity Recognition: An LLM-Driven Data Approach
NIPS 2024
A Generative Approach for Wikipedia-Scale Visual Entity Recognition
CVPR 2024
Retrieval-Enhanced Contrastive Vision-Text Models
ICLR 2024
Improving Image Recognition by Retrieving From Web-Scale Image-Text Data
CVPR 2023
REVEAL: Retrieval-Augmented Visual-Language Pre-Training With Multi-Source Multimodal Knowledge Memory
CVPR 2023
AVIS: Autonomous Visual Information Seeking with Large Language Model Agent
NIPS 2023
PreTraM: Self-Supervised Pre-training via Connecting Trajectory and Map
ECCV 2022
Panoptic Neural Fields: A Semantic Object-Aware Neural Scene Representation
CVPR 2022
DOPS: Learning to Detect 3D Objects and Predict Their 3D Shapes
CVPR 2020
An LSTM Approach to Temporal 3D Object Detection in LiDAR Point Clouds
ECCV 2020
Pillar-based Object Detection for Autonomous Driving
ECCV 2020
Virtual Multi-view Fusion for 3D Semantic Segmentation
ECCV 2020
3D-MPA: Multi-Proposal Aggregation for 3D Semantic Instance Segmentation
CVPR 2020
Instance Embedding Transfer to Unsupervised Video Object Segmentation
CVPR 2018
Tracking Emerges by Colorizing Videos
ECCV 2018
Speed/Accuracy Trade-Offs for Modern Convolutional Object Detectors
CVPR 2017
Modeling Actions through State Changes
CVPR 2013
Learning to Predict Gaze in Egocentric Video
ICCV 2013