Ali Farhadi
111 papers · 2013–2025 · 12 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+18 more ↓ Show less ↑
🏃 Academic Marathon (12) 🌍 Conference Polyglot (12) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🐣 Hot Topic Early Bird
🌈
Renaissance Researcher
(11)
🌉
Interdisciplinary Bridge
🧭
Keyword Pioneer
🌟
Keyword Trendsetter Combo
(3)
🏠
Conference Loyalist
(20)
🤝
Dynamic Duo
(26)
👑
Triple Crown
👥
Mega-Team
(50)
🌱
Topic Pioneer
🔬
Deep Specialist
(22)
🏆
Keyword Champion
🚀
Conference Pioneer
🗃️
Keyword Collector
(433)
📈
Trend Setter
⚡
Prolific Year
(12)
🔥
Unstoppable
(13)
💎
Century Club
(111)
❓
The Questioner
(7)
Conferences
CVPR (44)
NIPS (20)
ICLR (11)
ECCV (8)
ICCV (7)
EMNLP (5)
ACL (4)
ICML (4)
NAACL (4)
CORL (2)
IJCNLP (1)
WACV (1)
Top co-authors
Research topics
Keywords
neural network
(10)
zero-shot learning
(8)
transfer learning
(7)
convolutional neural network
(7)
representation learning
(7)
scene understanding
(6)
multimodal learning
(6)
visual question answering
(6)
image classification
(6)
action recognition
(6)
model compression
(6)
video understanding
(5)
vision-language model
(5)
visual reasoning
(5)
semantic segmentation
(5)
self-supervised learning
(4)
question answering
(4)
few-shot learning
(4)
language model
(4)
egocentric vision
(3)
Papers
OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens
ACL 2025
Contrastive Flow Matching
ICCV 2025
Eval3D: Interpretable and Fine-grained Evaluation for 3D Generation
CVPR 2025
OLMoE: Open Mixture-of-Experts Language Models
ICLR 2025
Synthetic Visual Genome
CVPR 2025
DRAWER: Digital Reconstruction and Articulation With Environment Realism
CVPR 2025
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models
CVPR 2025
Superposed Decoding: Multiple Generations from a Single Autoregressive Inference Pass
NIPS 2024
MatFormer: Nested Transformer for Elastic Inference
NIPS 2024
Learning to Build by Building Your Own Instructions
ECCV 2024
From an Image to a Scene: Learning to Imagine the World from a Million 360° Videos
NIPS 2024
Selective Visual Representations Improve Convergence and Generalization for Embodied AI
ICLR 2024
Task Me Anything
NIPS 2024
ActionAtlas: A VideoQA Benchmark for Domain-specialized Action Recognition
NIPS 2024
Phone2Proc: Bringing Robust Robots Into Our Chaotic World
CVPR 2023
What Does a Platypus Look Like? Generating Customized Prompts for Zero-Shot Image Classification
ICCV 2023
Moving Forward by Moving Backward: Embedding Action Impact over Action Semantics
ICLR 2023
Impossibly Good Experts and How to Follow Them
ICLR 2023
Neural Radiance Field Codebooks
ICLR 2023
FastFill: Efficient Compatible Model Update
ICLR 2023
Editing models with task arithmetic
ICLR 2023
Reinforce Data, Multiply Impact: Improved Model Accuracy and Robustness with Dataset Reinforcement
ICCV 2023
Objaverse: A Universe of Annotated 3D Objects
CVPR 2023
LCS: Learning Compressible Subspaces for Efficient, Adaptive, Real-Time Network Compression at Inference Time
WACV 2023
Stable and low-precision training for large-scale vision-language models
NIPS 2023
Localized Symbolic Knowledge Distillation for Visual Commonsense Models
NIPS 2023
DataComp: In search of the next generation of multimodal datasets
NIPS 2023
Objaverse-XL: A Universe of 10M+ 3D Objects
NIPS 2023
Neural Priming for Sample-Efficient Adaptation
NIPS 2023
On the Connection between Pre-training Data Diversity and Fine-tuning Robustness
NIPS 2023
AdANNS: A Framework for Adaptive Semantic Search
NIPS 2023
SHARCS: Efficient Transformers Through Routing with Dynamic Width Sub-networks
EMNLP 2023
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
ICML 2022
Exposing the Limits of Video-Text Models through Contrast Sets
NAACL 2022
Patching open-vocabulary models by interpolating weights
NIPS 2022
Matryoshka Representation Learning
NIPS 2022
Forward Compatible Training for Large-Scale Embedding Retrieval Systems
CVPR 2022
Object Manipulation via Visual Target Localization
ECCV 2022
Break and Make: Interactive Structural Understanding Using LEGO Bricks
ECCV 2022
Robust Fine-Tuning of Zero-Shot Models
CVPR 2022
MERLOT Reserve: Neural Script Knowledge Through Vision and Language and Sound
CVPR 2022
Iconary: A Pictionary-Based Game for Testing Multimodal Communication with Drawings and Text
EMNLP 2021
PIGLeT: Language Grounding Through Neuro-Symbolic Interaction in a 3D World
IJCNLP 2021
MERLOT: Multimodal Neural Script Knowledge Models
NIPS 2021
Pushing It Out of the Way: Interactive Visual Navigation
CVPR 2021
TuringAdvice: A Generative and Dynamic Evaluation of Language Use
NAACL 2021
LanguageRefer: Spatial-Language Model for 3D Visual Grounding
CORL 2021
Probing Contextual Language Models for Common Ground with Visual Representations
NAACL 2021
LLC: Accurate, Multi-purpose Learnt Low-dimensional Binary Codes
NIPS 2021
What Can You Learn From Your Muscles? Learning Visual Representation from Human Interactions
ICLR 2021
Learning Generalizable Visual Representations via Interactive Gameplay
ICLR 2021
Learning Neural Network Subspaces
ICML 2021
PIGLeT: Language Grounding Through Neuro-Symbolic Interaction in a 3D World
ACL 2021
Use the Force, Luke! Learning to Predict Physical Forces by Simulating Effects
CVPR 2020
Supermasks in Superposition
NIPS 2020
RoboTHOR: An Open Simulation-to-Real Embodied AI Platform
CVPR 2020
Butterfly Transform: An Efficient FFT Based Neural Architecture Design
CVPR 2020
What's Hidden in a Randomly Weighted Neural Network?
CVPR 2020
Visual Reaction: Learning to Play Catch With Your Drone
CVPR 2020
Grounded Situation Recognition
ECCV 2020
A Cordial Sync: Going Beyond Marginal Policies for Multi-Agent Embodied Tasks
ECCV 2020
VisualCOMET: Reasoning about the Dynamic Context of a Still Image
ECCV 2020
Soft Threshold Weight Reparameterization for Learnable Sparsity
ICML 2020
Video Relationship Reasoning Using Gated Spatio-Temporal Energy Graph
CVPR 2019
Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index
ACL 2019
Conditional Driving from Natural Language Instructions
CORL 2019
Discovering Neural Wirings
NIPS 2019
OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge
CVPR 2019
ELASTIC: Improving CNNs With Dynamic Scaling Policies
CVPR 2019
From Recognition to Cognition: Visual Commonsense Reasoning
CVPR 2019
Learning to Learn How to Learn: Self-Adaptive Visual Navigation Using Meta-Learning
CVPR 2019
Two Body Problem: Collaborative Visual Task Completion
CVPR 2019
HellaSwag: Can a Machine Really Finish Your Sentence?
ACL 2019
Defending Against Neural Fake News
NIPS 2019
Visual Semantic Navigation using Scene Priors
ICLR 2019
SeGAN: Segmenting and Generating the Invisible
CVPR 2018
IQA: Visual Question Answering in Interactive Environments
CVPR 2018
Who Let the Dogs Out? Modeling Dog Behavior From Visual Data
CVPR 2018
Phrase-Indexed Question Answering: A New Challenge for Scalable Document Comprehension
EMNLP 2018
Actor and Observer: Joint Modeling of First and Third-Person Videos
CVPR 2018
Neural Speed Reading via Skim-RNN
ICLR 2018
DOCK: Detecting Objects by transferring Common-sense Knowledge
ECCV 2018
Imagine This! Scripts to Compositions to Videos
ECCV 2018
Structured Set Matching Networks for One-Shot Part Labeling
CVPR 2018
LCNN: Lookup-Based Convolutional Neural Network
CVPR 2017
YOLO9000: Better, Faster, Stronger
CVPR 2017
Commonly Uncommon: Semantic Sparsity in Situation Recognition
CVPR 2017
Are You Smarter Than a Sixth Grader? Textbook Question Answering for Multimodal Machine Comprehension
CVPR 2017
Visual Semantic Planning Using Deep Successor Representations
ICCV 2017
See the Glass Half Full: Reasoning About Liquid Containers, Their Volume and Content
ICCV 2017
Asynchronous Temporal Fields for Action Recognition
CVPR 2017
Newtonian Scene Understanding: Unfolding the Dynamics of Objects in Static Images
CVPR 2016
Actions ~ Transformations
CVPR 2016
A Task-Oriented Approach for Cost-Sensitive Recognition
CVPR 2016
You Only Look Once: Unified, Real-Time Object Detection
CVPR 2016
Unsupervised Deep Embedding for Clustering Analysis
ICML 2016
Situation Recognition: Visual Semantic Role Labeling for Image Understanding
CVPR 2016
Stating the Obvious: Extracting Visual Common Sense Knowledge
NAACL 2016
VisKE: Visual Knowledge Extraction and Question Answering by Visual Verification of Relation Phrases
CVPR 2015
Generating Notifications for Missing Actions: Don't Forget to Turn the Lights Off!
ICCV 2015
Discriminative and Consistent Similarities in Instance-Level Multiple Instance Learning
CVPR 2015
Segment-Phrase Table for Semantic Segmentation, Visual Entailment and Paraphrasing
ICCV 2015
Solving Geometry Problems: Combining Text and Diagram Interpretation
EMNLP 2015
Visalogy: Answering Visual Analogy Questions
NIPS 2015
Learning Everything about Anything: Webly-Supervised Visual Concept Learning
CVPR 2014
Incorporating Scene Context and Object Layout into Appearance Modeling
CVPR 2014
Multi-Resolution Language Grounding with Weak Supervision
EMNLP 2014
Predicting Failures of Vision Systems
CVPR 2014
Multi-attribute Queries: To Merge or Not to Merge?
CVPR 2013
Adding Unlabeled Samples to Categories by Learned Attributes
CVPR 2013
Object-Centric Anomaly Detection by Attribute-Based Reasoning
CVPR 2013