Matei Zaharia
34 papers · 2012–2026 · 10 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+13 more ↓ Show less ↑
π Conference Polyglot (10) π Academic Marathon (13) π§ Keyword Pioneer π Interdisciplinary Bridge π Cross-Pollinator (12)
π
Cross-Pollinator
(12)
π
Renaissance Researcher
(8)
πΊοΈ
Taxonomy Completionist
(45)
π¬
Deep Specialist
(10)
π
Triple Crown
π
Grand Slam
π
Century Club
(33)
β‘
Prolific Year
(5)
π
Conference Pioneer
β
The Questioner
(2)
ποΈ
Keyword Collector
(161)
π
Trend Setter
π₯
Unstoppable
(7)
Conferences
ICLR (7)
NIPS (6)
NSDI (5)
ICML (4)
AAAI (3)
EMNLP (3)
NAACL (2)
OSDI (2)
ACL (1)
JMLR (1)
Top co-authors
Research topics
Keywords
information retrieval
(4)
machine learning api
(2)
prediction api
(2)
large language model
(2)
neural retrieval
(2)
parallel computing
(2)
language model
(2)
domain adaptation
(1)
online learning
(1)
active learning
(1)
knowledge distillation
(1)
speech recognition
(1)
claim verification
(1)
question answering
(1)
scalable learning
(1)
sentiment analysis
(1)
benchmark evaluation
(1)
multi-label classification
(1)
semantic search
(1)
passage retrieval
(1)
Papers
DS SERVE: A Framework for Efficient and Scalable Neural Retrieval
AAAI 2026
World Model on Million-Length Video And Language With Blockwise RingAttention
ICLR 2025
Language Models Can Easily Learn to Reason from Demonstrations
EMNLP 2025
LangProBe: a Language Program Benchmark
EMNLP 2025
ElasticTok: Adaptive Tokenization for Image and Video
ICLR 2025
HashAttention: Semantic Sparsity for Faster Inference
ICML 2025
Compass: Encrypted Semantic Search with High Accuracy
OSDI 2025
ARES: An Automated Evaluation Framework for Retrieval-Augmented Generation Systems
NAACL 2024
RingAttention with Blockwise Transformers for Near-Infinite Context
ICLR 2024
DSPy: Compiling Declarative Language Model Calls into State-of-the-Art Pipelines
ICLR 2024
Are More LLM Calls All You Need? Towards the Scaling Properties of Compound AI Systems
NIPS 2024
Optimizing Instructions and Demonstrations for Multi-Stage Language Model Programs
EMNLP 2024
HAPI Explorer: Comprehension, Discovery, and Explanation on History of ML APIs
AAAI 2023
Moving Beyond Downstream Task Accuracy for Information Retrieval Benchmarking
ACL 2023
Similarity Search for Efficient Active Learning and Search of Rare Concepts
AAAI 2022
Data-Parallel Actors: A Programming Model for Scalable Query Serving Systems
NSDI 2022
How Did the Model Change? Efficiently Assessing Machine Learning API Shifts
ICLR 2022
Hindsight: Posterior-guided training of retrievers for improved open-ended generation
ICLR 2022
Efficient Online ML API Selection for Multi-Label Classification Tasks
ICML 2022
ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction
NAACL 2022
Estimating and Explaining Model Performance When Both Covariates and Labels Shift
NIPS 2022
HAPI: A Large-scale Longitudinal Dataset of Commercial ML API Predictions
NIPS 2022
Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval
NIPS 2021
Memory-Efficient Pipeline-Parallel DNN Training
ICML 2021
Contracting Wide-area Network Topologies to Solve Flow Problems Quickly
NSDI 2021
FrugalML: How to use ML Prediction APIs more accurately and cheaply
NIPS 2020
Heterogeneity-Aware Cluster Scheduling Policies for Deep Learning Workloads
OSDI 2020
Selection via Proxy: Efficient Data Selection for Deep Learning
ICLR 2020
LIT: Learned Intermediate Representation Training for Model Compression
ICML 2019
Splinter: Practical Private Queries on Public Data
NSDI 2017
Yggdrasil: An Optimized System for Training Deep Decision Trees at Scale
NIPS 2016
MLlib: Machine Learning in Apache Spark
JMLR 2016
FairRide: Near-Optimal, Fair Cache Sharing
NSDI 2016
Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing
NSDI 2012