Ion Stoica
101 papers · 2012–2025 · 12 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+15 more ↓ Show less ↑
πΊοΈ Taxonomy Completionist (23) π§ Keyword Pioneer π Interdisciplinary Bridge π Renaissance Researcher (7) π£ Hot Topic Early Bird
π
Renaissance Researcher
(7)
π
Interdisciplinary Bridge
πΊοΈ
Taxonomy Completionist
(23)
π
Conference Loyalist
(25)
π
Keyword Champion
(3)
π
Triple Crown
π
Grand Slam
π¬
Deep Specialist
(12)
π€
Dynamic Duo
(22)
π
Conference Pioneer
β‘
Prolific Year
(16)
π₯
Unstoppable
(14)
π
Century Club
(101)
ποΈ
Keyword Collector
(109)
β
The Questioner
Conferences
NSDI (30)
ICML (25)
OSDI (14)
ICLR (10)
NIPS (10)
EMNLP (4)
CORL (2)
JMLR (2)
AAAI (1)
AISTATS (1)
CVPR (1)
ECCV (1)
Top co-authors
Research topics
Keywords
large language model
(7)
inference optimization
(6)
reinforcement learning
(6)
model compression
(4)
model parallelism
(3)
multi-armed bandit
(3)
online learning
(3)
distributed computing
(3)
resource allocation
(3)
cost optimization
(3)
throughput optimization
(3)
deep neural network
(2)
supervised fine-tuning
(2)
language model
(2)
stochastic gradient descent
(2)
online algorithm
(2)
memory optimization
(2)
fault tolerance
(2)
distributed training
(2)
pareto efficiency
(2)
Papers
Sparse Video-Gen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity
ICML 2025
Prompt-to-Leaderboard: Prompt-Adaptive LLM Evaluations
ICML 2025
Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards
ICML 2025
From Crowdsourced Data to High-quality Benchmarks: Arena-Hard and Benchbuilder Pipeline
ICML 2025
RoboMonkey: Scaling Test-Time Sampling and Verification for Vision-Language-Action Models
CORL 2025
VisionArena: 230k Real World User-VLM Conversations with Preference Labels
CVPR 2025
SuperServe: Fine-Grained Inference Serving for Unpredictable Workloads
NSDI 2025
S*: Test Time Scaling for Code Generation
EMNLP 2025
Language Models Can Easily Learn to Reason from Demonstrations
EMNLP 2025
A Statistical Framework for Ranking LLM-based Chatbots
ICLR 2025
How to Evaluate Reward Models for RLHF
ICLR 2025
LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code
ICLR 2025
RouteLLM: Learning to Route LLMs from Preference Data
ICLR 2025
JudgeBench: A Benchmark for Evaluating LLM-Based Judges
ICLR 2025
GameArena: Evaluating LLM Reasoning through Live Computer Games
ICLR 2025
depyf: Open the Opaque Box of PyTorch Compiler for Machine Learning Researchers
JMLR 2025
Fast Video Generation with Sliding Tile Attention
ICML 2025
HashAttention: Semantic Sparsity for Faster Inference
ICML 2025
The Berkeley Function Calling Leaderboard (BFCL): From Tool Use to Agentic Evaluation of Large Language Models
ICML 2025
Copilot Arena: A Platform for Code LLM Evaluation in the Wild
ICML 2025
OR-Bench: An Over-Refusal Benchmark for Large Language Models
ICML 2025
Can't Be Late: Optimizing Spot Instance Savings under Deadlines
NSDI 2024
LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset
ICLR 2024
LLM-Assisted Code Cleaning For Training Accurate Code Generators
ICLR 2024
Trustless Audits without Revealing Data or Models
ICML 2024
MuxServe: Flexible Spatial-Temporal Multiplexing for Multiple LLM Serving
ICML 2024
Stylus: Automatic Adapter Selection for Diffusion Models
NIPS 2024
Are More LLM Calls All You Need? Towards the Scaling Properties of Compound AI Systems
NIPS 2024
Efficient LLM Scheduling by Learning to Rank
NIPS 2024
SGLang: Efficient Execution of Structured Language Model Programs
NIPS 2024
Crafting Interpretable Embeddings for Language Neuroscience by Asking LLMs Questions
NIPS 2024
Fairness in Serving Large Language Models
OSDI 2024
Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
ICML 2024
Cloudcast: High-Throughput, Cost-Aware Overlay Multicast in the Cloud
NSDI 2024
R2E: Turning any Github Repository into a Programming Agent Environment
ICML 2024
Online Speculative Decoding
ICML 2024
Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
ICML 2024
Cilantro: Performance-Aware Resource Allocation for General Objectives via Online Feedback
OSDI 2023
CLUTR: Curriculum Learning via Unsupervised Task Representation Learning
ICML 2023
FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU
ICML 2023
AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving
OSDI 2023
Take Out the TraChe: Maximizing (Tra)nsactional Ca(che) Hit Rate
OSDI 2023
SHEPHERD: Serving DNNs in the Wild
NSDI 2023
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
NIPS 2023
SkyPilot: An Intercloud Broker for Sky Computing
NSDI 2023
ExoFlow: A Universal Workflow System for Exactly-Once DAGs
OSDI 2023
VCG Mechanism Design with Unknown Agent Values under Stochastic Bandit Feedback
JMLR 2023
Skyplane: Optimizing Transfer Cost and Throughput Using Cloud-Aware Overlays
NSDI 2023
Programmatic Modeling and Generation of Real-Time Strategic Soccer Environments for Reinforcement Learning
AAAI 2022
Learning Competitive Equilibria in Exchange Economies with Bandit Feedback
AISTATS 2022
Context-Aware Streaming Perception in Dynamic Environments
ECCV 2022
POET: Training Neural Networks on Tiny Devices with Integrated Rematerialization and Paging
ICML 2022
Ekya: Continuous Learning of Video Analytics Models on Edge Compute Servers
NSDI 2022
NetHint: White-Box Networking for Multi-Tenant Data Centers
NSDI 2022
Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning
OSDI 2022
TEGRA: Efficient Ad-Hoc Analytics on Evolving Graphs
NSDI 2021
Twenty Years After: Hierarchical Core-Stateless Fair Queueing
NSDI 2021
ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training
ICML 2021
TeraPipe: Token-Level Pipeline Parallelism for Training Large-Scale Language Models
ICML 2021
Resource Allocation in Multi-armed Bandit Exploration: Overcoming Sublinear Scaling with Adaptive Parallelism
ICML 2021
RLlib Flow: Distributed Reinforcement Learning is a Dataflow Problem
NIPS 2021
Contrastive Code Representation Learning
EMNLP 2021
Grounded Graph Decoding improves Compositional Generalization in Question Answering
EMNLP 2021
Representing Long-Range Context for Graph Neural Networks with Global Attention
NIPS 2021
Accelerating Quadratic Optimization with Reinforcement Learning
NIPS 2021
Ownership: A Distributed Futures System for Fine-Grained Tasks
NSDI 2021
Caerus: NIMBLE Task Scheduling for Serverless Analytics
NSDI 2021
IMPACT: Importance Weighted Asynchronous Architectures with Clipped Target Networks
ICLR 2020
Variable Skipping for Autoregressive Range Density Estimation
ICML 2020
FetchSGD: Communication-Efficient Federated Learning with Sketching
ICML 2020
Ansor: Generating High-Performance Tensor Programs for Deep Learning
OSDI 2020
DORY: An Encrypted Search System with Distributed Trust
OSDI 2020
RackSched: A Microsecond-Scale Scheduler for Rack-Scale Computers
OSDI 2020
Population Based Augmentation: Efficient Learning of Augmentation Policy Schedules
ICML 2019
Confluo: Distributed Monitoring and Diagnosis Stack for High-speed Networks
NSDI 2019
Shuffling, Fast and Slow: Scalable Analytics on Serverless Infrastructure
NSDI 2019
Communication-efficient Distributed SGD with Sketching
NIPS 2019
NetChain: Scale-Free Sub-RTT Coordination
NSDI 2018
RLlib: Abstractions for Distributed Reinforcement Learning
ICML 2018
Parametrized Hierarchical Procedures for Neural Programming
ICLR 2018
Ray: A Distributed Framework for Emerging AI Applications
OSDI 2018
ASAP: Fast, Approximate Graph Pattern Mining at Scale
OSDI 2018
Clipper: A Low-Latency Online Prediction Serving System
NSDI 2017
DDCO: Discovery of Deep Continuous Options for Robot Learning from Demonstrations
CORL 2017
Opaque: An Oblivious and Encrypted Distributed Analytics Platform
NSDI 2017
Ernest: Efficient Performance Prediction for Large-Scale Advanced Analytics
NSDI 2016
EC-Cache: Load-Balanced, Low-Latency Cluster Caching with Online Erasure Coding
OSDI 2016
HUG: Multi-Resource Fairness for Correlated and Elastic Demands
NSDI 2016
FairRide: Near-Optimal, Fair Cache Sharing
NSDI 2016
CFA: A Practical Prediction System for Video QoE Optimization
NSDI 2016
BlowFish: Dynamic Storage-Performance Tradeoff in Data Stores
NSDI 2016
CellIQ : Real-Time Cellular Network Analytics at Scale
NSDI 2015
Succinct: Enabling Queries on Compressed Data
NSDI 2015
C3: Internet-Scale Control Plane for Video Quality Optimization
NSDI 2015
The Power of Choice in Data-Aware Cluster Scheduling
OSDI 2014
GRASS: Trimming Stragglers in Approximation Analytics
NSDI 2014
GraphX: Graph Processing in a Distributed Dataflow Framework
OSDI 2014
Effective Straggler Mitigation: Attack of the Clones
NSDI 2013
Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing
NSDI 2012
PACMan: Coordinated Memory Caching for Parallel Jobs
NSDI 2012
Reoptimizing Data Parallel Computing
NSDI 2012