Ion Stoica

101 papers · 2012–2025 · 12 conferences · across top CS/AI conferences

Achievements

+15 more ↓

🗺️ Taxonomy Completionist (23) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (7) 🐣 Hot Topic Early Bird

🌈 Renaissance Researcher (7) 🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (23) 🏠 Conference Loyalist (25) 🏆 Keyword Champion (3) 👑 Triple Crown 🏆 Grand Slam 🔬 Deep Specialist (12) 🤝 Dynamic Duo (22) 🚀 Conference Pioneer ⚡ Prolific Year (16) 🔥 Unstoppable (14) 💎 Century Club (101) 🗃️ Keyword Collector (109) ❓ The Questioner

Conferences

NSDI (30) ICML (25) OSDI (14) ICLR (10) NIPS (10) EMNLP (4) CORL (2) JMLR (2) AAAI (1) AISTATS (1) CVPR (1) ECCV (1)

Top co-authors

Joseph E. Gonzalez (22) Wei-Lin Chiang (14) Hao Zhang (12) Joseph Gonzalez (11) Lianmin Zheng (9) Eric Liang (8) Zhuohan Li (7) Ying Sheng (7) Joseph E Gonzalez (7) Paras Jain (7)

Research topics

Applications (2) Deep Learning (1) Optimization (1)

Keywords

large language model (7) inference optimization (6) reinforcement learning (6) model compression (4) model parallelism (3) multi-armed bandit (3) online learning (3) distributed computing (3) resource allocation (3) cost optimization (3) throughput optimization (3) deep neural network (2) supervised fine-tuning (2) language model (2) stochastic gradient descent (2) online algorithm (2) memory optimization (2) fault tolerance (2) distributed training (2) pareto efficiency (2)

Papers

Sparse Video-Gen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity ICML 2025 Prompt-to-Leaderboard: Prompt-Adaptive LLM Evaluations ICML 2025 Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards ICML 2025 From Crowdsourced Data to High-quality Benchmarks: Arena-Hard and Benchbuilder Pipeline ICML 2025 RoboMonkey: Scaling Test-Time Sampling and Verification for Vision-Language-Action Models CORL 2025 VisionArena: 230k Real World User-VLM Conversations with Preference Labels CVPR 2025 SuperServe: Fine-Grained Inference Serving for Unpredictable Workloads NSDI 2025 S*: Test Time Scaling for Code Generation EMNLP 2025 Language Models Can Easily Learn to Reason from Demonstrations EMNLP 2025 A Statistical Framework for Ranking LLM-based Chatbots ICLR 2025 How to Evaluate Reward Models for RLHF ICLR 2025 LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code ICLR 2025 RouteLLM: Learning to Route LLMs from Preference Data ICLR 2025 JudgeBench: A Benchmark for Evaluating LLM-Based Judges ICLR 2025 GameArena: Evaluating LLM Reasoning through Live Computer Games ICLR 2025 depyf: Open the Opaque Box of PyTorch Compiler for Machine Learning Researchers JMLR 2025 Fast Video Generation with Sliding Tile Attention ICML 2025 HashAttention: Semantic Sparsity for Faster Inference ICML 2025 The Berkeley Function Calling Leaderboard (BFCL): From Tool Use to Agentic Evaluation of Large Language Models ICML 2025 Copilot Arena: A Platform for Code LLM Evaluation in the Wild ICML 2025 OR-Bench: An Over-Refusal Benchmark for Large Language Models ICML 2025 Can't Be Late: Optimizing Spot Instance Savings under Deadlines NSDI 2024 LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset ICLR 2024 LLM-Assisted Code Cleaning For Training Accurate Code Generators ICLR 2024 Trustless Audits without Revealing Data or Models ICML 2024 MuxServe: Flexible Spatial-Temporal Multiplexing for Multiple LLM Serving ICML 2024 Stylus: Automatic Adapter Selection for Diffusion Models NIPS 2024 Are More LLM Calls All You Need? Towards the Scaling Properties of Compound AI Systems NIPS 2024 Efficient LLM Scheduling by Learning to Rank NIPS 2024 SGLang: Efficient Execution of Structured Language Model Programs NIPS 2024 Crafting Interpretable Embeddings for Language Neuroscience by Asking LLMs Questions NIPS 2024 Fairness in Serving Large Language Models OSDI 2024 Break the Sequential Dependency of LLM Inference Using Lookahead Decoding ICML 2024 Cloudcast: High-Throughput, Cost-Aware Overlay Multicast in the Cloud NSDI 2024 R2E: Turning any Github Repository into a Programming Agent Environment ICML 2024 Online Speculative Decoding ICML 2024 Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference ICML 2024 Cilantro: Performance-Aware Resource Allocation for General Objectives via Online Feedback OSDI 2023 CLUTR: Curriculum Learning via Unsupervised Task Representation Learning ICML 2023 FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU ICML 2023 AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving OSDI 2023 Take Out the TraChe: Maximizing (Tra)nsactional Ca(che) Hit Rate OSDI 2023 SHEPHERD: Serving DNNs in the Wild NSDI 2023 Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena NIPS 2023 SkyPilot: An Intercloud Broker for Sky Computing NSDI 2023 ExoFlow: A Universal Workflow System for Exactly-Once DAGs OSDI 2023 VCG Mechanism Design with Unknown Agent Values under Stochastic Bandit Feedback JMLR 2023 Skyplane: Optimizing Transfer Cost and Throughput Using Cloud-Aware Overlays NSDI 2023 Programmatic Modeling and Generation of Real-Time Strategic Soccer Environments for Reinforcement Learning AAAI 2022 Learning Competitive Equilibria in Exchange Economies with Bandit Feedback AISTATS 2022 Context-Aware Streaming Perception in Dynamic Environments ECCV 2022 POET: Training Neural Networks on Tiny Devices with Integrated Rematerialization and Paging ICML 2022 Ekya: Continuous Learning of Video Analytics Models on Edge Compute Servers NSDI 2022 NetHint: White-Box Networking for Multi-Tenant Data Centers NSDI 2022 Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning OSDI 2022 TEGRA: Efficient Ad-Hoc Analytics on Evolving Graphs NSDI 2021 Twenty Years After: Hierarchical Core-Stateless Fair Queueing NSDI 2021 ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training ICML 2021 TeraPipe: Token-Level Pipeline Parallelism for Training Large-Scale Language Models ICML 2021 Resource Allocation in Multi-armed Bandit Exploration: Overcoming Sublinear Scaling with Adaptive Parallelism ICML 2021 RLlib Flow: Distributed Reinforcement Learning is a Dataflow Problem NIPS 2021 Contrastive Code Representation Learning EMNLP 2021 Grounded Graph Decoding improves Compositional Generalization in Question Answering EMNLP 2021 Representing Long-Range Context for Graph Neural Networks with Global Attention NIPS 2021 Accelerating Quadratic Optimization with Reinforcement Learning NIPS 2021 Ownership: A Distributed Futures System for Fine-Grained Tasks NSDI 2021 Caerus: NIMBLE Task Scheduling for Serverless Analytics NSDI 2021 IMPACT: Importance Weighted Asynchronous Architectures with Clipped Target Networks ICLR 2020 Variable Skipping for Autoregressive Range Density Estimation ICML 2020 FetchSGD: Communication-Efficient Federated Learning with Sketching ICML 2020 Ansor: Generating High-Performance Tensor Programs for Deep Learning OSDI 2020 DORY: An Encrypted Search System with Distributed Trust OSDI 2020 RackSched: A Microsecond-Scale Scheduler for Rack-Scale Computers OSDI 2020 Population Based Augmentation: Efficient Learning of Augmentation Policy Schedules ICML 2019 Confluo: Distributed Monitoring and Diagnosis Stack for High-speed Networks NSDI 2019 Shuffling, Fast and Slow: Scalable Analytics on Serverless Infrastructure NSDI 2019 Communication-efficient Distributed SGD with Sketching NIPS 2019 NetChain: Scale-Free Sub-RTT Coordination NSDI 2018 RLlib: Abstractions for Distributed Reinforcement Learning ICML 2018 Parametrized Hierarchical Procedures for Neural Programming ICLR 2018 Ray: A Distributed Framework for Emerging AI Applications OSDI 2018 ASAP: Fast, Approximate Graph Pattern Mining at Scale OSDI 2018 Clipper: A Low-Latency Online Prediction Serving System NSDI 2017 DDCO: Discovery of Deep Continuous Options for Robot Learning from Demonstrations CORL 2017 Opaque: An Oblivious and Encrypted Distributed Analytics Platform NSDI 2017 Ernest: Efficient Performance Prediction for Large-Scale Advanced Analytics NSDI 2016 EC-Cache: Load-Balanced, Low-Latency Cluster Caching with Online Erasure Coding OSDI 2016 HUG: Multi-Resource Fairness for Correlated and Elastic Demands NSDI 2016 FairRide: Near-Optimal, Fair Cache Sharing NSDI 2016 CFA: A Practical Prediction System for Video QoE Optimization NSDI 2016 BlowFish: Dynamic Storage-Performance Tradeoff in Data Stores NSDI 2016 CellIQ : Real-Time Cellular Network Analytics at Scale NSDI 2015 Succinct: Enabling Queries on Compressed Data NSDI 2015 C3: Internet-Scale Control Plane for Video Quality Optimization NSDI 2015 The Power of Choice in Data-Aware Cluster Scheduling OSDI 2014 GRASS: Trimming Stragglers in Approximation Analytics NSDI 2014 GraphX: Graph Processing in a Distributed Dataflow Framework OSDI 2014 Effective Straggler Mitigation: Attack of the Clones NSDI 2013 Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing NSDI 2012 PACMan: Coordinated Memory Caching for Parallel Jobs NSDI 2012 Reoptimizing Data Parallel Computing NSDI 2012