Zhihao Jia

21 papers · 2012–2025 · 8 conferences · across top CS/AI conferences

Achievements

+10 more ↓

🏃 Academic Marathon (13) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌍 Conference Polyglot (8) 🐣 Hot Topic Early Bird

🌍 Conference Polyglot (8) 🏃 Academic Marathon (13) 🧭 Keyword Pioneer 🧬 Topic Evolution 🗃️ Keyword Collector (80) 🚀 Conference Pioneer 💎 Century Club (21) 🔥 Unstoppable (5) 📈 Trend Setter ⚡ Prolific Year (7)

Conferences

OSDI (6) NIPS (4) ICLR (3) NSDI (3) ICML (2) ACL (1) EMNLP (1) IJCAI (1)

Top co-authors

Xupeng Miao (5) Zhihao Zhang (5) Beidi Chen (3) Zhuoming Chen (3) Jidong Zhai (2) Ruslan Svirschevski (2) Guoqing Harry Xu (2) Shizhi Tang (2) Zixuan Ma (2) Max Ryabinin (2)

Keywords

distributed training (6) neural network optimization (3) deep neural network (2) preemptible instance (2) tensor program optimization (2) speculative decoding (2) graph neural network (2) token generation (2) large language model (2) model parallelism (2) communication complexity (1) outlier detection (1) parallel computing (1) dynamic programming (1) model serving (1) decision making (1) adversarial learning (1) kernel optimization (1) anomaly detection (1) graph optimization (1)

Papers

DDO: Dual-Decision Optimization for LLM-Based Medical Consultation via Multi-Agent Collaboration EMNLP 2025 Mirage: A Multi-Level Superoptimizer for Tensor Programs OSDI 2025 TidalDecode: Fast and Accurate LLM Decoding with Position Persistent Sparse Attention ICLR 2025 MagicPIG: LSH Sampling for Efficient LLM Generation ICLR 2025 Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models ACL 2024 Accelerating Iterative Retrieval-augmented Language Model Serving with Speculation ICML 2024 SpecExec: Massively Parallel Speculative Decoding For Interactive LLM Inference on Consumer Devices NIPS 2024 Communication Bounds for the Distributed Experts Problem NIPS 2024 Sequoia: Scalable and Robust Speculative Decoding NIPS 2024 X-former Elucidator: Reviving Efficient Attention for Long Context Language Modeling IJCAI 2024 Parcae: Proactive, Liveput-Optimized DNN Training on Preemptible Instances NSDI 2024 Bamboo: Making Preemptible Instances Resilient for Affordable Training of Large DNNs NSDI 2023 EINNET: Optimizing Tensor Programs with Derivation-Based Transformations OSDI 2023 TopoOpt: Co-optimizing Network Topology and Parallelization Strategy for Distributed Training Jobs NSDI 2023 BOND: Benchmarking Unsupervised Outlier Node Detection on Static Attributed Graphs NIPS 2022 GradSign: Model Performance Inference with Theoretical Insights ICLR 2022 Unity: Accelerating DNN Training Through Joint Optimization of Algebraic Transformations and Parallelization OSDI 2022 Dorylus: Affordable, Scalable, and Accurate GNN Training with Distributed CPU Servers and Serverless Threads OSDI 2021 PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections OSDI 2021 Exploring Hidden Dimensions in Accelerating Convolutional Neural Networks ICML 2018 Improving Integer Security for Systems with KINT OSDI 2012