Beidi Chen
48 papers · 2019–2026 · 5 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+12 more ↓ Show less ↑
π§ Keyword Pioneer π£ Hot Topic Early Bird πΊοΈ Taxonomy Completionist (16) π Interdisciplinary Bridge π Conference Polyglot (5)
π
Cross-Pollinator
(12)
πΊοΈ
Taxonomy Completionist
(16)
π§
Keyword Pioneer
π¬
Deep Specialist
(11)
π
Keyword Champion
(4)
π€
Dynamic Duo
(12)
π
Triple Crown
ποΈ
Keyword Collector
(135)
β
The Questioner
β‘
Prolific Year
(20)
π
Century Club
(46)
π₯
Unstoppable
(7)
Conferences
NIPS (19)
ICML (15)
ICLR (10)
ACL (3)
COLT (1)
Top co-authors
Research topics
Keywords
large language model
(12)
model compression
(9)
inference efficiency
(5)
speculative decoding
(4)
inference optimization
(3)
language model
(3)
distributed learning
(3)
attention mechanism
(3)
stochastic gradient descent
(3)
inference acceleration
(2)
memory optimization
(2)
foundation model
(2)
batch processing
(2)
contextual sparsity
(2)
structured sparsity
(2)
knowledge distillation
(2)
parameter efficient
(2)
computational efficiency
(2)
locality sensitive hashing
(2)
communication compression
(2)
Papers
When "Correct" Is Not Safe: Can We Trust Functionally Correct Patches Generated by Code Agents?
ACL 2026
MedVerse: Efficient and Reliable Medical Reasoning via DAG-Structured Parallel Execution
ACL 2026
GSM-$β$: How Do your LLMs Behave over Infinitely Increasing Reasoning Complexity and Context Length?
ICML 2025
ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference
ICML 2025
Speculative Prefill: Turbocharging TTFT with Lightweight and Training-Free Token Importance Estimation
ICML 2025
Zeroth-Order Fine-Tuning of LLMs with Transferable Static Sparsity
ICLR 2025
APE: Faster and Longer Context-Augmented Generation via Adaptive Parallel Encoding
ICLR 2025
Memory Mosaics
ICLR 2025
MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding
ICLR 2025
MagicPIG: LSH Sampling for Efficient LLM Generation
ICLR 2025
On the Surprising Effectiveness of Attention Transfer for Vision Transformers
NIPS 2024
Sequoia: Scalable and Robust Speculative Decoding
NIPS 2024
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding
ACL 2024
Soft Prompt Recovers Compressed LLMs, Transferably
ICML 2024
$\texttt{Model-GLUE}$: Democratized LLM Scaling for A Large Model Zoo in the Wild
NIPS 2024
SpecExec: Massively Parallel Speculative Decoding For Interactive LLM Inference on Consumer Devices
NIPS 2024
SIRIUS : Contexual Sparisty with Correction for Efficient LLMs
NIPS 2024
S$^{2}$FT: Efficient, Scalable and Generalizable LLM Fine-tuning by Structured Sparsity
NIPS 2024
Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding
NIPS 2024
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length
NIPS 2024
Nearest Neighbor Speculative Decoding for LLM Generation and Attribution
NIPS 2024
Mini-Sequence Transformers: Optimizing Intermediate Memory for Long Sequences Training
NIPS 2024
Learn To be Efficient: Build Structured Sparsity in Large Language Models
NIPS 2024
JoMA: Demystifying Multilayer Transformers via Joint Dynamics of MLP and Attention
ICLR 2024
Efficient Streaming Language Models with Attention Sinks
ICLR 2024
LoCoCo: Dropping In Convolutions for Long Context Compression
ICML 2024
Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference
ICML 2024
HexGen: Generative Inference of Large Language Model over Heterogeneous Environment
ICML 2024
KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache
ICML 2024
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
ICML 2024
H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
NIPS 2023
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
ICML 2023
FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU
ICML 2023
CocktailSGD: Fine-tuning Foundation Models over 500Mbps Networks
ICML 2023
Laughing Hyena Distillery: Extracting Compact Recurrences From Convolutions
NIPS 2023
Fast Algorithms for a New Relaxation of Optimal Transport
COLT 2023
Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer
NIPS 2023
Pixelated Butterfly: Simple and Efficient Sparse training for Neural Network Models
ICLR 2022
Monarch: Expressive Structured Matrices for Efficient and Accurate Training
ICML 2022
Fine-tuning Language Models over Slow Networks using Activation Quantization with Guarantees
NIPS 2022
Decentralized Training of Foundation Models in Heterogeneous Environments
NIPS 2022
SOLAR: Sparse Orthogonal Learned and Random Embeddings
ICLR 2021
Locality Sensitive Teaching
NIPS 2021
A Tale of Two Efficient and Informative Negative Sampling Distributions
ICML 2021
Scatterbrain: Unifying Sparse and Low-rank Attention
NIPS 2021
MONGOOSE: A Learnable LSH Framework for Efficient Neural Network Training
ICLR 2021
Angular Visual Hardness
ICML 2020
Fast and Accurate Stochastic Gradient Estimation
NIPS 2019