Chengruidong Zhang
7 papers · 2024–2025 · 6 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓
π
Conference Polyglot
(6)
π
Interdisciplinary Bridge
πΊοΈ
Taxonomy Completionist
(12)
π§
Keyword Pioneer
π
Cross-Pollinator
(15)
Conferences
ICML (2)
EMNLP (1)
ICLR (1)
NIPS (1)
NSDI (1)
OSDI (1)
Top co-authors
Keywords
inference acceleration
(2)
large language model
(2)
neural architecture search
(1)
efficient computing
(1)
variational autoencoder
(1)
memory optimization
(1)
inference optimization
(1)
sparse attention
(1)
kv cache
(1)
request orchestration
(1)
channel pruning
(1)
decoding efficiency
(1)
end-to-end optimization
(1)
dynamic sparsity
(1)
efficient decoding
(1)
latency prediction
(1)
hardware-aware optimization
(1)
semantic variable
(1)
llm application
(1)
data flow analysis
(1)
Papers
MMInference: Accelerating Pre-filling for Long-Context Visual Language Models via Modality-Aware Permutation Sparse Attention
ICML 2025
SCBench: A KV Cache-Centric Analysis of Long-Context Methods
ICLR 2025
LeanK: Learnable K Cache Channel Pruning for Efficient Decoding
EMNLP 2025
Parrot: Efficient Serving of LLM-based Applications with Semantic Variable
OSDI 2024
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
ICML 2024
MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention
NIPS 2024
LitePred: Transferable and Scalable Latency Prediction for Hardware-Aware Neural Architecture Search
NSDI 2024