conftrace_

Chengruidong Zhang

7 papers · 2024–2025 · 6 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

🌍 Conference Polyglot (6) 🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (12) 🧭 Keyword Pioneer 🐝 Cross-Pollinator (15)

Conferences

ICML (2) EMNLP (1) ICLR (1) NIPS (1) NSDI (1) OSDI (1)

Top co-authors

Yuqing Yang (5) Lili Qiu (5) Huiqiang Jiang (4) Yucheng Li (3) Amir H. Abdi (3) Dongsheng Li (3) Qianhui Wu (3) Xufang Luo (3) Surin Ahn (3) Jianfeng Gao (2)

Keywords

inference acceleration (2) large language model (2) neural architecture search (1) efficient computing (1) variational autoencoder (1) memory optimization (1) inference optimization (1) sparse attention (1) kv cache (1) request orchestration (1) channel pruning (1) decoding efficiency (1) end-to-end optimization (1) dynamic sparsity (1) efficient decoding (1) latency prediction (1) hardware-aware optimization (1) semantic variable (1) llm application (1) data flow analysis (1)

Papers

MMInference: Accelerating Pre-filling for Long-Context Visual Language Models via Modality-Aware Permutation Sparse Attention ICML 2025 SCBench: A KV Cache-Centric Analysis of Long-Context Methods ICLR 2025 LeanK: Learnable K Cache Channel Pruning for Efficient Decoding EMNLP 2025 Parrot: Efficient Serving of LLM-based Applications with Semantic Variable OSDI 2024 LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens ICML 2024 MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention NIPS 2024 LitePred: Transferable and Scalable Latency Prediction for Hardware-Aware Neural Architecture Search NSDI 2024