conftrace_

Pengle Zhang

4 papers · 2023–2025 · 4 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

🌍 Conference Polyglot (4) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐝 Cross-Pollinator (15)

Conferences

EMNLP (1) ICLR (1) ICML (1) NIPS (1)

Top co-authors

Jintao Zhang (2) Zhengyan Zhang (2) Yankai Lin (2) Maosong Sun (2) Jun Zhu (2) Zhiyuan Liu (2) Chaojun Xiao (2) Xu Han (2) Jia wei (2) Jianfei Chen (2)

Keywords

model compression (1) attention mechanism (1) parameter-efficient learning (1) computational efficiency (1) parameter efficient (1) language model (1) inference efficiency (1) parameter efficiency (1) sequence compression (1) context memory (1) large language model (1) neural network (1) plug-and-play module (1) plug-and-play architecture (1) long context extrapolation (1) token relevant unit (1)

Papers

SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration ICLR 2025 SageAttention2: Efficient Attention with Thorough Outlier Smoothing and Per-thread INT4 Quantization ICML 2025 InfLLM: Training-Free Long-Context Extrapolation for LLMs with an Efficient Context Memory NIPS 2024 Variator: Accelerating Pre-trained Models with Plug-and-Play Compression Modules EMNLP 2023