Pengle Zhang
4 papers · 2023–2025 · 4 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓
🌍
Conference Polyglot
(4)
🌉
Interdisciplinary Bridge
🧭
Keyword Pioneer
🐝
Cross-Pollinator
(15)
Conferences
EMNLP (1)
ICLR (1)
ICML (1)
NIPS (1)
Top co-authors
Keywords
model compression
(1)
attention mechanism
(1)
parameter-efficient learning
(1)
computational efficiency
(1)
parameter efficient
(1)
language model
(1)
inference efficiency
(1)
parameter efficiency
(1)
sequence compression
(1)
context memory
(1)
large language model
(1)
neural network
(1)
plug-and-play module
(1)
plug-and-play architecture
(1)
long context extrapolation
(1)
token relevant unit
(1)
Papers
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration
ICLR 2025
SageAttention2: Efficient Attention with Thorough Outlier Smoothing and Per-thread INT4 Quantization
ICML 2025
InfLLM: Training-Free Long-Context Extrapolation for LLMs with an Efficient Context Memory
NIPS 2024
Variator: Accelerating Pre-trained Models with Plug-and-Play Compression Modules
EMNLP 2023