conftrace_

Qingan Li

3 papers · 2024–2025 · 2 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

🌍 Conference Polyglot (2) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐝 Cross-Pollinator (15)

Conferences

EMNLP (2) ACL (1)

Top co-authors

Chun Jason Xue (3) Shangyu Wu (3) Junhui He (3) Nan Wang (1) Peng Zhou (1) Junna Xing (1) Weidong Wen (1) Rui Xu (1) Chun Hu (1) Yuxin He (1)

Keywords

model compression (3) post-training quantization (1) efficient inference (1) vector quantization (1) model inference (1) feed-forward network (1) inference optimization (1) inference efficiency (1) weight quantization (1) kv cache (1) long context (1) bit-width allocation (1) small language model (1) activation sparsification (1) channel-wise thresholding (1) selective sparsification (1)

Papers

A2ATS: Retrieval-Based KV Cache Reduction via Windowed Rotary Position Embedding and Query-Aware Vector Quantization ACL 2025 MLWQ: Efficient Small Language Model Deployment via Multi-Level Weight Quantization EMNLP 2025 CHESS: Optimizing LLM Inference via Channel-Wise Thresholding and Selective Sparsification EMNLP 2024