Xiaoxia Wu
14 papers · 2019–2024 · 6 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+8 more ↓ Show less ↑
🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (6) 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🏃 Academic Marathon (5)
🐝
Cross-Pollinator
(12)
🌍
Conference Polyglot
(6)
🏃
Academic Marathon
(5)
🏆
Grand Slam
🗃️
Keyword Collector
(51)
💎
Century Club
(14)
🔥
Unstoppable
(6)
❓
The Questioner
Conferences
NIPS (4)
AAAI (3)
AISTATS (2)
ICLR (2)
ICML (2)
JMLR (1)
Top co-authors
Keywords
model compression
(4)
linear convergence
(3)
weight quantization
(3)
stochastic gradient descent
(3)
post-training quantization
(2)
deep learning
(2)
nonconvex optimization
(2)
transformer model
(2)
adaptive gradient
(2)
knowledge distillation
(2)
gradient descent
(2)
convergence rate
(2)
large language model
(2)
efficient computing
(1)
model pretraining
(1)
convergence analysis
(1)
efficient training
(1)
adaptive learning rate
(1)
attention mechanism
(1)
outlier robustness
(1)
Papers
DeepSpeed Data Efficiency: Improving Deep Learning Model Quality and Training Efficiency via Efficient Data Sampling and Routing
AAAI 2024
Exploring Post-training Quantization in LLMs from Comprehensive Study to Low Rank Compensation
AAAI 2024
Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding
NIPS 2024
ZeRO++: Extremely Efficient Collective Communication for Large Model Training
ICLR 2024
Understanding Int4 Quantization for Language Models: Latency Speedup, Composability, and Failure Cases
ICML 2023
ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers
NIPS 2022
AdaLoss: A Computationally-Efficient and Provably Convergent Adaptive Gradient Method
AAAI 2022
XTC: Extreme Compression for Pre-trained Transformers Made Simple and Efficient
NIPS 2022
When Do Curricula Work?
ICLR 2021
AdaGrad stepsizes: Sharp convergence over nonconvex landscapes
JMLR 2020
Linear Convergence of Adaptive Stochastic Gradient Descent
AISTATS 2020
Choosing the Sample with Lowest Loss makes SGD Robust
AISTATS 2020
Implicit Regularization and Convergence for Weight Normalization
NIPS 2020
AdaGrad Stepsizes: Sharp Convergence Over Nonconvex Landscapes
ICML 2019