Amir Gholami

27 papers · 2018–2025 · 8 conferences · across top CS/AI conferences

Achievements

+12 more ↓

🏃 Academic Marathon (7) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (8) 🧭 Keyword Pioneer 🐝 Cross-Pollinator (11)

🗺️ Taxonomy Completionist (39) 🌍 Conference Polyglot (8) 🏃 Academic Marathon (7) 🤝 Dynamic Duo (26) 🏆 Keyword Champion (2) 🔬 Deep Specialist (10) 🧬 Topic Evolution 📈 Trend Setter ⚡ Prolific Year (5) 💎 Century Club (27) 🔥 Unstoppable (8) 🗃️ Keyword Collector (117)

Conferences

NIPS (10) ICML (7) AAAI (3) ACL (2) CVPR (2) EMNLP (1) ICCV (1) WACV (1)

Top co-authors

Kurt Keutzer (26) Michael W. Mahoney (20) Sehoon Kim (13) Zhewei Yao (13) Zhen Dong (7) Nicholas Lee (5) Michael Mahoney (5) Sheng Shen (5) Coleman Richard Charles Hooper (4) Suhong Moon (4)

Keywords

model compression (8) neural network optimization (6) inference efficiency (4) mixed-precision quantization (4) model quantization (4) large language model (3) neural network quantization (3) scientific machine learning (2) hessian spectrum (2) batch normalization (2) hessian analysis (2) transformer architecture (2) neural network (2) knowledge distillation (2) data augmentation (2) partial differential equation (2) large batch training (2) object detection (1) curriculum learning (1) k-means clustering (1)

Papers

Squeezed Attention: Accelerating Long Context Length LLM Inference ACL 2025 QuantSpec: Self-Speculative Decoding with Hierarchical Quantized KV Cache ICML 2025 Plan-and-Act: Improving Planning of Agents for Long-Horizon Tasks ICML 2025 SqueezeLLM: Dense-and-Sparse Quantization ICML 2024 KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization NIPS 2024 LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement ACL 2024 An LLM Compiler for Parallel Function Calling ICML 2024 TinyAgent: Function Calling at the Edge EMNLP 2024 Towards Foundation Models for Scientific Machine Learning: Characterizing Scaling and Transfer Behavior NIPS 2023 Speculative Decoding with Big Little Decoder NIPS 2023 Hessian-Aware Pruning and Optimal Neural Implant WACV 2022 Squeezeformer: An Efficient Transformer for Automatic Speech Recognition NIPS 2022 A Fast Post-Training Pruning Framework for Transformers NIPS 2022 ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning AAAI 2021 I-BERT: Integer-only BERT Quantization ICML 2021 HAWQ-V3: Dyadic Neural Network Quantization ICML 2021 Characterizing possible failure modes in physics-informed neural networks NIPS 2021 Inefficiency of K-FAC for Large Batch Size Training AAAI 2020 Boundary thickness and robustness in learning models NIPS 2020 HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks NIPS 2020 Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT AAAI 2020 ZeroQ: A Novel Zero Shot Quantization Framework CVPR 2020 PowerNorm: Rethinking Batch Normalization in Transformers ICML 2020 HAWQ: Hessian AWare Quantization of Neural Networks With Mixed-Precision ICCV 2019 Trust Region Based Adversarial Attack on Neural Networks CVPR 2019 ANODEV2: A Coupled Neural ODE Framework NIPS 2019 Hessian-based Analysis of Large Batch Training and Robustness to Adversaries NIPS 2018