conftrace_

Zhewei Yao

35 papers · 2018–2026 · 11 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓
+15 more ↓ 🌍 Conference Polyglot (10) πŸƒ Academic Marathon (7) 🧭 Keyword Pioneer πŸŒ‰ Interdisciplinary Bridge 🐣 Hot Topic Early Bird
🌈 Renaissance Researcher (5) 🧭 Keyword Pioneer πŸ—ΊοΈ Taxonomy Completionist (50) πŸ† Grand Slam 🧬 Topic Evolution πŸ† Keyword Champion (2) 🀝 Dynamic Duo (16) πŸ‘‘ Triple Crown πŸ”¬ Deep Specialist (12) πŸ—ƒοΈ Keyword Collector (138) ❓ The Questioner (2) ⚑ Prolific Year (7) πŸ’Ž Century Club (33) πŸ”₯ Unstoppable (8) πŸ“ˆ Trend Setter

Conferences

NIPS (7) ICML (6) AAAI (5) EMNLP (4) ACL (3) ICLR (3) CVPR (2) NAACL (2) EACL (1) ICCV (1) WACV (1)

Papers

TAGQuant: Token-Aware Clustering for Group-Wise Quantization EACL 2026 GRAD: Generalizing RAG Adaptation with Decoding ACL 2026 SwiftKV: Fast Prefill-Optimized Inference with Knowledge-Preserving Model Transformation EMNLP 2025 Inference Scaling for Bridging Retrieval and Augmented Generation NAACL 2025 CORD: Balancing COnsistency and Rank Distillation for Robust Retrieval-Augmented Generation NAACL 2025 STUN: Structured-Then-Unstructured Pruning for Scalable MoE Pruning ACL 2025 Optimizing Reasoning for Text-to-SQL with Execution Feedback ACL 2025 Exploring Post-training Quantization in LLMs from Comprehensive Study to Low Rank Compensation AAAI 2024 ZeRO++: Extremely Efficient Collective Communication for Large Model Training ICLR 2024 Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding NIPS 2024 DeepSpeed Data Efficiency: Improving Deep Learning Model Quality and Training Efficiency via Efficient Data Sampling and Routing AAAI 2024 Scaling Vision-Language Models with Sparse Mixture of Experts EMNLP 2023 DySR: Adaptive Super-Resolution via Algorithm and System Co-design ICLR 2023 Understanding Int4 Quantization for Language Models: Latency Speedup, Composability, and Failure Cases ICML 2023 How Much Can CLIP Benefit Vision-and-Language Tasks? ICLR 2022 Hessian-Aware Pruning and Optimal Neural Implant WACV 2022 XTC: Extreme Compression for Pre-trained Transformers Made Simple and Efficient NIPS 2022 ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers NIPS 2022 DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale ICML 2022 What’s Hidden in a One-layer Randomly Weighted Transformer? EMNLP 2021 ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning AAAI 2021 ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training ICML 2021 I-BERT: Integer-only BERT Quantization ICML 2021 HAWQ-V3: Dyadic Neural Network Quantization ICML 2021 PowerNorm: Rethinking Batch Normalization in Transformers ICML 2020 HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks NIPS 2020 A Statistical Framework for Low-bitwidth Training of Deep Neural Networks NIPS 2020 ZeroQ: A Novel Zero Shot Quantization Framework CVPR 2020 Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT AAAI 2020 Inefficiency of K-FAC for Large Batch Size Training AAAI 2020 MAF: Multimodal Alignment Framework for Weakly-Supervised Phrase Grounding EMNLP 2020 ANODEV2: A Coupled Neural ODE Framework NIPS 2019 Trust Region Based Adversarial Attack on Neural Networks CVPR 2019 HAWQ: Hessian AWare Quantization of Neural Networks With Mixed-Precision ICCV 2019 Hessian-based Analysis of Large Batch Training and Robustness to Adversaries NIPS 2018