Haoli Bai

21 papers · 2016–2026 · 11 conferences · across top CS/AI conferences

Achievements

+11 more ↓

🏃 Academic Marathon (9) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌍 Conference Polyglot (11) 🐝 Cross-Pollinator (13)

🐣 Hot Topic Early Bird 🌍 Conference Polyglot (11) 🏃 Academic Marathon (9) 🤝 Dynamic Duo (13) 🏆 Grand Slam 👥 Mega-Team (30) 🔥 Unstoppable (6) 📈 Trend Setter 💎 Century Club (20) 🗃️ Keyword Collector (105) ⚡ Prolific Year (5)

Conferences

ACL (5) AAAI (3) CVPR (2) EMNLP (2) IJCAI (2) NIPS (2) ACML (1) ICLR (1) ICML (1) IJCNLP (1) NAACL (1)

Top co-authors

Lu Hou (13) Xin Jiang (7) Qun Liu (7) Irwin King (5) Lifeng Shang (4) Michael Lyu (4) Haokun Lin (3) Wei Zhang (3) Xianzhi Yu (3) Jiaxiang Wu (3)

Keywords

model compression (10) large language model (5) vision-language model (3) knowledge distillation (3) efficient inference (2) neural architecture search (2) inference efficiency (2) post-training quantization (2) visual document understanding (2) structured pruning (2) model quantization (2) neural network optimization (2) few-shot learning (2) multimodal learning (2) latency optimization (2) weight binarization (2) speech processing (1) speech synthesis (1) bert compression (1) link prediction (1)

Papers

Benchmarking Post-Training Quantization of Large Language Models under Microscaling Floating Point Formats ACL 2026 EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions CVPR 2025 TreeKV: Smooth Key-Value Cache Compression with Tree Structures IJCAI 2025 FlatQuant: Flatness Matters for LLM Quantization ICML 2025 Faster and Better LLMs via Latency-Aware Test-Time Scaling EMNLP 2025 Efficient Inference for Large Language Models –Algorithm, Model, and System EMNLP 2025 MoPE-CLIP: Structured Pruning for Efficient Vision-Language Models with Module-wise Pruning Error Metric CVPR 2024 IntactKV: Improving Large Language Model Quantization by Keeping Pivot Tokens Intact ACL 2024 Plug-and-Play: An Efficient Post-training Pruning Method for Large Language Models ICLR 2024 Visually Guided Generative Text-Layout Pre-training for Document Intelligence NAACL 2024 Wukong-Reader: Multi-modal Pre-training for Fine-grained Visual Document Understanding ACL 2023 Structured Pruning for Efficient Generative Pre-trained Language Models ACL 2023 Towards Efficient Post-training Quantization of Pre-trained Language Models NIPS 2022 BinaryBERT: Pushing the Limit of BERT Quantization ACL 2021 BinaryBERT: Pushing the Limit of BERT Quantization IJCNLP 2021 Revisiting Parameter Sharing for Automatic Neural Channel Number Search NIPS 2020 M-NAS: Meta Neural Architecture Search AAAI 2020 RTN: Reparameterized Ternary Network AAAI 2020 Few Shot Network Compression via Cross Distillation AAAI 2020 Structured Inference for Recurrent Hidden Semi-markov Model IJCAI 2018 Hierarchical Probabilistic Matrix Factorization with Network Topology for Multi-relational Social Network ACML 2016