Yuxiong He
26 papers · 2018–2026 · 8 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+11 more ↓ Show less ↑
π Conference Polyglot (7) π Academic Marathon (7) π Interdisciplinary Bridge π§ Keyword Pioneer π£ Hot Topic Early Bird
π
Cross-Pollinator
(8)
πΊοΈ
Taxonomy Completionist
(42)
π
Conference Polyglot
(7)
π€
Dynamic Duo
(14)
π
Grand Slam
π§¬
Topic Evolution
π
Century Club
(24)
π
Conference Pioneer
ποΈ
Keyword Collector
(92)
β‘
Prolific Year
(5)
π₯
Unstoppable
(6)
Conferences
NIPS (7)
ICLR (5)
AAAI (3)
ACL (3)
ICML (3)
EMNLP (2)
NAACL (2)
EACL (1)
Top co-authors
Keywords
model compression
(8)
knowledge distillation
(5)
large language model
(5)
neural network optimization
(3)
inference optimization
(3)
efficient computing
(3)
mixture of expert
(3)
weight quantization
(3)
retrieval-augmented generation
(3)
batch size
(2)
transformer model
(2)
efficient training
(2)
post-training quantization
(2)
sparse model
(2)
training efficiency
(2)
natural language understanding
(1)
model pretraining
(1)
domain adaptation
(1)
multimodal learning
(1)
language modeling
(1)
Papers
TAGQuant: Token-Aware Clustering for Group-Wise Quantization
EACL 2026
GRAD: Generalizing RAG Adaptation with Decoding
ACL 2026
SwiftKV: Fast Prefill-Optimized Inference with Knowledge-Preserving Model Transformation
EMNLP 2025
STUN: Structured-Then-Unstructured Pruning for Scalable MoE Pruning
ACL 2025
Optimizing Reasoning for Text-to-SQL with Execution Feedback
ACL 2025
ConvCodeWorld: Benchmarking Conversational Code Generation in Reproducible Feedback Environments
ICLR 2025
CORD: Balancing COnsistency and Rank Distillation for Robust Retrieval-Augmented Generation
NAACL 2025
Inference Scaling for Bridging Retrieval and Augmented Generation
NAACL 2025
DeepSpeed Data Efficiency: Improving Deep Learning Model Quality and Training Efficiency via Efficient Data Sampling and Routing
AAAI 2024
Exploring Post-training Quantization in LLMs from Comprehensive Study to Low Rank Compensation
AAAI 2024
ZeRO++: Extremely Efficient Collective Communication for Large Model Training
ICLR 2024
Scaling Vision-Language Models with Sparse Mixture of Experts
EMNLP 2023
Understanding Int4 Quantization for Language Models: Latency Speedup, Composability, and Failure Cases
ICML 2023
DySR: Adaptive Super-Resolution via Algorithm and System Co-design
ICLR 2023
Maximizing Communication Efficiency for Large-scale Training via 0/1 Adam
ICLR 2023
The Stability-Efficiency Dilemma: Investigating Sequence Length Warmup for Training GPT Models
NIPS 2022
DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale
ICML 2022
Adversarial Data Augmentation for Task-Specific Knowledge Distillation of Pre-trained Transformers
AAAI 2022
XTC: Extreme Compression for Pre-trained Transformers Made Simple and Efficient
NIPS 2022
ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers
NIPS 2022
NxMTransformer: Semi-Structured Sparsification for Natural Language Understanding via ADMM
NIPS 2021
1-bit Adam: Communication Efficient Large-Scale Training with Adamβs Convergence Speed
ICML 2021
SimiGrad: Fine-Grained Adaptive Batching for Large Scale Training using Gradient Similarity Measurement
NIPS 2021
Accelerating Training of Transformer-Based Language Models with Progressive Layer Dropping
NIPS 2020
Learning Intrinsic Sparse Structures within Long Short-Term Memory
ICLR 2018
Navigating with Graph Representations for Fast and Scalable Decoding of Neural Language Models
NIPS 2018