Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Model Compression
1928 directly classified papers
Papers per year
2013: 2
2014: 1
2015: 6
2016: 4
2017: 13
2018: 47
2019: 81
2020: 114
2021: 172
2022: 191
2023: 272
2024: 370
2025: 489
2026: 166
Papers
Blind-Touch: Homomorphic Encryption-Based Distributed Neural Network Inference for Privacy-Preserving Fingerprint Authentication
AAAI 2024
Revisiting the Information Capacity of Neural Network Watermarks: Upper Bound Estimation and Beyond
AAAI 2024
ShareBERT: Embeddings Are Capable of Learning Hidden Layers
AAAI 2024
OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Models
AAAI 2024
BitsFusion: 1.99 bits Weight Quantization of Diffusion Model
NIPS 2024
Building on Efficient Foundations: Effective Training of LLMs with Structured Feedforward Layers
NIPS 2024
RoCoIns: Enhancing Robustness of Large Language Models through Code-Style Instructions
COLING 2024
Pruning before Fine-tuning: A Retraining-free Compression Framework for Pre-trained Language Models
COLING 2024
Probe Then Retrieve and Reason: Distilling Probing and Reasoning Capabilities into Smaller Language Models
COLING 2024
Expanding Sparse Tuning for Low Memory Usage
NIPS 2024
Multilingual Brain Surgeon: Large Language Models Can Be Compressed Leaving No Language behind
COLING 2024
LoNAS: Elastic Low-Rank Adapters for Efficient Large Language Models
COLING 2024
Generation Meets Verification: Accelerating Large Language Model Inference with Smart Parallel Auto-Correct Decoding
ACL 2024
Anchor-based Large Language Models
ACL 2024
SeTAR: Out-of-Distribution Detection with Selective Low-Rank Approximation
NIPS 2024
Fast Randomized Low-Rank Adaptation of Pre-trained Language Models with PAC Regularization
ACL 2024
Towards Next-Level Post-Training Quantization of Hyper-Scale Transformers
NIPS 2024
FlattenQuant: Breaking through the Inference Compute-bound for Large Language Models with Per-tensor Quantization
COLING 2024
PLaD: Preference-based Large Language Model Distillation with Pseudo-Preference Pairs
ACL 2024
Sinkhorn Distance Minimization for Knowledge Distillation
COLING 2024
NN-Defined Modulator: Reconfigurable and Portable Software Modulator on IoT Gateways
NSDI 2024
Do Emergent Abilities Exist in Quantized Large Language Models: An Empirical Study
COLING 2024
BMRS: Bayesian Model Reduction for Structured Pruning
NIPS 2024
ELAD: Explanation-Guided Large Language Models Active Distillation
ACL 2024
EFTNAS: Searching for Efficient Language Models in First-Order Weight-Reordered Super-Networks
COLING 2024
<
1
…
27
28
29
…
78
>