Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Optimization & Theory
Deep Learning
›
Optimization & Theory
›
Model Compression
1674 directly classified papers
Papers per year
2012: 1
2013: 2
2014: 2
2015: 7
2016: 9
2017: 27
2018: 51
2019: 79
2020: 189
2021: 165
2022: 206
2023: 207
2024: 325
2025: 399
2026: 5
Papers
Torque Based Structured Pruning for Deep Neural Network
WACV 2024
Fast Randomized Low-Rank Adaptation of Pre-trained Language Models with PAC Regularization
ACL 2024
Unleashing Channel Potential: Space-Frequency Selection Convolution for SAR Object Detection
CVPR 2024
SnapKV: LLM Knows What You are Looking for Before Generation
NIPS 2024
LLM can Achieve Self-Regulation via Hyperparameter Aware Generation
ACL 2024
PartialFormer: Modeling Part Instead of Whole for Machine Translation
ACL 2024
LM-Cocktail: Resilient Tuning of Language Models via Model Merging
ACL 2024
KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
NIPS 2024
Disperse-Then-Merge: Pushing the Limits of Instruction Tuning via Alignment Tax Reduction
ACL 2024
D-LLM: A Token Adaptive Computing Resource Allocation Strategy for Large Language Models
NIPS 2024
Wino Vidi Vici: Conquering Numerical Instability of 8-Bit Winograd Convolution for Accurate Inference Acceleration on Edge
WACV 2024
LoRAPrune: Structured Pruning Meets Low-Rank Parameter-Efficient Fine-Tuning
ACL 2024
Reducing the Side-Effects of Oscillations in Training of Quantized YOLO Networks
WACV 2024
Reasons and Solutions for the Decline in Model Performance after Editing
NIPS 2024
ResLoRA: Identity Residual Mapping in Low-Rank Adaption
ACL 2024
DB-LLM: Accurate Dual-Binarization for Efficient LLMs
ACL 2024
BASS: Batched Attention-optimized Speculative Sampling
ACL 2024
A Brain-Inspired Way of Reducing the Network Complexity via Concept-Regularized Coding for Emotion Recognition
AAAI 2024
LoraRetriever: Input-Aware LoRA Retrieval and Composition for Mixed Tasks in the Wild
ACL 2024
IntactKV: Improving Large Language Model Quantization by Keeping Pivot Tokens Intact
ACL 2024
Light-PEFT: Lightening Parameter-Efficient Fine-Tuning via Early Pruning
ACL 2024
LoRA-Flow: Dynamic LoRA Fusion for Large Language Models in Generative Tasks
ACL 2024
Retaining Key Information under High Compression Ratios: Query-Guided Compressor for LLMs
ACL 2024
Searching for Efficient Linear Layers over a Continuous Space of Structured Matrices
NIPS 2024
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding
ACL 2024
<
1
…
26
27
28
…
67
>