Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Application Areas
Machine Learning
›
Application Areas
›
Model Compression
1503 directly classified papers
Papers per year
2006: 2
2010: 2
2011: 1
2013: 5
2014: 3
2015: 4
2016: 3
2017: 14
2018: 36
2019: 55
2020: 117
2021: 171
2022: 172
2023: 175
2024: 331
2025: 402
2026: 10
Papers
A Middle Path for On-Premises LLM Deployment: Preserving Privacy Without Sacrificing Model Confidentiality
EMNLP 2025
Does Acceleration Cause Hidden Instability in Vision Language Models? Uncovering Instance-Level Divergence Through a Large-Scale Empirical Study
EMNLP 2025
Sheaf Discovery with Joint Computation Graph Pruning and Flexible Granularity
EMNLP 2025
FaST: Feature-aware Sampling and Tuning for Personalized Preference Alignment with Limited Data
EMNLP 2025
XQuant: Achieving Ultra-Low Bit KV Cache Quantization with Cross-Layer Compression
EMNLP 2025
Not All Parameters Are Created Equal: Smart Isolation Boosts Fine-Tuning Performance
EMNLP 2025
Calibrating LLM Confidence by Probing Perturbed Representation Stability
EMNLP 2025
SATER: A Self-Aware and Token-Efficient Approach to Routing and Cascading
EMNLP 2025
IG-Pruning: Input-Guided Block Pruning for Large Language Models
EMNLP 2025
NeuroAda: Activating Each Neuron’s Potential for Parameter-Efficient Fine-Tuning
EMNLP 2025
Gamma-Guard: Lightweight Residual Adapters for Robust Guardrails in Large Language Models
EMNLP 2025
PPC-GPT: Federated Task-Specific Compression of Large Language Models via Pruning and Chain-of-Thought Distillation
EMNLP 2025
Power doesn’t reside in size: A Low Parameter Hybrid Language Model (HLM) for Sentiment Analysis in Code-mixed data
EMNLP 2025
FLRC: Fine-grained Low-Rank Compressor for Efficient LLM Inference
EMNLP 2025
Controllable Memorization in LLMs via Weight Pruning
EMNLP 2025
ThinkEdit: Interpretable Weight Editing to Mitigate Overly Short Thinking in Reasoning Models
EMNLP 2025
An Orthogonal High-Rank Adaptation for Large Language Models
EMNLP 2025
MobiZO: Enabling Efficient LLM Fine-Tuning at the Edge via Inference Engines
EMNLP 2025
EcoLoRA: Communication-Efficient Federated Fine-Tuning of Large Language Models
EMNLP 2025
GAP: a Global Adaptive Pruning Method for Large Language Models
EMNLP 2025
zFLoRA: Zero-Latency Fused Low-Rank Adapters
EMNLP 2025
Mitigating Catastrophic Forgetting in Large Language Models with Forgetting-aware Pruning
EMNLP 2025
GraphKV: Breaking the Static Selection Paradigm with Graph-Based KV Cache Eviction
EMNLP 2025
Comprehensive and Efficient Distillation for Lightweight Sentiment Analysis Models
EMNLP 2025
EfficientCrackNet: A Lightweight Model for Crack Segmentation
WACV 2025
<
1
…
15
16
17
…
61
>