Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Model Compression
1928 directly classified papers
Papers per year
2013: 2
2014: 1
2015: 6
2016: 4
2017: 13
2018: 47
2019: 81
2020: 114
2021: 172
2022: 191
2023: 272
2024: 370
2025: 489
2026: 166
Papers
VB-LoRA: Extreme Parameter Efficient Fine-Tuning with Vector Banks
NIPS 2024
MemoryFormer : Minimize Transformer Computation by Removing Fully-Connected Layers
NIPS 2024
Don't Look Twice: Faster Video Transformers with Run-Length Tokenization
NIPS 2024
Protecting Your LLMs with Information Bottleneck
NIPS 2024
Agile-Quant: Activation-Guided Quantization for Faster Inference of LLMs on the Edge
AAAI 2024
Exploring Post-training Quantization in LLMs from Comprehensive Study to Low Rank Compensation
AAAI 2024
MERGE: Fast Private Text Generation
AAAI 2024
Fairness-Aware Structured Pruning in Transformers
AAAI 2024
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression
ACL 2024
CeeBERT: Cross-Domain Inference in Early Exit BERT
ACL 2024
Papilusion at DAGPap24: Paper or Illusion? Detecting AI-generated Scientific Papers
ACL 2024
Token Alignment via Character Matching for Subword Completion
ACL 2024
AFLoRA: Adaptive Freezing of Low Rank Adaptation in Parameter Efficient Fine-Tuning of Large Models
ACL 2024
Mentor-KD: Making Small Language Models Better Multi-step Reasoners
EMNLP 2024
RETAIN: Interactive Tool for Regression Testing Guided LLM Migration
EMNLP 2024
ScaleLLM: A Resource-Frugal LLM Serving Framework by Optimizing End-to-End Efficiency
EMNLP 2024
Med-MoE: Mixture of Domain-Specific Experts for Lightweight Medical Vision-Language Models
EMNLP 2024
Adaptive Feature-based Low-Rank Compression of Large Language Models via Bayesian Optimization
EMNLP 2024
When Compression Meets Model Compression: Memory-Efficient Double Compression for Large Language Models
EMNLP 2024
MiLoRA: Efficient Mixture of Low-Rank Adaptation for Large Language Models Fine-tuning
EMNLP 2024
ATQ: Activation Transformation forWeight-Activation Quantization of Large Language Models
EMNLP 2024
Stochastic Fine-Tuning of Language Models Using Masked Gradients
EMNLP 2024
Pruning for Protection: Increasing Jailbreak Resistance in Aligned LLMs Without Fine-Tuning
EMNLP 2024
Global-Pruner: A Stable and Efficient Pruner for Retraining-Free Pruning of Encoder-Based Language Models
EMNLP 2024
LinChance-NTU for Unconstrained WMT2024 Literary Translation
EMNLP 2024
<
1
…
38
39
40
…
78
>