conftrace_

model compression

3302 papers

Explore in graph

Also known as

MC

Co-occurring keywords

knowledge distillation (3725) large language model (13587) neural network (6616) efficient computing (781) neural network optimization (1293) transfer learning (5449) convolutional neural network (4226) neural network pruning (265) language model (4599) parameter efficiency (417)

Papers

Wasserstein Contrastive Representation Distillation CVPR 2021

ProFormer: Towards On-Device LSH Projection Based Transformers EACL 2021

EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets IJCNLP 2021

Accelerate CNNs from Three Dimensions: A Comprehensive Pruning Framework ICML 2021

Accurate Post Training Quantization With Small Calibration Sets ICML 2021

A Unified Lottery Ticket Hypothesis for Graph Neural Networks ICML 2021

BinaryBERT: Pushing the Limit of BERT Quantization IJCNLP 2021

LeeBERT: Learned Early Exit for BERT with cross-level optimization IJCNLP 2021

Super Tickets in Pre-Trained Language Models: From Model Compression to Improving Generalization IJCNLP 2021

DPOQ: Dynamic Precision Onion Quantization ACML 2021

RGPNet: A Real-Time General Purpose Semantic Segmentation WACV 2021

HAWQ-V3: Dyadic Neural Network Quantization ICML 2021

Enabling Lightweight Fine-tuning for Pre-trained Language Model Compression based on Matrix Product Operators IJCNLP 2021

Multi-stage Pre-training over Simplified Multimodal Pre-training Models IJCNLP 2021

BatchQuant: Quantized-for-all Architecture Search with Robust Quantizer NIPS 2021

Weight Distillation: Transferring the Knowledge in Neural Network Parameters IJCNLP 2021

Marginal Utility Diminishes: Exploring the Minimum Knowledge for BERT Knowledge Distillation IJCNLP 2021

Structural Knowledge Distillation: Tractably Distilling Information for Structured Predictor IJCNLP 2021

Accelerating BERT Inference for Sequence Labeling via Early-Exit ACL 2021

Making DensePose Fast and Light WACV 2021

Efficient Inference for Multilingual Neural Machine Translation EMNLP 2021

AdaSGN: Adapting Joint Number and Model Size for Efficient Skeleton-Based Action Recognition ICCV 2021

Student Customized Knowledge Distillation: Bridging the Gap Between Student and Teacher ICCV 2021

Universal-KD: Attention-based Output-Grounded Intermediate Layer Knowledge Distillation EMNLP 2021

Does Knowledge Distillation Really Work? NIPS 2021