Artificial Intelligence › Core AI ›

Model Compression

1928 directly classified papers

Papers per year

Papers

COST-EFF: Collaborative Optimization of Spatial and Temporal Efficiency with Slenderized Multi-exit Language Models EMNLP 2022

Extracted BERT Model Leaks More Information than You Think! EMNLP 2022

Multicoated Supermasks Enhance Hidden Networks ICML 2022

MIA-Former: Efficient and Robust Vision Transformers via Multi-Grained Input-Adaptation AAAI 2022

To Fold or Not to Fold: a Necessary and Sufficient Condition on Batch-Normalization Layers Folding IJCAI 2022

FedCG: Leverage Conditional GAN for Protecting Privacy and Maintaining Competitive Performance in Federated Learning IJCAI 2022

GPT3.int8(): 8-bit Matrix Multiplication for Transformers at Scale NIPS 2022

THE-X: Privacy-Preserving Transformer Inference with Homomorphic Encryption ACL 2022

Plug and Play Knowledge Distillation for kNN-LM with External Logits AACL 2022

Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models ACL 2022

Symbolic Distillation for Learned TCP Congestion Control NIPS 2022

Provable Defense against Backdoor Policies in Reinforcement Learning NIPS 2022

ReservoirComputing.jl: An Efficient and Modular Library for Reservoir Computing Models JMLR 2022

DreamShard: Generalizable Embedding Table Placement for Recommender Systems NIPS 2022

BATUDE: Budget-Aware Neural Network Compression Based on Tucker Decomposition AAAI 2022

Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm ACL 2022

Isolation Mechanisms for High-Speed Packet-Processing Pipelines NSDI 2022

Redistribution of Weights and Activations for AdderNet Quantization NIPS 2022

Binary Early-Exit Network for Adaptive Inference on Low-Resource Devices INTERSPEECH 2022

Understanding and Improving Knowledge Distillation for Quantization Aware Training of Large Transformer Encoders EMNLP 2022

Fast Vocabulary Transfer for Language Model Compression EMNLP 2022

Fine-mixing: Mitigating Backdoors in Fine-tuned Language Models EMNLP 2022

Train Flat, Then Compress: Sharpness-Aware Minimization Learns More Compressible Models EMNLP 2022

Efficient Two-Stage Progressive Quantization of BERT EMNLP 2022

HW-TSC’s Submission for the WMT22 Efficiency Task EMNLP 2022