conftrace_

model compression

3302 papers

Explore in graph

Also known as

MC

Co-occurring keywords

knowledge distillation (3725) large language model (13587) neural network (6616) efficient computing (781) neural network optimization (1293) transfer learning (5449) convolutional neural network (4226) neural network pruning (265) language model (4599) parameter efficiency (417)

Papers

Compact Speech Translation Models via Discrete Speech Units Pretraining ACL 2024

IT-Tuning : Parameter Efficient Information Token Tuning for Language Model ACL 2024

MobileSpeech: A Fast and High-Fidelity Framework for Mobile Zero-Shot Text-to-Speech ACL 2024

Layer-Adaptive State Pruning for Deep State Space Models NIPS 2024

Progressively Knowledge Distillation via Re-parameterizing Diffusion Reverse Process AAAI 2024

Revisiting Knowledge Distillation for Autoregressive Language Models ACL 2024

LRQuant: Learnable and Robust Post-Training Quantization for Large Language Models ACL 2024

Global-Pruner: A Stable and Efficient Pruner for Retraining-Free Pruning of Encoder-Based Language Models CONLL 2024

Optimal and Approximate Adaptive Stochastic Quantization NIPS 2024

All Rivers Run to the Sea: Private Learning with Asymmetric Flows CVPR 2024

Unlocking Data-free Low-bit Quantization with Matrix Decomposition for KV Cache Compression ACL 2024

Some Like It Small: Czech Semantic Embedding Models for Industry Applications AAAI 2024

Federated Learning via Input-Output Collaborative Distillation AAAI 2024

Small Scale Data-Free Knowledge Distillation CVPR 2024

A Survey on Efficient Federated Learning Methods for Foundation Model Training IJCAI 2024

Structured Unrestricted-Rank Matrices for Parameter Efficient Finetuning NIPS 2024

Minimal Distillation Schedule for Extreme Language Model Compression EACL 2024

BabyLM Challenge: Experimenting with Self-Distillation and Reverse-Distillation for Language Model Pre-Training on Constrained Datasets CONLL 2024

Structured Optimal Brain Pruning for Large Language Models EMNLP 2024

Temperature-Centric Investigation of Speculative Decoding with Knowledge Distillation EMNLP 2024

DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs NIPS 2024

TinyLUT: Tiny Look-Up Table for Efficient Image Restoration at the Edge NIPS 2024

SpaFL: Communication-Efficient Federated Learning With Sparse Models And Low Computational Overhead NIPS 2024

Transforming Vision Transformer: Towards Efficient Multi-Task Asynchronous Learner NIPS 2024

UPS: Unified Projection Sharing for Lightweight Single-Image Super-resolution and Beyond NIPS 2024