conftrace_

model compression

3302 papers

Explore in graph

Also known as

MC

Co-occurring keywords

knowledge distillation (3725) large language model (13587) neural network (6616) efficient computing (781) neural network optimization (1293) transfer learning (5449) convolutional neural network (4226) neural network pruning (265) language model (4599) parameter efficiency (417)

Papers

UltraSparseBERT: 99% Conditionally Sparse Language Modelling ACL 2024

Emulated Disalignment: Safety Alignment for Large Language Models May Backfire! ACL 2024

Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding ACL 2024

BitsFusion: 1.99 bits Weight Quantization of Diffusion Model NIPS 2024

Even Sparser Graph Transformers NIPS 2024

Reasons and Solutions for the Decline in Model Performance after Editing NIPS 2024

VkD: Improving Knowledge Distillation using Orthogonal Projections CVPR 2024

Asymmetric Masked Distillation for Pre-Training Small Foundation Models CVPR 2024

Towards Accurate Post-training Quantization for Diffusion Models CVPR 2024

Instance-Aware Group Quantization for Vision Transformers CVPR 2024

LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding ACL 2024

DB-LLM: Accurate Dual-Binarization for Efficient LLMs ACL 2024

ResLoRA: Identity Residual Mapping in Low-Rank Adaption ACL 2024

AS-ES Learning: Towards efficient CoT learning in small models ACL 2024

Finding Lottery Tickets in Vision Models via Data-driven Spectral Foresight Pruning CVPR 2024

A Comprehensive Evaluation of Quantization Strategies for Large Language Models ACL 2024

Differentially Private Knowledge Distillation via Synthetic Text Generation ACL 2024

Sparsity-Accelerated Training for Large Language Models ACL 2024

Compact Speech Translation Models via Discrete Speech Units Pretraining ACL 2024

EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything CVPR 2024

SHViT: Single-Head Vision Transformer with Memory Efficient Macro Design CVPR 2024

Spear: Evaluate the Adversarial Robustness of Compressed Neural Models IJCAI 2024

Layer Attack Unlearning: Fast and Accurate Machine Unlearning via Layer Level Attack and Knowledge Distillation AAAI 2024

Find the Lady: Permutation and Re-synchronization of Deep Neural Networks AAAI 2024

ConsistentEE: A Consistent and Hardness-Guided Early Exiting Method for Accelerating Language Models Inference AAAI 2024