conftrace_

knowledge distillation

3725 papers

Explore in graph

Also known as

KD

Co-occurring keywords

model compression (3302) large language model (13587) transfer learning (5449) domain adaptation (4595) representation learning (6206) neural network (6616) language model (4599) catastrophic forgetting (958) continual learning (1181) contrastive learning (4032)

Papers

BabyLlama-2: Ensemble-Distilled Models Consistently Outperform Teachers With Limited Data CONLL 2024

RADCoT: Retrieval-Augmented Distillation to Specialization Models for Generating Chain-of-Thoughts in Query Expansion COLING 2024

All Rivers Run to the Sea: Private Learning with Asymmetric Flows CVPR 2024

Pruning before Fine-tuning: A Retraining-free Compression Framework for Pre-trained Language Models COLING 2024

Probe Then Retrieve and Reason: Distilling Probing and Reasoning Capabilities into Smaller Language Models COLING 2024

PIRB: A Comprehensive Benchmark of Polish Dense and Hybrid Text Retrieval Methods COLING 2024

MoDE-CoTD: Chain-of-Thought Distillation for Complex Reasoning Tasks with Mixture of Decoupled LoRA-Experts COLING 2024

UniPTS: A Unified Framework for Proficient Post-Training Sparsity CVPR 2024

Is Modularity Transferable? A Case Study through the Lens of Knowledge Distillation COLING 2024

Self-Supervised Quantization-Aware Knowledge Distillation AISTATS 2024

A Dynamic GCN with Cross-Representation Distillation for Event-Based Learning AAAI 2024

CANDLE: Iterative Conceptualization and Instantiation Distillation from Large Language Models for Commonsense Reasoning ACL 2024

Knowledge Distillation for Tiny Speech Enhancement with Latent Feature Augmentation INTERSPEECH 2024

RaD-Net 2: A causal two-stage repairing and denoising speech enhancement network with knowledge distillation and complex axial self-attention INTERSPEECH 2024

Neural Machine Translation between Low-Resource Languages with Synthetic Pivoting COLING 2024

Accurate Knowledge Distillation via n-best Reranking NAACL 2024

PaD: Program-aided Distillation Can Teach Small Models Reasoning Better than Chain-of-thought Fine-tuning NAACL 2024

TriSum: Learning Summarization Ability from Large Language Models with Structured Rationale NAACL 2024

CONSCENDI: A Contrastive and Scenario-Guided Distillation Approach to Guardrail Models for Virtual Assistants NAACL 2024

Mind’s Mirror: Distilling Self-Evaluation Capability and Comprehensive Thinking from Large Language Models NAACL 2024

A Lightweight Mixture-of-Experts Neural Machine Translation Model with Stage-wise Training Strategy NAACL 2024

Towards an On-device Agent for Text Rewriting NAACL 2024

UEGP: Unified Expert-Guided Pre-training for Knowledge Rekindle NAACL 2024

DiLM: Distilling Dataset into Language Model for Text-level Dataset Distillation NAACL 2024

A Novel Two-step Fine-tuning Framework for Transfer Learning in Low-Resource Neural Machine Translation NAACL 2024