conftrace_

← Optimization & Theory

Machine Learning › Optimization & Theory ›

Neural Network Optimization

3,648 papers

Papers per year

Papers

Initialization of Large Language Models via Reparameterization to Mitigate Loss Spikes EMNLP 2024

QDyLoRA: Quantized Dynamic Low-Rank Adaptation for Efficient Large Language Model Tuning EMNLP 2024

Make Large Language Model a Better Ranker EMNLP 2024

Tokenization Falling Short: On Subword Robustness in Large Language Models EMNLP 2024

Layer-wise Importance Matters: Less Memory for Better Performance in Parameter-efficient Fine-tuning of Large Language Models EMNLP 2024

Towards Robust Extractive Question Answering Models: Rethinking the Training Methodology EMNLP 2024

Draft on the Fly: Adaptive Self-Speculative Decoding using Cosine Similarity EMNLP 2024

In-Context Former: Lightning-fast Compressing Context for Large Language Model EMNLP 2024

Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models EMNLP 2024

Tending Towards Stability: Convergence Challenges in Small Language Models EMNLP 2024

Revisiting Catastrophic Forgetting in Large Language Model Tuning EMNLP 2024

Normalized Narrow Jump To Conclusions: Normalized Narrow Shortcuts for Parameter Efficient Early Exit Transformer Prediction EMNLP 2024

On the token distance modeling ability of higher RoPE attention dimension EMNLP 2024

Llama SLayer 8B: Shallow Layers Hold the Key to Knowledge Injection EMNLP 2024

SCA: Selective Compression Attention for Efficiently Extending the Context Window of Large Language Models EMNLP 2024

DAdEE: Unsupervised Domain Adaptation in Early Exit PLMs EMNLP 2024

LongHeads: Multi-Head Attention is Secretly a Long Context Processor EMNLP 2024

LPZero: Language Model Zero-cost Proxy Search from Zero EMNLP 2024

Beyond Fine-tuning: Unleashing the Potential of Continuous Pretraining for Clinical LLMs. EMNLP 2024

Fighting Randomness with Randomness: Mitigating Optimisation Instability of Fine-Tuning using Delayed Ensemble and Noisy Interpolation EMNLP 2024

Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs EMNLP 2024

FastMem: Fast Memorization of Prompt Improves Context Awareness of Large Language Models EMNLP 2024

Auto-Evolve: Enhancing Large Language Model’s Performance via Self-Reasoning Framework EMNLP 2024

Unlocking Black-Box Prompt Tuning Efficiency via Zeroth-Order Optimization EMNLP 2024

Hop, skip, jump to Convergence: Dynamics of Learning Rate Transitions for Improved Training of Large Language Models EMNLP 2024