← Optimization & Theory

Deep Learning › Optimization & Theory ›

Efficient Computing

1253 directly classified papers

Papers per year

Papers

NeuroPrune: A Neuro-inspired Topological Sparse Training Algorithm for Large Language Models ACL 2024

SpecHub: Provable Acceleration to Multi-Draft Speculative Decoding EMNLP 2024

PyramidInfer: Pyramid KV Cache Compression for High-throughput LLM Inference ACL 2024

Encoding Spreadsheets for Large Language Models EMNLP 2024

Anchor-based Large Language Models ACL 2024

Extending Context Window of Large Language Models via Semantic Compression ACL 2024

Generation Meets Verification: Accelerating Large Language Model Inference with Smart Parallel Auto-Correct Decoding ACL 2024

Mixture-of-Modules: Reinventing Transformers as Dynamic Assemblies of Modules EMNLP 2024

Query-OPT: Optimizing Inference of Large Language Models via Multi-Query Instructions in Meeting Summarization EMNLP 2024

On the token distance modeling ability of higher RoPE attention dimension EMNLP 2024

Temperature-Centric Investigation of Speculative Decoding with Knowledge Distillation EMNLP 2024

Scaling Laws for Linear Complexity Language Models EMNLP 2024

Mixture-of-Supernets: Improving Weight-Sharing Supernet Training with Architecture-Routed Mixture-of-Experts ACL 2024

LLM Performance Predictors are good initializers for Architecture Search ACL 2024

Accelerating Multilingual Language Model for Excessively Tokenized Languages ACL 2024

InfiniPot: Infinite Context Processing on Memory-Constrained LLMs EMNLP 2024

FFN-SkipLLM: A Hidden Gem for Autoregressive Decoding with Adaptive Feed Forward Skipping EMNLP 2024

Scalable Efficient Training of Large Language Models with Low-dimensional Projected Attention EMNLP 2024

Enhanced Language Model Truthfulness with Learnable Intervention and Uncertainty Expression ACL 2024

Joint Pre-Encoding Representation and Structure Embedding for Efficient and Low-Resource Knowledge Graph Completion EMNLP 2024

On Efficient Language and Vision Assistants for Visually-Situated Natural Language Understanding: What Matters in Reading and Reasoning EMNLP 2024

Make Some Noise: Unlocking Language Model Parallel Inference Capability through Noisy Training EMNLP 2024

TroL: Traversal of Layers for Large Language and Vision Models EMNLP 2024

Turn Waste into Worth: Rectifying Top-k Router of MoE EMNLP 2024

Towards Fast Multilingual LLM Inference: Speculative Decoding and Specialized Drafters EMNLP 2024