← Optimization & Theory

Deep Learning › Optimization & Theory ›

Efficient Computing

1253 directly classified papers

Papers per year

Papers

MOSEL: Inference Serving Using Dynamic Modality Selection EMNLP 2024

InfiniPot: Infinite Context Processing on Memory-Constrained LLMs EMNLP 2024

Mixture-of-Modules: Reinventing Transformers as Dynamic Assemblies of Modules EMNLP 2024

Model Adaptation for Time Constrained Embodied Control CVPR 2024

Exploring Token Pruning in Vision State Space Models NIPS 2024

Order of Magnitude Speedups for LLM Membership Inference EMNLP 2024

Optimized Speculative Sampling for GPU Hardware Accelerators EMNLP 2024

Position Engineering: Boosting Large Language Models through Positional Information Manipulation EMNLP 2024

Attention-Driven Training-Free Efficiency Enhancement of Diffusion Models CVPR 2024

SHViT: Single-Head Vision Transformer with Memory Efficient Macro Design CVPR 2024

Towards Fast Multilingual LLM Inference: Speculative Decoding and Specialized Drafters EMNLP 2024

TroL: Traversal of Layers for Large Language and Vision Models EMNLP 2024

Cache Me if You Can: Accelerating Diffusion Models through Block Caching CVPR 2024

LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding ACL 2024

Retaining Key Information under High Compression Ratios: Query-Guided Compressor for LLMs ACL 2024

Harder Task Needs More Experts: Dynamic Routing in MoE Models ACL 2024

LSK3DNet: Towards Effective and Efficient 3D Perception with Large Sparse Kernels CVPR 2024

HybridNeRF: Efficient Neural Rendering via Adaptive Volumetric Surfaces CVPR 2024

Zero-TPrune: Zero-Shot Token Pruning through Leveraging of the Attention Graph in Pre-Trained Transformers CVPR 2024

Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling ACL 2024

LongVQ: Long Sequence Modeling with Vector Quantization on Structured Memory IJCAI 2024

Speculative Contrastive Decoding ACL 2024

MediSwift: Efficient Sparse Pre-trained Biomedical Language Models ACL 2024

CodeM: Less Data Yields More Versatility via Ability Matrix ACL 2024

Diffusion Models Without Attention CVPR 2024