← Application Areas

Machine Learning › Application Areas ›

Efficient Computing

6876 directly classified papers

Papers per year

Papers

CSR:Achieving 1 Bit Key-Value Cache via Sparse Representation AAAI 2025

Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model AAAI 2025

Dynamic-Width Speculative Beam Decoding for LLM Inference AAAI 2025

Prompt Compression with Context-Aware Sentence Encoding for Fast and Improved LLM Inference AAAI 2025

Enhancing Large Language Model Performance with Gradient-Based Parameter Selection AAAI 2025

CMT: A Memory Compression Method for Continual Knowledge Learning of Large Language Models AAAI 2025

EPT: Efficient Prompt Tuning by Multi-Space Projection and Prompt Fusion AAAI 2025

ELITR-Bench: A Meeting Assistant Benchmark for Long-Context Language Models COLING 2025

LoSA: Long-Short-Range Adapter for Scaling End-to-End Temporal Action Localization WACV 2025

On Adaptive Stochastic Optimization for Streaming Data: A Newton's Method with O(dN) Operations JMLR 2025

C3oT: Generating Shorter Chain-of-Thought Without Compromising Effectiveness AAAI 2025

Gradient Weight-normalized Low-rank Projection for Efficient LLM Training AAAI 2025

COSEE: Consistency-Oriented Signal-Based Early Exiting via Calibrated Sample Weighting Mechanism AAAI 2025

Multi-Branch Self-Drafting for LLM Inference Acceleration AAAI 2025

Falcon: Faster and Parallel Inference of Large Language Models Through Enhanced Semi-Autoregressive Drafting and Custom-Designed Decoding Tree AAAI 2025

Practical Offloading for Fine-Tuning LLM on Commodity GPU via Learned Sparse Projectors AAAI 2025

Towards Efficient Low-Order Hybrid Optimizer for Language Model Fine-Tuning AAAI 2025

MeRino: Entropy-Driven Design for Generative Language Models on IoT Devices AAAI 2025

ASER: Activation Smoothing and Error Reconstruction for Large Language Model Quantization AAAI 2025

Channel Merging: Preserving Specialization for Merged Experts AAAI 2025

QiMLP: Quantum-inspired Multilayer Perceptron with Strong Correlation Mining and Parameter Compression AAAI 2025

ABQ-LLM: Arbitrary-Bit Quantized Inference Acceleration for Large Language Models AAAI 2025

Fit and Prune: Fast and Training-free Visual Token Pruning for Multi-modal Large Language Models AAAI 2025

ScaleOT: Privacy-utility-scalable Offsite-tuning with Dynamic LayerReplace and Selective Rank Compression AAAI 2025

Pushing the Limits of BFP on Narrow Precision LLM Inference AAAI 2025