Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Efficient Computing
596 directly classified papers
Papers per year
2007: 2
2009: 1
2011: 1
2014: 2
2016: 1
2017: 4
2018: 7
2019: 20
2020: 47
2021: 53
2022: 70
2023: 60
2024: 140
2025: 183
2026: 5
Papers
Sustainability of Data Center Digital Twins with Reinforcement Learning
AAAI 2024
PyramidInfer: Pyramid KV Cache Compression for High-throughput LLM Inference
ACL 2024
ETAS: Zero-Shot Transformer Architecture Search via Network Trainability and Expressivity
ACL 2024
Light-PEFT: Lightening Parameter-Efficient Fine-Tuning via Early Pruning
ACL 2024
A Comprehensive Evaluation of Quantization Strategies for Large Language Models
ACL 2024
Cache Me if You Can: Accelerating Diffusion Models through Block Caching
CVPR 2024
Zero-TPrune: Zero-Shot Token Pruning through Leveraging of the Attention Graph in Pre-Trained Transformers
CVPR 2024
Prefixing Attention Sinks can Mitigate Activation Outliers for Large Language Model Quantization
EMNLP 2024
Context-Aware Assistant Selection for Improved Inference Acceleration with Large Language Models
EMNLP 2024
Beyond the Turn-Based Game: Enabling Real-Time Conversations with Duplex Models
EMNLP 2024
Make Some Noise: Unlocking Language Model Parallel Inference Capability through Noisy Training
EMNLP 2024
Turn Waste into Worth: Rectifying Top-k Router of MoE
EMNLP 2024
InfiniPot: Infinite Context Processing on Memory-Constrained LLMs
EMNLP 2024
CHESS: Optimizing LLM Inference via Channel-Wise Thresholding and Selective Sparsification
EMNLP 2024
AMPO: Automatic Multi-Branched Prompt Optimization
EMNLP 2024
RevMUX: Data Multiplexing with Reversible Adapters for Efficient LLM Batch Inference
EMNLP 2024
LongHeads: Multi-Head Attention is Secretly a Long Context Processor
EMNLP 2024
MobileQuant: Mobile-friendly Quantization for On-device Language Models
EMNLP 2024
Change Is the Only Constant: Dynamic LLM Slicing based on Layer Redundancy
EMNLP 2024
Style-Compress: An LLM-Based Prompt Compression Framework Considering Task-Specific Styles
EMNLP 2024
Cost-Performance Optimization for Processing Low-Resource Language Tasks Using Commercial LLMs
EMNLP 2024
Universal Physics Transformers: A Framework For Efficiently Scaling Neural Operators
NIPS 2024
EnOF-SNN: Training Accurate Spiking Neural Networks via Enhancing the Output Feature
NIPS 2024
Efficient LLM Scheduling by Learning to Rank
NIPS 2024
Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling
NIPS 2024
<
1
…
12
13
14
…
24
>