performance optimization

47 papers

Explore in graph

Co-occurring keywords

large language model (12755) reinforcement learning (4122) code generation (699) program analysis (32) network latency (5) distributed storage (13) operating system (17) memory management (37) gpu optimization (27) code optimization (8)

Papers

Deploying Atmospheric and Oceanic AI Models on Chinese Hardware and Framework: Migration Strategies, Performance Optimization and Analysis AAAI 2026

DiffBench Meets DiffAgent: End-to-End LLM-Driven Diffusion Acceleration Code Generation AAAI 2026

CATBench: A Compiler Autotuning Benchmarking Suite for Black-box Optimization AUTOML 2025

Principles and Methodologies for Serial Performance Optimization OSDI 2025

POLO: An LLM-Powered Project-Level Code Performance Optimization Framework IJCAI 2025

Fork in the Road: Reflections and Optimizations for Cold Start Latency in Production Serverless Systems OSDI 2025

Neutrino: Fine-grained GPU Kernel Profiling via Programmable Probing OSDI 2025

ONCache: A Cache-Based Low-Overhead Container Overlay Network NSDI 2025

QiMeng-TensorOp: One-Line Prompt is Enough for High-Performance Tensor Operator Generation with Hardware Primitives IJCAI 2025

QiMeng-GEMM: Automatically Generating High-Performance Matrix Multiplication Code by Exploiting Large Language Models AAAI 2025

Tiered Memory Management Beyond Hotness OSDI 2025

Preventing Network Bottlenecks: Accelerating Datacenter Services with Hotspot-Aware Placement for Compute and Storage NSDI 2025

From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning NAACL 2024

Nomad: Non-Exclusive Memory Tiering via Transactional Page Migration OSDI 2024

Balancing Humans and Machines: A Study on Integration Scale and Its Impact on Collaborative Performance AAAI 2024

Optimized Speculative Sampling for GPU Hardware Accelerators EMNLP 2024

Microkernel Goes General: Performance and Compatibility in the HongMeng Production Microkernel OSDI 2024

OPPerTune: Post-Deployment Configuration Tuning of Services Made Easy NSDI 2024

Einsum Benchmark: Enabling the Development of Next-Generation Tensor Execution Engines NIPS 2024

LDB: An Efficient Latency Profiling Tool for Multithreaded Applications NSDI 2024

The Grand Illusion: The Myth of Software Portability and Implications for ML Progress. NIPS 2023

PERFOGRAPH: A Numerical Aware Program Graph Representation for Performance Optimization and Program Analysis NIPS 2023

Electrode: Accelerating Distributed Protocols with eBPF NSDI 2023

Triangulating Python Performance Issues with SCALENE OSDI 2023

Trinity: High-Performance Mobile Emulation through Graphics Projection OSDI 2022