Research Explorer

Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads

Tianle Cai, Yuhong Li, Zhengyang Geng et al.

2024 ICML

EE-LLM: Large-Scale Training and Inference of Early-Exit Large Language Models with 3D Parallelism

Yanxi Chen, Xuchen Pan, Yaliang Li et al.

2024 ICML

Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference

Wei-Lin Chiang, Lianmin Zheng, Ying Sheng et al.

2024 ICML

KV-Runahead: Scalable Causal LLM Inference by Parallel Key-Value Cache Generation

Minsik Cho, Mohammad Rastegari, Devang Naik

2024 ICML

Embodied CoT Distillation From LLM To Off-the-shelf Agents

Wonje Choi, Woo Kyung Kim, Minjong Yoo et al.

2024 ICML

Multicalibration for Confidence Scoring in LLMs

Gianluca Detommaso, Martin Andres Bertran, Riccardo Fogliato et al.

2024 ICML

LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens

Yiran Ding, Li Lyna Zhang, Chengruidong Zhang et al.

2024 ICML

Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference

Harry Dong, Xinyu Yang, Zhenyu Zhang et al.

2024 ICML

Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMs

Jordan Dotzel, Yuzong Chen, Bahaa Kotb et al.

2024 ICML

MuxServe: Flexible Spatial-Temporal Multiplexing for Multiple LLM Serving

Jiangfei Duan, Runyu Lu, Haojie Duanmu et al.

2024 ICML

Efficient Exploration for LLMs

Vikranth Dwaracherla, Seyed Mohammad Asghari, Botao Hao et al.

2024 ICML

Keypoint-based Progressive Chain-of-Thought Distillation for LLMs

Kaituo Feng, Changsheng Li, Xiaolu Zhang et al.

2024 ICML

Break the Sequential Dependency of LLM Inference Using Lookahead Decoding

Yichao Fu, Peter Bailis, Ion Stoica et al.

2024 ICML

Position: Fundamental Limitations of LLM Censorship Necessitate New Approaches

David Glukhov, Ilia Shumailov, Yarin Gal et al.

2024 ICML

Evaluation of LLMs on Syntax-Aware Code Fill-in-the-Middle Tasks

Linyuan Gong, Sida Wang, Mostafa Elhoushi et al.

2024 ICML

Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast

Xiangming Gu, Xiaosen Zheng, Tianyu Pang et al.

2024 ICML

COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability

Xingang Guo, Fangxu Yu, Huan Zhang et al.

2024 ICML

Covert Malicious Finetuning: Challenges in Safeguarding LLM Adaptation

Danny Halawi, Alexander Wei, Eric Wallace et al.

2024 ICML

Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text

Abhimanyu Hans, Avi Schwarzschild, Valeriia Cherepanova et al.

2024 ICML

GLoRe: When, Where, and How to Improve LLM Reasoning via Global and Local Refinements

Alexander Havrilla, Sharath Chandra Raparthy, Christoforos Nalmpantis et al.

2024 ICML

Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression

Junyuan Hong, Jinhao Duan, Chenhui Zhang et al.

2024 ICML

PrE-Text: Training Language Models on Private Federated Data in the Age of LLMs

Charlie Hou, Akshat Shrivastava, Hongyuan Zhan et al.

2024 ICML

SceneCraft: An LLM Agent for Synthesizing 3D Scenes as Blender Code

Ziniu Hu, Ahmet Iscen, Aashi Jain et al.

2024 ICML

BiLLM: Pushing the Limit of Post-Training Quantization for LLMs

Wei Huang, Yangdong Liu, Haotong Qin et al.

2024 ICML

Debating with More Persuasive LLMs Leads to More Truthful Answers

Akbir Khan, John Hughes, Dan Valentine et al.

2024 ICML

Papers