Papers
SpeedLoader: An I/O efficient scheme for heterogeneous and distributed LLM operation
Yiqi Zhang, Yang You
LeDex: Training LLMs to Better Self-Debug and Explain Code
Nan Jiang, Xiaopeng Li, Shiqi Wang et al.
Dataset Decomposition: Faster LLM Training with Variable Sequence Length Curriculum
Hadi Pouransari, Chun-Liang Li, Jen-Hao Rick Chang et al.
Mobility-LLM: Learning Visiting Intentions and Travel Preference from Human Mobility Data with Large Language Models
Letian Gong, Yan Lin, Xinyue Zhang et al.
Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition
Edoardo Debenedetti, Javier Rando, Daniel Paleka et al.
StackEval: Benchmarking LLMs in Coding Assistance
Nidhish Shah, Zulkuf Genc, Dogu Araci
Mission Impossible: A Statistical Perspective on Jailbreaking LLMs
Jingtong Su, Julia Kempe, Karen Ullrich
LLM Circuit Analyses Are Consistent Across Training and Scale
Curt Tigges, Michael Hanna, Qinan Yu et al.
Verified Code Transpilation with LLMs
Sahil Bhatia, Jie Qiu, Niranjan Hasabnis et al.
Exploiting LLM Quantization
Kazuki Egashira, Mark Vero, Robin Staab et al.
RL on Incorrect Synthetic Data Scales the Efficiency of LLM Math Reasoning by Eight-Fold
Amrith Setlur, Saurabh Garg, Xinyang (Young) Geng et al.
Easy2Hard-Bench: Standardized Difficulty Labels for Profiling LLM Performance and Generalization
Mucong Ding, Chenghao Deng, Jocelyn Choo et al.
Is Programming by Example Solved by LLMs?
Wen-Ding Li, Kevin Ellis
Are More LLM Calls All You Need? Towards the Scaling Properties of Compound AI Systems
Lingjiao Chen, Jared Davis, Boris Hanin et al.
LLMs Can Evolve Continually on Modality for $\mathbb{X}$-Modal Reasoning
Jiazuo Yu, Haomiao Xiong, Lu Zhang et al.
CTIBench: A Benchmark for Evaluating LLMs in Cyber Threat Intelligence
Md Tanvirul Alam, Dipkamal Bhusal, Le Nguyen et al.
MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention
Huiqiang Jiang, Yucheng Li, Chengruidong Zhang et al.
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing
Ye Tian, Baolin Peng, Linfeng Song et al.
EAI: Emotional Decision-Making of LLMs in Strategic Games and Ethical Dilemmas
Mikhail Mozikov, Nikita Severin, Valeria Bodishtianu et al.
Can LLMs Implicitly Learn Numeric Parameter Constraints in Data Science APIs?
Yinlin Deng, Chunqiu Steven Xia, Zhezhen Cao et al.
DALD: Improving Logits-based Detector without Logits from Black-box LLMs
Cong Zeng, Shengkun Tang, Xianjun Yang et al.
Rethinking LLM Memorization through the Lens of Adversarial Compression
Avi Schwarzschild, Zhili Feng, Pratyush Maini et al.
Vitron: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing
Hao Fei, Shengqiong Wu, Hanwang Zhang et al.
NYU CTF Bench: A Scalable Open-Source Benchmark Dataset for Evaluating LLMs in Offensive Security
Minghao Shao, Sofija Jancheska, Meet Udeshi et al.
To Believe or Not to Believe Your LLM: Iterative Prompting for Estimating Epistemic Uncertainty
Yasin Abbasi Yadkori, Ilja Kuzborskij, András György et al.