Papers
TorchTitan: One-stop PyTorch native solution for production ready LLM pretraining
Wanchao Liang, Tianyu Liu, Less Wright et al.
LLMs Can Plan Only If We Tell Them
Bilgehan Sel, Ruoxi Jia, Ming Jin
Do LLM Agents Have Regret? A Case Study in Online Learning and Games
Chanwoo Park, Xiangyu Liu, Asuman E. Ozdaglar et al.
Scaling LLM Test-Time Compute Optimally Can be More Effective than Scaling Parameters for Reasoning
Charlie Victor Snell, Jaehoon Lee, Kelvin Xu et al.
Benchmarking LLMs' Judgments with No Gold Standard
Shengwei Xu, Yuxuan Lu, Grant Schoenebeck et al.
WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild
Bill Yuchen Lin, Yuntian Deng, Khyathi Chandu et al.
Decision Tree Induction Through LLMs via Semantically-Aware Evolution
Tennison Liu, Nicolas Huynh, Mihaela van der Schaar
R-Sparse: Rank-Aware Activation Sparsity for Efficient LLM Inference
Zhenyu Zhang, Zechun Liu, Yuandong Tian et al.
MM-EMBED: UNIVERSAL MULTIMODAL RETRIEVAL WITH MULTIMODAL LLMS
Sheng-Chieh Lin, Chankyu Lee, Mohammad Shoeybi et al.
Turning Up the Heat: Min-p Sampling for Creative and Coherent LLM Outputs
Nguyen Nhat Minh, Andrew Baker, Clement Neo et al.
Instruct-SkillMix: A Powerful Pipeline for LLM Instruction Tuning
Simran Kaur, Simon Park, Anirudh Goyal et al.
Better autoregressive regression with LLMs via regression-aware fine-tuning
Michal Lukasik, Zhao Meng, Harikrishna Narasimhan et al.
Adapters for Altering LLM Vocabularies: What Languages Benefit the Most?
HyoJung Han, Akiko Eriguchi, Haoran Xu et al.
DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning Trajectories Search
Murong Yue, Wenlin Yao, Haitao Mi et al.
Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning
Yuheng Zhang, Dian Yu, Baolin Peng et al.
Teaching LLMs How to Learn with Contextual Fine-Tuning
Younwoo Choi, Muhammad Adil Asif, Ziwen Han et al.
IterGen: Iterative Semantic-aware Structured LLM Generation with Backtracking
Shubham Ugare, Rohan Gumaste, Tarun Suresh et al.
SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training
Tianjin Huang, Ziquan Zhu, Gaojie Jin et al.
Language Agents Meet Causality -- Bridging LLMs and Causal World Models
John Gkountouras, Matthias Lindemann, Phillip Lippe et al.
Robotouille: An Asynchronous Planning Benchmark for LLM Agents
Gonzalo Gonzalez-Pumariega, Leong Su Yean, Neha Sunkara et al.
AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs
Xiaogeng Liu, Peiran Li, G. Edward Suh et al.
MIND: Math Informed syNthetic Dialogues for Pretraining LLMs
Syeda Nahida Akter, Shrimai Prabhumoye, John Kamalu et al.
Moral Alignment for LLM Agents
Elizaveta Tennant, Stephen Hailes, Mirco Musolesi
From an LLM Swarm to a PDDL-empowered Hive: Planning Self-executed Instructions in a Multi-modal Jungle
Kaustubh Vyas, Damien Graux, Yijun Yang et al.
ZooProbe: A Data Engine for Evaluating, Exploring, and Evolving Large-scale Training Data for Multimodal LLMs
Yi-Kai Zhang, Shiyin Lu, Qing-Guo Chen et al.