Papers
Sliding-Window Merging for Compacting Patch-Redundant Layers in LLMs
Xuan Ding, Rui Sun, Yunjian Zhang et al.
Accelerating LLM Inference Throughput via Asynchronous KV Cache Prefetching
Yanhao Dong, Yubo Miao, Weinan Li et al.
TFRank: Think-Free Reasoning Enables Practical Pointwise LLM Ranking
Yongqi Fan, Xiaoyang Chen, Dezhi Ye et al.
The Semantic Architect: How FEAML Bridges Structured Data and LLMs for Multi-Label Tasks
Wanfu Gao, Zebin He, Jun Gao
FLRQ: Faster LLM Quantization with Flexible Low-Rank Matrix Sketching
Hongyaoxing Gu, Lijuan Hu, Shuzi Niu et al.
From Diagnosis to Generalization: A Cognitive Approach to Data Selection for Educational LLMs
Yuxiang Guo, Yan Zhuang, Qi Liu et al.
HALO: Hardware-Aware Quantization with Low Critical-Path-Delay Weights for LLM Acceleration
Rohan Juneja, Shivam Aggarwal, Safeen Huda et al.
FedP²EFT: Federated Learning to Personalize PEFT for Multilingual LLMs
Royson Lee, Minyoung Kim, Fady Rezk et al.
Sub-MoE: Efficient Mixture-of-Expert LLMs Compression via Subspace Expert Merging
Lujun Li, Qiyuan Zhu, Jiacheng Wang et al.
LLMC+: Benchmarking Vision-Language Model Compression with a plug-and-play Toolkit
Chengtao Lv, Bilang Zhang, Yang Yong et al.
MMG-Vid: Maximizing Marginal Gains at Segment-level and Token-level for Efficient Video LLMs
Junpeng Ma, Qizhe Zhang, Ming Lu et al.
Prototype Entropy Alignment: Reinforcing Structured Uncertainty in LLM Reasoning
Zhengyuan Pan, Yanhao Chen, Zhongquan Jian et al.
What Makes a Good Generated Image? Investigating Human and Multimodal LLM Image Preference Alignment
Rishab Parthasarathy, Jasmine Collins, Cory Stephenson
Online Multi-LLM Selection via Contextual Bandits Under Unstructured Context Evolution
Manhin Poon, Xiangxiang Dai, Xutong Liu et al.
Next Generation Active Learning: Mixture of LLMs in the Loop
Yuanyuan Qi, Xiaohao Yang, Jueqing Lu et al.
BitDP: Ultra-low-bit Communication for Data Parallelism in LLM Training
Xiaozhe Ren, Qiong Luo
A Solver-in-the-Loop Framework for Improving LLMs on Answer Set Programming for Logic Puzzle Solving
Timo Pierre Schrader, Lukas Lange, Tobias Kaminski et al.
Low-Rank Curvature for Zeroth-Order Optimization in LLM Fine-tuning
Hyunseok Seung, Jaewoo Lee, Hyunsuk Ko
URaG: Unified Retrieval and Generation in Multimodal LLMs for Efficient Long Document Understanding
Yongxin Shi, Jiapeng Wang, Zeyu Shan et al.
Conformal Constrained Policy Optimization for Cost-Effective LLM Agents
Wenwen Si, Sooyong Jang, Insup Lee et al.
Learning to Collaborate: An Orchestrated-Decentralized Framework for Peer-to-Peer LLM Federation
Inderjeet Singh, Eleonore Vissol-Gaudin, Andikan Otung et al.
DAWN: Distributed LLM Multi-Agent Workflow Synthesis
Guancheng Wan, Mo Zhou, Ziyi Wang et al.
TowerMind: A Tower Defence Game Learning Environment and Benchmark for LLM as Agents
Dawei Wang, Chengming Zhou, Di Zhao et al.
MemeBQ:Memory Efficient Binary Quantization of LLMs
Yuanhui Wang, Kunlong Liu, Minnan Pei et al.
Making Sense of LLM Decisions: A Prototype-based Framework for Explainable Classification
Bowen Wei, Mehrdad Fazli, Ziwei Zhu