Papers
5,479 papers found
Learn from Failure: Fine-tuning LLMs with Trial-and-Error Data for Intuitionistic Propositional Logic Proving
Chenyang An, Zhibo Chen, Qihao Ye et al.
FineSurE: Fine-grained Summarization Evaluation using LLMs
Hwanjun Song, Hang Su, Igor Shalyminov et al.
AbsInstruct: Eliciting Abstraction Ability from LLMs through Explanation Tuning with Plausibility Estimation
Zhaowei Wang, Wei Fan, Qing Zong et al.
EasyGen: Easing Multimodal Generation with BiDiffuser and LLMs
Xiangyu Zhao, Bo Liu, Qijiong Liu et al.
Probing the Multi-turn Planning Capabilities of LLMs via 20 Question Games
Yizhe Zhang, Jiarui Lu, Navdeep Jaitly
LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression
Huiqiang Jiang, Qianhui Wu, Xufang Luo et al.
Dissecting Human and LLM Preferences
Junlong Li, Fan Zhou, Shichao Sun et al.
AFaCTA: Assisting the Annotation of Factual Claim Detection with Reliable LLM Annotators
Jingwei Ni, Minjing Shi, Dominik Stammbach et al.
Towards Faithful and Robust LLM Specialists for Evidence-Based Question-Answering
Tobias Schimanski, Jingwei Ni, Mathias Kraus et al.
Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation
Xiaoying Zhang, Baolin Peng, Ye Tian et al.
Surgical Feature-Space Decomposition of LLMs: Why, When and How?
Arnav Chavan, Nahush Lele, Deepak Gupta
SirLLM: Streaming Infinite Retentive LLM
Yao Yao, Zuchao Li, Hai Zhao
MathGenie: Generating Synthetic Data with Question Back-translation for Enhancing Mathematical Reasoning of LLMs
Zimu Lu, Aojun Zhou, Houxing Ren et al.
GSM-Plus: A Comprehensive Benchmark for Evaluating the Robustness of LLMs as Mathematical Problem Solvers
Qintong Li, Leyang Cui, Xueliang Zhao et al.
Open Ko-LLM Leaderboard: Evaluating Large Language Models in Korean with Ko-H5 Benchmark
Chanjun Park, Hyeonwoo Kim, Dahyun Kim et al.
GrowOVER: How Can LLMs Adapt to Growing Real-World Knowledge?
Dayoon Ko, Jinyoung Kim, Hahyeon Choi et al.
Democratizing LLMs for Low-Resource Languages by Leveraging their English Dominant Abilities with Linguistically-Diverse Prompts
Xuan-Phi Nguyen, Mahani Aljunied, Shafiq Joty et al.
Metaphor Understanding Challenge Dataset for LLMs
Xiaoyu Tong, Rochelle Choenni, Martha Lewis et al.
A Multi-Task Embedder For Retrieval Augmented LLMs
Peitian Zhang, Zheng Liu, Shitao Xiao et al.
Crayon: Customized On-Device LLM via Instant Adapter Blending and Edge-Server Hybrid Inference
Jihwan Bang, Juntae Lee, Kyuhong Shim et al.
DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows
Ajay Patel, Colin Raffel, Chris Callison-Burch
In-context Mixing (ICM): Code-mixed Prompts for Multilingual LLMs
Bhavani Shankar, Preethi Jyothi, Pushpak Bhattacharyya
Intuitive or Dependent? Investigating LLMs’ Behavior Style to Conflicting Prompts
Jiahao Ying, Yixin Cao, Kai Xiong et al.
Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs
Jiejun Tan, Zhicheng Dou, Yutao Zhu et al.
Factual Confidence of LLMs: on Reliability and Robustness of Current Estimators
Matéo Mahaut, Laura Aina, Paula Czarnowska et al.