Papers
5,479 papers found
Courtroom-LLM: A Legal-Inspired Multi-LLM Framework for Resolving Ambiguous Text Classifications
Sangkeun Jung, Jeesu Jung
Can LLMs Verify Arabic Claims? Evaluating the Arabic Fact-Checking Abilities of Multilingual LLMs
Ayushman Gupta, Aryan Singhal, Thomas Law et al.
Adapting Multilingual LLMs to Low-Resource Languages using Continued Pre-training and Synthetic Corpus: A Case Study for Hindi LLMs
Raviraj Joshi, Kanishk Singla, Anusha Kamath et al.
Generative FrameNet: Scalable and Adaptive Frames for Interpretable Knowledge Storage and Retrieval for LLMs Powered by LLMs
Harish Tayyar Madabushi, Taylor Hudson, Claire Bonial
Playing the Fool: Jailbreaking LLMs and Multimodal LLMs with Out-of-Distribution Strategy
Joonhyun Jeong, Seyun Bae, Yeonsung Jung et al.
LLM See, LLM Do: Leveraging Active Inheritance to Target Non-Differentiable Objectives
Luísa Shimabucoro, Sebastian Ruder, Julia Kreutzer et al.
BPO: Staying Close to the Behavior LLM Creates Better Online LLM Alignment
Wenda Xu, Jiachen Li, William Yang Wang et al.
Can LLMs replace Neil deGrasse Tyson? Evaluating the Reliability of LLMs as Science Communicators
Prasoon Bajpai, Niladri Chatterjee, Subhabrata Dutta et al.
Do LLMs Plan Like Human Writers? Comparing Journalist Coverage of Press Releases with LLMs
Alexander Spangher, Nanyun Peng, Sebastian Gehrmann et al.
Are LLMs Effective Negotiators? Systematic Evaluation of the Multifaceted Capabilities of LLMs in Negotiation Dialogues
Deuksin Kwon, Emily Weiss, Tara Kulshrestha et al.
From General LLM to Translation: How We Dramatically Improve Translation Quality Using Human Evaluation Data for LLM Finetuning
Denis Elshin, Nikolay Karpachev, Boris Gruzdev et al.
Mapping the Minds of LLMs: A Graph-Based Analysis of Reasoning LLMs
Zhen Xiong, Yujun Cai, Zhecheng Li et al.
Can LLMs be Literary Companions?: Analysing LLMs on Bengali Figures of Speech Identification
Sourav Das, Kripabandhu Ghosh
Forget What You Know about LLMs Evaluations - LLMs are Like a Chameleon
Nurit Cohen Inger, Yehonatan Elisha, Bracha Shapira et al.
How Far Can LLMs Improve from Experience? Measuring Test-Time Learning Ability in LLMs with Human Comparison
Jiayin Wang, Zhiqiang Guo, Weizhi Ma et al.
Do LLMs Behave as Claimed? Investigating How LLMs Follow Their Own Claims using Counterfactual Questions
Haochen Shi, Shaobo Li, Guoqing Chao et al.
RouterEval: A Comprehensive Benchmark for Routing LLMs to Explore Model-level Scaling Up in LLMs
Zhongzhan Huang, Guoming Ling, Yupei Lin et al.
Teaching LLMs to Plan, Not Just Solve: Plan Learning Boosts LLMs Generalization in Reasoning Tasks
Tianlong Wang, Junzhe Chen, Weibin Liao et al.
Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs
Miao Xiong, Zhiyuan Hu, Xinyang Lu et al.
Do LLMs Recognize Your Preferences? Evaluating Personalized Preference Following in LLMs
Siyan Zhao, Mingyi Hong, Yang Liu et al.
When LLMs Play the Telephone Game: Cultural Attractors as Conceptual Tools to Evaluate LLMs in Multi-turn Settings
Jérémy Perez, Grgur Kovač, Corentin Léger et al.
How to Enable LLM with 3D Capacity? A Survey of Spatial Reasoning in LLM
Jirong Zha, Yuxuan Fan, Xiao Yang et al.
Head-to-Tail: How Knowledgeable are Large Language Models (LLMs)? A.K.A. Will LLMs Replace Knowledge Graphs?
Kai Sun, Yifan Xu, Hanwen Zha et al.
The Colorful Future of LLMs: Evaluating and Improving LLMs as Emotional Supporters for Queer Youth
Shir Lissak, Nitay Calderon, Geva Shenkman et al.