Papers
5,479 papers found
LLMs are not Zero-Shot Reasoners for Biomedical Information Extraction
Aishik Nagar, Viktor Schlegel, Thanh-Tung Nguyen et al.
Exploring Limitations of LLM Capabilities with Multi-Problem Evaluation
Zhengxiang Wang, Jordan Kodner, Owen Rambow
Self Knowledge-Tracing for Tool Use (SKT-Tool): Helping LLM Agents Understand Their Capabilities in Tool Use
Joshua Vigel, Renpei Cai, Eleanor Chen et al.
Evaluating Robustness of LLMs to Numerical Variations in Mathematical Reasoning
Yuli Yang, Hiroaki Yamada, Takenobu Tokunaga
Leveraging Domain Knowledge at Inference Time for LLM Translation: Retrieval versus Generation
Bryan Li, Jiaming Luo, Eleftheria Briakou et al.
LLM Reasoning Engine: Specialized Training for Enhanced Mathematical Reasoning
Shuguang Chen, Guang Lin
RouteNator: A Router-Based Multi-Modal Architecture for Generating Synthetic Training Data for Function Calling LLMs
Vibha Belavadi, Tushar Vatsa, Dewang Sultania et al.
Towards Effectively Leveraging Execution Traces for Program Repair with Code LLMs
Mirazul Haque, Petr Babkin, Farima Farmahinifarahani et al.
AI Conversational Interviewing: Transforming Surveys with LLMs as Adaptive Interviewers
Alexander Wuttke, Matthias Aßenmacher, Christopher Klamm et al.
Prompting the Past: Exploring Zero-Shot Learning for Named Entity Recognition in Historical Texts Using Prompt-Answering LLMs
Crina Tudor, Beata Megyesi, Robert Östling
Using LLMs to Advance Idiom Corpus Construction
Doğukan Arslan, Hüseyin Anıl Çakmak, Gülşen Eryiğit et al.
Assessing Crowdsourced Annotations with LLMs: Linguistic Certainty as a Proxy for Trustworthiness
Tianyi Li, Divya Sree, Tatiana Ringenberg
A Comparative Analysis of Ethical and Safety Gaps in LLMs using Relative Danger Coefficient
Yehor Tereshchenko, Mika Hämäläinen
VLG-BERT: Towards Better Interpretability in LLMs through Visual and Linguistic Grounding
Toufik Mechouma, Ismail Biskri, Serge Robert
A Comprehensive Evaluation of Cognitive Biases in LLMs
Simon Malberg, Roman Poletukhin, Carolin M. Schuster et al.
Fearful Falcons and Angry Llamas: Emotion Category Annotations of Arguments by Humans and LLMs
Lynn Greschner, Roman Klinger
Named Entity Inference Attacks on Clinical LLMs: Exploring Privacy Risks and the Impact of Mitigation Strategies
Adam Sutton, Xi Bai, Kawsar Noor et al.
Line of Duty: Evaluating LLM Self-Knowledge via Consistency in Feasibility Boundaries
Sahil Kale, Vijaykant Nadadur
Multi-lingual Multi-turn Automated Red Teaming for LLMs
Abhishek Singhania, Christophe Dupuy, Shivam Sadashiv Mangale et al.
Summary the Savior: Harmful Keyword and Query-based Summarization for LLM Jailbreak Defense
Shagoto Rahman, Ian Harris
Monte Carlo Temperature: a robust sampling strategy for LLM’s uncertainty quantification methods
Nicola Cecere, Andrea Bacciu, Ignacio Fernández-Tobías et al.