Papers
5,479 papers found
Do LLMs Need Inherent Reasoning Before Reinforcement Learning? A Study in Korean Self-Correction
Hongjin Kim, Jaewook Lee, Kiyoung Lee et al.
Hidden in Plain Text: Emergence & Mitigation of Steganographic Collusion in LLMs
Yohan Mathew, Ollie Matthews, Robert McCarthy et al.
Multilingual, Not Multicultural: Uncovering the Cultural Empathy Gap in LLMs through a Comparative Empathetic Dialogue Benchmark
Woojin Lee, Yujin Sim, Hongjin Kim et al.
Chain-of-Query: Unleashing the Power of LLMs in SQL-Aided Table Understanding via Multi-Agent Collaboration
Songyuan Sui, Hongyi Liu, Serena Liu et al.
ProofTeller: Exposing recency bias in LLM reasoning and its side effects on communication
Mayank Jobanputra, Alisa Kovtunova, Brisca Balthes et al.
An Adversary-Resistant Multi-Agent LLM System via Credibility Scoring
Sana Ebrahimi, Mohsen Dehghankar, Abolfazl Asudeh
Are LLMs Rigorous Logical Reasoners? Empowering Natural Language Proof Generation by Stepwise Decoding with Contrastive Learning
Ying Su, Mingwen Liu, Zhijiang Guo
Exploring Working Memory Capacity in LLMs: From Stressors to Human-Inspired Strategies
Eunjin Hong, Sumin Cho, Juae Kim
Doppelganger-JC: Benchmarking the LLMs’ Understanding of Cross-Lingual Homographs between Japanese and Chinese
Yuka Kitamura, Jiahao Huang, Akiko Aizawa
LLMs Do Not See Age: Assessing Demographic Bias in Automated Systematic Review Synthesis
Favour Y. Aghaebe, Elizabeth A Williams, Tanefa Apekey et al.
Decode Like a Clinician: Enhancing LLM Fine-Tuning with Temporal Structured Data Representation
Daniel Fadlon, David Dov, Aviya Bennett et al.
The Confidence Paradox: Can LLM Know When It’s Wrong?
Sahil Tripathi, MD Tabrez Nafis, Imran Hussain et al.
Large Temporal Models: Unlocking Temporal Understanding in LLMs for Temporal Relation Classification
Omri Homburger, Kfir Bar
Interpreting the Effects of Quantization on LLMs
Manpreet Singh, Hassan Sajjad
What Would You Ask When You First Saw a2+b2=c2? Evaluating LLM on Curiosity-Driven Question Generation
Shashidhar Reddy Javaji, Zining Zhu
Can AI Validate Science? Benchmarking LLMs on Claim →Evidence Reasoning in AI Papers
Shashidhar Reddy Javaji, Yupeng Cao, Haohang Li et al.
More Than a Score: Probing the Impact of Prompt Specificity on LLM Code Generation
Yangtian Zi, Harshitha Menon, Arjun Guha
Pragmatic Theories Enhance Understanding of Implied Meanings in LLMs
Takuma Sato, Seiya Kawano, Koichiro Yoshino
Crypto-LLM: Two-Stage Language Model Pre-training with Ciphered and Natural Language Data
Yohei Kobashi, Fumiya Uchiyama, Takeshi Kojima et al.
From Templates to Natural Language: Generalization Challenges in Instruction-Tuned LLMs for Spatial Reasoning
Chalamalasetti Kranti, Sherzod Hakimov, David Schlangen
Small Changes, Large Consequences: Analyzing the Allocational Fairness of LLMs in Hiring Contexts
Preethi Seshadri, Hongyu Chen, Sameer Singh et al.
Revisiting Word Embeddings in the LLM Era
Yash Mahajan, Matthew Freestone, Naman Bansal et al.
Agnus LLM: Robust and Flexible Entity Disambiguation with decoder-only Language Models
Kristian Noullet, Ayoub Ourgani, Niklas Thomas Lakner et al.
Found in Translation: Measuring Multilingual LLM Consistency as Simple as Translate then Evaluate
Ashim Gupta, Maitrey Mehta, Zhichao Xu et al.