Papers
Can LLMs Recognize Their Own Analogical Hallucinations? Evaluating Uncertainty Estimation for Analogical Reasoning
Zheng Chen, Zhaoxin Feng, Jianfei Ma et al.
Reasoning or Memorization? Investigating LLMs’ Capability in Restoring Chinese Internet Homophones
Jianfei Ma, Zhaoxin Feng, Huacheng Song et al.
On the Way to LLM Personalization: Learning to Remember User Conversations
Lucie Charlotte Magister, Katherine Metcalf, Yizhe Zhang et al.
Understanding Verbatim Memorization in LLMs Through Circuit Discovery
Ilya Lasy, Peter Knees, Stefan Woltran
Memorization is Language-Sensitive: Analyzing Memorization and Inference Risks of LLMs in a Multilingual Setting
Ali Satvaty, Anna Visman, Dan Seidel et al.
Bring Your Own Knowledge: A Survey of Methods for LLM Knowledge Expansion
Mingyang Wang, Alisa Stoll, Lukas Lange et al.
Better Aligned with Survey Respondents or Training Data? Unveiling Political Leanings of LLMs on U.S. Supreme Court Cases
Shanshan Xu, Santosh T.y.s.s, Yanai Elazar et al.
UTF: Under-trained Tokens as Fingerprints —— a Novel Approach to LLM Identification
Jiacheng Cai, Jiahao Yu, Yangguang Shao et al.
LongSafety: Enhance Safety for Long-Context LLMs
Mianqiu Huang, Xiaoran Liu, Shaojun Zhou et al.
ArithmAttack: Evaluating Robustness of LLMs to Noisy Context in Math Problem Solving
Zain Ul Abedin, Shahzeb Qamar, Lucie Flek et al.
Bypassing LLM Guardrails: An Empirical Analysis of Evasion Attacks against Prompt Injection and Jailbreak Detection Systems
William Hackett, Lewis Birch, Stefan Trawicki et al.
1-2-3 Check: Enhancing Contextual Privacy in LLM via Multi-Agent Reasoning
Wenkai Li, Liwen Sun, Zhenxiang Guan et al.
Guardians of Trust: Risks and Opportunities for LLMs in Mental Health
Miguel Baidal, Erik Derner, Nuria Oliver
What Counts Underlying LLMs’ Moral Dilemma Judgments?
Wenya Wu, Weihong Deng
Hybrid Annotation for Propaganda Detection: Integrating LLM Pre-Annotations with Human Intelligence
Ariana Sahitaj, Premtim Sahitaj, Veronika Solopova et al.
Safe in Isolation, Dangerous Together: Agent-Driven Multi-Turn Decomposition Jailbreaks on LLMs
Devansh Srivastav, Xiao Zhang
StateAct: Enhancing LLM Base Agents via Self-prompting and State-tracking
Nikolai Rozanov, Marek Rei
FrontierScience Bench: Evaluating AI Research Capabilities in LLMs
Matthew Li, Santiago Torres-Garcia, Shayan Halder et al.
Weight-of-Thought Reasoning: Exploring Neural Network Weights for Enhanced LLM Reasoning
Saif Punjwani, Larry Heck
TeXpert: A Multi-Level Benchmark for Evaluating LaTeX Code Generation by LLMs
Sahil Kale, Vijaykant Nadadur
Predicting The Scholarly Impact of Research Papers Using Retrieval-Augmented LLMs
Tamjid Azad, Ibrahim Al Azher, Sagnik Ray Choudhury et al.
Inductive Learning on Heterogeneous Graphs Enhanced by LLMs for Software Mention Detection
Gabriel Silva, Mário Rodriges, António Teixeira et al.
Comparing LLMs and BERT-based Classifiers for Resource-Sensitive Claim Verification in Social Media
Max Upravitelev, Nicolau Duran-Silva, Christian Woerle et al.