Papers
2,781 papers found
What Really Matters in Many-Shot Attacks? An Empirical Study of Long-Context Vulnerabilities in LLMs
Sangyeop Kim, Yohan Lee, Yongwoo Song et al.
Bias in the Mirror : Are LLMs opinions robust to their own adversarial attacks
Virgile Rennard, Christos Xypolopoulos, Michalis Vazirgiannis
Less for More: Enhanced Feedback-aligned Mixed LLMs for Molecule Caption Generation and Fine-Grained NLI Evaluation
Dimitris Gkoumas, Maria Liakata
LADM: Long-context Training Data Selection with Attention-based Dependency Measurement for LLMs
Jianghao Chen, Junhong Wu, Yangyifan Xu et al.
Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training
Youliang Yuan, Wenxiang Jiao, Wenxuan Wang et al.
Token Prepending: A Training-Free Approach for Eliciting Better Sentence Embeddings from LLMs
Yuchen Fu, Zifeng Cheng, Zhiwei Jiang et al.
Taming LLMs with Gradient Grouping
Siyuan Li, Juanxi Tian, Zedong Wang et al.
Which of These Best Describes Multiple Choice Evaluation with LLMs? A) Forced B) Flawed C) Fixable D) All of the Above
Nishant Balepur, Rachel Rudinger, Jordan Lee Boyd-Graber
Contrastive Prompting Enhances Sentence Embeddings in LLMs through Inference-Time Steering
Zifeng Cheng, Zhonghui Wang, Yuchen Fu et al.
Comparing Moral Values in Western English-speaking societies and LLMs with Word Associations
Chaoyi Xiang, Chunhua Liu, Simon De Deyne et al.
Fine-Tuning on Diverse Reasoning Chains Drives Within-Inference CoT Refinement in LLMs
Haritz Puerto, Tilek Chubakov, Xiaodan Zhu et al.
Do Large Language Models have an English Accent? Evaluating and Improving the Naturalness of Multilingual LLMs
Yanzhu Guo, Simone Conia, Zelin Zhou et al.
Enhancing Character-Level Understanding in LLMs through Token Internal Structure Learning
Zhu Xu, Zhiqiang Zhao, Zihan Zhang et al.
Confidence v.s. Critique: A Decomposition of Self-Correction Capability for LLMs
Zhe Yang, Yichang Zhang, Yudong Wang et al.
Automating Legal Interpretation with LLMs: Retrieval, Generation, and Evaluation
Kangcheng Luo, Quzhe Huang, Cong Jiang et al.
Can LLMs Simulate L2-English Dialogue? An Information-Theoretic Analysis of L1-Dependent Biases
Rena Gao, Xuetong Wu, Tatsuki Kuribayashi et al.
How Humans and LLMs Organize Conceptual Knowledge: Exploring Subordinate Categories in Italian
Andrea Pedrotti, Giulia Rambelli, Caterina Villani et al.
Stepwise Reasoning Disruption Attack of LLMs
Jingyu Peng, Maolin Wang, Xiangyu Zhao et al.
Are the Hidden States Hiding Something? Testing the Limits of Factuality-Encoding Capabilities in LLMs
Giovanni Servedio, Alessandro De Bellis, Dario Di Palma et al.
HD-NDEs: Neural Differential Equations for Hallucination Detection in LLMs
Qing Li, Jiahui Geng, Zongxiong Chen et al.
Can Graph Descriptive Order Affect Solving Graph Problems with LLMs?
Yuyao Ge, Shenghua Liu, Baolong Bi et al.
GIFT-SW: Gaussian noise Injected Fine-Tuning of Salient Weights for LLMs
Maxim Zhelnin, Viktor Moskvoretskii, Egor Shvetsov et al.
Biased LLMs can Influence Political Decision-Making
Jillian Fisher, Shangbin Feng, Robert Aron et al.
FineReason: Evaluating and Improving LLMs’ Deliberate Reasoning through Reflective Puzzle Solving
Guizhen Chen, Weiwen Xu, Hao Zhang et al.
The TIP of the Iceberg: Revealing a Hidden Class of Task-in-Prompt Adversarial Attacks on LLMs
Sergey Berezin, Reza Farahbakhsh, Noel Crespi