Papers
2,781 papers found
Guiding not Forcing: Enhancing the Transferability of Jailbreaking Attacks on LLMs via Removing Superfluous Constraints
Junxiao Yang, Zhexin Zhang, Shiyao Cui et al.
Safer or Luckier? LLMs as Safety Evaluators Are Not Robust to Artifacts
Hongyu Chen, Seraphina Goldfarb-Tarrant
Vulnerability of LLMs to Vertically Aligned Text Manipulations
Zhecheng Li, Yiwei Wang, Bryan Hooi et al.
AutoMixAlign: Adaptive Data Mixing for Multi-Task Preference Optimization in LLMs
Nicholas E. Corrado, Julian Katz-Samuels, Adithya M Devraj et al.
Amplifying Trans and Nonbinary Voices: A Community-Centred Harm Taxonomy for LLMs
Eddie L. Ungless, Sunipa Dev, Cynthia L. Bennett et al.
Can LLMs Identify Critical Limitations within Scientific Research? A Systematic Evaluation on AI Research Papers
Zhijian Xu, Yilun Zhao, Manasi Patwardhan et al.
VF-Eval: Evaluating Multimodal LLMs for Generating Feedback on AIGC Videos
Tingyu Song, Tongyan Hu, Guo Gan et al.
On Generalization across Measurement Systems: LLMs Entail More Test-Time Compute for Underrepresented Cultures
Minh Duc Bui, Kyung Eun Park, Goran Glavaš et al.
Veracity Bias and Beyond: Uncovering LLMs’ Hidden Beliefs in Problem-Solving Reasoning
Yue Zhou, Barbara Di Eugenio
JailbreakRadar: Comprehensive Assessment of Jailbreak Attacks Against LLMs
Junjie Chu, Yugeng Liu, Ziqing Yang et al.
Enhancing Mathematical Reasoning in LLMs by Stepwise Correction
Zhenyu Wu, Qingkai Zeng, Zhihan Zhang et al.
Evaluating Personalized Tool-Augmented LLMs from the Perspectives of Personalization and Proactivity
Yupu Hao, Pengfei Cao, Zhuoran Jin et al.
IOPO: Empowering LLMs with Complex Instruction Following via Input-Output Preference Optimization
Xinghua Zhang, Haiyang Yu, Cheng Fu et al.
Flipping Knowledge Distillation: Leveraging Small Models’ Expertise to Enhance LLMs in Text Matching
Mingzhe Li, Jing Xiang, Qishen Zhang et al.
S2R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning
Ruotian Ma, Peisong Wang, Cheng Liu et al.
Look Both Ways and No Sink: Converting LLMs into Text Encoders without Training
Ziyong Lin, Haoyi Wu, Shu Wang et al.
From English to Second Language Mastery: Enhancing LLMs with Cross-Lingual Continued Instruction Tuning
Linjuan Wu, Hao-Ran Wei, Baosong Yang et al.
One QuantLLM for ALL: Fine-tuning Quantized LLMs Once for Efficient Deployments
Ke Yi, Yuhui Xu, Heng Chang et al.
Learning to Look at the Other Side: A Semantic Probing Study of Word Embeddings in LLMs with Enabled Bidirectional Attention
Zhaoxin Feng, Jianfei Ma, Emmanuele Chersoni et al.
Tracing and Dissecting How LLMs Recall Factual Knowledge for Real World Questions
Yiqun Wang, Chaoqun Wan, Sile Hu et al.
Synthesizing Post-Training Data for LLMs through Multi-Agent Simulation
Shuo Tang, Xianghe Pang, Zexi Liu et al.
SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs
Yige Xu, Xu Guo, Zhiwei Zeng et al.
Beyond Prompt Engineering: Robust Behavior Control in LLMs via Steering Target Atoms
Mengru Wang, Ziwen Xu, Shengyu Mao et al.
Finding Needles in Images: Can Multi-modal LLMs Locate Fine Details?
Parth Thakkar, Ankush Agarwal, Prasad Kasu et al.
Prediction Hubs are Context-Informed Frequent Tokens in LLMs
Beatrix Miranda Ginn Nielsen, Iuri Macocco, Marco Baroni