Papers
5,479 papers found
Cross-Task Defense: Instruction-Tuning LLMs for Content Safety
Yu Fu, Wen Xiao, Jia Chen et al.
Introducing GenCeption for Multimodal LLM Benchmarking: You May Bypass Annotations
Lele Cao, Valentin Buchner, Zineb Senane et al.
Sandwich attack: Multi-language Mixture Adaptive Attack on LLMs
Bibek Upadhayay, Vahid Behzadan
Can LLMs Handle Low-Resource Dialects? A Case Study on Translation and Common Sense Reasoning in Šariš
Viktória Ondrejová, Marek Šuppa
Data-Augmentation-Based Dialectal Adaptation for LLMs
Fahim Faisal, Antonios Anastasopoulos
Incorporating Dialect Understanding Into LLM Using RAG and Prompt Engineering Techniques for Causal Commonsense Reasoning
Benedikt Perak, Slobodan Beliga, Ana Meštrović
Simple LLM based Approach to Counter Algospeak
Jan Fillies, Adrian Paschke
On Behalf of the Stakeholders: Trends in NLP Model Interpretability in the Era of LLMs
Nitay Calderon, Roi Reichart
Hybrid Graphs for Table-and-Text based Question Answering using LLMs
Ankush Agarwal, Chaitanya Devaguptapu, Ganesh S
SeqAR: Jailbreak LLMs with Sequential Auto-Generated Characters
Yan Yang, Zeguan Xiao, Xin Lu et al.
Unmasking Implicit Bias: Evaluating Persona-Prompted LLM Responses in Power-Disparate Social Scenarios
Bryan Chen Zhengyu Tan, Roy Ka-Wei Lee
Balancing Forget Quality and Model Utility: A Reverse KL-Divergence Knowledge Distillation Approach for Better Unlearning in LLMs
Bichen Wang, Yuzhe Zi, Yixin Sun et al.
Can LLMs Convert Graphs to Text-Attributed Graphs?
Zehong Wang, Sidney Liu, Zheyuan Zhang et al.
CompAct: Compressed Activations for Memory-Efficient LLM Training
Yara Shamshoum, Nitzan Hodos, Yuval Sieradzki et al.
What Did I Do Wrong? Quantifying LLMs’ Sensitivity and Consistency to Prompt Engineering
Federico Errica, Davide Sanvito, Giuseppe Siracusano et al.
SafetyQuizzer: Timely and Dynamic Evaluation on the Safety of LLMs
Zhichao Shi, Shaoling Jing, Yi Cheng et al.
The Impact of Inference Acceleration on Bias of LLMs
Elisabeth Kirsten, Ivan Habernal, Vedant Nanda et al.
From Allies to Adversaries: Manipulating LLM Tool-Calling through Adversarial Injection
Rupeng Zhang, Haowei Wang, Junjie Wang et al.
Fine-Tuned LLMs are “Time Capsules” for Tracking Societal Bias Through Books
Sangmitra Madhusudan, Robert Morabito, Skye Reid et al.
SafeQuant: LLM Safety Analysis via Quantized Gradient Inspection
Sindhu Padakandla, Sadbhavana Babar, Rathod Darshan D et al.
Have LLMs Reopened the Pandora’s Box of AI-Generated Fake News?
Xinyu Wang, Wenbo Zhang, Sai Koneru et al.
A Probabilistic Framework for LLM Hallucination Detection via Belief Tree Propagation
Bairu Hou, Yang Zhang, Jacob Andreas et al.
Who Relies More on World Knowledge and Bias for Syntactic Ambiguity Resolution: Humans or LLMs?
So Young Lee, Russell Scheinberg, Amber Shore et al.
Can Unconfident LLM Annotations Be Used for Confident Conclusions?
Kristina Gligoric, Tijana Zrnic, Cinoo Lee et al.