Artificial Intelligence › Core AI ›

Responsible AI

1991 directly classified papers

Papers per year

Papers

Towards Understanding Task-agnostic Debiasing Through the Lenses of Intrinsic Bias and Forgetfulness ACL 2024

SALAD-Bench: A Hierarchical and Comprehensive Safety Benchmark for Large Language Models ACL 2024

Perceptions of Language Technology Failures from South Asian English Speakers ACL 2024

The Good and The Bad: Exploring Privacy Issues in Retrieval-Augmented Generation (RAG) ACL 2024

The Butterfly Effect of Altering Prompts: How Small Changes and Jailbreaks Affect Large Language Model Performance ACL 2024

Mitigating Privacy Seesaw in Large Language Models: Augmented Privacy Neuron Editing via Activation Patching ACL 2024

Subtle Signatures, Strong Shields: Advancing Robust and Imperceptible Watermarking in Large Language Models ACL 2024

All Languages Matter: On the Multilingual Safety of LLMs ACL 2024

Bias in News Summarization: Measures, Pitfalls and Corpora ACL 2024

Sowing the Wind, Reaping the Whirlwind: The Impact of Editing Language Models ACL 2024

Negative Object Presence Evaluation (NOPE) to Measure Object Hallucination in Vision-Language Models ACL 2024

John vs. Ahmed: Debate-Induced Bias in Multilingual LLMs ACL 2024

CDEval: A Benchmark for Measuring the Cultural Dimensions of Large Language Models ACL 2024

Do Multilingual Large Language Models Mitigate Stereotype Bias? ACL 2024

Are Generative Language Models Multicultural? A Study on Hausa Culture and Emotions using ChatGPT ACL 2024

Generative Debunking of Climate Misinformation ACL 2024

Unlearning Climate Misinformation in Large Language Models ACL 2024

MBIAS: Mitigating Bias in Large Language Models While Retaining Context ACL 2024

Jailbreak Open-Sourced Large Language Models via Enforced Decoding ACL 2024

The Language Barrier: Dissecting Safety Challenges of LLMs in Multilingual Contexts ACL 2024

LIRE: listwise reward enhancement for preference alignment ACL 2024

More than Minorities and Majorities: Understanding Multilateral Bias in Language Generation ACL 2024

Pause-Aware Automatic Dubbing using LLM and Voice Cloning ACL 2024

FBK@IWSLT Test Suites Task: Gender Bias evaluation with MuST-SHE ACL 2024

ValueBench: Towards Comprehensively Evaluating Value Orientations and Understanding of Large Language Models ACL 2024