Artificial Intelligence › Core AI ›

Responsible AI

1991 directly classified papers

Papers per year

Papers

ZJUKLAB at SemEval-2025 Task 4: Unlearning via Model Merging. ACL 2025

Exploiting Instruction-Following Retrievers for Malicious Information Retrieval ACL 2025

COVER: Context-Driven Over-Refusal Verification in LLMs ACL 2025

EMNLP: Educator-role Moral and Normative Large Language Models Profiling EMNLP 2025

Exploring the Impact of Personality Traits on LLM Bias and Toxicity EMNLP 2025

MMJ-Bench: A Comprehensive Study on Jailbreak Attacks and Defenses for Vision Language Models AAAI 2025

LLMs know their vulnerabilities: Uncover Safety Gaps through Natural Distribution Shifts ACL 2025

Self-Pluralising Culture Alignment for Large Language Models NAACL 2025

GUIR at SemEval-2025 Task 4: Adaptive Weight Tuning with Gradual Negative Matching for LLM Unlearning ACL 2025

Multi-OphthaLingua: A Multilingual Benchmark for Assessing and Debiasing LLM Ophthalmological QA in LMICs AAAI 2025

WaterPool: A Language Model Watermark Mitigating Trade-Offs among Imperceptibility, Efficacy and Robustness NAACL 2025

Developing a Postgraduate Program for AI in Medicine with Kern’s Six-Step Curriculum Development Approach in Singapore AAAI 2025

Meta-Cultural Competence: Climbing the Right Hill of Cultural Awareness NAACL 2025

The Essentials of AI for Life and Society: An AI Literacy Course for the University Community AAAI 2025

Synthetic Paths to Integral Truth: Mitigating Hallucinations Caused by Confirmation Bias with Synthetic Data COLING 2025

Position: Iterative Online-Offline Joint Optimization is Needed to Manage Complex LLM Copyright Risks ICML 2025

NLP for Counterspeech against Hate and Misinformation (CSHAM) ACL 2025

BasqBBQ: A QA Benchmark for Assessing Social Biases in LLMs for Basque, a Low-Resource Language COLING 2025

What’s the most important value? INVP: INvestigating the Value Priorities of LLMs through Decision-making in Social Scenarios COLING 2025

Making Transparency Advocates: An Educational Approach Towards Better Algorithmic Transparency in Practice AAAI 2025

Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path Reasoning COLING 2025

Fine-tuning Large Language Models for Improving Factuality in Legal Question Answering COLING 2025

Mitigating Bias in Machine Learning: A Comprehensive Review and Novel Approaches AAAI 2025

An Empirical Study of LLM-as-a-Judge for LLM Evaluation: Fine-tuned Judge Model is not a General Substitute for GPT-4 ACL 2025

Tokens for Learning, Tokens for Unlearning: Mitigating Membership Inference Attacks in Large Language Models via Dual-Purpose Training ACL 2025