Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Responsible AI
1991 directly classified papers
Papers per year
2011: 1
2016: 1
2017: 7
2018: 10
2019: 22
2020: 51
2021: 91
2022: 145
2023: 207
2024: 526
2025: 760
2026: 170
Papers
Synthetic Paths to Integral Truth: Mitigating Hallucinations Caused by Confirmation Bias with Synthetic Data
COLING 2025
Position: Iterative Online-Offline Joint Optimization is Needed to Manage Complex LLM Copyright Risks
ICML 2025
DAMAGeR: Deploying Automatic and Manual Approaches to GenAI Red-teaming
NAACL 2025
BasqBBQ: A QA Benchmark for Assessing Social Biases in LLMs for Basque, a Low-Resource Language
COLING 2025
What’s the most important value? INVP: INvestigating the Value Priorities of LLMs through Decision-making in Social Scenarios
COLING 2025
Making Transparency Advocates: An Educational Approach Towards Better Algorithmic Transparency in Practice
AAAI 2025
Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path Reasoning
COLING 2025
Fine-tuning Large Language Models for Improving Factuality in Legal Question Answering
COLING 2025
Mitigating Bias in Machine Learning: A Comprehensive Review and Novel Approaches
AAAI 2025
An Empirical Study of LLM-as-a-Judge for LLM Evaluation: Fine-tuned Judge Model is not a General Substitute for GPT-4
ACL 2025
LEGEND: Leveraging Representation Engineering to Annotate Safety Margin for Preference Datasets
AAAI 2025
Beyond the Binary: Analysing Transphobic Hate and Harassment Online
ACL 2025
Alleviating Hallucinations in Large Language Models via Truthfulness-driven Rank-adaptive LoRA
ACL 2025
sudoLLM: On Multi-role Alignment of Language Models
EMNLP 2025
Seeing Race, Feeling Bias: Emotion Stereotyping in Multimodal Language Models
EMNLP 2025
Standard Quality Criteria Derived from Current NLP Evaluations for Guiding Evaluation Design and Grounding Comparability and AI Compliance Assessments
ACL 2025
Saudi-Alignment Benchmark: Assessing LLMs Alignment with Cultural Norms and Domain Knowledge in the Saudi Context
EMNLP 2025
Are LLMs Empathetic to All? Investigating the Influence of Multi-Demographic Personas on a Model’s Empathy
EMNLP 2025
Data Attribution: A Data-Centric Approach for Trustworthy AI Development
AAAI 2025
LLMs as Medical Safety Judges: Evaluating Alignment with Human Annotation in Patient-Facing QA
ACL 2025
The AI Race: Why Current Neural Network-based Architectures are a Poor Basis for Artificial General Intelligence
AAAI 2025
SweEval: Do LLMs Really Swear? A Safety Benchmark for Testing Limits for Enterprise Use
NAACL 2025
Lessons for Editors of AI Incidents from the AI Incident Database
AAAI 2025
Verifiable by Design: Aligning Language Models to Quote from Pre-Training Data
NAACL 2025
iShumei-Chinchunmei at SemEval-2025 Task 4: A balanced forgetting and retention multi-task framework using effective unlearning loss
SEMEVAL 2025
<
1
…
36
37
38
…
80
>