Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Responsible AI
1991 directly classified papers
Papers per year
2011: 1
2016: 1
2017: 7
2018: 10
2019: 22
2020: 51
2021: 91
2022: 145
2023: 207
2024: 526
2025: 760
2026: 170
Papers
Bias Mitigation or Cultural Commonsense? Evaluating LLMs with a Japanese Dataset
EMNLP 2025
MemeArena: Automating Context-Aware Unbiased Evaluation of Harmfulness Understanding for Multimodal Large Language Models
EMNLP 2025
Fostering Digital Inclusion for Low-Resource Nigerian Languages: A Case Study of Igbo and Nigerian Pidgin
NAACL 2025
Alignment for Efficient Tool Calling of Large Language Models
EMNLP 2025
Defending Large Language Models against Jailbreak Attacks via Semantic Smoothing
AACL 2025
Less Is More? Examining Fairness in Pruned Large Language Models for Summarising Opinions
EMNLP 2025
A Comparative Analysis of Ethical and Safety Gaps in LLMs using Relative Danger Coefficient
NAACL 2025
From Surveys to Narratives: Rethinking Cultural Value Adaptation in LLMs
EMNLP 2025
MPTA: MultiTask Personalization Assessment
EMNLP 2025
Iterative Prompt Refinement for Safer Text-to-Image Generation
EMNLP 2025
HateImgPrompts: Mitigating Generation of Images Spreading Hate Speech
NAACL 2025
STEER-BENCH: A Benchmark for Evaluating the Steerability of Large Language Models
EMNLP 2025
Evaluating and Mitigating Linguistic Discrimination in Large Language Models: Perspectives on Safety Equity and Knowledge Equity
IJCAI 2025
MolErr2Fix: Benchmarking LLM Trustworthiness in Chemistry via Modular Error Detection, Localization, Explanation, and Correction
EMNLP 2025
Rainbow-Teaming for the Polish Language: A Reproducibility Study
NAACL 2025
Bias Amplification: Large Language Models as Increasingly Biased Media
AACL 2025
Role-Aware Language Models for Secure and Contextualized Access Control in Organizations
AACL 2025
On the Convergence of Moral Self-Correction in Large Language Models
AACL 2025
PMPO: A Self-Optimizing Framework for Creating High-Fidelity Measurement Tools for Social Bias in Large Language Models
AACL 2025
IncogniText: Privacy-enhancing Conditional Text Anonymization via LLM-based Private Attribute Randomization
AACL 2025
Small Changes, Large Consequences: Analyzing the Allocational Fairness of LLMs in Hiring Contexts
AACL 2025
ReVision: A Dataset and Baseline VLM for Privacy-Preserving Task-Oriented Visual Instruction Rewriting
AACL 2025
Quantifying Cognitive Bias Induction in LLM-Generated Content
AACL 2025
The Hidden Risks of Large Reasoning Models: A Safety Assessment of R1
AACL 2025
Are LLMs Rational Investors? A Study on the Financial Bias in LLMs
ACL 2025
<
1
…
29
30
31
…
80
>