Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Responsible AI
1991 directly classified papers
Papers per year
2011: 1
2016: 1
2017: 7
2018: 10
2019: 22
2020: 51
2021: 91
2022: 145
2023: 207
2024: 526
2025: 760
2026: 170
Papers
AccessEval: Benchmarking Disability Bias in Large Language Models
EMNLP 2025
Controlled Generation for Private Synthetic Text
EMNLP 2025
A Simple Yet Effective Method for Non-Refusing Context Relevant Fine-grained Safety Steering in LLMs
EMNLP 2025
Certified Mitigation of Worst-Case LLM Copyright Infringement
EMNLP 2025
Retracing the Past: LLMs Emit Training Data When They Get Lost
EMNLP 2025
Mind the Blind Spots: A Focus-Level Evaluation Framework for LLM Reviews
EMNLP 2025
Chinese Toxic Language Mitigation via Sentiment Polarity Consistent Rewrites
EMNLP 2025
SynthTextEval: Synthetic Text Data Generation and Evaluation for High-Stakes Domains
EMNLP 2025
PCRI: Measuring Context Robustness in Multimodal Models for Enterprise Applications
EMNLP 2025
Benchmarking LLM Faithfulness in RAG with Evolving Leaderboards
EMNLP 2025
Truth, Trust, and Trouble: Medical AI on the Edge
EMNLP 2025
Experience Report: Implementing Machine Translation in a Regulated Industry
EMNLP 2025
HalluDetect: Detecting, Mitigating, and Benchmarking Hallucinations in Conversational Systems in the Legal Domain
EMNLP 2025
Depression Detection on Social Media with Large Language Models
EMNLP 2025
Toward Optimal LLM Alignments Using Two-Player Games
EMNLP 2025
Safety in Large Reasoning Models: A Survey
EMNLP 2025
From Measurement to Mitigation: Exploring the Transferability of Debiasing Approaches to Gender Bias in Maltese Language Models
ACL 2025
Simulating Identity, Propagating Bias: Abstraction and Stereotypes in LLM-Generated Text
EMNLP 2025
GenWriter: Reducing Gender Cues in Biographies through Text Rewriting
ACL 2025
Language Models Resist Alignment: Evidence From Data Compression
ACL 2025
Exploring Gender Bias in Large Language Models: An In-depth Dive into the German Language
ACL 2025
Linguistic and Embedding-Based Profiling of Texts Generated by Humans and Large Language Models
EMNLP 2025
Adapting Psycholinguistic Research for LLMs: Gender-inclusive Language in a Coreference Context
ACL 2025
T2ISafety: Benchmark for Assessing Fairness, Toxicity, and Privacy in Image Generation
CVPR 2025
Layer-Aware Representation Filtering: Purifying Finetuning Data to Preserve LLM Safety Alignment
EMNLP 2025
<
1
…
16
17
18
…
80
>