Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Responsible AI
1991 directly classified papers
Papers per year
2011: 1
2016: 1
2017: 7
2018: 10
2019: 22
2020: 51
2021: 91
2022: 145
2023: 207
2024: 526
2025: 760
2026: 170
Papers
Gender-Neutral Large Language Models for Medical Applications: Reducing Bias in PubMed Abstracts
ACL 2025
The Invisible Hand: Unveiling Provider Bias in Large Language Models for Code Generation
ACL 2025
Toward Reasonable Parrots: Why Large Language Models Should Argue with Us by Design
ACL 2025
Multilingual NLP for African Healthcare: Bias, Translation, and Explainability Challenges
ACL 2025
Improved Unbiased Watermark for Large Language Models
ACL 2025
Moral Compass: A Data-Driven Benchmark for Ethical Cognition in AI
IJCAI 2025
Rethinking Prompt-based Debiasing in Large Language Model
ACL 2025
7 Points to Tsinghua but 10 Points to ? Assessing Large Language Models in Agentic Multilingual National Bias
ACL 2025
If Eleanor Rigby Had Met ChatGPT: A Study on Loneliness in a Post-LLM World
ACL 2025
Protecting Users From Themselves: Safeguarding Contextual Privacy in Interactions with Conversational Agents
ACL 2025
SuMa: A Subspace Mapping Approach for Robust and Effective Concept Erasure in Text-to-Image Diffusion Models
ICCV 2025
ELBA-Bench: An Efficient Learning Backdoor Attacks Benchmark for Large Language Models
ACL 2025
Towards Safer Pretraining: Analyzing and Filtering Harmful Content in Webscale Datasets for Responsible LLMs
IJCAI 2025
AdvERSEM: Adversarial Robustness Testing and Training of LLM-based Groundedness Evaluators via Semantic Structure Manipulation
EMNLP 2025
Scaling Trends for Data Poisoning in LLMs
AAAI 2025
Data Attribution: A Data-Centric Approach for Trustworthy AI Development
AAAI 2025
Fair Domain Generalization with Heterogeneous Sensitive Attributes Across Domains
WACV 2025
Holistic Unlearning Benchmark: A Multi-Faceted Evaluation for Text-to-Image Diffusion Model Unlearning
ICCV 2025
Disrupting Model Merging: A Parameter-Level Defense Without Sacrificing Accuracy
ICCV 2025
Insight Over Sight: Exploring the Vision-Knowledge Conflicts in Multimodal LLMs
ACL 2025
TRCE: Towards Reliable Malicious Concept Erasure in Text-to-Image Diffusion Models
ICCV 2025
Which Demographics do LLMs Default to During Annotation?
ACL 2025
Modality-Fair Preference Optimization for Trustworthy MLLM Alignment
IJCAI 2025
Investigating and Mitigating Undesirable Biases in Large Language Models
AAAI 2025
Defining and Quantifying Visual Hallucinations in Vision-Language Models
NAACL 2025
<
1
…
11
12
13
…
80
>