Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Responsible AI
1991 directly classified papers
Papers per year
2011: 1
2016: 1
2017: 7
2018: 10
2019: 22
2020: 51
2021: 91
2022: 145
2023: 207
2024: 526
2025: 760
2026: 170
Papers
Balancing Social Impact, Opportunities, and Ethical Constraints of Using AI in the Documentation and Vitalization of Indigenous Languages
IJCAI 2023
Pushing the Limits of Fairness in Algorithmic Decision-Making
IJCAI 2023
Fairlearn: Assessing and Improving Fairness of AI Systems
JMLR 2023
The Effects of AI Biases and Explanations on Human Decision Fairness: A Case Study of Bidding in Rental Housing Markets
IJCAI 2023
Image Shortcut Squeezing: Countering Perturbative Availability Poisons with Compression
ICML 2023
When the Majority is Wrong: Modeling Annotator Disagreement for Subjective Tasks
EMNLP 2023
“Are Your Explanations Reliable?” Investigating the Stability of LIME in Explaining Text Classifiers by Marrying XAI and Adversarial Attack
EMNLP 2023
A Fine-Grained Taxonomy of Replies to Hate Speech
EMNLP 2023
The Past, Present and Better Future of Feedback Learning in Large Language Models for Subjective Human Preferences and Values
EMNLP 2023
BiasX: “Thinking Slow” in Toxic Content Moderation with Explanations of Implied Social Biases
EMNLP 2023
Toxicity in Multilingual Machine Translation at Scale
EMNLP 2023
Geographical Erasure in Language Generation
EMNLP 2023
Rehabilitating Homeless: Dataset and Key Insights
AAAI 2023
Leveraging Domain Knowledge for Inclusive and Bias-aware Humanitarian Response Entry Classification
IJCAI 2023
BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset
NIPS 2023
DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models
NIPS 2023
Attribution-based Explanations that Provide Recourse Cannot be Robust
JMLR 2023
Quantus: An Explainable AI Toolkit for Responsible Evaluation of Neural Network Explanations and Beyond
JMLR 2023
Assessing Cross-Cultural Alignment between ChatGPT and Human Societies: An Empirical Study
EACL 2023
Understanding Ethics in NLP Authoring and Reviewing
EACL 2023
Building Stereotype Repositories with Complementary Approaches for Scale and Depth
EACL 2023
Toward Cultural Bias Evaluation Datasets: The Case of Bengali Gender, Religious, and National Identity
EACL 2023
Measuring Gender Bias in West Slavic Language Models
EACL 2023
Combining Psychological Theory with Language Models for Suicide Risk Detection
EACL 2023
Performance and Risk Trade-offs for Multi-word Text Prediction at Scale
EACL 2023
<
1
…
61
62
63
…
80
>