Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Responsible AI
1991 directly classified papers
Papers per year
2011: 1
2016: 1
2017: 7
2018: 10
2019: 22
2020: 51
2021: 91
2022: 145
2023: 207
2024: 526
2025: 760
2026: 170
Papers
TRCE: Towards Reliable Malicious Concept Erasure in Text-to-Image Diffusion Models
ICCV 2025
Which Demographics do LLMs Default to During Annotation?
ACL 2025
Modality-Fair Preference Optimization for Trustworthy MLLM Alignment
IJCAI 2025
Defining and Quantifying Visual Hallucinations in Vision-Language Models
NAACL 2025
Battling Misinformation: An Empirical Study on Adversarial Factuality in Open-Source Large Language Models
NAACL 2025
Can’t See the Forest for the Trees: Benchmarking Multimodal Safety Awareness for Multimodal LLMs
ACL 2025
Rainbow-Teaming for the Polish Language: A Reproducibility Study
NAACL 2025
HateImgPrompts: Mitigating Generation of Images Spreading Hate Speech
NAACL 2025
Chinese SafetyQA: A Safety Short-form Factuality Benchmark for Large Language Models
ACL 2025
Evaluating and Mitigating Linguistic Discrimination in Large Language Models: Perspectives on Safety Equity and Knowledge Equity
IJCAI 2025
Human-Centered Disability Bias Detection in Large Language Models
IJCNLP 2025
A Comparative Analysis of Ethical and Safety Gaps in LLMs using Relative Danger Coefficient
NAACL 2025
Fostering Digital Inclusion for Low-Resource Nigerian Languages: A Case Study of Igbo and Nigerian Pidgin
NAACL 2025
Stealing Training Data from Large Language Models in Decentralized Training through Activation Inversion Attack
ACL 2025
shimig@DravidianLangTech2025: Stratification of Abusive content on Women in Social Media
NAACL 2025
CUET_Absolute_Zero@DravidianLangTech 2025: Detecting AI-Generated Product Reviews in Malayalam and Tamil Language Using Transformer Models
NAACL 2025
Can LLM Watermarks Robustly Prevent Unauthorized Knowledge Distillation?
ACL 2025
Wisdom from Diversity: Bias Mitigation Through Hybrid Human-LLM Crowds
IJCAI 2025
NLP_goats_DravidianLangTech_2025__Detecting_AI_Written_Reviews_for_Consumer_Trust
NAACL 2025
LLM Alignment for the Arabs: A Homogenous Culture or Diverse Ones
NAACL 2025
GenderAlign: An Alignment Dataset for Mitigating Gender Bias in Large Language Models
ACL 2025
GRAIT: Gradient-Driven Refusal-Aware Instruction Tuning for Effective Hallucination Mitigation
NAACL 2025
Evaluating Cultural and Social Awareness of LLM Web Agents
NAACL 2025
PrivaCI-Bench: Evaluating Privacy with Contextual Integrity and Legal Compliance
ACL 2025
Large Language Models Discriminate Against Speakers of German Dialects
EMNLP 2025
<
1
…
12
13
14
…
80
>