Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Responsible AI
1991 directly classified papers
Papers per year
2011: 1
2016: 1
2017: 7
2018: 10
2019: 22
2020: 51
2021: 91
2022: 145
2023: 207
2024: 526
2025: 760
2026: 170
Papers
Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models
ACL 2024
AI ‘News’ Content Farms Are Easy to Make and Hard to Detect: A Case Study in Italian
ACL 2024
IMGTB: A Framework for Machine-Generated Text Detection Benchmarking
ACL 2024
OpenEval: Benchmarking Chinese LLMs across Capability, Alignment and Safety
ACL 2024
Bypassing LLM Watermarks with Color-Aware Substitutions
ACL 2024
Are LLM-based Evaluators Confusing NLG Quality Criteria?
ACL 2024
Raccoon: Prompt Extraction Benchmark of LLM-Integrated Applications
ACL 2024
Investigating Subtler Biases in LLMs: Ageism, Beauty, Institutional, and Nationality Bias in Generative Models
ACL 2024
XAI for Better Exploitation of Text in Medical Decision Support
ACL 2024
SeeGULL Multilingual: a Dataset of Geo-Culturally Situated Stereotypes
ACL 2024
A Chinese Dataset for Evaluating the Safeguards in Large Language Models
ACL 2024
Pro-Woman, Anti-Man? Identifying Gender Bias in Stance Detection
ACL 2024
Towards Tracing Trustworthiness Dynamics: Revisiting Pre-training Period of Large Language Models
ACL 2024
KorNAT: LLM Alignment Benchmark for Korean Social Values and Common Knowledge
ACL 2024
Chaos with Keywords: Exposing Large Language Models Sycophancy to Misleading Keywords and Evaluating Defense Strategies
ACL 2024
Divine LLaMAs: Bias, Stereotypes, Stigmatization, and Emotion Representation of Religion in Large Language Models
EMNLP 2024
Recent Advances in Online Hate Speech Moderation: Multimodality and the Role of Large Models
EMNLP 2024
Quantifying Generative Media Bias with a Corpus of Real-world and Generated News Articles
EMNLP 2024
LLM Tropes: Revealing Fine-Grained Values and Opinions in Large Language Models
EMNLP 2024
Dial BeInfo for Faithfulness: Improving Factuality of Information-Seeking Dialogue via Behavioural Fine-Tuning
EMNLP 2024
PG-Story: Taxonomy, Dataset, and Evaluation for Ensuring Child-Safe Content for Story Generation
EMNLP 2024
Investigating Ableism in LLMs through Multi-turn Conversation
EMNLP 2024
Decoding Ableism in Large Language Models: An Intersectional Approach
EMNLP 2024
DiversityMedQA: A Benchmark for Assessing Demographic Biases in Medical Diagnosis using Large Language Models
EMNLP 2024
TokenSHAP: Interpreting Large Language Models with Monte Carlo Shapley Value Estimation
EMNLP 2024
<
1
…
55
56
57
…
80
>