Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Responsible AI
1991 directly classified papers
Papers per year
2011: 1
2016: 1
2017: 7
2018: 10
2019: 22
2020: 51
2021: 91
2022: 145
2023: 207
2024: 526
2025: 760
2026: 170
Papers
AI Chatbots as Professional Service Agents: Developing a Professional Identity
EMNLP 2025
Advancing Oversight Reasoning across Languages for Audit Sycophantic Behaviour via X-Agent
EMNLP 2025
The discordance between embedded ethics and cultural inference in large language models
EMNLP 2025
Unraveling Interwoven Roles of Large Language Models in Authorship Privacy: Obfuscation, Mimicking, and Verification
EMNLP 2025
Reward Model Perspectives: Whose Opinions Do Reward Models Reward?
EMNLP 2025
A Necessary Step toward Faithfulness: Measuring and Improving Consistency in Free-Text Explanations
EMNLP 2025
The State of Multilingual LLM Safety Research: From Measuring The Language Gap To Mitigating It
EMNLP 2025
Implicit Values Embedded in How Humans and LLMs Complete Subjective Everyday Tasks
EMNLP 2025
Incorporating Diverse Perspectives in Cultural Alignment: Survey of Evaluation Benchmarks Through A Three-Dimensional Framework
EMNLP 2025
Governance in Motion: Co-evolution of Constitutions and AI models for Scalable Safety
EMNLP 2025
Web Intellectual Property at Risk: Preventing Unauthorized Real-Time Retrieval by Large Language Models
EMNLP 2025
Bias Mitigation or Cultural Commonsense? Evaluating LLMs with a Japanese Dataset
EMNLP 2025
MemeArena: Automating Context-Aware Unbiased Evaluation of Harmfulness Understanding for Multimodal Large Language Models
EMNLP 2025
Alignment for Efficient Tool Calling of Large Language Models
EMNLP 2025
Less Is More? Examining Fairness in Pruned Large Language Models for Summarising Opinions
EMNLP 2025
From Surveys to Narratives: Rethinking Cultural Value Adaptation in LLMs
EMNLP 2025
Iterative Prompt Refinement for Safer Text-to-Image Generation
EMNLP 2025
STEER-BENCH: A Benchmark for Evaluating the Steerability of Large Language Models
EMNLP 2025
MolErr2Fix: Benchmarking LLM Trustworthiness in Chemistry via Modular Error Detection, Localization, Explanation, and Correction
EMNLP 2025
Shared Path: Unraveling Memorization in Multilingual LLMs through Language Similarities
EMNLP 2025
Artificial Impressions: Evaluating Large Language Model Behavior Through the Lens of Trait Impressions
EMNLP 2025
Path Drift in Large Reasoning Models: How First-Person Commitments Override Safety
EMNLP 2025
Who Holds the Pen? Caricature and Perspective in LLM Retellings of History
EMNLP 2025
SAFENUDGE: Safeguarding Large Language Models in Real-time with Tunable Safety-Performance Trade-offs
EMNLP 2025
Layer-Aware Representation Filtering: Purifying Finetuning Data to Preserve LLM Safety Alignment
EMNLP 2025
<
1
…
14
15
16
…
80
>