Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Responsible AI
1991 directly classified papers
Papers per year
2011: 1
2016: 1
2017: 7
2018: 10
2019: 22
2020: 51
2021: 91
2022: 145
2023: 207
2024: 526
2025: 760
2026: 170
Papers
Social, Legal, Ethical, Empathetic, and Cultural Rules: Compilation and Reasoning
AAAI 2024
“Thinking” Fair and Slow: On the Efficacy of Structured Prompts for Debiasing Language Models
EMNLP 2024
Systematic Biases in LLM Simulations of Debates
EMNLP 2024
Fairer Preferences Elicit Improved Human-Aligned Large Language Model Judgments
EMNLP 2024
Towards Tool Use Alignment of Large Language Models
EMNLP 2024
Glue pizza and eat rocks - Exploiting Vulnerabilities in Retrieval-Augmented Generative Models
EMNLP 2024
SHIELD: Evaluation and Defense Strategies for Copyright Compliance in LLM Text Generation
EMNLP 2024
CMD: a framework for Context-aware Model self-Detoxification
EMNLP 2024
Whispers that Shake Foundations: Analyzing and Mitigating False Premise Hallucinations in Large Language Models
EMNLP 2024
ASETF: A Novel Method for Jailbreak Attack on LLMs through Translate Suffix Embeddings
EMNLP 2024
GoldCoin: Grounding Large Language Models in Privacy Laws via Contextual Integrity Theory
EMNLP 2024
Learn to Refuse: Making Large Language Models More Controllable and Reliable through Knowledge Scope Limitation and Refusal Mechanism
EMNLP 2024
Dissecting Fine-Tuning Unlearning in Large Language Models
EMNLP 2024
Dancing in Chains: Reconciling Instruction Following and Faithfulness in Language Models
EMNLP 2024
Teaching LLMs to Abstain across Languages via Multilingual Feedback
EMNLP 2024
Outcome-Constrained Large Language Models for Countering Hate Speech
EMNLP 2024
How Does the Disclosure of AI Assistance Affect the Perceptions of Writing?
EMNLP 2024
Thinking Outside of the Differential Privacy Box: A Case Study in Text Privatization with Language Model Prompting
EMNLP 2024
VLFeedback: A Large-Scale AI Feedback Dataset for Large Vision-Language Models Alignment
EMNLP 2024
ChatGPT Doesn’t Trust Chargers Fans: Guardrail Sensitivity in Context
EMNLP 2024
Aligning Large Language Models with Diverse Political Viewpoints
EMNLP 2024
“You Gotta be a Doctor, Lin” : An Investigation of Name-Based Bias of Large Language Models in Employment Recommendations
EMNLP 2024
Is LLM-as-a-Judge Robust? Investigating Universal Adversarial Attacks on Zero-shot LLM Assessment
EMNLP 2024
Humans or LLMs as the Judge? A Study on Judgement Bias
EMNLP 2024
Walking in Others’ Shoes: How Perspective-Taking Guides Large Language Models in Reducing Toxicity and Bias
EMNLP 2024
<
1
…
40
41
42
…
80
>