Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Responsible AI
1991 directly classified papers
Papers per year
2011: 1
2016: 1
2017: 7
2018: 10
2019: 22
2020: 51
2021: 91
2022: 145
2023: 207
2024: 526
2025: 760
2026: 170
Papers
Silenced Biases: The Dark Side LLMs Learned to Refuse
AAAI 2026
Reducing the Scope of Language Models
AAAI 2026
Steering Representations, Safeguarding Privacy: A Cross-Modal Privacy Protection Method for Generative AI
AAAI 2026
ShadeEdit: A Utility-Preserving and Defense-Evasive Knowledge Manipulation Attack in Federated LLMs
AAAI 2026
SCOPE: Intrinsic Semantic Space Control for Mitigating Copyright Infringement in LLMs
AAAI 2026
ALTER: Asymmetric LoRA for Token-Entropy-Guided Unlearning of LLMs
AAAI 2026
Fairness Perceptions of Large Language Models
AAAI 2026
Beyond World Models: Rethinking Understanding in AI Models
AAAI 2026
Teaching Large Language Models to Maintain Contextual Faithfulness via Synthetic Tasks and Reinforcement Learning
AAAI 2026
Hallucination as a Computational Boundary: A Hierarchy of Inevitability and the Oracle Escape
AAAI 2026
From Chaos to Clarity: A Knowledge Graph-Driven Audit Dataset Generation Framework for LLM Unlearning
AAAI 2026
Large Language Model Unlearning for Source Code
AAAI 2026
“Yuki Gets Sushi, David Gets Steak?”: Uncovering Gender and Racial Biases in LLM-Based Meal Recommendations
EACL 2026
Safe RAG by RAG: Untying the Bell That RAG Rang with the RAG Hand
AAAI 2026
HEV Generative Sandbox: A Framework for Assessing Domain-Specific Social Risks Through Human-LLM Simulation
AAAI 2026
Adaptive Hallucination Alleviation in Multimodal Large Language Models: From Strategic Data Selection to Severity-Guided Training
AAAI 2026
Oblivionis: A Lightweight Learning and Unlearning Framework for Federated Large Language Models
AAAI 2026
On the Misalignment Between Data Learnability and Forgettability in Machine Unlearning
AAAI 2026
Simulated Rewards, Skewed Strategies: Tracing the Acquired Preference Bias in LLM-Based Dialogue Planners
AAAI 2026
An Information Theoretic Evaluation Metric for Strong Unlearning
AAAI 2026
OmniDPO: A Preference Optimization Framework to Address Omni-Modal Hallucination
AAAI 2026
Reasoning Shapes Alignment: Investigating Cultural Alignment in Large Reasoning Models with Cultural Norms
AAAI 2026
A Robust Unlearning Method with Adaptive Knowledge Guidance and Memory Preservation
AAAI 2026
FGD-Align: Pluralistic Alignment for Large Language Models via Fuzzy Group Decision-Making
AAAI 2026
WaterMod: Modular Token-Rank Partitioning for Probability-Balanced LLM Watermarking
AAAI 2026
<
1
2
3
4
5
…
80
>