Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Responsible AI
1991 directly classified papers
Papers per year
2011: 1
2016: 1
2017: 7
2018: 10
2019: 22
2020: 51
2021: 91
2022: 145
2023: 207
2024: 526
2025: 760
2026: 170
Papers
Catch Me If You GPT: Tutorial on Deepfake Texts
NAACL 2024
Combating Security and Privacy Issues in the Era of Large Language Models
NAACL 2024
Citation: A Key to Building Responsible and Accountable Large Language Models
NAACL 2024
REQUAL-LM: Reliability and Equity through Aggregation in Large Language Models
NAACL 2024
A Robust Semantics-based Watermark for Large Language Model against Paraphrasing
NAACL 2024
Modeling the Sacred: Considerations when Using Religious Texts in Natural Language Processing
NAACL 2024
Pollice Verso at SemEval-2024 Task 6: The Roman Empire Strikes Back
NAACL 2024
MARiA at SemEval 2024 Task-6: Hallucination Detection Through LLMs, MNLI, and Cosine similarity
NAACL 2024
Towards Healthy AI: Large Language Models Need Therapists Too
NAACL 2024
Cross-Task Defense: Instruction-Tuning LLMs for Content Safety
NAACL 2024
Sandwich attack: Multi-language Mixture Adaptive Attack on LLMs
NAACL 2024
BELIEVE: Belief-Enhanced Instruction Generation and Augmentation for Zero-Shot Bias Mitigation
NAACL 2024
Adventures of Trustworthy Vision-Language Models: A Survey
AAAI 2024
Novax or Novak? Estimating Social Media Stance towards Celebrity Vaccine Hesitancy (Student Abstract)
AAAI 2024
Merging AI Incidents Research with Political Misinformation Research: Introducing the Political Deepfakes Incidents Database
AAAI 2024
Evaluating the Effectiveness of Explainable Artificial Intelligence Approaches (Student Abstract)
AAAI 2024
Diverse Yet Biased: Towards Mitigating Biases in Generative AI (Student Abstract)
AAAI 2024
PRP: Propagating Universal Perturbations to Attack Large Language Model Guard-Rails
ACL 2024
Measuring Political Bias in Large Language Models: What Is Said and How It Is Said
ACL 2024
SoFA: Shielded On-the-fly Alignment via Priority Rule Following
ACL 2024
A Comprehensive Study of Jailbreak Attack versus Defense for Large Language Models
ACL 2024
On the Vulnerability of Safety Alignment in Open-Access LLMs
ACL 2024
Making Harmful Behaviors Unlearnable for Large Language Models
ACL 2024
Debiasing Large Language Models with Structured Knowledge
ACL 2024
Duwak: Dual Watermarks in Large Language Models
ACL 2024
<
1
…
42
43
44
…
80
>