conftrace
_
Papers
Trends
Conferences
Explore
Authors
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Core AI
Artificial Intelligence
›
Core AI
›
Interpretability
7,318 papers
Papers per year
2003: 1
2006: 1
2007: 1
2008: 1
2009: 1
2010: 5
2012: 2
2013: 10
2014: 7
2015: 14
2016: 27
2017: 84
2018: 196
2019: 395
2020: 488
2021: 771
2022: 823
2023: 954
2024: 1360
2025: 1713
2026: 464
Papers
Pride and Prejudice: LLM Amplifies Self-Bias in Self-Refinement
ACL 2024
Steering Llama 2 via Contrastive Activation Addition
ACL 2024
ToMBench: Benchmarking Theory of Mind in Large Language Models
ACL 2024
ICLEF: In-Context Learning with Expert Feedback for Explainable Style Transfer
ACL 2024
ECBD: Evidence-Centered Benchmark Design for NLP
ACL 2024
Monotonic Representation of Numeric Attributes in Language Models
ACL 2024
Learnable Privacy Neurons Localization in Language Models
ACL 2024
What Does Parameter-free Probing Really Uncover?
ACL 2024
Explainability and Hate Speech: Structured Explanations Make Social Media Moderators Faster
ACL 2024
AGR: Reinforced Causal Agent-Guided Self-explaining Rationalization
ACL 2024
The Probabilities Also Matter: A More Faithful Metric for Faithfulness of Free-Text Explanations in Large Language Models
ACL 2024
LM Transparency Tool: Interactive Tool for Analyzing Transformer Language Models
ACL 2024
VeraCT Scan: Retrieval-Augmented Fake News Detection with Justifiable Reasoning
ACL 2024
ELLA: Empowering LLMs for Interpretable, Accurate and Informative Legal Advice
ACL 2024
On the Interpretability of Deep Learning Models for Collaborative Argumentation Analysis in Classrooms
ACL 2024
Vulnerabilities of Large Language Models to Adversarial Attacks
ACL 2024
The Counterfeit Conundrum: Can Code Language Models Grasp the Nuances of Their Incorrect Generations?
ACL 2024
CHIME: LLM-Assisted Hierarchical Organization of Scientific Studies for Literature Review Support
ACL 2024
Are self-explanations from Large Language Models faithful?
ACL 2024
Benchmarking Cognitive Biases in Large Language Models as Evaluators
ACL 2024
Finding and Editing Multi-Modal Neurons in Pre-Trained Transformers
ACL 2024
Cutting Off the Head Ends the Conflict: A Mechanism for Interpreting and Mitigating Knowledge Conflicts in Language Models
ACL 2024
Neurons in Large Language Models: Dead, N-gram, Positional
ACL 2024
Unveiling the Achilles’ Heel of NLG Evaluators: A Unified Adversarial Framework Driven by Large Language Models
ACL 2024
Teacher-Student Training for Debiasing: General Permutation Debiasing for Large Language Models
ACL 2024
<
1
…
105
106
107
…
293
>