conftrace
_
Papers
Trends
Conferences
Explore
Authors
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Core AI
Artificial Intelligence
›
Core AI
›
Interpretability
7,318 papers
Papers per year
2003: 1
2006: 1
2007: 1
2008: 1
2009: 1
2010: 5
2012: 2
2013: 10
2014: 7
2015: 14
2016: 27
2017: 84
2018: 196
2019: 395
2020: 488
2021: 771
2022: 823
2023: 954
2024: 1360
2025: 1713
2026: 464
Papers
Diagnosing Moral Reasoning Acquisition in Language Models: Pragmatics and Generalization
EMNLP 2025
Discourse Heuristics For Paradoxically Moral Self-Correction
EMNLP 2025
Seeing Race, Feeling Bias: Emotion Stereotyping in Multimodal Language Models
EMNLP 2025
Addition in Four Movements: Mapping Layer-wise Information Trajectories in LLMs
EMNLP 2025
Token Knowledge: A New Perspective For Knowledge in Large Language Models
EMNLP 2025
A Group Fairness Lens for Large Language Models
EMNLP 2025
Distributional Surgery for Language Model Activations
EMNLP 2025
Are the Reasoning Models Good at Automated Essay Scoring?
EMNLP 2025
SalaMAnder: Shapley-based Mathematical Expression Attribution and Metric for Chain-of-Thought Reasoning
EMNLP 2025
DAPE-BR: Distance-Aware Positional Encoding for Mitigating Object Hallucination in LVLMs
EMNLP 2025
Joint Enhancement of Relational Reasoning for Long-Context LLMs
EMNLP 2025
CLEAR: A Framework Enabling Large Language Models to Discern Confusing Legal Paragraphs
EMNLP 2025
Unveiling Multimodal Processing: Exploring Activation Patterns in Multimodal LLMs for Interpretability and Efficiency
EMNLP 2025
HARE: an entity and relation centric evaluation framework for histopathology reports
EMNLP 2025
Extracting Conceptual Spaces from LLMs Using Prototype Embeddings
EMNLP 2025
SAFE: A Sparse Autoencoder-Based Framework for Robust Query Enrichment and Hallucination Mitigation in LLMs
EMNLP 2025
LLMs as a synthesis between symbolic and distributed approaches to language
EMNLP 2025
Understanding How Value Neurons Shape the Generation of Specified Values in LLMs
EMNLP 2025
Sugar-Coated Poison: Benign Generation Unlocks Jailbreaking
EMNLP 2025
FinGrAct: A Framework for FINe-GRrained Evaluation of ACTionability in Explainable Automatic Fact-Checking
EMNLP 2025
Bold Claims or Self-Doubt? Factuality Hallucination Type Detection via Belief State
EMNLP 2025
DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate Hallucinations
EMNLP 2025
Conflicts in Texts: Data, Implications and Challenges
EMNLP 2025
Recognizing Limits: Investigating Infeasibility in Large Language Models
EMNLP 2025
AIRepr: An Analyst-Inspector Framework for Evaluating Reproducibility of LLMs in Data Science
EMNLP 2025
<
1
…
64
65
66
…
293
>