conftrace
_
Papers
Trends
Conferences
Explore
Authors
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Core AI
Artificial Intelligence
›
Core AI
›
Interpretability
7,318 papers
Papers per year
2003: 1
2006: 1
2007: 1
2008: 1
2009: 1
2010: 5
2012: 2
2013: 10
2014: 7
2015: 14
2016: 27
2017: 84
2018: 196
2019: 395
2020: 488
2021: 771
2022: 823
2023: 954
2024: 1360
2025: 1713
2026: 464
Papers
Thesis Distillation: Investigating The Impact of Bias in NLP Models on Hate Speech Detection
EMNLP 2023
Analyzing Pre-trained and Fine-tuned Language Models
EMNLP 2023
Emergent Linear Representations in World Models of Self-Supervised Sequence Models
EMNLP 2023
Explaining Data Patterns in Natural Language with Language Models
EMNLP 2023
Disentangling the Linguistic Competence of Privacy-Preserving BERT
EMNLP 2023
“Honey, Tell Me What’s Wrong”, Global Explanation of Textual Discriminative Models through Cooperative Generation
EMNLP 2023
Self-Consistency of Large Language Models under Ambiguity
EMNLP 2023
Character-Level Chinese Backpack Language Models
EMNLP 2023
Investigating Semantic Subspaces of Transformer Sentence Embeddings through Linear Structural Probing
EMNLP 2023
Enhancing Interpretability Using Human Similarity Judgements to Prune Word Embeddings
EMNLP 2023
When Your Language Model Cannot Even Do Determiners Right: Probing for Anti-Presuppositions and the Maximize Presupposition! Principle
EMNLP 2023
Introducing VULCAN: A Visualization Tool for Understanding Our Models and Data by Example
EMNLP 2023
Investigating the Effect of Discourse Connectives on Transformer Surprisal: Language Models Understand Connectives, Even So They Are Surprised
EMNLP 2023
Investigating the Encoding of Words in BERT’s Neurons Using Feature Textualization
EMNLP 2023
Rigorously Assessing Natural Language Explanations of Neurons
EMNLP 2023
Identifying and Adapting Transformer-Components Responsible for Gender Bias in an English Language Model
EMNLP 2023
Humans and language models diverge when predicting repeating text
EMNLP 2023
A Comparative Study on Textual Saliency of Styles from Eye Tracking, Annotations, and Language Models
EMNLP 2023
Revising with a Backward Glance: Regressions and Skips during Reading as Cognitive Signals for Revision Policies in Incremental Processing
EMNLP 2023
Future Lens: Anticipating Subsequent Tokens from a Single Hidden State
EMNLP 2023
Implications of Annotation Artifacts in Edge Probing Test Datasets
EMNLP 2023
REFER: An End-to-end Rationale Extraction Framework for Explanation Regularization
EMNLP 2023
Flesch or Fumble? Evaluating Readability Standard Alignment of Instruction-Tuned Language Models
EMNLP 2023
Are Large Language Models Reliable Judges? A Study on the Factuality Evaluation Capabilities of LLMs
EMNLP 2023
Evaluating Neural Language Models as Cognitive Models of Language Acquisition
EMNLP 2023
<
1
…
169
170
171
…
293
>