conftrace
_
Papers
Trends
Conferences
Explore
Authors
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Core AI
Artificial Intelligence
›
Core AI
›
Interpretability
7,318 papers
Papers per year
2003: 1
2006: 1
2007: 1
2008: 1
2009: 1
2010: 5
2012: 2
2013: 10
2014: 7
2015: 14
2016: 27
2017: 84
2018: 196
2019: 395
2020: 488
2021: 771
2022: 823
2023: 954
2024: 1360
2025: 1713
2026: 464
Papers
HD-Eval: Aligning Large Language Model Evaluators Through Hierarchical Criteria Decomposition
ACL 2024
Label-Efficient Model Selection for Text Generation
ACL 2024
Competition of Mechanisms: Tracing How Language Models Handle Facts and Counterfactuals
ACL 2024
Bypassing LLM Watermarks with Color-Aware Substitutions
ACL 2024
RAVEL: Evaluating Interpretability Methods on Disentangling Language Model Representations
ACL 2024
Faithful Chart Summarization with ChaTS-Pi
ACL 2024
I am a Strange Dataset: Metalinguistic Tests for Language Models
ACL 2024
Mitigating Biases for Instruction-following Language Models via Bias Neurons Elimination
ACL 2024
MM-SAP: A Comprehensive Benchmark for Assessing Self-Awareness of Multimodal Large Language Models in Perception
ACL 2024
Focus on Your Question! Interpreting and Mitigating Toxic CoT Problems in Commonsense Reasoning
ACL 2024
InterrogateLLM: Zero-Resource Hallucination Detection in LLM-Generated Answers
ACL 2024
Comparing Inferential Strategies of Humans and Large Language Models in Deductive Reasoning
ACL 2024
Are LLM-based Evaluators Confusing NLG Quality Criteria?
ACL 2024
Diffusion Lens: Interpreting Text Encoders in Text-to-Image Pipelines
ACL 2024
Do Large Language Models Latently Perform Multi-Hop Reasoning?
ACL 2024
Harnessing Toulmin’s theory for zero-shot argument explication
ACL 2024
MindMap: Knowledge Graph Prompting Sparks Graph of Thoughts in Large Language Models
ACL 2024
Characterizing Similarities and Divergences in Conversational Tones in Humans and LLMs by Sampling with People
ACL 2024
Ask Again, Then Fail: Large Language Models’ Vacillations in Judgment
ACL 2024
CLAMBER: A Benchmark of Identifying and Clarifying Ambiguous Information Needs in Large Language Models
ACL 2024
CLOMO: Counterfactual Logical Modification with Large Language Models
ACL 2024
Interpretable User Satisfaction Estimation for Conversational Systems with Large Language Models
ACL 2024
Measuring Political Bias in Large Language Models: What Is Said and How It Is Said
ACL 2024
Measuring Meaning Composition in the Human Brain with Composition Scores from Large Language Models
ACL 2024
An Entropy-based Text Watermarking Detection Method
ACL 2024
<
1
…
103
104
105
…
293
>