conftrace
_
Papers
Trends
Conferences
Explore
Authors
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Architectures
Deep Learning
›
Architectures
›
Transformers
9,294 papers
Papers per year
2011: 1
2014: 2
2015: 6
2016: 17
2017: 67
2018: 156
2019: 404
2020: 769
2021: 1217
2022: 1446
2023: 1628
2024: 1574
2025: 1647
2026: 360
Papers
An Investigation of Neuron Activation as a Unified Lens to Explain Chain-of-Thought Eliciting Arithmetic Reasoning of LLMs
ACL 2024
EIT: Enhanced Interactive Transformer
ACL 2024
Decoder-only Streaming Transformer for Simultaneous Translation
ACL 2024
NaijaHate: Evaluating Hate Speech Detection on Nigerian Twitter Using Representative Data
ACL 2024
Linear Transformers with Learnable Kernel Functions are Better In-Context Models
ACL 2024
AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling
ACL 2024
Dodo: Dynamic Contextual Compression for Decoder-only LMs
ACL 2024
PolCLIP: A Unified Image-Text Word Sense Disambiguation Model via Generating Multimodal Complementary Representations
ACL 2024
Exploring Hybrid Question Answering via Program-based Prompting
ACL 2024
Fortify the Shortest Stave in Attention: Enhancing Context Awareness of Large Language Models for Effective Tool Use
ACL 2024
Layer-Condensed KV Cache for Efficient Inference of Large Language Models
ACL 2024
ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partition
ACL 2024
Tree Transformer’s Disambiguation Ability of Prepositional Phrase Attachment and Garden Path Effects
ACL 2024
VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild
ACL 2024
Harder Task Needs More Experts: Dynamic Routing in MoE Models
ACL 2024
XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts
ACL 2024
From Sights to Insights: Towards Summarization of Multimodal Clinical Documents
ACL 2024
Maverick: Efficient and Accurate Coreference Resolution Defying Recent Trends
ACL 2024
PathReasoner: Modeling Reasoning Path with Equivalent Extension for Logical Question Answering
ACL 2024
Chunk, Align, Select: A Simple Long-sequence Processing Method for Transformers
ACL 2024
Why are Sensitive Functions Hard for Transformers?
ACL 2024
What Languages are Easy to Language-Model? A Perspective from Learning Probabilistic Regular Languages
ACL 2024
Do Llamas Work in English? On the Latent Language of Multilingual Transformers
ACL 2024
Causal Estimation of Memorisation Profiles
ACL 2024
UltraSparseBERT: 99% Conditionally Sparse Language Modelling
ACL 2024
<
1
…
104
105
106
…
372
>