conftrace_

← Architectures

Deep Learning › Architectures ›

Transformers

9,294 papers

Papers per year

Papers

An Investigation of Neuron Activation as a Unified Lens to Explain Chain-of-Thought Eliciting Arithmetic Reasoning of LLMs ACL 2024

EIT: Enhanced Interactive Transformer ACL 2024

Decoder-only Streaming Transformer for Simultaneous Translation ACL 2024

NaijaHate: Evaluating Hate Speech Detection on Nigerian Twitter Using Representative Data ACL 2024

Linear Transformers with Learnable Kernel Functions are Better In-Context Models ACL 2024

AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling ACL 2024

Dodo: Dynamic Contextual Compression for Decoder-only LMs ACL 2024

PolCLIP: A Unified Image-Text Word Sense Disambiguation Model via Generating Multimodal Complementary Representations ACL 2024

Exploring Hybrid Question Answering via Program-based Prompting ACL 2024

Fortify the Shortest Stave in Attention: Enhancing Context Awareness of Large Language Models for Effective Tool Use ACL 2024

Layer-Condensed KV Cache for Efficient Inference of Large Language Models ACL 2024

ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partition ACL 2024

Tree Transformer’s Disambiguation Ability of Prepositional Phrase Attachment and Garden Path Effects ACL 2024

VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild ACL 2024

Harder Task Needs More Experts: Dynamic Routing in MoE Models ACL 2024

XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts ACL 2024

From Sights to Insights: Towards Summarization of Multimodal Clinical Documents ACL 2024

Maverick: Efficient and Accurate Coreference Resolution Defying Recent Trends ACL 2024

PathReasoner: Modeling Reasoning Path with Equivalent Extension for Logical Question Answering ACL 2024

Chunk, Align, Select: A Simple Long-sequence Processing Method for Transformers ACL 2024

Why are Sensitive Functions Hard for Transformers? ACL 2024

What Languages are Easy to Language-Model? A Perspective from Learning Probabilistic Regular Languages ACL 2024

Do Llamas Work in English? On the Latent Language of Multilingual Transformers ACL 2024

Causal Estimation of Memorisation Profiles ACL 2024

UltraSparseBERT: 99% Conditionally Sparse Language Modelling ACL 2024