← Models

Deep Learning › Models ›

Transformers

1816 directly classified papers

Papers per year

Papers

From Alignment to Entailment: A Unified Textual Entailment Framework for Entity Alignment ACL 2023

Causes and Cures for Interference in Multilingual Translation ACL 2023

Statistical Foundations of Prior-Data Fitted Networks ICML 2023

DEplain: A German Parallel Corpus with Intralingual Translations into Plain Language for Sentence and Document Simplification ACL 2023

Reprogramming Pretrained Language Models for Antibody Sequence Infilling ICML 2023

Bridging The Gap: Entailment Fused-T5 for Open-retrieval Conversational Machine Reading Comprehension ACL 2023

Abstractive Summarizers are Excellent Extractive Summarizers ACL 2023

Integrally Pre-Trained Transformer Pyramid Networks CVPR 2023

CLUSTSEG: Clustering for Universal Segmentation ICML 2023

SMURF-THP: Score Matching-based UnceRtainty quantiFication for Transformer Hawkes Process ICML 2023

LayoutMask: Enhance Text-Layout Interaction in Multi-modal Pre-training for Document Understanding ACL 2023

Token-Level Self-Evolution Training for Sequence-to-Sequence Learning ACL 2023

It is a Bird Therefore it is a Robin: On BERT’s Internal Consistency Between Hypernym Knowledge and Logical Words ACL 2023

Care4Lang at MEDIQA-Chat 2023: Fine-tuning Language Models for Classifying and Summarizing Clinical Dialogues ACL 2023

Architecture-Agnostic Masked Image Modeling -- From ViT back to CNN ICML 2023

Dynamic Routing Transformer Network for Multimodal Sarcasm Detection ACL 2023

Fast Inference from Transformers via Speculative Decoding ICML 2023

How Does Generative Retrieval Scale to Millions of Passages? EMNLP 2023

Explaining How Transformers Use Context to Build Predictions ACL 2023

VIMA: Robot Manipulation with Multimodal Prompts ICML 2023

GNOT: A General Neural Operator Transformer for Operator Learning ICML 2023

Self-Distillation into Self-Attention Heads for Improving Transformer-based End-to-End Neural Speaker Diarization INTERSPEECH 2023

Learning Language-guided Adaptive Hyper-modality Representation for Multimodal Sentiment Analysis EMNLP 2023

Incorporating Distributions of Discourse Structure for Long Document Abstractive Summarization ACL 2023

Scaling Laws for Multilingual Neural Machine Translation ICML 2023