conftrace_

← Models

Deep Learning › Models ›

Transformers

1,816 papers

Papers per year

Papers

Provably Transformers Harness Multi-Concept Word Semantics for Efficient In-Context Learning NIPS 2024

The Evolution of Statistical Induction Heads: In-Context Learning Markov Chains NIPS 2024

Rethinking Parity Check Enhanced Symmetry-Preserving Ansatz NIPS 2024

Unveiling Induction Heads: Provable Training Dynamics and Feature Learning in Transformers NIPS 2024

$\textit{NeuroPath}$: A Neural Pathway Transformer for Joining the Dots of Human Connectomes NIPS 2024

FuseMoE: Mixture-of-Experts Transformers for Fleximodal Fusion NIPS 2024

Scaling transformer neural networks for skillful and reliable medium-range weather forecasting NIPS 2024

Training-Free Open-Ended Object Detection and Segmentation via Attention as Prompts NIPS 2024

Tiny Time Mixers (TTMs): Fast Pre-trained Models for Enhanced Zero/Few-Shot Forecasting of Multivariate Time Series NIPS 2024

Transformers Represent Belief State Geometry in their Residual Stream NIPS 2024

Pretrained Transformer Efficiently Learns Low-Dimensional Target Functions In-Context NIPS 2024

Understanding Transformer Reasoning Capabilities via Graph Algorithms NIPS 2024

Humanoid Locomotion as Next Token Prediction NIPS 2024

Compact Proofs of Model Performance via Mechanistic Interpretability NIPS 2024

SongCreator: Lyrics-based Universal Song Generation NIPS 2024

Non-asymptotic Convergence of Training Transformers for Next-token Prediction NIPS 2024

Mesa-Extrapolation: A Weave Position Encoding Method for Enhanced Extrapolation in LLMs NIPS 2024

Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction NIPS 2024

In-Context Learning with Representations: Contextual Generalization of Trained Transformers NIPS 2024

Transcendence: Generative Models Can Outperform The Experts That Train Them NIPS 2024

Base of RoPE Bounds Context Length NIPS 2024

How does Architecture Influence the Base Capabilities of Pre-trained Language Models? A Case Study Based on FFN-Wider and MoE Transformers NIPS 2024

AMAGO-2: Breaking the Multi-Task Barrier in Meta-Reinforcement Learning with Transformers NIPS 2024

Molecule Design by Latent Prompt Transformer NIPS 2024

DeTrack: In-model Latent Denoising Learning for Visual Object Tracking NIPS 2024