conftrace_

← Architectures

Deep Learning › Architectures ›

Transformers

9,294 papers

Papers per year

Papers

ScaleKD: Strong Vision Transformers Could Be Excellent Teachers NIPS 2024

SOFTS: Efficient Multivariate Time Series Forecasting with Series-Core Fusion NIPS 2024

The Evolution of Statistical Induction Heads: In-Context Learning Markov Chains NIPS 2024

DCDepth: Progressive Monocular Depth Estimation in Discrete Cosine Domain NIPS 2024

A Global Depth-Range-Free Multi-View Stereo Transformer Network with Pose Embedding NIPS 2024

AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and Generation NIPS 2024

Stable-Pose: Leveraging Transformers for Pose-Guided Text-to-Image Generation NIPS 2024

Amortized Planning with Large-Scale Transformers: A Case Study on Chess NIPS 2024

Model Decides How to Tokenize: Adaptive DNA Sequence Tokenization with MxDNA NIPS 2024

Unveiling Induction Heads: Provable Training Dynamics and Feature Learning in Transformers NIPS 2024

Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy Curvature of Attention NIPS 2024

SpikedAttention: Training-Free and Fully Spike-Driven Transformer-to-SNN Conversion with Winner-Oriented Spike Shift for Softmax Operation NIPS 2024

Do LLMs dream of elephants (when told not to)? Latent concept association and associative memory in transformers NIPS 2024

$\textit{NeuroPath}$: A Neural Pathway Transformer for Joining the Dots of Human Connectomes NIPS 2024

FuseMoE: Mixture-of-Experts Transformers for Fleximodal Fusion NIPS 2024

iVideoGPT: Interactive VideoGPTs are Scalable World Models NIPS 2024

FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision NIPS 2024

Scaling transformer neural networks for skillful and reliable medium-range weather forecasting NIPS 2024

Approximation Rate of the Transformer Architecture for Sequence Modeling NIPS 2024

Enhancing Motion in Text-to-Video Generation with Decomposed Encoding and Conditioning NIPS 2024

A Consistency-Aware Spot-Guided Transformer for Versatile and Hierarchical Point Cloud Registration NIPS 2024

MetaLA: Unified Optimal Linear Approximation to Softmax Attention Map NIPS 2024

MambaAD: Exploring State Space Models for Multi-class Unsupervised Anomaly Detection NIPS 2024

Even Sparser Graph Transformers NIPS 2024

Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length NIPS 2024