conftrace_

← Architectures

Deep Learning › Architectures ›

Transformers

9,294 papers

Papers per year

Papers

DeTrack: In-model Latent Denoising Learning for Visual Object Tracking NIPS 2024

Mitigating Object Hallucination via Concentric Causal Attention NIPS 2024

Speculative Decoding with CTC-based Draft Model for LLM Inference Acceleration NIPS 2024

Unraveling the Gradient Descent Dynamics of Transformers NIPS 2024

In-Context Learning with Transformers: Softmax Attention Adapts to Function Lipschitzness NIPS 2024

BEACON: Benchmark for Comprehensive RNA Tasks and Language Models NIPS 2024

InterpBench: Semi-Synthetic Transformers for Evaluating Mechanistic Interpretability Techniques NIPS 2024

CALVIN: Improved Contextual Video Captioning via Instruction Tuning NIPS 2024

Normalization Layer Per-Example Gradients are Sufficient to Predict Gradient Noise Scale in Transformers NIPS 2024

SSA-Seg: Semantic and Spatial Adaptive Pixel-level Classifier for Semantic Segmentation NIPS 2024

LiT: Unifying LiDAR "Languages" with LiDAR Translator NIPS 2024

Learning Frequency-Adapted Vision Foundation Model for Domain Generalized Semantic Segmentation NIPS 2024

Multi-Head Mixture-of-Experts NIPS 2024

Perceiving Longer Sequences With Bi-Directional Cross-Attention Transformers NIPS 2024

Towards Next-Level Post-Training Quantization of Hyper-Scale Transformers NIPS 2024

Accelerating Augmentation Invariance Pretraining NIPS 2024

SyncTweedies: A General Generative Framework Based on Synchronized Diffusions NIPS 2024

Grokking of Implicit Reasoning in Transformers: A Mechanistic Journey to the Edge of Generalization NIPS 2024

Real-time Core-Periphery Guided ViT with Smart Data Layout Selection on Mobile Devices NIPS 2024

A distributional simplicity bias in the learning dynamics of transformers NIPS 2024

Text-Guided Attention is All You Need for Zero-Shot Robustness in Vision-Language Models NIPS 2024

Mini-Sequence Transformers: Optimizing Intermediate Memory for Long Sequences Training NIPS 2024

MC-DiT: Contextual Enhancement via Clean-to-Clean Reconstruction for Masked Diffusion Models NIPS 2024

Transformers need glasses! Information over-squashing in language tasks NIPS 2024

Transformers Learn to Achieve Second-Order Convergence Rates for In-Context Linear Regression NIPS 2024