conftrace_

← Architectures

Deep Learning › Architectures ›

Transformers

9,294 papers

Papers per year

Papers

EVDM: Event-based Real-world Video Deblurring with Mamba ICCV 2025

Towards Fine-grained Interactive Segmentation in Images and Videos ICCV 2025

FaceXFormer: A Unified Transformer for Facial Analysis ICCV 2025

EFTViT: Efficient Federated Training of Vision Transformers with Masked Images on Resource-Constrained Clients ICCV 2025

Scaling Transformer-Based Novel View Synthesis with Models Token Disentanglement and Synthetic Data ICCV 2025

SAFER: Sharpness Aware layer-selective Finetuning for Enhanced Robustness in vision transformers ICCV 2025

Transformer-based Tooth Alignment Prediction with Occlusion and Collision Constraints ICCV 2025

GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation ICCV 2025

FullDiT: Video Generative Foundation Models with Multimodal Control via Full Attention ICCV 2025

Is CLIP ideal? No. Can we fix it? Yes! ICCV 2025

SpiLiFormer: Enhancing Spiking Transformers with Lateral Inhibition ICCV 2025

Bringing RNNs Back to Efficient Open-Ended Video Understanding ICCV 2025

RANKCLIP: Ranking-Consistent Language-Image Pretraining ICCV 2025

Global Regulation and Excitation via Attention Tuning for Stereo Matching ICCV 2025

STaR: Seamless Spatial-Temporal Aware Motion Retargeting with Penetration and Consistency Constraints ICCV 2025

Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation ICCV 2025

Turbo2K: Towards Ultra-Efficient and High-Quality 2K Video Synthesis ICCV 2025

Puppet-Master: Scaling Interactive Video Generation as a Motion Prior for Part-Level Dynamics ICCV 2025

OmniCache: A Trajectory-Oriented Global Perspective on Training-Free Cache Reuse for Diffusion Transformer Models ICCV 2025

Unified Open-World Segmentation with Multi-Modal Prompts ICCV 2025

PixTalk: Controlling Photorealistic Image Processing and Editing with Language ICCV 2025

Learning Streaming Video Representation via Multitask Training ICCV 2025

EasyControl: Adding Efficient and Flexible Control for Diffusion Transformer ICCV 2025

Democratizing Text-to-Image Masked Generative Models with Compact Text-Aware One-Dimensional Tokens ICCV 2025

CreatiLayout: Siamese Multimodal Diffusion Transformer for Creative Layout-to-Image Generation ICCV 2025