conftrace_

Papers

Tight Clusters Make Specialized Experts ICLR 2025 Promoting Ensemble Diversity with Interactive Bayesian Distributional Robustness for Fine-tuning Foundation Models ICML 2025 Tree-Sliced Wasserstein Distance with Nonlinear Projection ICML 2025 Tree-Sliced Wasserstein Distance: A Geometric Perspective ICML 2025 Equivariant Polynomial Functional Networks ICML 2025 MoLEx: Mixture of Layer Experts for Fine-tuning with Sparse Upcycling ICLR 2025 Spherical Tree-Sliced Wasserstein Distance ICLR 2025 Distance-Based Tree-Sliced Wasserstein Distance ICLR 2025 CAMEx: Curvature-aware Merging of Experts ICLR 2025 Equivariant Neural Functional Networks for Transformers ICLR 2025 Demystifying the Token Dynamics of Deep Selective State Space Models ICLR 2025 Transformer Meets Twicing: Harnessing Unattended Residual Information ICLR 2025 Revisiting Kernel Attention with Correlated Gaussian Process Representation UAI 2024 Beyond Vanilla Variational Autoencoders: Detecting Posterior Collapse in Conditional and Hierarchical Variational Autoencoders ICLR 2024 Neural Collapse for Cross-entropy Class-Imbalanced Learning with Unconstrained ReLU Features Model ICML 2024 PIDformer: Transformer Meets Control Theory ICML 2024 From Coupled Oscillators to Graph Neural Networks: Reducing Over-smoothing via a Kuramoto Model-based Approach AISTATS 2024 A Primal-Dual Framework for Transformers and Neural Networks ICLR 2023 Neural Collapse in Deep Linear Networks: From Balanced to Imbalanced Data ICML 2023 Revisiting Over-smoothing and Over-squashing Using Ollivier-Ricci Curvature ICML 2023 Hierarchical Sliced Wasserstein Distance ICLR 2023 GRAND++: Graph Neural Diffusion with A Source Term ICLR 2022 Improving Transformers with Probabilistic Attention Keys ICML 2022