Kaiyue Wen
13 papers · 2022–2025 · 6 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+5 more ↓ Show less ↑
π§ Keyword Pioneer π Conference Polyglot (6) π Cross-Pollinator (7) π Renaissance Researcher (5) πΊοΈ Taxonomy Completionist (25)
π
Interdisciplinary Bridge
β‘
Prolific Year
(7)
π
Trend Setter
π
Century Club
(13)
β
The Questioner
(2)
Conferences
ICLR (5)
ACL (2)
ICML (2)
NIPS (2)
EMNLP (1)
NAACL (1)
Top co-authors
Keywords
transfer learning
(2)
large language model
(2)
prompt tuning
(2)
meta-learning
(1)
knowledge transfer
(1)
vision-language alignment
(1)
model analysis
(1)
visual grounding
(1)
pre-trained language model
(1)
task generalization
(1)
neural network analysis
(1)
neural network optimization
(1)
compositional learning
(1)
neural network theory
(1)
mixture of expert
(1)
parameter-efficient fine-tuning
(1)
vision-language model
(1)
formal language
(1)
network pruning
(1)
zero-shot learning
(1)
Papers
Understanding Warmup-Stable-Decay Learning Rates: A River Valley Loss Landscape View
ICLR 2025
RNNs are not Transformers (Yet): The Key Bottleneck on In-Context Retrieval
ICLR 2025
From Sparse Dependence to Sparse Attention: Unveiling How Chain-of-Thought Enhances Transformer Sample Efficiency
ICLR 2025
Task Generalization with Autoregressive Compositional Structure: Can Learning from $D$ Tasks Generalize to $D^T$ Tasks?
ICML 2025
Overtrained Language Models Are Harder to Fine-Tune
ICML 2025
Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models
ACL 2025
Symmetrical Visual Contrastive Optimization: Aligning Vision-Language Models with Minimal Contrastive Images
ACL 2025
Sharpness Minimization Algorithms Do Not Only Minimize Sharpness To Achieve Better Generalization
NIPS 2023
Transformers are uninterpretable with myopic methods: a case study with bounded Dyck grammars
NIPS 2023
How Sharpness-Aware Minimization Minimizes Sharpness?
ICLR 2023
Benign Overfitting in Classification: Provably Counter Label Noise with Larger Models
ICLR 2023
Finding Skill Neurons in Pre-trained Transformer-based Language Models
EMNLP 2022
On Transferability of Prompt Tuning for Natural Language Processing
NAACL 2022