conftrace_

Kaiyue Wen

13 papers · 2022–2025 · 6 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+5 more ↓

🧭 Keyword Pioneer 🌍 Conference Polyglot (6) 🐝 Cross-Pollinator (7) 🌈 Renaissance Researcher (5) 🗺️ Taxonomy Completionist (25)

🌉 Interdisciplinary Bridge ⚡ Prolific Year (7) 📈 Trend Setter 💎 Century Club (13) ❓ The Questioner (2)

Conferences

ICLR (5) ACL (2) ICML (2) NIPS (2) EMNLP (1) NAACL (1)

Top co-authors

Tengyu Ma (3) Jingzhao Zhang (3) Zhiyuan Li (3) Huaqing Zhang (2) Xiaozhi Wang (2) Hongzhou Lin (2) Lei Hou (2) Zhiyuan Liu (2) Juanzi Li (2) Jason S. Wang (1)

Keywords

transfer learning (2) large language model (2) prompt tuning (2) meta-learning (1) knowledge transfer (1) vision-language alignment (1) model analysis (1) visual grounding (1) pre-trained language model (1) task generalization (1) neural network analysis (1) neural network optimization (1) compositional learning (1) neural network theory (1) mixture of expert (1) parameter-efficient fine-tuning (1) vision-language model (1) formal language (1) network pruning (1) zero-shot learning (1)

Papers

Understanding Warmup-Stable-Decay Learning Rates: A River Valley Loss Landscape View ICLR 2025 RNNs are not Transformers (Yet): The Key Bottleneck on In-Context Retrieval ICLR 2025 From Sparse Dependence to Sparse Attention: Unveiling How Chain-of-Thought Enhances Transformer Sample Efficiency ICLR 2025 Task Generalization with Autoregressive Compositional Structure: Can Learning from $D$ Tasks Generalize to $D^T$ Tasks? ICML 2025 Overtrained Language Models Are Harder to Fine-Tune ICML 2025 Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models ACL 2025 Symmetrical Visual Contrastive Optimization: Aligning Vision-Language Models with Minimal Contrastive Images ACL 2025 Sharpness Minimization Algorithms Do Not Only Minimize Sharpness To Achieve Better Generalization NIPS 2023 Transformers are uninterpretable with myopic methods: a case study with bounded Dyck grammars NIPS 2023 How Sharpness-Aware Minimization Minimizes Sharpness? ICLR 2023 Benign Overfitting in Classification: Provably Counter Label Noise with Larger Models ICLR 2023 Finding Skill Neurons in Pre-trained Transformer-based Language Models EMNLP 2022 On Transferability of Prompt Tuning for Natural Language Processing NAACL 2022