Yuan Cao

72 papers · 2014–2026 · 14 conferences · across top CS/AI conferences

Achievements

+14 more ↓

🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🗺️ Taxonomy Completionist (16) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (14)

🗺️ Taxonomy Completionist (16) 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🤝 Dynamic Duo (21) 👑 Triple Crown 🏆 Grand Slam 🔬 Deep Specialist (14) 🏆 Keyword Champion (2) ⚡ Prolific Year (5) ❓ The Questioner (3) 🗃️ Keyword Collector (256) 📈 Trend Setter 💎 Century Club (65) 🔥 Unstoppable (8)

Conferences

NIPS (17) AAAI (13) ICLR (11) ACL (6) EMNLP (5) ICML (5) IJCAI (4) AISTATS (3) NAACL (3) COLT (1) ECCV (1) INTERSPEECH (1) JMLR (1) UAI (1)

Top co-authors

Quanquan Gu (21) Izhak Shafran (8) Yonghui Wu (7) Difan Zou (7) Zixiang Chen (6) Jeffrey Zhao (6) Mingqiu Wang (6) Orhan Firat (5) Yanwei Yu (5) Abhinav Rastogi (4)

Keywords

gradient descent (10) language model (5) convolutional neural network (5) dialogue state tracking (4) model compression (4) stochastic gradient descent (4) attention mechanism (4) unsupervised learning (4) transfer learning (4) representation learning (4) graph neural network (3) neural machine translation (3) contrastive learning (3) dialog state tracking (3) neural tangent kernel (3) generalization bound (3) image retrieval (3) benign overfitting (3) deep learning (2) neural network optimization (2)

Papers

TrajAgg: Dual-Scale Feature Aggregation with Hybrid Training for Trajectory Similarity Computation in Free Space AAAI 2026 Reasoning over Precedents Alongside Statutes: Case-Augmented Deliberative Alignment for LLM Safety ACL 2026 Proxy Zero-Shot Hashing with Multimodal Fusion via Stable Diffusion AAAI 2026 Self-Supervised Cross-City Trajectory Representation Learning Based on Meta-Learning AAAI 2026 Automatic Channel Pruning by Searching with Structure Embedding for Hash Network AAAI 2026 Multiplex Heterogeneous Graph Neural Networks with Euclidean-Riemannian Mutual Space Synergy AAAI 2026 Towards Understanding Generalization in DP-GD: A Case Study in Training Two-Layer CNNs AAAI 2026 Vision-guided Text Mining for Unsupervised Cross-modal Hashing with Community Similarity Quantization AAAI 2025 Quantifying the Optimization and Generalization Advantages of Graph Neural Networks Over Multilayer Perceptrons AISTATS 2025 On the Power of Multitask Representation Learning with Gradient Descent AISTATS 2025 On the Feature Learning in Diffusion Models ICLR 2025 Transformer Learns Optimal Variable Selection in Group-Sparse Classification ICLR 2025 Deep Graph Online Hashing for Multi-Label Image Retrieval AAAI 2025 Taxonomy Driven Fast Adversarial Training AAAI 2024 On the Comparison between Multi-modal and Single-modal Contrastive Learning NIPS 2024 One-Layer Transformer Provably Learns One-Nearest Neighbor In Context NIPS 2024 Attention boosted Individualized Regression NIPS 2024 The Implicit Bias of Adam on Separable Data NIPS 2024 Benign Overfitting in Two-Layer ReLU Convolutional Neural Networks for XOR Data ICML 2024 Multiple Descent in the Multiple Random Feature Model JMLR 2024 IG Captioner: Information Gain Captioners are Strong Zero-shot Classifiers ECCV 2024 Global Convergence in Training Large-Scale Transformers NIPS 2024 Can Public Large Language Models Help Private Cross-device Federated Learning? NAACL 2024 MUX-PLMs: Data Multiplexing for High-throughput Language Models EMNLP 2023 Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization ICLR 2023 Understanding Train-Validation Split in Meta-Learning with Neural Networks ICLR 2023 How Does Semi-supervised Learning with Pseudo-labelers Work? A Case Study ICLR 2023 Tree of Thoughts: Deliberate Problem Solving with Large Language Models NIPS 2023 Binarized Neural Machine Translation NIPS 2023 Grammar Prompting for Domain-Specific Language Generation with Large Language Models NIPS 2023 Benign Overfitting in Adversarially Robust Linear Classification UAI 2023 Fast Online Hashing with Multi-Label Projection AAAI 2023 Graph Structure Learning on User Mobility Data for Social Relationship Inference AAAI 2023 Speech Aware Dialog System Technology Challenge (DSTC11) INTERSPEECH 2023 MUX-PLMs: Pre-training Language Models with Data Multiplexing ACL 2023 The Implicit Bias of Batch Normalization in Linear Models and Two-layer Linear Convolutional Neural Networks COLT 2023 The Benefits of Mixup for Feature Learning ICML 2023 AnyTOD: A Programmable Task-Oriented Dialog System EMNLP 2023 ReAct: Synergizing Reasoning and Acting in Language Models ICLR 2023 Knowledge-grounded Dialog State Tracking EMNLP 2022 Benign Overfitting in Two-layer Convolutional Neural Networks NIPS 2022 SGD-X: A Benchmark for Robust Generalization in Schema-Guided Dialogue Systems AAAI 2022 Multilingual Mix: Example Interpolation Improves Multilingual Neural Machine Translation ACL 2022 SimVLM: Simple Visual Language Model Pretraining with Weak Supervision ICLR 2022 On the Channel Pruning using Graph Convolution Network for Convolutional Neural Network Acceleration IJCAI 2022 Unsupervised Slot Schema Induction for Task-oriented Dialog NAACL 2022 Show, Don’t Tell: Demonstrations Outperform Descriptions for Schema-Guided Task-Oriented Dialogue NAACL 2022 A Comprehensive Survey on Image Dehazing Based on Deep Learning IJCAI 2021 The geometry of integration in text classification RNNs ICLR 2021 Risk Bounds for Over-parameterized Maximum Margin Classification on Sub-Gaussian Mixtures NIPS 2021 Understanding How Encoder-Decoder Architectures Attend NIPS 2021 Agnostic Learning of Halfspaces with Gradient Descent via Soft Margins ICML 2021 Provable Generalization of SGD-trained Neural Networks of Any Width in the Presence of Adversarial Label Noise ICML 2021 Gradient Vaccine: Investigating and Improving Multi-task Optimization in Massively Multilingual Models ICLR 2021 How Much Over-parameterization Is Sufficient to Learn Deep ReLU Networks? ICLR 2021 Effective Sequence-to-Sequence Dialogue State Tracking EMNLP 2021 Towards Understanding the Spectral Bias of Deep Learning IJCAI 2021 Leveraging Monolingual Data with Self-Supervision for Multilingual Neural Machine Translation ACL 2020 Accelerated Factored Gradient Descent for Low-Rank Matrix Factorization AISTATS 2020 Closing the Generalization Gap of Adaptive Gradient Methods in Training Deep Neural Networks IJCAI 2020 Agnostic Learning of a Single Neuron with Gradient Descent NIPS 2020 Generalization Error Bounds of Gradient Descent for Learning Over-Parameterized Deep ReLU Networks AAAI 2020 A Generalized Neural Tangent Kernel Analysis for Two-layer Neural Networks NIPS 2020 Your GAN is Secretly an Energy-based Model and You Should Use Discriminator Driven Latent Sampling NIPS 2020 Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks NIPS 2019 Algorithm-Dependent Generalization Bounds for Overparameterized Deep Residual Networks NIPS 2019 Hierarchical Generative Modeling for Controllable Speech Synthesis ICLR 2019 Neural Decipherment via Minimum-Cost Flow: From Ugaritic to Linear B ACL 2019 Tight Sample Complexity of Learning One-hidden-layer Convolutional Neural Networks NIPS 2019 Training Deeper Neural Machine Translation Models with Transparent Attention EMNLP 2018 The Edge Density Barrier: Computational-Statistical Tradeoffs in Combinatorial Inference ICML 2018 Online Learning in Tensor Space ACL 2014