Masashi Sugiyama

224 papers · 2002–2026 · 17 conferences · across top CS/AI conferences

Achievements

+18 more ↓

🧭 Keyword Pioneer 🗺️ Taxonomy Completionist (55) 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (8) 🐣 Hot Topic Early Bird

🌈 Renaissance Researcher (8) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌟 Keyword Trendsetter Combo (7) 🏠 Conference Loyalist (60) 👑 Domain Dominant (6) 🏆 Keyword Champion (4) 🔬 Deep Specialist (16) 🤝 Dynamic Duo (87) 🏆 Grand Slam 👑 Triple Crown 🗃️ Keyword Collector (256) ❓ The Questioner (8) ⚡ Prolific Year (9) 📈 Trend Setter 🔥 Unstoppable (20) 🚀 Conference Pioneer 💎 Century Club (223)

Conferences

NIPS (60) ICML (58) AISTATS (26) ICLR (20) ACML (16) JMLR (16) AAAI (9) IJCAI (4) ICCV (3) EMNLP (2) ECCV (2) CVPR (2) WACV (2) EACL (1) IJCNLP (1) COLT (1) UAI (1)

Top co-authors

Gang Niu (88) Bo Han (33) Tongliang Liu (22) Issei Sato (21) Jingfeng ZHANG (14) Shinichi Nakajima (12) Junya Honda (11) Nontawat Charoenphakdee (11) Lei Feng (10) Nan Lu (9)

Keywords

weakly supervised learning (19) variational inference (13) semi-supervised learning (13) label noise (13) domain adaptation (13) binary classification (11) bayesian inference (10) importance weighting (9) adversarial training (9) adversarial robustness (9) noisy label (9) distribution shift (8) density ratio estimation (8) online learning (7) kernel methods (7) dimensionality reduction (7) multi-armed bandit (7) multi-class classification (6) variational bayesian (6) representation learning (6)

Papers

Robust Learning from Noisily Labeled Long-Tailed Data via Fairness Regularizer AAAI 2026 Label Distribution Learning with Biased Annotations Assisted by Multi-Label Learning IJCAI 2025 Multi-Player Approaches for Dueling Bandits AISTATS 2025 Domain Adaptation and Entanglement: an Optimal Transport Perspective AISTATS 2025 Learning View-invariant World Models for Visual Robotic Manipulation ICLR 2025 Towards Effective Evaluations and Comparisons for LLM Unlearning Methods ICLR 2025 Towards Out-of-Modal Generalization without Instance-level Modal Correspondence ICLR 2025 Realistic Evaluation of Deep Partial-Label Learning Algorithms ICLR 2025 Sharpness-Aware Black-Box Optimization ICLR 2025 Adaptive Localization of Knowledge Negation for Continual LLM Unlearning ICML 2025 Parallel Simulation for Log-concave Sampling and Score-based Diffusion Models ICML 2025 Action-Agnostic Point-Level Supervision for Temporal Action Detection AAAI 2025 Non-stationary Online Learning for Curved Losses: Improved Dynamic Regret via Mixability ICML 2025 Robust Multi-View Learning via Representation Fusion of Sample-Level Attention and Alignment of Simulated Perturbation ICCV 2025 What Makes Partial-Label Learning Algorithms Effective? NIPS 2024 Slight Corruption in Pre-training Data Makes Better Diffusion Models NIPS 2024 Thompson Sampling for Real-Valued Combinatorial Pure Exploration of Multi-Armed Bandit AAAI 2024 The Choice of Noninformative Priors for Thompson Sampling in Multiparameter Bandit Models AAAI 2024 Generating Chain-of-Thoughts with a Pairwise-Comparison Approach to Searching for the Most Promising Intermediate Thought ICML 2024 Balancing Similarity and Complementarity for Federated Learning ICML 2024 Counterfactual Reasoning for Multi-Label Image Classification via Patching-Based Training ICML 2024 Learning with Complementary Labels Revisited: The Selected-Completely-at-Random Setting Is More Practical ICML 2024 Efficient Non-stationary Online Learning by Wavelets with Applications to Online Distribution Shift Adaptation ICML 2024 Locally Estimated Global Perturbations are Better than Local Perturbations for Federated Sharpness-aware Minimization ICML 2024 A General Framework for Learning from Weak Supervision ICML 2024 Accurate Forgetting for Heterogeneous Federated Continual Learning ICLR 2024 Robust Similarity Learning with Difference Alignment Regularization ICLR 2024 Understanding and Mitigating the Label Noise in Pre-training on Downstream Tasks ICLR 2024 Vision-Language Model Fine-Tuning via Simple Parameter-Efficient Modification EMNLP 2024 Direct Distillation between Different Domains ECCV 2024 Dual-Decoupling Learning and Metric-Adaptive Thresholding for Semi-Supervised Multi-Label Learning ECCV 2024 Fixed-Budget Real-Valued Combinatorial Pure Exploration of Multi-Armed Bandit AISTATS 2024 VEC-SBM: Optimal Community Detection with Vectorial Edges Covariates AISTATS 2024 Appearance-Based Curriculum for Semi-Supervised Learning With Multi-Angle Unlabeled Data WACV 2024 Imprecise Label Learning: A Unified Framework for Learning with Various Imprecise Label Configurations NIPS 2024 Enriching Disentanglement: From Logical Definitions to Quantitative Metrics NIPS 2024 Test-time Adaptation in Non-stationary Environments via Adaptive Representation Alignment NIPS 2024 Online (Multinomial) Logistic Bandit: Improved Regret and Constant Computation Cost NIPS 2023 Imitation Learning from Vague Feedback NIPS 2023 Optimality of Thompson Sampling with Noninformative Priors for Pareto Bandits ICML 2023 Efficient Adversarial Contrastive Learning via Robustness-Aware Coreset Selection NIPS 2023 Universal Approximation Property of Invertible Neural Networks JMLR 2023 A Category-theoretical Meta-analysis of Definitions of Disentanglement ICML 2023 Thompson Exploration with Best Challenger Rule in Best Arm Identification ACML 2023 On the Overlooked Pitfalls of Weight Decay and How to Mitigate Them: A Gradient-Norm Perspective NIPS 2023 Binary Classification with Confidence Difference NIPS 2023 Distributional Pareto-Optimal Multi-Objective Reinforcement Learning NIPS 2023 Enhancing Adversarial Contrastive Learning via Adversarial Invariant Regularization NIPS 2023 Diversified Outlier Exposure for Out-of-Distribution Detection via Informative Extrapolation NIPS 2023 Is the Performance of My Deep Network Too Good to Be True? A Direct Approach to Estimating the Bayes Error in Binary Classification ICLR 2023 Seeing Differently, Acting Similarly: Heterogeneously Observable Imitation Learning ICLR 2023 Multi-Label Knowledge Distillation ICCV 2023 Distribution Shift Matters for Knowledge Distillation with Webly Collected Images ICCV 2023 Generalizing Importance Weighting to A Universal Solver for Distribution Shift Problems NIPS 2023 Class-Distribution-Aware Pseudo-Labeling for Semi-Supervised Multi-Label Learning NIPS 2023 Adapting to Continuous Covariate Shift via Online Density Ratio Estimation NIPS 2023 Diversity-enhancing Generative Network for Few-shot Hypothesis Adaptation ICML 2023 GAT: Guided Adversarial Training with Pareto-optimal Auxiliary Tasks ICML 2023 Instance-Dependent Label-Noise Learning With Manifold-Regularized Transition Matrix Estimation CVPR 2022 Generalizing Consistent Multi-Class Classification with Rejection to be Compatible with Arbitrary Losses NIPS 2022 Learning Contrastive Embedding in Low-Dimensional Space NIPS 2022 Adversarial Training with Complementary Labels: On the Benefit of Gradually Informative Attacks NIPS 2022 Adapting to Online Label Shift with Provable Guarantees NIPS 2022 Synergy-of-Experts: Collaborate to Improve Adversarial Robustness NIPS 2022 Robust computation of optimal transport by $β$-potential regularization ACML 2022 Multi-class Classification from Multiple Unlabeled Datasets with Partial Risk Regularization ACML 2022 Pairwise Supervision Can Provably Elicit a Decision Boundary AISTATS 2022 Predictive variational Bayesian inference as risk-seeking optimization AISTATS 2022 Sample Selection with Uncertainty of Losses for Learning with Noisy Labels ICLR 2022 Federated Learning from Only Unlabeled Data with Class-conditional-sharing Clients ICLR 2022 Exploiting Class Activation Value for Partial-Label Learning ICLR 2022 Rethinking Class-Prior Estimation for Positive-Unlabeled Learning ICLR 2022 Meta Discovery: Learning to Discover Novel Classes given Very Limited Data ICLR 2022 To Smooth or Not? When Label Smoothing Meets Noisy Labels ICML 2022 Adaptive Inertia: Disentangling the Effects of Adaptive Learning Rate and Momentum ICML 2022 Adversarial Attack and Defense for Non-Parametric Two-Sample Tests ICML 2022 Towards Adversarially Robust Deep Image Denoising IJCAI 2022 Fast and Robust Rank Aggregation against Model Misspecification JMLR 2022 Learning from Noisy Pairwise Similarity and Unlabeled Data JMLR 2022 Binary Classification from Multiple Unlabeled Datasets via Surrogate Set Classification ICML 2021 Provably End-to-end Label-noise Learning without Anchor Points ICML 2021 Probabilistic Margins for Instance Reweighting in Adversarial Training NIPS 2021 Loss function based second-order Jensen inequality and its application to particle variational inference NIPS 2021 Classification with Rejection Based on Cost-sensitive Classification ICML 2021 Large-Margin Contrastive Learning with Distance Polarization Regularizer ICML 2021 Lower-Bounded Proper Losses for Weakly Supervised Classification ICML 2021 On Focal Loss for Class-Posterior Probability Estimation: A Theoretical Perspective CVPR 2021 Maximum Mean Discrepancy Test is Aware of Adversarial Attacks ICML 2021 Scalable Evaluation and Improvement of Document Set Expansion via Neural Positive-Unlabeled Learning EACL 2021 CIFS: Improving Adversarial Robustness of CNNs via Channel-wise Importance-based Feature Selection ICML 2021 Confidence Scores Make Instance-dependent Label-noise Learning Possible ICML 2021 Geometry-aware Instance-reweighted Adversarial Training ICLR 2021 Mediated Uncoupled Learning: Learning Functions without Direct Input-output Correspondences ICML 2021 Positive-Negative Momentum: Manipulating Stochastic Gradient Noise to Improve Generalization ICML 2021 Learning from Similarity-Confidence Data ICML 2021 A Diffusion Theory For Deep Learning Dynamics: Stochastic Gradient Descent Exponentially Favors Flat Minima ICLR 2021 Learning Diverse-Structured Networks for Adversarial Robustness ICML 2021 Pointwise Binary Classification with Pairwise Confidence Comparisons ICML 2021 Learning Noise Transition Matrix from Only Noisy Labels via Total Variation Regularization ICML 2021 Incorporating causal graphical prior knowledge into predictive modeling via simple data augmentation UAI 2021 Robust Imitation Learning from Noisy Demonstrations AISTATS 2021 Fenchel-Young Losses with Skewed Entropies for Class-posterior Probability Estimation AISTATS 2021 γ-ABC: Outlier-Robust Approximate Bayesian Computation Based on a Robust Divergence Estimator AISTATS 2021 A unified view of likelihood ratio and reparameterization gradients AISTATS 2021 Progressive Identification of True Labels for Partial-Label Learning ICML 2020 Do We Need Zero Training Loss After Achieving Zero Training Error? ICML 2020 SIGUA: Forgetting May Make Learning with Noisy Labels More Robust ICML 2020 Unbiased Risk Estimators Can Mislead: A Case Study of Learning with Complementary Labels ICML 2020 Learning with Multiple Complementary Labels ICML 2020 Accelerating the diffusion-based ensemble sampling by non-reversible dynamics ICML 2020 A One-step Approach to Covariate Shift Adaptation ACML 2020 Mitigating Overfitting in Supervised Classification from Two Unlabeled Datasets: A Consistent Risk Correction Approach AISTATS 2020 Calibrated Surrogate Maximization of Linear-fractional Utility in Binary Classification AISTATS 2020 Rethinking Importance Weighting for Deep Learning under Distribution Shift NIPS 2020 Provably Consistent Partial-Label Learning NIPS 2020 Calibrated Surrogate Losses for Adversarially Robust Classification COLT 2020 Analysis and Design of Thompson Sampling for Stochastic Partial Monitoring NIPS 2020 Learning from Aggregate Observations NIPS 2020 Part-dependent Label Noise: Towards Instance-dependent Label Noise NIPS 2020 Dual T: Reducing Estimation Error for Transition Matrix in Label-noise Learning NIPS 2020 Coupling-based Invertible Neural Networks Are Universal Diffeomorphism Approximators NIPS 2020 Partially Zero-shot Domain Adaptation from Incomplete Target Data with Missing Classes WACV 2020 Binary Classification from Positive Data with Skewed Confidence IJCAI 2020 Attacks Which Do Not Kill Training Make Adversarial Learning Stronger ICML 2020 Normalized Flat Minima: Exploring Scale Invariant Definition of Flat Minima for Neural Networks Using PAC-Bayesian Analysis ICML 2020 Few-shot Domain Adaptation by Causal Mechanism Transfer ICML 2020 Variational Imitation Learning with Diverse-quality Demonstrations ICML 2020 Online Dense Subgraph Discovery via Blurred-Graph Feedback ICML 2020 How does Disagreement Help Generalization against Label Corruption? ICML 2019 Clipped Matrix Completion: A Remedy for Ceiling Effects AAAI 2019 Imitation Learning from Imperfect Demonstration ICML 2019 Uncoupled Regression from Pairwise Comparison Data NIPS 2019 On the Calibration of Multiclass Classification with Rejection NIPS 2019 Dueling Bandits with Qualitative Feedback AAAI 2019 Learning Only from Relevant Keywords and Unlabeled Documents IJCNLP 2019 On Symmetric Losses for Learning from Corrupted Labels ICML 2019 Zero-shot Domain Adaptation Based on Attribute Information ACML 2019 Complementary-Label Learning for Arbitrary Losses and Models ICML 2019 Classification from Positive, Unlabeled and Biased Negative Data ICML 2019 Bézier Simplex Fitting: Describing Pareto Fronts of´ Simplicial Problems with Small Samples in Multi-Objective Optimization AAAI 2019 On the Minimal Supervision for Training Any Binary Classifier from Only Unlabeled Data ICLR 2019 Hierarchical Reinforcement Learning via Advantage-Weighted Information Maximization ICLR 2019 Are Anchor Points Really Indispensable in Label-Noise Learning? NIPS 2019 Learning Only from Relevant Keywords and Unlabeled Documents EMNLP 2019 Unsupervised Domain Adaptation Based on Source-Guided Discrepancy AAAI 2019 Bayesian Posterior Approximation via Greedy Particle Optimization AAAI 2019 Analysis of Minimax Error Rate for Crowdsourcing and Its Application to Worker Clustering Model ICML 2018 Variational Inference based on Robust Divergences AISTATS 2018 A fully adaptive algorithm for pure exploration in linear bandits AISTATS 2018 Bayesian Nonparametric Poisson-Process Allocation for Time-Sequence Modeling AISTATS 2018 Guide Actor-Critic for Continuous Control ICLR 2018 Mode-Seeking Clustering and Density Ridge Estimation via Direct Estimation of Density-Derivative-Ratios JMLR 2018 Binary Classification from Positive-Confidence Data NIPS 2018 Masking: A New Perspective of Noisy Supervision NIPS 2018 Co-teaching: Robust training of deep neural networks with extremely noisy labels NIPS 2018 Continuous-time Value Function Approximation in Reproducing Kernel Hilbert Spaces NIPS 2018 Lipschitz-Margin Training: Scalable Certification of Perturbation Invariance for Deep Neural Networks NIPS 2018 Uplift Modeling from Separate Labels NIPS 2018 Classification from Pairwise Similarity and Unlabeled Data ICML 2018 Does Distributionally Robust Supervised Learning Give Robust Classifiers? ICML 2018 Generative Local Metric Learning for Kernel Regression NIPS 2017 Whitening-Free Least-Squares Non-Gaussian Component Analysis ACML 2017 Expectation Propagation for t-Exponential Family Using q-Algebra NIPS 2017 Estimating Density Ridges by Direct Estimation of Density-Derivative-Ratios AISTATS 2017 Semi-Supervised Classification Based on Classification from Positive and Unlabeled Data ICML 2017 Learning from Complementary Labels NIPS 2017 Learning Discrete Representations via Information Maximizing Self-Augmented Training ICML 2017 Least-Squares Log-Density Gradient Clustering for Riemannian Manifolds AISTATS 2017 Positive-Unlabeled Learning with Non-Negative Risk Estimator NIPS 2017 Structure Learning of Partitioned Markov Networks ICML 2016 Multitask Principal Component Analysis ACML 2016 Non-Gaussian Component Analysis with Log-Density Gradient Estimation AISTATS 2016 Geometry-aware stationary subspace analysis ACML 2016 Theoretical Comparisons of Positive-Unlabeled Learning against Positive-Negative Learning NIPS 2016 Continuous Target Shift Adaptation in Supervised Learning ACML 2015 Geometry-Aware Principal Component Analysis for Symmetric Positive Definite Matrices ACML 2015 Stroke-Based Stylization Learning and Rendering with Inverse Reinforcement Learning IJCAI 2015 Condition for Perfect Dimensionality Recovery by Variational Bayesian PCA JMLR 2015 Convex Formulation for Learning from Positive and Unlabeled Data ICML 2015 Regularized Policy Gradients: Direct Variance Reduction in Policy Gradient Estimation ACML 2015 Direct Density-Derivative Estimation and Its Application in KL-Divergence Approximation AISTATS 2015 Sufficient Dimension Reduction via Direct Estimation of the Gradients of Logarithmic Conditional Densities ACML 2015 Class-prior Estimation for Learning from Positive and Unlabeled Data ACML 2015 Multitask learning meets tensor factorization: task imputation via convex optimization NIPS 2014 Outlier Path: A Homotopy Algorithm for Robust SVM ICML 2014 Transductive Learning with Multi-class Volume Approximation ICML 2014 Bias Reduction and Metric Learning for Nearest-Neighbor Estimation of Kullback-Leibler Divergence AISTATS 2014 Analysis of Empirical MAP and Empirical Partially Bayes: Can They be Alternatives to Variational Bayes? AISTATS 2014 Analysis of Learning from Positive and Unlabeled Data NIPS 2014 Analysis of Variational Bayesian Latent Dirichlet Allocation: Weaker Sparsity Than MAP NIPS 2014 Parametric Task Learning NIPS 2013 Infinitesimal Annealing for Training Semi-Supervised Support Vector Machines ICML 2013 Squared-loss Mutual Information Regularization: A Novel Information-theoretic Approach to Semi-supervised Learning ICML 2013 Maximum Volume Clustering: A New Discriminative Clustering Approach JMLR 2013 Global Analytic Solution of Fully-observed Variational Bayesian Matrix Factorization JMLR 2013 Global Solver and Its Efficient Approximation for Variational Bayesian Low-rank Subspace Clustering NIPS 2013 Density-Difference Estimation NIPS 2012 Perfect Dimensionality Recovery by Variational Bayesian PCA NIPS 2012 Sparse Additive Matrix Factorization for Robust PCA and Its Generalization ACML 2012 Fast Learning Rate of Multiple Kernel Learning: Trade-Off between Sparsity and Smoothness AISTATS 2012 Cross-Domain Object Matching with Model Selection AISTATS 2011 Global Solution of Fully-Observed Variational Bayesian Matrix Factorization is Column-Wise Independent NIPS 2011 Maximum Volume Clustering AISTATS 2011 Theoretical Analysis of Bayesian Matrix Factorization JMLR 2011 A Refined Margin Analysis for Boosting Algorithms via Equilibrium Margin JMLR 2011 Relative Density-Ratio Estimation for Robust Distribution Comparison NIPS 2011 Analysis and Improvement of Policy Gradient Estimation NIPS 2011 Super-Linear Convergence of Dual Augmented Lagrangian Algorithm for Sparsity Regularized Estimation JMLR 2011 Computationally Efficient Sufficient Dimension Reduction via Squared-Loss Mutual Information ACML 2011 Target Neighbor Consistent Feature Weighting for Nearest Neighbor Classification NIPS 2011 Conditional Density Estimation via Least-Squares Density Ratio Estimation AISTATS 2010 Global Analytic Solution for Variational Bayesian Matrix Factorization NIPS 2010 Sufficient Dimension Reduction via Squared-loss Mutual Information Estimation AISTATS 2010 Single versus Multiple Sorting in All Pairs Similarity Search ACML 2010 A Least-squares Approach to Direct Importance Estimation JMLR 2009 Efficient Direct Density Ratio Estimation for Non-stationarity Adaptation and Outlier Detection NIPS 2008 Covariate Shift Adaptation by Importance Weighted Cross Validation JMLR 2007 Multi-Task Learning via Conic Programming NIPS 2007 Direct Importance Estimation with Model Selection and Its Application to Covariate Shift Adaptation NIPS 2007 Dimensionality Reduction of Multimodal Labeled Data by Local Fisher Discriminant Analysis JMLR 2007 Active Learning in Approximately Linear Regression Based on Conditional Expectation of Generalization Error JMLR 2006 Mixture Regression for Covariate Shift NIPS 2006 In Search of Non-Gaussian Components of a High-Dimensional Distribution JMLR 2006 The Subspace Information Criterion for Infinite Dimensional Hypothesis Spaces JMLR 2002