Wei Xiong
43 papers · 2017–2025 · 12 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+13 more ↓ Show less ↑
π Academic Marathon (8) π Cross-Pollinator (14) π Conference Polyglot (12) π Interdisciplinary Bridge π Renaissance Researcher (7)
πΊοΈ
Taxonomy Completionist
(69)
π£
Hot Topic Early Bird
π
Conference Polyglot
(12)
π€
Dynamic Duo
(14)
π
Triple Crown
π
Grand Slam
π§¬
Topic Evolution
π
Keyword Champion
(2)
π
Century Club
(43)
β‘
Prolific Year
(13)
π
Conference Pioneer
π₯
Unstoppable
(9)
ποΈ
Keyword Collector
(165)
Conferences
ICML (8)
CVPR (7)
NIPS (6)
ECCV (5)
EMNLP (4)
ICLR (4)
ACL (2)
ICCV (2)
NAACL (2)
AAAI (1)
AISTATS (1)
SEMEVAL (1)
Top co-authors
Keywords
reinforcement learning from human feedback
(5)
regret bound
(4)
diffusion model
(4)
markov game
(3)
identity preservation
(3)
image editing
(3)
generative adversarial network
(2)
language model
(2)
image generation
(2)
reinforcement learning
(2)
human feedback
(2)
adversarial training
(2)
function approximation
(2)
reward model
(2)
policy learning
(2)
contrastive learning
(2)
markov decision process
(2)
minimax optimization
(2)
offline reinforcement learning
(2)
multi-armed bandit
(2)
Papers
DIVE: Taming DINO for Subject-Driven Video Editing
ICCV 2025
Building Math Agents with Multi-Turn Iterative Preference Learning
ICLR 2025
RRM: Robust Reward Model Training Mitigates Reward Hacking
ICLR 2025
MetaShadow: Object-Centered Shadow Detection, Removal, and Synthesis
CVPR 2025
Not All Voices Are Rewarded Equally: Probing and Repairing Reward Models across Human Diversity
EMNLP 2025
Refine-by-Align: Reference-Guided Artifacts Refinement through Semantic Alignment
ICLR 2025
From Lists to Emojis: How Format Bias Affects Model Alignment
ACL 2025
Logarithmic Regret for Online KL-Regularized Reinforcement Learning
ICML 2025
LLM Alignment as Retriever Optimization: An Information Retrieval Perspective
ICML 2025
DPO Meets PPO: Reinforced Token Optimization for RLHF
ICML 2025
polyBART: A Chemical Linguist for Polymer Property Prediction and Generative Design
EMNLP 2025
Mitigating the Alignment Tax of RLHF
EMNLP 2024
Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts
EMNLP 2024
Relightful Harmonization: Lighting-aware Portrait Background Replacement
CVPR 2024
LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models
NAACL 2024
Online Iterative Reinforcement Learning from Human Feedback with General Preference Model
NIPS 2024
Earthfarsser: Versatile Spatio-Temporal Dynamical Systems Modeling in One Model
AAAI 2024
Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards
ACL 2024
Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-constraint
ICML 2024
InstantBooth: Personalized Text-to-Image Generation without Test-Time Finetuning
CVPR 2024
IMPRINT: Generative Object Compositing by Learning Identity-Preserving Representation
CVPR 2024
SwapAnything: Enabling Arbitrary Object Swapping in Personalized Image Editing
ECCV 2024
Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization
ECCV 2024
WAS: Dataset and Methods for Artistic Text Segmentation
ECCV 2024
Provably Efficient Offline Reinforcement Learning with Perturbed Data Sources
ICML 2023
Corruption-Robust Algorithms with Uncertainty Weighting for Nonlinear Contextual Bandits and Markov Decision Processes
ICML 2023
PHOTOSWAP: Personalized Subject Swapping in Images
NIPS 2023
Maximize to Explore: One Objective Function Fusing Estimation, Planning, and Exploration
NIPS 2023
Nearly Minimax Optimal Offline Reinforcement Learning with Linear Function Approximation: Single-Agent MDP and Markov Game
ICLR 2023
ZhichunRoad at SemEval-2022 Task 2: Adversarial Training and Contrastive Learning for Multiword Representations
SEMEVAL 2022
A Self-Play Posterior Sampling Algorithm for Zero-Sum Markov Games
ICML 2022
Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium Learning from Offline Datasets
ICML 2022
ZhichunRoad at SemEval-2022 Task 2: Adversarial Training and Contrastive Learning for Multiword Representations
NAACL 2022
Distributional Reinforcement Learning for Multi-Dimensional Reward Functions
NIPS 2021
Heterogeneous Multi-player Multi-armed Bandits: Closing the Gap and Generalization
NIPS 2021
(Almost) Free Incentivized Exploration from Decentralized Learning Agents
NIPS 2021
Example-Guided Image Synthesis using Masked Spatial-Channel Attention and Self-Supervision
ECCV 2020
Decentralized Multi-player Multi-armed Bandits with No Collision Information
AISTATS 2020
Fine-Grained Image-to-Image Transformation Towards Visual Recognition
CVPR 2020
Foreground-Aware Image Inpainting
CVPR 2019
Learning to Generate Time-Lapse Videos Using Multi-Stage Dynamic Generative Adversarial Networks
CVPR 2018
Focus, Segment and Erase: An Efficient Network for Multi-Label Brain Tumor Segmentation
ECCV 2018
Regional Interactive Image Segmentation Networks
ICCV 2017