Wei Xiong

43 papers · 2017–2025 · 12 conferences · across top CS/AI conferences

Achievements

+13 more ↓

🏃 Academic Marathon (8) 🐝 Cross-Pollinator (14) 🌍 Conference Polyglot (12) 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (7)

🗺️ Taxonomy Completionist (69) 🐣 Hot Topic Early Bird 🌍 Conference Polyglot (12) 🤝 Dynamic Duo (14) 👑 Triple Crown 🏆 Grand Slam 🧬 Topic Evolution 🏆 Keyword Champion (2) 💎 Century Club (43) ⚡ Prolific Year (13) 🚀 Conference Pioneer 🔥 Unstoppable (9) 🗃️ Keyword Collector (165)

Conferences

ICML (8) CVPR (7) NIPS (6) ECCV (5) EMNLP (4) ICLR (4) ACL (2) ICCV (2) NAACL (2) AAAI (1) AISTATS (1) SEMEVAL (1)

Top co-authors

Tong Zhang (14) Chengshuai Shi (7) He Zhang (6) Jianming Zhang (6) Cong Shen (6) Han Zhong (6) Zhifei Zhang (5) Zhe Lin (5) Chenlu Ye (4) Hanze Dong (4)

Keywords

reinforcement learning from human feedback (5) regret bound (4) diffusion model (4) markov game (3) identity preservation (3) image editing (3) generative adversarial network (2) language model (2) image generation (2) reinforcement learning (2) human feedback (2) adversarial training (2) function approximation (2) reward model (2) policy learning (2) contrastive learning (2) markov decision process (2) minimax optimization (2) offline reinforcement learning (2) multi-armed bandit (2)

Papers

DIVE: Taming DINO for Subject-Driven Video Editing ICCV 2025 Building Math Agents with Multi-Turn Iterative Preference Learning ICLR 2025 RRM: Robust Reward Model Training Mitigates Reward Hacking ICLR 2025 MetaShadow: Object-Centered Shadow Detection, Removal, and Synthesis CVPR 2025 Not All Voices Are Rewarded Equally: Probing and Repairing Reward Models across Human Diversity EMNLP 2025 Refine-by-Align: Reference-Guided Artifacts Refinement through Semantic Alignment ICLR 2025 From Lists to Emojis: How Format Bias Affects Model Alignment ACL 2025 Logarithmic Regret for Online KL-Regularized Reinforcement Learning ICML 2025 LLM Alignment as Retriever Optimization: An Information Retrieval Perspective ICML 2025 DPO Meets PPO: Reinforced Token Optimization for RLHF ICML 2025 polyBART: A Chemical Linguist for Polymer Property Prediction and Generative Design EMNLP 2025 Mitigating the Alignment Tax of RLHF EMNLP 2024 Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts EMNLP 2024 Relightful Harmonization: Lighting-aware Portrait Background Replacement CVPR 2024 LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models NAACL 2024 Online Iterative Reinforcement Learning from Human Feedback with General Preference Model NIPS 2024 Earthfarsser: Versatile Spatio-Temporal Dynamical Systems Modeling in One Model AAAI 2024 Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards ACL 2024 Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-constraint ICML 2024 InstantBooth: Personalized Text-to-Image Generation without Test-Time Finetuning CVPR 2024 IMPRINT: Generative Object Compositing by Learning Identity-Preserving Representation CVPR 2024 SwapAnything: Enabling Arbitrary Object Swapping in Personalized Image Editing ECCV 2024 Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization ECCV 2024 WAS: Dataset and Methods for Artistic Text Segmentation ECCV 2024 Provably Efficient Offline Reinforcement Learning with Perturbed Data Sources ICML 2023 Corruption-Robust Algorithms with Uncertainty Weighting for Nonlinear Contextual Bandits and Markov Decision Processes ICML 2023 PHOTOSWAP: Personalized Subject Swapping in Images NIPS 2023 Maximize to Explore: One Objective Function Fusing Estimation, Planning, and Exploration NIPS 2023 Nearly Minimax Optimal Offline Reinforcement Learning with Linear Function Approximation: Single-Agent MDP and Markov Game ICLR 2023 ZhichunRoad at SemEval-2022 Task 2: Adversarial Training and Contrastive Learning for Multiword Representations SEMEVAL 2022 A Self-Play Posterior Sampling Algorithm for Zero-Sum Markov Games ICML 2022 Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium Learning from Offline Datasets ICML 2022 ZhichunRoad at SemEval-2022 Task 2: Adversarial Training and Contrastive Learning for Multiword Representations NAACL 2022 Distributional Reinforcement Learning for Multi-Dimensional Reward Functions NIPS 2021 Heterogeneous Multi-player Multi-armed Bandits: Closing the Gap and Generalization NIPS 2021 (Almost) Free Incentivized Exploration from Decentralized Learning Agents NIPS 2021 Example-Guided Image Synthesis using Masked Spatial-Channel Attention and Self-Supervision ECCV 2020 Decentralized Multi-player Multi-armed Bandits with No Collision Information AISTATS 2020 Fine-Grained Image-to-Image Transformation Towards Visual Recognition CVPR 2020 Foreground-Aware Image Inpainting CVPR 2019 Learning to Generate Time-Lapse Videos Using Multi-Stage Dynamic Generative Adversarial Networks CVPR 2018 Focus, Segment and Erase: An Efficient Network for Multi-Label Brain Tumor Segmentation ECCV 2018 Regional Interactive Image Segmentation Networks ICCV 2017