Weinan Zhang
127 papers · 2012–2026 · 14 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+18 more ↓ Show less ↑
πΊοΈ Taxonomy Completionist (21) π§ Keyword Pioneer π Interdisciplinary Bridge π Renaissance Researcher (5) π£ Hot Topic Early Bird
π
Renaissance Researcher
(5)
π
Interdisciplinary Bridge
π
Cross-Pollinator
(14)
π
Conference Loyalist
(20)
π
Keyword Champion
(4)
π
Triple Crown
π§¬
Topic Evolution
π
Grand Slam
π₯
Mega-Team
(32)
π¬
Deep Specialist
(20)
π€
Dynamic Duo
(52)
π
Conference Pioneer
β‘
Prolific Year
(12)
π₯
Unstoppable
(9)
ποΈ
Keyword Collector
(77)
π
Trend Setter
π
Century Club
(116)
β
The Questioner
Conferences
NIPS (20)
ACL (19)
AAAI (16)
IJCAI (15)
ICLR (14)
ICML (12)
EMNLP (11)
COLING (5)
CORL (4)
AISTATS (3)
JMLR (3)
IJCNLP (2)
RSS (2)
NAACL (1)
Top co-authors
Research topics
Keywords
reinforcement learning
(20)
offline reinforcement learning
(9)
large language model
(9)
policy optimization
(8)
multi-agent system
(8)
multi-agent reinforcement learning
(8)
recommender system
(6)
sample efficiency
(5)
model-based reinforcement learning
(5)
representation learning
(5)
diffusion model
(5)
transfer learning
(5)
policy gradient
(4)
graph neural network
(4)
proximal policy optimization
(4)
sequence generation
(4)
generative adversarial network
(4)
neural machine translation
(4)
code generation
(4)
process reward model
(4)
Papers
ColorBrowserAgent: Complex Long-Horizon Browser Agent with Adaptive Knowledge Evolution
ACL 2026
Exploring and Distilling Multi-Dimensional Clues for Interpretable Social Bot Detection
ACL 2026
CoreCodeBench: Decoupling Code Intelligence via Fine-Grained Repository-Level Tasks
ACL 2026
Bridging Scale Discrepancies in Robotic Control via Language-Based Action Representations
AAAI 2026
LoopTool: Closing the DataβTraining Loop for Robust LLM Tool Calls
ACL 2026
ToolPRM: Fine-Grained Inference Scaling of Structured Outputs for Function Calling
ACL 2026
A Survey of Large Language Model-Based Search Agents
ACL 2026
ACE-Router: Generalizing History-Aware Routing from MCP Tools to the Agent Web
ACL 2026
A Comprehensive Survey of Process Reward Models: Data Generation, Model Construction, and Usage
ACL 2026
Offline Fictitious Self-Play for Competitive Games
AAAI 2026
PADiff: Predictive and Adaptive Diffusion Policies for Ad Hoc Teamwork
AAAI 2026
Beyond Graph Convolution: Multimodal Recommendation with Topology-aware MLPs
AAAI 2025
Stimulate the Critical Thinking of LLMs via Debiasing Discussion
EMNLP 2025
RethinkMCTS: Refining Erroneous Thoughts in Monte Carlo Tree Search for Code Generation
EMNLP 2025
NL-Debugging: Exploiting Natural Language as an Intermediate Representation for Code Debugging
EMNLP 2025
BeamDojo: Learning Agile Humanoid Locomotion on Sparse Footholds
RSS 2025
A Unified and General Humanoid Whole-Body Controller for Fine-Grained Locomotion
RSS 2025
Fast Second-Order Online Kernel Learning Through Incremental Matrix Sketching and Decomposition
IJCAI 2025
Score-Based Diffusion Policy Compatible with Reinforcement Learning via Optimal Transport
ICML 2025
Large Language Models are Demonstration Pre-Selectors for Themselves
ICML 2025
ContraDiff: Planning Towards High Return States via Contrastive Learning
ICLR 2025
Reconstruction-Guided Policy: Enhancing Decision-Making through Agent-Wise State Consistency
ICLR 2025
Autonomous Goal Detection and Cessation in Reinforcement Learning: A Case Study on Source Term Estimation
AAAI 2025
Boost, Disentangle, and Customize: A Robust System2-to-System1 Pipeline for Code Generation
ACL 2025
Robust Function-Calling for On-Device Language Model via Function Masking
ICLR 2025
Nullspace Disentanglement for Red Teaming Language Models
EMNLP 2025
Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration
ACL 2025
DebateCoder: Towards Collective Intelligence of LLMs via Test Case Driven LLM Debate for Code Generation
ACL 2025
HammerBench: Fine-Grained Function-Calling Evaluation in Real Mobile Assistant Scenarios
ACL 2025
CodePRM: Execution Feedback-enhanced Process Reward Model for Code Generation
ACL 2025
Retrieval-Augmented Process Reward Model for Generalizable Mathematical Reasoning
ACL 2025
Learning an Actionable Discrete Diffusion Policy via Large-Scale Actionless Video Pre-Training
NIPS 2024
ZSC-Eval: An Evaluation Toolkit and Benchmark for Multi-agent Zero-shot Coordination
NIPS 2024
Diffusion-based Reinforcement Learning via Q-weighted Variational Policy Optimization
NIPS 2024
Diffusion-DICE: In-Sample Diffusion Guidance for Offline Reinforcement Learning
NIPS 2024
Reinforcing LLM Agents via Policy Optimization with Action Decomposition
NIPS 2024
4DBInfer: A 4D Benchmarking Toolbox for Graph-Centric Predictive Modeling on RDBs
NIPS 2024
Bridging the Sim-to-Real Gap from the Information Bottleneck Perspective
CORL 2024
OmniH2O: Universal and Dexterous Human-to-Humanoid Whole-Body Teleoperation and Learning
CORL 2024
GenSim2: Scaling Robot Data Generation with Multi-modal and Reasoning LLMs
CORL 2024
ODICE: Revealing the Mystery of Distribution Correction Estimation via Orthogonal-gradient Update
ICLR 2024
Unraveling and Mitigating Retriever Inconsistencies in Retrieval-Augmented Large Language Models
ACL 2024
MADiff: Offline Multi-agent Learning with Diffusion Models
NIPS 2024
Prove Your Point!: Bringing Proof-Enhancement Principles to Argumentative Essay Generation
EMNLP 2024
Vision-Language Foundation Models as Effective Robot Imitators
ICLR 2024
AlphaZero-Like Tree-Search can Guide Large Language Model Decoding and Training
ICML 2024
DiffStitch: Boosting Offline Reinforcement Learning with Diffusion-based Trajectory Stitching
ICML 2024
MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning
JMLR 2023
Lending Interaction Wings to Recommender Systems with Conversational Agents
NIPS 2023
Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning
NIPS 2023
Set-to-Sequence Ranking-Based Concept-Aware Learning Path Recommendation
AAAI 2023
Learning Decomposed Spatial Relations for Multi-Variate Time-Series Modeling
AAAI 2023
Harnessing the Power of Large Language Models for Empathetic Response Generation: Empirical Investigations and Improvements
EMNLP 2023
Conversational Recommender System and Large Language Model Are Made for Each Other in E-commerce Pre-sales Dialogue
EMNLP 2023
Order Matters: Agent-by-agent Policy Optimization
ICLR 2023
Visual Imitation Learning with Patch Rewards
ICLR 2023
GEAR: A GPU-Centric Experience Replay System for Large Reinforcement Learning Models
ICML 2023
Large Decision Models
IJCAI 2023
Adaptation Augmented Model-based Policy Optimization
JMLR 2023
Multi-Agent Reinforcement Learning is a Sequence Modeling Problem
NIPS 2022
CLIP Models are Few-Shot Learners: Empirical Studies on VQA and Visual Entailment
ACL 2022
PAEG: Phrase-level Adversarial Example Generation for Neural Machine Translation
COLING 2022
Neural Re-ranking in Multi-stage Recommender Systems: A Review
IJCAI 2022
Goal-Conditioned Reinforcement Learning: Problems and Solutions
IJCAI 2022
Towards Applicable Reinforcement Learning: Improving the Generalization and Sample Efficiency with Policy Ensemble
IJCAI 2022
Learning Enhanced Representation for Tabular Data via Neighborhood Propagation
NIPS 2022
Why Propagate Alone? Parallel Use of Labels and Features on Graphs
ICLR 2022
Inductive Relation Prediction Using Analogy Subgraph Embeddings
ICLR 2022
Honor of Kings Arena: an Environment for Generalization in Competitive Reinforcement Learning
NIPS 2022
Reinforcement Learning with Automated Auxiliary Loss Search
NIPS 2022
Nested Named Entity Recognition with Span-level Graphs
ACL 2022
NeoRL: A Near Real-World Benchmark for Offline Reinforcement Learning
NIPS 2022
Multi-View Graph Representation for Programming Language Processing: An Investigation into Algorithm Detection
AAAI 2022
Bootstrapped Transformer for Offline Reinforcement Learning
NIPS 2022
PerfectDou: Dominating DouDizhu with Perfect Information Distillation
NIPS 2022
Towards Return Parity in Markov Decision Processes
AISTATS 2022
Plan Your Target and Learn Your Skills: Transferable State-Only Imitation Learning via Decoupled Policy Optimization
ICML 2022
SelF-Eval: Self-supervised Fine-grained Dialogue Evaluation
COLING 2022
Deep Learning for Click-Through Rate Estimation
IJCAI 2021
Glancing Transformer for Non-Autoregressive Neural Machine Translation
ACL 2021
On Effective Scheduling of Model-based Reinforcement Learning
NIPS 2021
Learning Logic Rules for Document-Level Relation Extraction
EMNLP 2021
Model-based Multi-agent Policy Optimization with Adaptive Opponent-wise Rollouts
IJCAI 2021
MapGo: Model-Assisted Policy Optimization for Goal-Oriented Tasks
IJCAI 2021
MARS: Markov Molecular Sampling for Multi-objective Drug Discovery
ICLR 2021
Curriculum Offline Imitating Learning
NIPS 2021
Data-Driven Multimodal Patrol Planning for Anti-poaching
AAAI 2021
Fork or Fail: Cycle-Consistent Training with Many-to-One Mappings
AISTATS 2021
Universal Trading for Order Execution with Oracle Policy Distillation
AAAI 2021
Glancing Transformer for Non-Autoregressive Neural Machine Translation
IJCNLP 2021
Active Sentence Learning by Adversarial Uncertainty Sampling in Discrete Space
EMNLP 2020
Efficient Spectrum-Revealing CUR Matrix Decomposition
AISTATS 2020
Multi-Agent Interactions Modeling with Correlated Policies
ICLR 2020
GraphAF: a Flow-based Autoregressive Model for Molecular Graph Generation
ICLR 2020
Bidirectional Model-based Policy Optimization
ICML 2020
Multi-Agent Determinantal Q-Learning
ICML 2020
Towards Making the Most of BERT in Neural Machine Translation
AAAI 2020
Bi-Level Actor-Critic for Multi-Agent Coordination
AAAI 2020
Author Name Disambiguation on Heterogeneous Information Network with Adversarial Representation Learning
AAAI 2020
Aggregating Crowd Wisdom with Side Information via a Clustering-based Label-aware Autoencoder
IJCAI 2020
DropNAS: Grouped Operation Dropout for Differentiable Architecture Search
IJCAI 2020
Efficient and Robust High-Dimensional Linear Contextual Bandits
IJCAI 2020
Model-based Policy Optimization with Unsupervised Model Adaptation
NIPS 2020
Efficient Projection-free Algorithms for Saddle Point Problems
NIPS 2020
SMARTS: An Open-Source Scalable Multi-Agent RL Training School for Autonomous Driving
CORL 2020
Large-Scale Interactive Recommendation with Tree-Structured Policy Gradient
AAAI 2019
Hybrid Actor-Critic Reinforcement Learning in Parameterized Action Space
IJCAI 2019
CoT: Cooperative Training for Generative Modeling of Discrete Data
ICML 2019
Academic Reader: An Interactive Question Answering System on Academic Literatures
AAAI 2019
Dynamically Fused Graph Network for Multi-hop Reasoning
ACL 2019
Lipschitz Generative Adversarial Nets
ICML 2019
AdaShift: Decorrelation and Convergence of Adaptive Learning Rate Methods
ICLR 2019
Exploring Diverse Expressions for Paraphrase Generation
EMNLP 2019
Exploring Diverse Expressions for Paraphrase Generation
IJCNLP 2019
Deep Recurrent Survival Analysis
AAAI 2019
Context-Sensitive Generation of Open-Domain Conversational Responses
COLING 2018
Activation Maximization Generative Adversarial Nets
ICLR 2018
Learning to Design Games: Strategic Environments in Reinforcement Learning
IJCAI 2018
Label-Aware Double Transfer Learning for Cross-Specialty Medical Named Entity Recognition
NAACL 2018
Mean Field Multi-Agent Reinforcement Learning
ICML 2018
Path-Level Network Transformation for Efficient Architecture Search
ICML 2018
Zero Pronoun Resolution with Attention-based Neural Network
COLING 2018
Aggregating Crowd Wisdoms with Label-aware Autoencoders
IJCAI 2017
Chinese Zero Pronoun Resolution with Deep Memory Network
EMNLP 2017
A Deep Neural Network for Chinese Zero Pronoun Resolution
IJCAI 2017
SVDFeature: A Toolkit for Feature-based Collaborative Filtering
JMLR 2012
The Use of Dependency Relation Graph to Enhance the Term Weighting in Question Retrieval
COLING 2012