Nan Jiang
90 papers · 2014–2025 · 18 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+19 more ↓ Show less ↑
π§ Keyword Pioneer π£ Hot Topic Early Bird πΊοΈ Taxonomy Completionist (18) π Interdisciplinary Bridge π Conference Polyglot (18)
π
Interdisciplinary Bridge
π£
Hot Topic Early Bird
π§
Keyword Pioneer
π
Keyword Trendsetter Combo
(3)
π
Conference Loyalist
(22)
π§¬
Topic Evolution
π€
Dynamic Duo
(10)
π
Grand Slam
π
Triple Crown
π±
Topic Pioneer
π¬
Deep Specialist
(21)
π
Keyword Champion
(3)
β‘
Prolific Year
(12)
ποΈ
Keyword Collector
(60)
π
Trend Setter
π
Century Club
(90)
π₯
Unstoppable
(12)
β
The Questioner
(3)
π
Conference Pioneer
Conferences
NIPS (22)
ICML (17)
ICLR (10)
AAAI (7)
ACL (4)
AISTATS (4)
COLT (4)
CVPR (4)
IJCAI (4)
UAI (3)
EMNLP (3)
JMLR (2)
WACV (1)
NAACL (1)
ICCV (1)
ECCV (1)
CORL (1)
ALT (1)
Top co-authors
Keywords
reinforcement learning
(11)
off-policy evaluation
(10)
offline reinforcement learning
(10)
function approximation
(8)
sample complexity
(7)
value function
(6)
markov decision process
(6)
model-based reinforcement learning
(5)
policy optimization
(5)
importance sampling
(5)
minimax optimization
(4)
representation learning
(4)
language model
(4)
density ratio
(4)
code generation
(3)
policy learning
(3)
value function approximation
(3)
sequential decision making
(3)
policy gradient
(3)
sample efficiency
(3)
Papers
Can Language Models Replace Programmers for Coding? REPOCOD Says βNot Yetβ
ACL 2025
Nova: Generative Language Models for Assembly Code with Hierarchical Attention and Contrastive Learning
ICLR 2025
DART: Distilling Autoregressive Reasoning to Silent Thought
EMNLP 2025
Commit0: Library Generation from Scratch
ICLR 2025
WAFFLE: Fine-tuning Multi-Modal Model for Automated Front-End Development
ACL 2025
MLLM-as-a-Judge for Image Safety without Human Labeling
CVPR 2025
Dynamic Motion Blending for Versatile Motion Editing
CVPR 2025
GameArena: Evaluating LLM Reasoning through Live Computer Games
ICLR 2025
Active Symbolic Discovery of Ordinary Differential Equations via Phase Portrait Sketching
AAAI 2025
Statistical Tractability of Off-policy Evaluation of History-dependent Policies in POMDPs
ICLR 2025
Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning
ICLR 2025
LATTE: Improving Latex Recognition for Tables and Formulae with Iterative Refinement
AAAI 2025
Solving Satisfiability Modulo Counting Exactly with Probabilistic Circuits
ICML 2025
Is Best-of-N the Best of Them? Coverage, Scaling, and Optimality in Inference-Time Alignment
ICML 2025
Is attention required for ICL? Exploring the Relationship Between Model Architecture and In-Context Learning Ability
ICLR 2024
Mitigating the Alignment Tax of RLHF
EMNLP 2024
Scaling Up Dynamic Human-Scene Interaction Modeling
CVPR 2024
F-HOI: Toward Fine-grained Semantic-Aligned 3D Human-Object Interactions
ECCV 2024
Model-Free Representation Learning and Exploration in Low-Rank MDPs
JMLR 2024
Vertical Symbolic Regression via Deep Policy Gradient
IJCAI 2024
Occupancy-based Policy Gradient: Estimation, Convergence, and Optimality
NIPS 2024
PhyRecon: Physically Plausible Neural Scene Reconstruction
NIPS 2024
LeDex: Training LLMs to Better Self-Debug and Explain Code
NIPS 2024
Online Iterative Reinforcement Learning from Human Feedback with General Preference Model
NIPS 2024
On the Curses of Future and History in Future-dependent Value Functions for Off-policy Evaluation
NIPS 2024
Reinforcement Learning Under Latent Dynamics: Toward Statistical and Algorithmic Modularity
NIPS 2024
Racing Control Variable Genetic Programming for Symbolic Regression
AAAI 2024
Solving Satisfiability Modulo Counting for Symbolic and Statistical AI Integration with Provable Guarantees
AAAI 2024
Word Embeddings Are Steers for Language Models
ACL 2024
Harnessing Density Ratios for Online Reinforcement Learning
ICLR 2024
Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-constraint
ICML 2024
The Role of Coverage in Online Reinforcement Learning
ICLR 2023
Adversarial Model for Offline Reinforcement Learning
NIPS 2023
The Optimal Approximation Factors in Misspecified Off-Policy Value Function Estimation
ICML 2023
Reinforcement Learning in Low-rank MDPs with Density Features
ICML 2023
Offline Learning in Markov Games with General Function Approximation
ICML 2023
Explaining RL Decisions with Trajectories
ICLR 2023
Full-Body Articulated Human-Object Interaction
ICCV 2023
Learning Markov Random Fields for Combinatorial Structures via Sampling through LovΓ‘sz Local Lemma
AAAI 2023
Marginalized Importance Sampling for Off-Environment Policy Evaluation
CORL 2023
Future-Dependent Value-Based Off-Policy Evaluation in POMDPs
NIPS 2023
Interaction-Grounded Learning with Action-Inclusive Feedback
NIPS 2022
Tiered Reinforcement Learning: Pessimism in the Face of Uncertainty and Constant Regret
NIPS 2022
Beyond the Return: Off-policy Function Estimation under User-specified Error-measuring Distributions
NIPS 2022
On the Statistical Efficiency of Reward-Free Exploration in Non-Linear RL
NIPS 2022
A Few Expert Queries Suffices for Sample-Efficient RL with Resets and Linear Value Approximation
NIPS 2022
On the Convergence Rate of Off-Policy Policy Optimization Methods with Density-Ratio Correction
AISTATS 2022
Offline Reinforcement Learning with Realizability and Single-policy Concentrability
COLT 2022
Towards Deployment-Efficient Reinforcement Learning: Lower Bound and Optimality
ICLR 2022
Adversarially Trained Actor Critic for Offline Reinforcement Learning
ICML 2022
A Minimax Learning Approach to Off-Policy Evaluation in Confounded Partially Observable Markov Decision Processes
ICML 2022
Constraint Reasoning Embedded Structured Prediction
JMLR 2022
Offline reinforcement learning under value and density-ratio realizability: The power of gaps
UAI 2022
Bellman-consistent Pessimism for Offline Reinforcement Learning
NIPS 2021
Improved Worst-Case Regret Bounds for Randomized Least-Squares Value Iteration
AAAI 2021
On Query-efficient Planning in MDPs under Linear Realizability of the Optimal State-value Function
COLT 2021
Minimax Model Learning
AISTATS 2021
Batch Value-function Approximation with Only Realizability
ICML 2021
PALM: Probabilistic area loss Minimization for Protein Sequence Alignment
UAI 2021
Policy Finetuning: Bridging Sample-Efficient Offline and Online Reinforcement Learning
NIPS 2021
Towards Hyperparameter-free Policy Selection for Offline Reinforcement Learning
NIPS 2021
RL-Duet: Online Music Accompaniment Generation Using Deep Reinforcement Learning
AAAI 2020
Language Generation via Combinatorial Constraint Satisfaction: A Tree Search Enhanced Monte-Carlo Approach
EMNLP 2020
Scale Match for Tiny Person Detection
WACV 2020
Sample Complexity of Reinforcement Learning using Linearly Combined Model Ensembles
AISTATS 2020
A Question Type Driven and Copy Loss Enhanced Frameworkfor Answer-Agnostic Neural Question Generation
ACL 2020
From Importance Sampling to Doubly Robust Policy Gradient
ICML 2020
Minimax Weight and Q-Function Learning for Off-Policy Evaluation
ICML 2020
Q* Approximation Schemes for Batch Reinforcement Learning: A Theoretical Comparison
UAI 2020
When Counterpoint Meets Chinese Folk Melodies
NIPS 2020
Minimax Value Interval for Off-Policy Evaluation and Policy Optimization
NIPS 2020
Provably Efficient Q-Learning with Low Switching Cost
NIPS 2019
Model-based RL in Contextual Decision Processes: PAC bounds and Exponential Improvements over Model-free Approaches
COLT 2019
Information-Theoretic Considerations in Batch Reinforcement Learning
ICML 2019
Provably efficient RL with Rich Observations via Latent State Decoding
ICML 2019
LSDSCC: a Large Scale Domain-Specific Conversational Corpus for Response Generation with Diversity Oriented Evaluation Metrics
NAACL 2018
Markov Decision Processes with Continuous Side Information
ALT 2018
Hierarchical Imitation and Reinforcement Learning
ICML 2018
On Oracle-Efficient PAC RL with Rich Observations
NIPS 2018
Completing State Representations using Spectral Learning
NIPS 2018
Open Problem: The Dependence of Sample Complexity Lower Bounds on Planning Horizon
COLT 2018
Exploration of Tree-based Hierarchical Softmax for Recurrent Language Models
IJCAI 2017
Contextual Decision Processes with low Bellman rank are PAC-Learnable
ICML 2017
Repeated Inverse Reinforcement Learning
NIPS 2017
The Dependence of Effective Planning Horizon on Model Accuracy
IJCAI 2016
Doubly Robust Off-policy Value Evaluation for Reinforcement Learning
ICML 2016
On Structural Properties of MDPs that Bound Loss Due to Shallow Planning
IJCAI 2016
Abstraction Selection in Model-based Reinforcement Learning
ICML 2015
Low-Rank Spectral Learning with Weighted Loss Functions
AISTATS 2015
Unifying Spatial and Attribute Selection for Distracter-Resilient Tracking
CVPR 2014