Xiaolong Wang
181 papers · 2005–2026 · 19 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+17 more ↓ Show less ↑
π Conference Polyglot (19) π£ Hot Topic Early Bird π§ Keyword Pioneer π Interdisciplinary Bridge π Academic Marathon (20)
π
Renaissance Researcher
(11)
π
Interdisciplinary Bridge
π
Cross-Pollinator
(12)
π
Conference Loyalist
(20)
π
The Namer
π€
Dynamic Duo
(22)
π
Triple Crown
π
Grand Slam
π¬
Deep Specialist
(17)
π§¬
Topic Evolution
π
Keyword Champion
(6)
π
Conference Pioneer
β‘
Prolific Year
(30)
ποΈ
Keyword Collector
(464)
π
Century Club
(178)
π₯
Unstoppable
(18)
π
Trend Setter
Conferences
CVPR (36)
ICLR (20)
CORL (18)
ACL (14)
ICCV (14)
NIPS (14)
ECCV (10)
IJCNLP (9)
COLING (8)
ICML (8)
RSS (6)
CONLL (5)
EMNLP (5)
SEMEVAL (4)
IJCAI (3)
AAAI (3)
NAACL (2)
JMLR (1)
WACV (1)
Top co-authors
Keywords
self-supervised learning
(12)
reinforcement learning
(8)
large language model
(7)
novel view synthesis
(7)
convolutional neural network
(7)
contrastive learning
(6)
object detection
(6)
sim-to-real transfer
(6)
representation learning
(6)
dexterous manipulation
(5)
neural radiance field
(5)
image generation
(4)
camera pose estimation
(4)
domain adaptation
(4)
imitation learning
(4)
dialogue system
(4)
3d reconstruction
(4)
few-shot learning
(4)
video generation
(4)
zero-shot learning
(4)
Papers
MCA-Bench: A Multimodal Benchmark for Evaluating CAPTCHA Robustness Against VLM-based Attacks
AAAI 2026
UR2 : Unify RAG and Reasoning through Reinforcement Learning
ACL 2026
Beyond "I Donβt Know": Evaluating LLM Self-Awareness in Discriminating Data and Model Uncertainty
ACL 2026
Humanoid Policy Β Human Policy
CORL 2025
Co-Design of Soft Gripper with Neural Physics
CORL 2025
Lucid-XR: An Extended-Reality Data Engine for Robotic Manipulation
CORL 2025
Perspective Transition of Large Language Models for Solving Subjective Tasks
ACL 2025
HomoMatcher: Achieving Dense Feature Matching with Semi-Dense Efficiency by Homography Estimation
AAAI 2025
ActiView: Evaluating Active Perception Ability for Multimodal Large Language Models
ACL 2025
Learning to (Learn at Test Time): RNNs with Expressive Hidden States
ICML 2025
Dynamic Gaussians Mesh: Consistent Mesh Reconstruction from Dynamic Scenes
ICLR 2025
3D-SPATIAL MULTIMODAL MEMORY
ICLR 2025
Hierarchical World Models as Visual Whole-Body Humanoid Controllers
ICLR 2025
Consistent Flow Distillation for Text-to-3D Generation
ICLR 2025
Hallucination Detection in Structured Query Generation via LLM Self-Debating
EMNLP 2025
MUCAR: Benchmarking Multilingual Cross-Modal Ambiguity Resolution for Multimodal Large Language Models
EMNLP 2025
One-Minute Video Generation with Test-Time Training
CVPR 2025
Parallel Sequence Modeling via Generalized Spatial Propagation Network
CVPR 2025
Dex1B: Learning with 1B Demonstrations for Dexterous Manipulation
RSS 2025
EditAR: Unified Conditional Generation with Autoregressive Models
CVPR 2025
AMO: Adaptive Motion Optimization for Hyper-Dexterous Humanoid Whole-Body Control
RSS 2025
NaVILA: Legged Robot Vision-Language-Action Model for Navigation
RSS 2025
Test-Time Training on Video Streams
JMLR 2025
EmoCharacter: Evaluating the Emotional Fidelity of Role-Playing Agents in Dialogues
NAACL 2025
ManiFlow: A General Robot Manipulation Policy via Consistency Flow Training
CORL 2025
VT-Refine: Learning Bimanual Assembly with Visuo-Tactile Feedback via Simulation Fine-Tuning
CORL 2025
Open-TeleVision: Teleoperation with Immersive Active Visual Feedback
CORL 2024
A Simulation Benchmark for Autonomous Racing with Large-Scale Human Data
NIPS 2024
SpatialRGPT: Grounded Spatial Reasoning in Vision-Language Models
NIPS 2024
Visual Whole-Body Control for Legged Loco-Manipulation
CORL 2024
GraspSplats: Efficient Manipulation with 3D Feature Splatting
CORL 2024
Lessons from Learning to Spin βPensβ
CORL 2024
Visual Manipulation with Legs
CORL 2024
Generalized Animal Imitator: Agile Locomotion with Versatile Motion Prior
CORL 2024
ACE: A Cross-platform and visual-Exoskeletons System for Low-Cost Dexterous Teleoperation
CORL 2024
CODIS: Benchmarking Context-dependent Visual Comprehension for Multimodal Large Language Models
ACL 2024
Enhancing Multilingual Capabilities of Large Language Models through Self-Distillation from Resource-Rich Languages
ACL 2024
Reasoning in Conversation: Solving Subjective Tasks through Dialogue Simulation for Large Language Models
ACL 2024
DEEM: Dynamic Experienced Expert Modeling for Stance Detection
COLING 2024
Pluggable Neural Machine Translation Models via Memory-augmented Adapters
COLING 2024
RGBD Objects in the Wild: Scaling Real-World 3D Object Learning from RGB-D Videos
CVPR 2024
Pixel-Aligned Language Model
CVPR 2024
Image Neural Field Diffusion Models
CVPR 2024
HOIDiffusion: Generating Realistic 3D Hand-Object Interaction Data
CVPR 2024
COLMAP-Free 3D Gaussian Splatting
CVPR 2024
CyberDemo: Augmenting Simulated Human Demonstration for Real-World Dexterous Manipulation
CVPR 2024
Investigating and Mitigating the Side Effects of Noisy Views for Self-Supervised Clustering Algorithms in Practical Multi-View Scenarios
CVPR 2024
Editable Image Elements for Controllable Synthesis
ECCV 2024
PointLLM: Empowering Large Language Models to Understand Point Clouds
ECCV 2024
Language-Driven Physics-Based Scene Synthesis and Editing via Feature Splatting
ECCV 2024
GenSim: Generating Robotic Simulation Tasks via Large Language Models
ICLR 2024
TD-MPC2: Scalable, Robust World Models for Continuous Control
ICLR 2024
3D Reconstruction with Generalizable Neural Fields using Scene Priors
ICLR 2024
TUVF: Learning Generalizable Texture UV Radiance Fields
ICLR 2024
Expressive Whole-Body Control for Humanoid Robots
RSS 2024
A Multimodal Benchmark and Improved Architecture for Zero Shot Learning
WACV 2024
Cross-Modality Person Re-identification with Memory-Based Contrastive Embedding
AAAI 2023
DexArt: Benchmarking Generalizable Dexterous Manipulation With Articulated Objects
CVPR 2023
Dynamic Inference With Grounding Based Vision and Language Models
CVPR 2023
Open-Vocabulary Panoptic Segmentation With Text-to-Image Diffusion Models
CVPR 2023
Zero-Shot Pose Transfer for Unrigged Stylized 3D Characters
CVPR 2023
Policy Adaptation From Foundation Model Feedback
CVPR 2023
Neural Volumetric Memory for Visual Locomotion Control
CVPR 2023
Elastic Decision Transformer
NIPS 2023
Learning Dense Correspondences between Photos and Sketches
ICML 2023
AnyTeleop: A General Vision-Based Dexterous Robot Arm-Hand Teleoperation System
RSS 2023
Rotating without Seeing: Towards In-hand Dexterity through Touch
RSS 2023
FeatureNeRF: Learning Generalizable NeRFs by Distilling Foundation Models
ICCV 2023
On Pre-Training for Visuo-Motor Control: Revisiting a Learning-from-Scratch Baseline
ICML 2023
MonoNeRF: Learning Generalizable NeRFs from Monocular Videos without Camera Poses
ICML 2023
ActorsNeRF: Animatable Few-shot Human Rendering with Generalizable NeRFs
ICCV 2023
Dynamic Handover: Throw and Catch with Bimanual Hands
CORL 2023
Finetuning Offline World Models in the Real World
CORL 2023
GNFactor: Multi-Task Real Robot Learning with Generalizable Neural Feature Fields
CORL 2023
Fine-Grained Cross-View Geo-Localization Using a Correlation-Aware Homography Estimator
NIPS 2023
GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group Propagation
ICLR 2023
Self-Supervised Geometric Correspondence for Category-Level 6D Object Pose Estimation in the Wild
ICLR 2023
MoDem: Accelerating Visual Model-Based Reinforcement Learning with Demonstrations
ICLR 2023
GIFS: Neural Implicit Function for General Shape Representation
CVPR 2022
DexMV: Imitation Learning for Dexterous Manipulation from Human Videos
ECCV 2022
Scraping Textures from Natural Images for Synthesis and Editing
ECCV 2022
Transformers As Meta-Learners for Implicit Neural Representations
ECCV 2022
Learning Implicit Feature Alignment Function for Semantic Segmentation
ECCV 2022
Temporal Difference Learning for Model Predictive Control
ICML 2022
Graph Inverse Reinforcement Learning from Diverse Videos
CORL 2022
DexPoint: Generalizable Point Cloud Reinforcement Learning for Sim-to-Real Dexterous Manipulation
CORL 2022
Category-Level 6D Object Pose Estimation in the Wild: A Semi-Supervised Learning Approach and A New Dataset
NIPS 2022
CoordGAN: Self-Supervised Dense Correspondences Emerge From GANs
CVPR 2022
VideoINR: Learning Video Implicit Neural Representation for Continuous Space-Time Super-Resolution
CVPR 2022
Learning Generalizable Dexterous Manipulation from Human Grasp Affordance
CORL 2022
Look Outside the Room: Synthesizing a Consistent Long-Term 3D Scene Video From a Single Image
CVPR 2022
GroupViT: Semantic Segmentation Emerges From Text Supervision
CVPR 2022
Joint Hand Motion and Interaction Hotspots Prediction From Egocentric Videos
CVPR 2022
Learning Vision-Guided Quadrupedal Locomotion End-to-End with Cross-Modal Transformers
ICLR 2022
Learning Continuous Environment Fields via Implicit Functions
ICLR 2022
Multi-Person 3D Motion Prediction with Multi-Range Transformers
NIPS 2021
Semi-Supervised 3D Hand-Object Poses Estimation With Interactions in Time
CVPR 2021
Synthesizing Long-Term 3D Human Motion and Interaction in 3D Scenes
CVPR 2021
Learning Continuous Image Representation With Local Implicit Image Function
CVPR 2021
Stabilizing Deep Q-Learning with ConvNets and Vision Transformers under Data Augmentation
NIPS 2021
Test-Time Personalization with a Transformer for Human Pose Estimation
NIPS 2021
Rethinking Self-Supervised Correspondence Learning: A Video Frame-Level Similarity Perspective
ICCV 2021
Video Autoencoder: Self-Supervised Disentanglement of Static 3D Structure and Motion
ICCV 2021
Contrastive Learning of Image Representations With Cross-Video Cycle-Consistency
ICCV 2021
Robust Object Detection via Instance-Level Temporal Cycle Confusion
ICCV 2021
A-SDF: Learning Disentangled Signed Distance Functions for Articulated Shape Representation
ICCV 2021
Meta-Baseline: Exploring Simple Meta-Learning for Few-Shot Learning
ICCV 2021
Region Similarity Representation Learning
ICCV 2021
Hand-Object Contact Consistency Reasoning for Human Grasps Generation
ICCV 2021
Rethinking Preventing Class-Collapsing in Metric Learning With Margin-Based Losses
ICCV 2021
Solving Compositional Reinforcement Learning Problems via Task Reduction
ICLR 2021
Discovering Diverse Multi-Agent Strategic Behavior via Reward Randomization
ICLR 2021
Learning Long-term Visual Dynamics with Region Proposal Interaction Networks
ICLR 2021
What Should Not Be Contrastive in Contrastive Learning
ICLR 2021
Learning Cross-Domain Correspondence for Control with Dynamics Cycle-Consistency
ICLR 2021
Self-Supervised Policy Adaptation during Deployment
ICLR 2021
Compositional Video Synthesis with Action Graphs
ICML 2021
NovelD: A Simple yet Effective Exploration Criterion
NIPS 2021
Deep Isometric Learning for Visual Recognition
ICML 2020
Multi-Task Reinforcement Learning with Soft Modularization
NIPS 2020
MedWriter: Knowledge-Aware Medical Text Generation
COLING 2020
Hierarchical Style-based Networks for Motion Synthesis
ECCV 2020
Online Adaptation for Consistent Mesh Reconstruction in the Wild
NIPS 2020
Something-Else: Compositional Action Recognition With Spatial-Temporal Interaction Networks
CVPR 2020
Test-Time Training with Self-Supervision for Generalization under Distribution Shifts
ICML 2020
Continual Learning Long Short Term Memory
EMNLP 2020
Learning Correspondence From the Cycle-Consistency of Time
CVPR 2019
Putting Humans in a Scene: Learning Affordance in 3D Indoor Environments
CVPR 2019
Joint-task Self-supervised Learning for Temporal Correspondence
NIPS 2019
Visual Semantic Navigation using Scene Priors
ICLR 2019
A Deep Learning-Based System for PharmaCoNER
EMNLP 2019
LSDSCC: a Large Scale Domain-Specific Conversational Corpus for Response Generation with Diversity Oriented Evaluation Metrics
NAACL 2018
Interpretable Intuitive Physics Model
ECCV 2018
Videos as Space-Time Region Graphs
ECCV 2018
Dynamically Hierarchy Revolution: DirNet for Compressing Recurrent Neural Network on Mobile Devices
IJCAI 2018
Non-Local Neural Networks
CVPR 2018
Zero-Shot Recognition via Semantic Embeddings and Knowledge Graphs
CVPR 2018
3D Human Pose Estimation in the Wild by Adversarial Learning
CVPR 2018
Transitive Invariance for Self-Supervised Visual Representation Learning
ICCV 2017
Temporal Dynamic Graph LSTM for Action-Driven Video Object Detection
ICCV 2017
Predicting Usersβ Negative Feedbacks in Multi-Turn Human-Computer Dialogues
IJCNLP 2017
A-Fast-RCNN: Hard Positive Generation via Adversary for Object Detection
CVPR 2017
Binge Watching: Scaling Affordance Learning From Sitcoms
CVPR 2017
Neural Response Generation via GAN with an Approximate Embedding Layer
EMNLP 2017
Incorporating Label Dependency for Answer Quality Tagging in Community Question Answering via CNN-LSTM-CRF
COLING 2016
Actions ~ Transformations
CVPR 2016
Answer Sequence Learning with Neural Networks for Answer Selection in Community Question Answering
IJCNLP 2015
ICRC-HIT: A Deep Learning based Comment Sequence Labeling System for Answer Selection Challenge
SEMEVAL 2015
HITSZ-ICRC: An Integration Approach for QA TempEval Challenge
SEMEVAL 2015
Designing Deep Networks for Surface Normal Estimation
CVPR 2015
Modeling Mention, Context and Entity with Neural Networks for Entity Disambiguation
IJCAI 2015
VRCA: A Clustering Algorithm for Massive Amount of Texts
IJCAI 2015
yiGou: A Semantic Text Similarity Computing System Based on SVM
SEMEVAL 2015
HITSZ-ICRC: Exploiting Classification Approach for Answer Selection in Community Question Answering
SEMEVAL 2015
Unsupervised Learning of Visual Representations Using Videos
ICCV 2015
Predicting Polarities of Tweets by Composing Word Embeddings with Long Short-Term Memory
ACL 2015
Answer Sequence Learning with Neural Networks for Answer Selection in Community Question Answering
ACL 2015
Predicting Polarities of Tweets by Composing Word Embeddings with Long Short-Term Memory
IJCNLP 2015
Hybrid Deep Belief Networks for Semi-supervised Sentiment Classification
COLING 2014
Identification of Basic Phrases for Kazakh Language using Maximum Entropy Model
COLING 2014
Cross-lingual Opinion Analysis via Negative Transfer Detection
ACL 2014
WINGS:Writing with Intelligent Guidance and Suggestions
ACL 2014
Deep Joint Task Learning for Generic Object Extraction
NIPS 2014
Grammatical Error Correction Using Feature Selection and Confidence Tuning
IJCNLP 2013
Incorporating Structural Alternatives and Sharing into Hierarchy for Multiclass Object Recognition and Detection
CVPR 2013
A Hybrid Model For Grammatical Error Correction
CONLL 2013
Multimodal DBN for Predicting High-Quality Answers in cQA portals
ACL 2013
PAL: A Chatterbot System for Answering Domain-specific Questions
ACL 2013
Automatic Corpora Construction for Text Classification
IJCNLP 2013
Dynamical And-Or Graph Learning for Object Shape Modeling and Detection
NIPS 2012
A Mixed Deterministic Model for Coreference Resolution
CONLL 2012
Generating Questions from Web Community Contents
COLING 2012
Diversifying Information Needs in Results of Question Retrieval
IJCNLP 2011
A Cascade Method for Detecting Hedges and their Scope in Natural Language Text
CONLL 2010
Modeling Semantic Relevance for Question-Answer Pairs in Web Social Communities
ACL 2010
Active Deep Networks for Semi-Supervised Sentiment Classification
COLING 2010
A Joint Syntactic and Semantic Dependency Parsing System based on Maximum Entropy Models
CONLL 2009
Name Origin Recognition Using Maximum Entropy Model and Diverse Features
IJCNLP 2008
Discriminative Learning of Syntactic and Semantic Dependencies
CONLL 2008
Detecting Segmentation Errors in Chinese Annotated Corpus
IJCNLP 2005
Principles of Non-stationary Hidden Markov Model and Its Applications to Sequence Labeling Task
IJCNLP 2005