conftrace_

Wanli Ouyang

234 papers · 2013–2026 · 12 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+20 more ↓

🗺️ Taxonomy Completionist (16) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (5) 🐣 Hot Topic Early Bird

🌈 Renaissance Researcher (5) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌟 Keyword Trendsetter Combo (7) 🏠 Conference Loyalist (23) 🤝 Dynamic Duo (43) 🌱 Topic Pioneer 🔬 Deep Specialist (39) 🧬 Topic Evolution 🏆 Keyword Champion (10) 👑 Triple Crown 🏆 Grand Slam 👥 Mega-Team (20) 📈 Trend Setter 💎 Century Club (230) ⚡ Prolific Year (20) 🔥 Unstoppable (13) ❓ The Questioner (2) 🗃️ Keyword Collector (757) 🚀 Conference Pioneer

Conferences

CVPR (77) ICCV (40) AAAI (25) ECCV (25) NIPS (24) ICLR (15) ACL (10) ICML (7) IJCAI (4) EMNLP (3) WACV (3) NAACL (1)

Top co-authors

Xiaogang Wang (43) LEI BAI (35) Tong He (22) SHIXIANG TANG (22) Junjie Yan (16) Dongzhan Zhou (15) Peng Ye (15) hongsheng Li (14) Rui Zhao (14) Di Huang (14)

Research topics

Robotics (1) Applications (1)

Keywords

convolutional neural network (27) object detection (17) large language model (15) point cloud (13) person re-identification (12) human pose estimation (12) neural architecture search (11) transfer learning (10) representation learning (10) self-supervised learning (10) feature extraction (10) 3d object detection (10) neural network (9) semantic segmentation (9) feature representation (8) pose estimation (8) multimodal learning (8) image classification (8) autonomous driving (8) deep learning (8)

Papers

ARCHE: A Novel Task to Evaluate LLMs on Latent Reasoning Chain Extraction AAAI 2026 A Scalable Multi-LLM Collaboration System with Retrieval-based Selection and Exploration-Exploitation-Driven Enhancement ACL 2026 Nature-Inspired Population-Based Evolution of Large Language Models ACL 2026 Mitigating Low-Quality Reasoning in MLLMs: Self-Driven Refined Multimodal CoT with Selective Thinking and Step-wise Visual Enhancement AAAI 2026 Depth Any Video with Scalable Synthetic Data ICLR 2025 Neural Representational Consistency Emerges from Probabilistic Neural-Behavioral Representation Alignment ICML 2025 Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning CVPR 2025 UniSTD: Towards Unified Spatio-Temporal Learning across Diverse Disciplines CVPR 2025 ComfyBench: Benchmarking LLM-based Agents in ComfyUI for Autonomously Designing Collaborative AI Systems CVPR 2025 Neuro-3D: Towards 3D Visual Decoding from EEG Signals CVPR 2025 Satellite Observations Guided Diffusion Model for Accurate Meteorological States at Arbitrary Resolution CVPR 2025 ChemVLM: Exploring the Power of Multimodal Large Language Models in Chemistry Area AAAI 2025 Multi-Modal Latent Variables for Cross-Individual Primary Visual Cortex Modeling and Analysis AAAI 2025 GigaGS: 3D Gaussian Based Planar Representation for Large-Scene Surface Reconstruction AAAI 2025 Towards Efficient and Intelligent Laser Weeding: Method and Dataset for Weed Stem Detection AAAI 2025 Biology-Instructions: A Dataset and Benchmark for Multi-Omics Sequence Understanding Capability of Large Language Models EMNLP 2025 EgoAgent: A Joint Predictive Agent Model in Egocentric Worlds ICCV 2025 CMT: A Cascade MAR with Topology Predictor for Multimodal Conditional CAD Generation ICCV 2025 MindAligner: Explicit Brain Functional Alignment for Cross-Subject Visual Decoding from Limited fMRI Data ICML 2025 SparseFlex: High-Resolution and Arbitrary-Topology 3D Shape Modeling ICCV 2025 Dolphin: Moving Towards Closed-loop Auto-research through Thinking, Practice, and Feedback ACL 2025 Many Heads Are Better Than One: Improved Scientific Idea Generation by A LLM-Based Multi-Agent System ACL 2025 ROGRAG: A Robustly Optimized GraphRAG Framework ACL 2025 WeatherGFM: Learning a Weather Generalist Foundation Model via In-context Learning ICLR 2025 LLaMA-Berry: Pairwise Optimization for Olympiad-level Mathematical Reasoning via O1-like Monte Carlo Tree Search NAACL 2025 Human-Centric Foundation Models: Perception, Generation and Agentic Modeling IJCAI 2025 Where Am I and What Will I See: An Auto-Regressive Model for Spatial Localization and View Prediction ICLR 2025 PostCast: Generalizable Postprocessing for Precipitation Nowcasting via Unsupervised Blurriness Modeling ICLR 2025 TAR3D: Creating High-Quality 3D Assets via Next-Part Prediction ICCV 2025 MOOSE-Chem: Large Language Models for Rediscovering Unseen Chemistry Scientific Hypotheses ICLR 2025 A CLIP-Powered Framework for Robust and Generalizable Data Selection ICLR 2025 HiSplat: Hierarchical 3D Gaussian Splatting for Generalizable Sparse-View Reconstruction ICLR 2025 Dense Connector for MLLMs NIPS 2024 DiffPano: Scalable and Consistent Text to Panorama Generation with Spherical Epipolar-Aware Diffusion NIPS 2024 Generalizing Weather Forecast to Fine-grained Temporal Scales via Physics-AI Hybrid Modeling NIPS 2024 ProSST: Protein Language Modeling with Quantized Structure and Disentangled Attention NIPS 2024 Empowering and Assessing the Utility of Large Language Models in Crop Science NIPS 2024 Model Decides How to Tokenize: Adaptive DNA Sequence Tokenization with MxDNA NIPS 2024 Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning NIPS 2024 AFBench: A Large-scale Benchmark for Airfoil Design NIPS 2024 BEACON: Benchmark for Comprehensive RNA Tasks and Language Models NIPS 2024 NeuRodin: A Two-stage Framework for High-Fidelity Neural Surface Reconstruction NIPS 2024 EMR-Merging: Tuning-Free High-Performance Model Merging NIPS 2024 Lumina-Next : Making Lumina-T2X Stronger and Faster with Next-DiT NIPS 2024 FNP: Fourier Neural Processes for Arbitrary-Resolution Data Assimilation NIPS 2024 An Embarrassingly Simple Approach to Enhance Transformer Performance in Genomic Selection for Crop Breeding IJCAI 2024 ContraNovo: A Contrastive Learning Approach to Enhance De Novo Peptide Sequencing AAAI 2024 Frozen CLIP Transformer Is an Efficient Point Cloud Encoder AAAI 2024 Boosting Residual Networks with Group Knowledge AAAI 2024 Semi-supervised 3D Object Detection with PatchTeacher and PillarMix AAAI 2024 MotionGPT: Finetuned LLMs Are General-Purpose Motion Generators AAAI 2024 A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning AAAI 2024 MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues ACL 2024 Emulated Disalignment: Safety Alignment for Large Language Models May Backfire! ACL 2024 ConceptMath: A Bilingual Concept-wise Benchmark for Measuring Mathematical Reasoning of Large Language Models ACL 2024 Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization ACL 2024 RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models ACL 2024 Towards a Self-contained Data-driven Global Weather Forecasting Framework ICML 2024 FiT: Flexible Vision Transformer for Diffusion Model ICML 2024 CasCast: Skillful High-resolution Precipitation Nowcasting via Cascaded Modelling ICML 2024 Octavius: Mitigating Task Interference in MLLMs via LoRA-MoE ICLR 2024 GraphReader: Building Graph-based Agent to Enhance Long-Context Abilities of Large Language Models EMNLP 2024 LOCR: Location-Guided Transformer for Optical Character Recognition EMNLP 2024 DiffBIR: Toward Blind Image Restoration with Generative Diffusion Prior ECCV 2024 PredBench: Benchmarking Spatio-Temporal Prediction across Diverse Disciplines ECCV 2024 DetToolChain: A New Prompting Paradigm to Unleash Detection Ability of MLLM ECCV 2024 Agent3D-Zero: An Agent for Zero-shot 3D Understanding ECCV 2024 GVGEN: Text-to-3D Generation with Volumetric Representation ECCV 2024 UniDream: Unifying Diffusion Priors for Relightable Text-to-3D Generation ECCV 2024 Point Cloud Pre-training with Diffusion Models CVPR 2024 Point Transformer V3: Simpler Faster Stronger CVPR 2024 Instruct-ReID: A Multi-purpose Person Re-identification Task with Instructions CVPR 2024 UniPAD: A Universal Pre-training Paradigm for Autonomous Driving CVPR 2024 TASeg: Temporal Aggregation Network for LiDAR Semantic Segmentation CVPR 2024 Taming Stable Diffusion for Text to 360 Panorama Image Generation CVPR 2024 Masked Motion Predictors are Strong 3D Action Representation Learners ICCV 2023 STEERER: Resolving Scale Variations for Counting and Localization via Selective Inheritance Learning ICCV 2023 NDC-Scene: Boost Monocular 3D Semantic Scene Completion in Normalized Device Coordinates Space ICCV 2023 CLIP2Point: Transfer CLIP to Point Cloud Classification with Image-Depth Pre-Training ICCV 2023 Ponder: Point Cloud Pre-training via Neural Rendering ICCV 2023 Learning Multi-Modal Class-Specific Tokens for Weakly Supervised Dense Object Localization CVPR 2023 PVT-SSD: Single-Stage 3D Object Detector With Point-Voxel Transformer CVPR 2023 GD-MAE: Generative Decoder for MAE Pre-Training on LiDAR Point Clouds CVPR 2023 Crossing the Gap: Domain Generalization for Image Captioning CVPR 2023 Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval? CVPR 2023 MM-3DScene: 3D Scene Understanding by Customizing Masked Modeling With Informative-Preserved Reconstruction and Self-Distilled Consistency CVPR 2023 Learning to Parameterize Visual Attributes for Open-set Fine-grained Retrieval NIPS 2023 CluB: Cluster Meets BEV for LiDAR-Based 3D Object Detection NIPS 2023 Seeing is not always believing: Benchmarking Human and Model Perception of AI-Generated Images NIPS 2023 UniHCP: A Unified Model for Human-Centric Perceptions CVPR 2023 Open-Set Fine-Grained Retrieval via Prompting Vision-Language Evaluator CVPR 2023 ACE: Cooperative Multi-Agent Q-learning with Bidirectional Action-Dependency AAAI 2023 Multi-Scale Control Signal-Aware Transformer for Motion Synthesis without Phase AAAI 2023 Revisiting Classifier: Transferring Vision-Language Models for Video Recognition AAAI 2023 Fine-Grained Retrieval Prompt Tuning AAAI 2023 Exploiting Visual Context Semantics for Sound Source Localization WACV 2023 Bidirectional Cross-Modal Knowledge Exploration for Video Recognition With Pre-Trained Vision-Language Models CVPR 2023 Cycle-consistent Masked AutoEncoder for Unsupervised Domain Generalization ICLR 2023 Better Teacher Better Student: Dynamic Prior Knowledge for Knowledge Distillation ICLR 2023 SeCo: Separating Unknown Musical Visual Sounds With Consistency Guidance WACV 2023 LAMM: Language-Assisted Multi-Modal Instruction-Tuning Dataset, Framework, and Benchmark NIPS 2023 HumanBench: Towards General Human-Centric Perception With Projector Assisted Pretraining CVPR 2023 Bi-LRFusion: Bi-Directional LiDAR-Radar Fusion for 3D Dynamic Object Detection CVPR 2023 Towards Fair and Comprehensive Comparisons for Image-Based 3D Object Detection ICCV 2023 What Can Simple Arithmetic Operations Do for Temporal Modeling? ICCV 2023 Semi-Supervised Semantic Segmentation under Label Noise via Diverse Learning Groups ICCV 2023 Pseudo-Labeled Auto-Curriculum Learning for Semi-Supervised Keypoint Localization ICLR 2022 Stimulative Training of Residual Networks: A Social Psychology Perspective of Loafing NIPS 2022 Unsupervised Object Detection Pretraining with Joint Object Priors Generation and Detector Learning NIPS 2022 Category-Specific Nuance Exploration Network for Fine-Grained Object Retrieval AAAI 2022 SepFusion: Finding Optimal Fusion Structures for Visual Sound Separation AAAI 2022 Multi-Class Token Transformer for Weakly Supervised Semantic Segmentation CVPR 2022 Accelerating Neural Network Optimization Through an Automated Control Theory Lens CVPR 2022 Unsupervised Learning of Accurate Siamese Tracking CVPR 2022 DR.VIC: Decomposition and Reasoning for Video Individual Counting CVPR 2022 Not All Tokens Are Equal: Human-Centric Visual Analysis via Token Clustering Transformer CVPR 2022 Revisiting the Transferability of Supervised Pretraining: An MLP Perspective CVPR 2022 b-DARTS: Beta-Decay Regularization for Differentiable Architecture Search CVPR 2022 3D Interacting Hand Pose Estimation by Hand De-Occlusion and Removal ECCV 2022 Pose for Everything: Towards Category-Agnostic Pose Estimation ECCV 2022 Backbone Is All Your Need: A Simplified Architecture for Visual Object Tracking ECCV 2022 Fast-MoCo: Boost Momentum-Based Contrastive Learning with Combinatorial Patches ECCV 2022 Unifying Visual Contrastive Learning for Object Recognition from a Graph Perspective ECCV 2022 Relative Contrastive Loss for Unsupervised Representation Learning ECCV 2022 Domain Invariant Masked Autoencoders for Self-Supervised Learning from Multi-Domains ECCV 2022 NSNet: Non-Saliency Suppression Sampler for Efficient Video Recognition ECCV 2022 MonoDistill: Learning Spatial Features for Monocular 3D Object Detection ICLR 2022 Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm ICLR 2022 RePre: Improving Self-Supervised Vision Transformer with Reconstructive Pre-training IJCAI 2022 PyMAF: 3D Human Pose and Shape Regression With Pyramidal Mesh Alignment Feedback Loop ICCV 2021 Graph-Based 3D Multi-Person Pose Estimation Using Multi-View Images ICCV 2021 Evolving Search Space for Neural Architecture Search ICCV 2021 GLiT: Neural Architecture Search for Global and Local Image Transformer ICCV 2021 BN-NAS: Neural Architecture Search With Batch Normalization ICCV 2021 Leveraging Auxiliary Tasks With Affinity Learning for Weakly Supervised Semantic Segmentation ICCV 2021 Geometry Uncertainty Projection Network for Monocular 3D Object Detection ICCV 2021 Aggregation With Feature Detection ICCV 2021 A Continuous Mapping For Augmentation Design NIPS 2021 Mutual CRF-GNN for Few-Shot Learning CVPR 2021 Inception Convolution With Efficient Dilation Search CVPR 2021 Layerwise Optimization by Gradient Decomposition for Continual Learning CVPR 2021 Delving Into Localization Errors for Monocular 3D Object Detection CVPR 2021 ViPNAS: Efficient Video Pose Estimation via Neural Architecture Search CVPR 2021 Gradient Regularized Contrastive Learning for Continual Domain Adaptation AAAI 2021 Dynamic Position-aware Network for Fine-grained Image Recognition AAAI 2021 AutoSampling: Search for Effective Data Sampling Schedules ICML 2021 Once Quantization-Aware Training: High Performance Extremely Low-Bit Architecture Search ICCV 2021 3D Hand Pose Estimation with Disentangled Cross-Modal Latent Space WACV 2020 DASOT: A Unified Framework Integrating Data Association and Single Object Tracking for Online Multi-Object Tracking AAAI 2020 Hierarchical Online Instance Matching for Person Search AAAI 2020 Computation Reallocation for Object Detection ICLR 2020 Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition CVPR 2020 Improving Deep Video Compression by Resolution-adaptive Flow Coding ECCV 2020 Content Adaptive and Error Propagation Aware Deep Video Compression ECCV 2020 Differentiable Hierarchical Graph Grouping for Multi-Person Pose Estimation ECCV 2020 Cheaper Pre-training Lunch: An Efficient Paradigm for Object Detection ECCV 2020 Whole-Body Human Pose Estimation in the Wild ECCV 2020 Rethinking Pseudo-LiDAR Representation ECCV 2020 Equalization Loss for Long-Tailed Object Recognition CVPR 2020 Relational Prototypical Network for Weakly Supervised Temporal Action Localization AAAI 2020 Part-Level Graph Convolutional Network for Skeleton-Based Action Recognition AAAI 2020 Multi-Dimensional Pruning: A Unified Framework for Model Compression CVPR 2020 3D Human Mesh Regression With Dense Correspondence CVPR 2020 EcoNAS: Finding Proxies for Economical Neural Architecture Search CVPR 2020 Improving One-Shot NAS by Suppressing the Posterior Fading CVPR 2020 Improving Auto-Augment via Augmentation-Wise Weight Sharing NIPS 2020 Channel Pruning Guided by Classification Loss and Feature Importance AAAI 2020 Libra R-CNN: Towards Balanced Learning for Object Detection CVPR 2019 Improving Action Localization by Progressive Cross-Stream Cooperation CVPR 2019 DVC: An End-To-End Deep Video Compression Framework CVPR 2019 Multi-Person Articulated Tracking With Spatial and Temporal Embeddings CVPR 2019 Hybrid Task Cascade for Instance Segmentation CVPR 2019 Box-Driven Class-Wise Region Masking and Filling Rate Guided Loss for Weakly Supervised Semantic Segmentation CVPR 2019 GS3D: An Efficient 3D Object Detection Framework for Autonomous Driving CVPR 2019 SR-LSTM: State Refinement for LSTM Towards Pedestrian Trajectory Prediction CVPR 2019 Crowd Counting With Deep Structured Scale Integration Network ICCV 2019 LAP-Net: Level-Aware Progressive Network for Image Dehazing ICCV 2019 Structured Modeling of Joint Deep Feature and Prediction Refinement for Salient Object Detection ICCV 2019 Unsupervised Collaborative Learning of Keyframe Detection and Visual Odometry Towards Monocular Deep SLAM ICCV 2019 GradNet: Gradient-Guided Network for Visual Object Tracking ICCV 2019 Online Hyper-Parameter Learning for Auto-Augmentation Strategy ICCV 2019 Accurate Monocular 3D Object Detection via Color-Embedded 3D Reconstruction for Autonomous Driving ICCV 2019 AM-LFS: AutoML for Loss Function Search ICCV 2019 TRB: A Novel Triplet Representation for Understanding 2D Human Body ICCV 2019 Feature Intertwiner for Object Detection ICLR 2019 Quantization Mimic: Towards Very Tiny CNN for Object Detection ECCV 2018 Person Search via A Mask-guided Two-stream CNN Model ECCV 2018 FishNet: A Versatile Backbone for Image, Region, and Pixel Level Prediction NIPS 2018 Dividing and Aggregating Network for Multi-view Action Recognition ECCV 2018 Factorizable Net: An Efficient Subgraph-based Framework for Scene Graph Generation ECCV 2018 Visual Question Generation as Dual Task of Visual Question Answering CVPR 2018 3D Human Pose Estimation in the Wild by Adversarial Learning CVPR 2018 Exploit the Unknown Gradually: One-Shot Video-Based Person Re-Identification by Stepwise Learning CVPR 2018 Collaborative and Adversarial Network for Unsupervised Domain Adaptation CVPR 2018 Crowd Counting using Deep Recurrent Spatial-Aware Network IJCAI 2018 Mask-Guided Contrastive Attention Model for Person Re-Identification CVPR 2018 PAD-Net: Multi-Tasks Guided Prediction-and-Distillation Network for Simultaneous Depth Estimation and Scene Parsing CVPR 2018 Style Aggregated Network for Facial Landmark Detection CVPR 2018 Attention-Aware Compositional Network for Person Re-Identification CVPR 2018 Optical Flow Guided Feature: A Fast and Robust Motion Representation for Video Action Recognition CVPR 2018 Deep Kalman Filtering Network for Video Compression Artifact Reduction ECCV 2018 Multi-Context Attention for Human Pose Estimation CVPR 2017 Learning Feature Pyramids for Human Pose Estimation ICCV 2017 Scene Graph Generation From Objects, Phrases and Region Captions ICCV 2017 Quality Aware Network for Set to Set Recognition CVPR 2017 Learning Spatial Regularization With Image-Level Supervisions for Multi-Label Image Classification CVPR 2017 Learning Cross-Modal Deep Representations for Robust Pedestrian Detection CVPR 2017 Multi-Scale Continuous CRFs as Sequential Deep Networks for Monocular Depth Estimation CVPR 2017 Chained Cascade Network for Object Detection ICCV 2017 ViP-CNN: Visual Phrase Guided Convolutional Neural Network CVPR 2017 Object Detection in Videos With Tubelet Proposal Networks CVPR 2017 Learning Deep Structured Multi-Scale Features using Attention-Gated CRFs for Contour Prediction NIPS 2017 Online Multi-Object Tracking Using CNN-Based Single Object Tracker With Spatial-Temporal Attention Mechanism ICCV 2017 STCT: Sequentially Training Convolutional Networks for Visual Tracking CVPR 2016 End-To-End Learning of Deformable Mixture of Parts and Deep Convolutional Neural Networks for Human Pose Estimation CVPR 2016 Multi-Bias Non-linear Activation in Deep Neural Networks ICML 2016 CRF-CNN: Modeling Structured Information in Human Pose Estimation NIPS 2016 Structured Feature Learning for Pose Estimation CVPR 2016 Object Detection From Video Tubelets With Convolutional Neural Networks CVPR 2016 Factors in Finetuning Deep Model for Object Detection With Long-Tail Distribution CVPR 2016 Learning Deep Feature Representations With Domain Guided Dropout for Person Re-Identification CVPR 2016 Learning Deep Representation With Large-Scale Attributes ICCV 2015 Saliency Detection by Multi-Context Deep Learning CVPR 2015 DeepID-Net: Deformable Deep Convolutional Neural Networks for Object Detection CVPR 2015 Visual Tracking With Fully Convolutional Networks ICCV 2015 Multi-Task Recurrent Neural Network for Immediacy Prediction ICCV 2015 Multi-source Deep Learning for Human Pose Estimation CVPR 2014 Learning Mid-level Filters for Person Re-identification CVPR 2014 Multi-stage Contextual Deep Learning for Pedestrian Detection ICCV 2013 Person Re-identification by Salience Matching ICCV 2013 Joint Deep Learning for Pedestrian Detection ICCV 2013 Modeling Mutual Visibility Relationship in Pedestrian Detection CVPR 2013 Single-Pedestrian Detection Aided by Multi-pedestrian Detection CVPR 2013 Unsupervised Salience Learning for Person Re-identification CVPR 2013