Bo Zhang

171 papers · 2008–2026 · 18 conferences · across top CS/AI conferences

Achievements

+18 more ↓

🗺️ Taxonomy Completionist (29) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (8) 🐣 Hot Topic Early Bird

🌈 Renaissance Researcher (8) 🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (29) 🏠 Conference Loyalist (21) 🔬 Deep Specialist (15) 👑 Triple Crown 🧬 Topic Evolution 🏆 Keyword Champion 🏆 Grand Slam 👥 Mega-Team (38) 🤝 Dynamic Duo (38) ❓ The Questioner 🚀 Conference Pioneer ⚡ Prolific Year (14) 🔥 Unstoppable (14) 🗃️ Keyword Collector (73) 💎 Century Club (157) 📈 Trend Setter

Conferences

CVPR (34) ACL (22) NIPS (21) AAAI (16) ICCV (15) ICLR (13) ICML (12) ECCV (9) EMNLP (9) IJCAI (7) NAACL (3) CLEAR (2) IJCNLP (2) JMLR (2) EACL (1) INTERSPEECH (1) AISTATS (1) SEMEVAL (1)

Top co-authors

Jun Zhu (38) Botian Shi (14) Chongxuan Li (13) Dong Chen (13) Xiangxiang Chu (12) Jiakang Yuan (11) Tao Chen (11) Zhenghua Li (11) Fang Wen (11) Hang Su (10)

Keywords

large language model (10) diffusion model (10) model compression (8) multimodal learning (7) variational inference (7) autonomous driving (7) vision-language model (6) convolutional neural network (6) image generation (6) generative adversarial network (6) neural network (6) variational autoencoder (5) domain adaptation (5) semi-supervised learning (5) image restoration (5) grammatical error correction (5) data augmentation (5) zero-shot learning (5) text generation (5) generative model (5)

Papers

MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing ACL 2026 A Scalable Multi-LLM Collaboration System with Retrieval-based Selection and Exploration-Exploitation-Driven Enhancement ACL 2026 FlowSearch: Advancing Deep Research with Dynamic Structured Knowledge Flow ACL 2026 AEGIS: A Holistic Benchmark for Evaluating Forensic Analysis of AI-Generated Academic Images ACL 2026 ViG-RAG: Video-aware Graph Retrieval-Augmented Generation via Temporal and Semantic Hybrid Reasoning AAAI 2026 MTRouter: Cost-Aware Multi-Turn LLM Routing with History–Model Joint Embeddings ACL 2026 A Survey of Reinforcement Learning for Large Language Models under Data Scarcity: Challenges and Solutions ACL 2026 Counterfactual Question Generation Uncovering Learner Contradictions AAAI 2026 Multimodal DeepResearcher: Generating Text-Chart Interleaved Reports from Scratch with Agentic Framework AAAI 2026 Learning to LEAP: Efficient Dense Point Tracking by Focusing Where It Matters AAAI 2026 DisCal: Distribution-Aware Calibration for Mathematical Reasoning Under Character-Level Noisy Inputs ACL 2026 Global-Local Confidence Fusion for Hallucination Detection in Mathematical Reasoning Task AAAI 2026 Semore: VLM-guided Enhanced Semantic Motion Representations for Visual Reinforcement Learning AAAI 2026 CareCom: Generative Image Composition with Calibrated Reference Features AAAI 2026 MLVU: Benchmarking Multi-task Long Video Understanding CVPR 2025 JiSAM: Alleviate Labeling Burden and Corner Case Problems in Autonomous Driving via Minimal Real-World Data CVPR 2025 DiffCalib: Reformulating Monocular Camera Calibration as Diffusion-Based Dense Incident Map Generation AAAI 2025 LiON: Learning Point-Wise Abstaining Penalty for LiDAR Outlier DetectioN Using Diverse Synthetic Data AAAI 2025 What Is a Good Question? Assessing Question Quality via Meta-Fact Checking AAAI 2025 MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency ICML 2025 A Semantic Knowledge Complementarity based Decoupling Framework for Semi-supervised Class-imbalanced Medical Image Segmentation CVPR 2025 PerturboLLaVA: Reducing Multimodal Hallucinations with Perturbative Visual Training ICLR 2025 MovieDreamer: Hierarchical Generation for Coherent Long Visual Sequences ICLR 2025 GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training ICLR 2025 Image Over Text: Transforming Formula Recognition Evaluation with Character Detection Matching CVPR 2025 SURVEYFORGE : On the Outline Heuristics, Memory-Driven Generation, and Multi-dimensional Evaluation for Automated Survey Writing ACL 2025 Dolphin: Moving Towards Closed-loop Auto-research through Thinking, Practice, and Feedback ACL 2025 A Training-free LLM-based Approach to General Chinese Character Error Correction ACL 2025 DISC: Plug-and-Play Decoding Intervention with Similarity of Characters for Chinese Spelling Check ACL 2025 Dynamic Evil Score-Guided Decoding: An Efficient Decoding Framework For Red-Team Model ACL 2025 dutir914 at SemEval-2025 Task 1: An integrated approach for Multimodal Idiomaticity Representations ACL 2025 OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text ICLR 2025 Self-adaptive Dataset Construction for Real-World Multimodal Safety Scenarios EMNLP 2025 SafeConf: A Confidence-Calibrated Safety Self-Evaluation Method for Large Language Models EMNLP 2025 dutir914 at SemEval-2025 Task 1: An integrated approach for Multimodal Idiomaticity Representations SEMEVAL 2025 OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations CVPR 2025 Lumina-Image 2.0: A Unified and Efficient Image Generative Framework ICCV 2025 DriveX: Omni Scene Modeling for Learning Generalizable World Knowledge in Autonomous Driving ICCV 2025 Temporal Overlapping Prediction: A Self-supervised Pre-training Method for LiDAR Moving Object Segmentation ICCV 2025 Chimera: Improving Generalist Model with Domain-Specific Experts ICCV 2025 R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization ICCV 2025 Norma: A Noise Robust Memory-Augmented Framework for Whole Slide Image Classification ECCV 2024 Multimodal Clickbait Detection by De-confounding Biases Using Causal Representation Inference EMNLP 2024 A Simple yet Effective Training-free Prompt-free Approach to Chinese Spelling Correction Based on Large Language Models EMNLP 2024 mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding EMNLP 2024 mABC: Multi-Agent Blockchain-inspired Collaboration for Root Cause Analysis in Micro-Services Architecture EMNLP 2024 Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategy NIPS 2024 DomainGallery: Few-shot Domain-driven Image Generation by Attribute-centric Finetuning NIPS 2024 Exploring and Exploiting the Asymmetric Valley of Deep Neural Networks NIPS 2024 Continuously Learning, Adapting, and Improving: A Dual-Process Approach to Autonomous Driving NIPS 2024 ZOPP: A Framework of Zero-shot Offboard Panoptic Perception for Autonomous Driving NIPS 2024 DAGCN: Distance-based and Aspect-oriented Graph Convolutional Network for Aspect-based Sentiment Analysis NAACL 2024 LogFormer: A Pre-train and Tuning Pipeline for Log Anomaly Detection AAAI 2024 Make RepVGG Greater Again: A Quantization-Aware Approach AAAI 2024 Norm Tweaking: High-Performance Low-Bit Quantization of Large Language Models AAAI 2024 On the Emergence of Cross-Task Linearity in Pretraining-Finetuning Paradigm ICML 2024 Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models ACL 2024 Towards Better Utilization of Multi-Reference Training Data for Chinese Grammatical Error Correction ACL 2024 Estimating the Causal Effect of Early ArXiving on Paper Acceptance CLEAR 2024 ReSimAD: Zero-Shot 3D Domain Transfer for Autonomous Driving with Source Reconstruction and Target Simulation ICLR 2024 LiDAR-PTQ: Post-Training Quantization for Point Cloud 3D Object Detection ICLR 2024 OWL: A Large Language Model for IT Operations ICLR 2024 DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior ICLR 2024 Shadow Generation for Composite Image Using Diffusion Model CVPR 2024 Language-Driven Anchors for Zero-Shot Adversarial Robustness CVPR 2024 Once for Both: Single Stage of Importance and Sparsity Search for Vision Transformer Compression CVPR 2024 Better Regression Makes Better Test-time Adaptive 3D Object Detection ECCV 2024 ComFusion: Enhancing Personalized Generation by Instance-Scene Compositing and Fusion ECCV 2024 VisionLLaMA: A Unified LLaMA Backbone for Vision Tasks ECCV 2024 Uni3D: A Unified Baseline for Multi-Dataset 3D Object Detection CVPR 2023 Bi3D: Bi-Domain Active Learning for Cross-Domain 3D Object Detection CVPR 2023 Generative Diffusion Prior for Unified Image Restoration and Enhancement CVPR 2023 Physically Realizable Natural-Looking Clothing Textures Evade Person Detectors via 3D Modeling CVPR 2023 Image Cropping With Spatial-Aware Feature and Rank Consistency CVPR 2023 Make-It-3D: High-fidelity 3D Creation from A Single Image with Diffusion Prior ICCV 2023 AD-PT: Autonomous Driving Pre-Training with Large-scale Point Cloud Dataset NIPS 2023 Improving Seq2Seq Grammatical Error Correction via Decoding Interventions EMNLP 2023 Fine-grained Visible Watermark Removal ICCV 2023 Conditional Positional Encodings for Vision Transformers ICLR 2023 NaSGEC: a Multi-Domain Chinese Grammatical Error Correction Dataset from Native Speaker Texts ACL 2023 LipsNet: A Smooth and Robust Neural Network with Adaptive Lipschitz Constant for High Accuracy Optimal Control ICML 2023 OPT-GAN: A Broad-Spectrum Global Optimizer for Black-Box Problems by Learning Distribution AAAI 2023 Better Pre-Training by Reducing Representation Confusion EACL 2023 MixPath: A Unified Approach for One-shot Neural Architecture Search ICCV 2023 ROME: Robustifying Memory-Efficient NAS via Topology Disentanglement and Gradient Accumulation ICCV 2023 Foreground Object Search by Distilling Composite Image Feature ICCV 2023 UMC: A Unified Bandwidth-efficient and Multi-resolution based Collaborative Perception Framework ICCV 2023 Delving Into Shape-Aware Zero-Shot Semantic Segmentation CVPR 2023 Paint by Example: Exemplar-Based Image Editing With Diffusion Models CVPR 2023 RODIN: A Generative Model for Sculpting 3D Digital Avatars Using Diffusion CVPR 2023 MetaPortrait: Identity-Preserving Talking Head Generation With Fast Personalized Adaptation CVPR 2023 Enhanced Accuracy and Robustness via Multi-Teacher Adversarial Distillation ECCV 2022 Some Reflections on Drawing Causal Inference using Textual Data: Parallels Between Human Subjects and Organized Texts CLEAR 2022 StyleSwin: Transformer-Based GAN for High-Resolution Image Generation CVPR 2022 Adversarial Texture for Fooling Person Detectors in the Physical World CVPR 2022 Bringing Old Films Back to Life CVPR 2022 Vector Quantized Diffusion Model for Text-to-Image Synthesis CVPR 2022 Human-Centric Image Cropping with Partition-Aware and Content-Preserving Features ECCV 2022 Real-Time Neural Character Rendering with Pose-Guided Multiplane Images ECCV 2022 SynGEC: Syntax-Enhanced Grammatical Error Correction with a Tailored GEC-Oriented Parser EMNLP 2022 Analytic-DPM: an Analytic Estimate of the Optimal Reverse Variance in Diffusion Probabilistic Models ICLR 2022 Estimating the Optimal Covariance with Imperfect Mean in Diffusion Probabilistic Models ICML 2022 Fast Lossless Neural Compression with Integer-Only Discrete Flows ICML 2022 MuCGEC: a Multi-Reference Multi-Source Evaluation Dataset for Chinese Grammatical Error Correction NAACL 2022 A Unified Span-Based Approach for Opinion Mining with Syntactic Constituents NAACL 2021 Matching Distributions between Model and Data: Cross-domain Knowledge Distillation for Unsupervised Domain Adaptation IJCNLP 2021 Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation CVPR 2021 MagDR: Mask-Guided Detection and Reconstruction for Defending Deepfakes CVPR 2021 DARTS-: Robustly Stepping out of Performance Collapse Without Indicators ICLR 2021 FairNAS: Rethinking Evaluation Fairness of Weight Sharing Neural Architecture Search ICCV 2021 Let's See Clearly: Contaminant Artifact Removal for Moving Cameras ICCV 2021 Matching Distributions between Model and Data: Cross-domain Knowledge Distillation for Unsupervised Domain Adaptation ACL 2021 Style-Based Point Generator With Adversarial Rendering for Point Cloud Completion CVPR 2021 Stability and Generalization of Bilevel Programming in Hyperparameter Optimization NIPS 2021 Twins: Revisiting the Design of Spatial Attention in Vision Transformers NIPS 2021 CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation CVPR 2021 Variational (Gradient) Estimate of the Score Function in Energy-based Latent Variable Models ICML 2021 Syntax-Aware Opinion Role Labeling with Dependency Graph Convolutional Networks ACL 2020 Cross-Domain Correspondence Learning for Exemplar-Based Image Translation CVPR 2020 Bringing Old Photos Back to Life CVPR 2020 Fair DARTS: Eliminating Unfair Advantages in Differentiable Architecture Search ECCV 2020 Training Interpretable Convolutional Neural Networks by Differentiating Class-specific Filters ECCV 2020 To Relieve Your Headache of Training an MRF, Take AdVIL ICLR 2020 Pruning from Scratch AAAI 2020 Dynamic Network Pruning with Interpretable Layerwise Channel Selection AAAI 2020 Neural Architecture Search on Acoustic Scene Classification INTERSPEECH 2020 A Wasserstein Minimum Velocity Approach to Learning Unnormalized Models AISTATS 2020 Bi-level Score Matching for Learning Energy-based Latent Variable Models NIPS 2020 An Attention-based Model for Conversion Rate Prediction with Delayed Feedback via Post-click Calibration IJCAI 2020 Internal and Contextual Attention Network for Cold-start Multi-channel Matching in Recommendation IJCAI 2020 Understanding and Stabilizing GANs’ Training Dynamics Using Control Theory ICML 2020 Function Space Particle Optimization for Bayesian Neural Networks ICLR 2019 Hierarchy Response Learning for Neural Conversation Generation EMNLP 2019 Multi-objects Generation with Amortized Structural Regularization NIPS 2019 Hierarchy Response Learning for Neural Conversation Generation IJCNLP 2019 Deep Exemplar-Based Video Colorization CVPR 2019 Blind Geometric Distortion Correction on Images Through Deep Learning CVPR 2019 DeepExposure: Learning to Expose Photos with Asynchronously Reinforced Adversarial Learning NIPS 2018 Interpret Neural Networks by Identifying Critical Data Routing Paths CVPR 2018 Smooth Neighbors on Teacher Graphs for Semi-Supervised Learning CVPR 2018 Textbook Question Answering Under Instructor Guidance With Memory Networks CVPR 2018 Supervised Treebank Conversion: Data and Approaches ACL 2018 Message Passing Stein Variational Gradient Descent ICML 2018 Graphical Generative Adversarial Networks NIPS 2018 Semi-crowdsourced Clustering with Deep Generative Models NIPS 2018 Forecast the Plausible Paths in Crowd Scenes IJCAI 2017 Improving Interpretability of Deep Neural Networks With Semantic Information CVPR 2017 Triple Generative Adversarial Nets NIPS 2017 Semi-supervised Max-margin Topic Model with Manifold Posterior Regularization IJCAI 2017 Discriminative Deep Random Walk for Network Classification ACL 2016 Segment-Level Sequence Modeling using Gated Recursive Semi-Markov Conditional Random Fields ACL 2016 Crowd Scene Understanding with Coherent Recurrent Neural Networks IJCAI 2016 Learning to Generate with Memory ICML 2016 Max-Margin Deep Generative Models NIPS 2015 RIDE: Reversal Invariant Descriptor Enhancement ICCV 2015 Adaptive Dropout Rates for Learning with Corrupted Features IJCAI 2015 Convolutional Neural Networks with Intra-Layer Recurrent Connections for Scene Labeling NIPS 2015 Orientational Pyramid Matching for Recognizing Indoor Scenes CVPR 2014 Gibbs Max-margin Topic Models with Data Augmentation JMLR 2014 Max-Margin Infinite Hidden Markov Models ICML 2014 Distributed Bayesian Posterior Sampling via Moment Sharing NIPS 2014 Fast Max-Margin Matrix Factorization with Data Augmentation ICML 2013 Hierarchical Part Matching for Fine-Grained Visual Categorization ICCV 2013 Scalable Inference for Logistic-Normal Topic Models NIPS 2013 Improved Bayesian Logistic Supervised Topic Models with Data Augmentation ACL 2013 Generalized Relational Topic Models with Data Augmentation IJCAI 2013 Gibbs Max-Margin Topic Models with Fast Sampling Algorithms ICML 2013 Super-Bit Locality-Sensitive Hashing NIPS 2012 Nonparametric Max-Margin Matrix Factorization for Collaborative Prediction NIPS 2012 Partially Observed Maximum Entropy Discrimination Markov Networks NIPS 2008 Dynamic Hierarchical Markov Random Fields for Integrated Web Data Extraction JMLR 2008