Dan Xu

75 papers · 2017–2026 · 11 conferences · across top CS/AI conferences

Achievements

+15 more ↓

🌍 Conference Polyglot (11) 🌉 Interdisciplinary Bridge 🐣 Hot Topic Early Bird 🧭 Keyword Pioneer 🏃 Academic Marathon (8)

🐣 Hot Topic Early Bird 🐝 Cross-Pollinator (13) 🗺️ Taxonomy Completionist (114) 🌟 Keyword Trendsetter Combo (3) 🏠 Conference Loyalist (30) 🤝 Dynamic Duo (13) 🔬 Deep Specialist (15) 🧬 Topic Evolution 🏆 Keyword Champion (2) 🏆 Grand Slam 🗃️ Keyword Collector (323) 🔥 Unstoppable (9) 💎 Century Club (73) 📈 Trend Setter ⚡ Prolific Year (12)

Conferences

CVPR (30) ICCV (11) ACL (7) ECCV (7) AAAI (6) ICLR (4) NIPS (3) SEMEVAL (3) ICML (2) COLING (1) IJCAI (1)

Top co-authors

Jin Wang (13) Xuejie Zhang (13) You Zhang (12) Nicu Sebe (11) Wanli Ouyang (11) Hanrong Ye (8) Elisa Ricci (6) Xiaogang Wang (6) Yan Yan (5) Zhenguo Li (5)

Keywords

depth estimation (9) multi-task learning (7) novel view synthesis (6) diffusion model (6) text classification (5) neural radiance field (5) semantic segmentation (5) conditional random field (5) multi-label classification (4) convolutional neural network (4) talking head generation (4) focal loss (4) video generation (4) multimodal learning (4) facial animation (3) contrastive learning (3) 3d reconstruction (3) generative adversarial network (3) knowledge distillation (3) generative model (3)

Papers

Empowering Sparse-Input Neural Radiance Fields with Dual-Level Semantic Guidance from Dense Novel Views AAAI 2026 Emotion-Conditioned Motion Sub-spaces with Flow Matching for Real-Time Audio-Driven Talking Heads AAAI 2026 Free-viewpoint Human Animation with Pose-correlated Reference Selection CVPR 2025 From One to More: Contextual Part Latents for 3D Generation ICCV 2025 Multi-Attribute Multi-Grained Adaptation of Pre-Trained Language Models for Text Understanding from Bayesian Perspective AAAI 2025 Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation ICCV 2025 Vision-aware Multimodal Prompt Tuning for Uploadable Multi-source Few-shot Domain Adaptation AAAI 2025 Flow-NeRF: Joint Learning of Geometry, Poses, and Dense Flow within Unified Neural Representations CVPR 2025 GaussHDR: High Dynamic Range Gaussian Splatting via Learning Unified 3D and 2D Local Tone Mapping CVPR 2025 DiGA3D: Coarse-to-Fine Diffusional Propagation of Geometry and Appearance for Versatile 3D Inpainting ICCV 2025 Taming LLMs with Gradient Grouping ACL 2025 YNU-HPCC at SemEval-2025 Task 6: Using BERT Model with R-drop for Promise Verification ACL 2025 YNU-HPCC at SemEval-2025 Task 10: A Two-Stage Approach to Solving Multi-Label and Multi-Class Role Classification Based on DeBERTa ACL 2025 YNU-HPCC at SemEval-2025 Task 1: Enhancing Multimodal Idiomaticity Representation via LoRA and Hybrid Loss Optimization ACL 2025 Rep-MTL: Unleashing the Power of Representation-level Task Saliency for Multi-Task Learning ICCV 2025 YNU-HPCC at SemEval-2025 Task 1: Enhancing Multimodal Idiomaticity Representation via LoRA and Hybrid Loss Optimization SEMEVAL 2025 YNU-HPCC at SemEval-2025 Task 10: A Two-Stage Approach to Solving Multi-Label and Multi-Class Role Classification Based on DeBERTa SEMEVAL 2025 YNU-HPCC at SemEval-2025 Task 6: Using BERT Model with R-drop for Promise Verification SEMEVAL 2025 Human-Centric Foundation Models: Perception, Generation and Agentic Modeling IJCAI 2025 I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models ICML 2025 UniMC: Taming Diffusion Transformer for Unified Keypoint-Guided Multi-Class Image Generation ICML 2025 Synergizing Motion and Appearance: Multi-Scale Compensatory Codebooks for Talking Head Video Generation CVPR 2025 MMEgo: Towards Building Egocentric Multimodal LLMs for Video QA ICLR 2025 Taming Video Diffusion Prior with Scene-Grounding Guidance for 3D Gaussian Splatting from Sparse Inputs CVPR 2025 RoomTex: Texturing Compositional Indoor Scenes via Iterative Inpainting ECCV 2024 YNU-HPCC at SIGHAN-2024 dimABSA Task: Using PLMs with a Joint Learning Strategy for Dimensional Intensity Prediction ACL 2024 Improving Personalized Sentiment Representation with Knowledge-enhanced and Parameter-efficient Layer Normalization COLING 2024 Personalized LoRA for Human-Centered Text Understanding AAAI 2024 Interactive3D: Create What You Want by Interactive 3D Generation CVPR 2024 Implicit Event-RGBD Neural SLAM CVPR 2024 DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection CVPR 2024 Text-to-3D Generation with Bidirectional Diffusion using both 2D and 3D priors CVPR 2024 DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Data CVPR 2024 Efficient Multitask Dense Predictor via Binarization CVPR 2024 GS-SLAM: Dense Visual SLAM with 3D Gaussian Splatting CVPR 2024 CVT-xRF: Contrastive In-Voxel Transformer for 3D Consistent Radiance Fields from Sparse Inputs CVPR 2024 Learning 3D Geometry and Feature Consistent Gaussian Splatting for Object Removal ECCV 2024 SegGen: Supercharging Segmentation Models with Text2Mask and Mask2Img Synthesis ECCV 2024 Motion-Oriented Compositional Neural Radiance Fields for Monocular Dynamic Human Modeling ECCV 2024 CoDA: Collaborative Novel Box Discovery and Cross-modal Alignment for Open-vocabulary 3D Object Detection NIPS 2023 Contrastive Multi-Task Dense Prediction AAAI 2023 Learning Unified Decompositional and Compositional NeRF for Editable Novel View Synthesis ICCV 2023 Learning Multi-Modal Class-Specific Tokens for Weakly Supervised Dense Object Localization CVPR 2023 DetCLIPv2: Scalable Open-Vocabulary Object Detection Pre-Training via Word-Region Alignment CVPR 2023 Edge Guided GANs with Contrastive Learning for Semantic Image Synthesis ICLR 2023 Switch-NeRF: Learning Scene Decomposition with Mixture of Experts for Large-scale Neural Radiance Fields ICLR 2023 TaskPrompter: Spatial-Channel Multi-Task Prompting for Dense Scene Understanding ICLR 2023 YNU-HPCC at WASSA 2023: Using Text-Mixed Data Augmentation for Emotion Classification on Code-Mixed Text Message ACL 2023 Domain Generalization via Switch Knowledge Distillation for Robust Review Representation ACL 2023 TaskExpert: Dynamically Assembling Multi-Task Representations with Memorial Mixture-of-Experts ICCV 2023 Implicit Identity Representation Conditioned Memory Compensation Network for Talking Head video Generation ICCV 2023 Lipschitz Continuity Retained Binary Neural Network ECCV 2022 DetCLIP: Dictionary-Enriched Visual-Concept Paralleled Pre-training for Open-world Detection NIPS 2022 Multi-Class Token Transformer for Weakly Supervised Semantic Segmentation CVPR 2022 Depth-Aware Generative Adversarial Network for Talking Head Video Generation CVPR 2022 Generalized Binary Search Network for Highly-Efficient Multi-View Stereo CVPR 2022 Network Binarization via Contrastive Learning ECCV 2022 Inverted Pyramid Multi-task Transformer for Dense Scene Understanding ECCV 2022 Delving Into Localization Errors for Monocular 3D Object Detection CVPR 2021 Leveraging Auxiliary Tasks With Affinity Learning for Weakly Supervised Semantic Segmentation ICCV 2021 SA-ConvONet: Sign-Agnostic Optimization of Convolutional Occupancy Networks ICCV 2021 Learning Parallel Dense Correspondence From Spatio-Temporal Descriptors for Efficient and Robust 4D Reconstruction CVPR 2021 Dynamic Graph Message Passing Networks CVPR 2020 Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation CVPR 2020 Multi-Channel Attention Selection GAN With Cascaded Semantic Guidance for Cross-View Image Translation CVPR 2019 Structured Modeling of Joint Deep Feature and Prediction Refinement for Salient Object Detection ICCV 2019 Unsupervised Collaborative Learning of Keyframe Detection and Visual Odometry Towards Monocular Deep SLAM ICCV 2019 Every Smile Is Unique: Landmark-Guided Diverse Smile Generation CVPR 2018 Group Consistent Similarity Learning via Deep CRF for Person Re-Identification CVPR 2018 Structured Attention Guided Convolutional Neural Fields for Monocular Depth Estimation CVPR 2018 PAD-Net: Multi-Tasks Guided Prediction-and-Distillation Network for Simultaneous Depth Estimation and Scene Parsing CVPR 2018 Learning Deep Structured Multi-Scale Features using Attention-Gated CRFs for Contour Prediction NIPS 2017 Viraliency: Pooling Local Virality CVPR 2017 Multi-Scale Continuous CRFs as Sequential Deep Networks for Monocular Depth Estimation CVPR 2017 Learning Cross-Modal Deep Representations for Robust Pedestrian Detection CVPR 2017