conftrace_

Hongtao Xie

56 papers · 2019–2026 · 7 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓
+13 more ↓ 🧭 Keyword Pioneer 🌍 Conference Polyglot (7) πŸ—ΊοΈ Taxonomy Completionist (10) πŸŒ‰ Interdisciplinary Bridge πŸƒ Academic Marathon (6)
πŸƒ Academic Marathon (6) πŸ—ΊοΈ Taxonomy Completionist (10) 🐣 Hot Topic Early Bird 🀝 Dynamic Duo (34) πŸ† Keyword Champion πŸ”¬ Deep Specialist (16) πŸ“ˆ Trend Setter πŸ”₯ Unstoppable (7) πŸš€ Conference Pioneer ⚑ Prolific Year (13) ❓ The Questioner πŸ—ƒοΈ Keyword Collector (279) πŸ’Ž Century Club (54)

Conferences

CVPR (15) AAAI (11) IJCAI (10) ICCV (8) NIPS (6) ECCV (4) ACL (2)

Papers

SpaceVLLM: Endowing Multimodal Large Language Model with Spatio-Temporal Video Grounding Capability AAAI 2026 RegionRAG: Region-level Retrieval-Augmented Generation for Visual Document Understanding AAAI 2026 Mask^2DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation CVPR 2025 Hybrid-Level Instruction Injection for Video Token Compression in Multi-modal Large Language Models CVPR 2025 SynTab-LLaVA: Enhancing Multimodal Table Understanding with Decoupled Synthesis CVPR 2025 IterMeme: Expert-Guided Multimodal LLM for Interactive Meme Creation with Layout-Aware Generation IJCAI 2025 IDseq: Decoupled and Sequentially Detecting and Grounding Multi-Modal Media Manipulation AAAI 2025 Invisible Watermarks, Visible Gains: Steering Machine Unlearning with Bi-Level Watermarking Design ICCV 2025 SVTRv2: CTC Beats Encoder-Decoder Models in Scene Text Recognition ICCV 2025 GestureHYDRA: Semantic Co-speech Gesture Synthesis via Hybrid Modality Diffusion Transformer and Cascaded-Synchronized Retrieval-Augmented Generation ICCV 2025 CLIP-Adapted Region-to-Text Learning for Generative Open-Vocabulary Semantic Segmentation ICCV 2025 Forensic-MoE: Exploring Comprehensive Synthetic Image Detection Traces with Mixture of Experts ICCV 2025 IGD: Instructional Graphic Design with Multimodal Layer Generation ICCV 2025 PosterMaker: Towards High-Quality Product Poster Generation with Accurate Text Rendering CVPR 2025 Leveraging Text Localization for Scene Text Removal via Text-aware Masked Image Modeling ECCV 2024 How Control Information Influences Multilingual Text Image Generation and Editing? NIPS 2024 ShowMaker: Creating High-Fidelity 2D Human Video via Fine-Grained Diffusion Modeling NIPS 2024 Boosting Semi-Supervised Scene Text Recognition via Viewing and Summarizing NIPS 2024 Towards Balanced Alignment: Modal-Enhanced Semantic Modeling for Video Moment Retrieval AAAI 2024 Knowledge Context Modeling with Pre-trained Language Models for Contrastive Knowledge Graph Completion ACL 2024 DiffAM: Diffusion-based Adversarial Makeup Transfer for Facial Privacy Protection CVPR 2024 OTE: Exploring Accurate Scene Text Recognition Using One Token CVPR 2024 Choose What You Need: Disentangled Representation Learning for Scene Text Recognition Removal and Editing CVPR 2024 DEADiff: An Efficient Stylization Diffusion Model with Disentangled Representations CVPR 2024 AlignZeg: Mitigating Objective Misalignment for Zero-shot Semantic Segmentation ECCV 2024 Self-Supervised Pre-training with Symmetric Superimposition Modeling for Scene Text Recognition IJCAI 2024 Focus on the Whole Character: Discriminative Character Modeling for Scene Text Recognition IJCAI 2024 TPS++: Attention-Enhanced Thin-Plate Spline for Scene Text Recognition IJCAI 2023 Exploring Stroke-Level Modifications for Scene Text Editing AAAI 2023 Progressive Spatio-Temporal Prototype Matching for Text-Video Retrieval ICCV 2023 Linguistic More: Taking a Further Step toward Efficient and Accurate Scene Text Recognition IJCAI 2023 Learning Orthogonal Prototypes for Generalized Few-Shot Semantic Segmentation CVPR 2023 MomentDiff: Generative Video Moment Retrieval from Random to Real NIPS 2023 Bridging the Gap Between Vision Transformers and Convolutional Neural Networks on Small Datasets NIPS 2022 Dual-Stream Knowledge-Preserving Hashing for Unsupervised Video Retrieval ECCV 2022 Detecting Tampered Scene Text in the Wild ECCV 2022 Neighborhood-Adaptive Structure Augmented Metric Learning AAAI 2022 Partial Class Activation Attention for Semantic Segmentation CVPR 2022 From Two to One: A New Scene Text Recognizer With Visual Language Modeling Network ICCV 2021 Frequency-Aware Discriminative Feature Learning Supervised by Single-Center Loss for Face Forgery Detection CVPR 2021 Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition CVPR 2021 Query-Memory Re-Aggregation for Weakly-supervised Video Object Segmentation AAAI 2021 Dynamic Inconsistency-aware DeepFake Video Detection IJCAI 2021 Semantic-guided Reinforced Region Embedding for Generalized Zero-Shot Learning AAAI 2021 Hierarchical Granularity Transfer Learning NIPS 2020 ContourNet: Taking a Further Step Toward Accurate Arbitrary-Shaped Scene Text Detection CVPR 2020 CircleNet for Hip Landmark Detection AAAI 2020 Filtration and Distillation: Enhancing Region Attention for Fine-Grained Visual Categorization AAAI 2020 Real-World Automatic Makeup via Identity Preservation Makeup Net IJCAI 2020 Domain-Aware Visual Bias Eliminating for Generalized Zero-Shot Learning CVPR 2020 Curriculum Learning for Natural Language Understanding ACL 2020 Graph Structured Network for Image-Text Matching CVPR 2020 Learning to Draw Text in Natural Images with Conditional Adversarial Networks IJCAI 2019 Semi-supervised User Profiling with Heterogeneous Graph Attention Networks IJCAI 2019 DSRN: A Deep Scale Relationship Network for Scene Text Detection IJCAI 2019 Robust Deep Co-Saliency Detection with Group Semantic AAAI 2019