conftrace_

Yixiao Ge

58 papers · 2018–2025 · 10 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓
+14 more ↓ πŸƒ Academic Marathon (7) 🌍 Conference Polyglot (10) πŸŒ‰ Interdisciplinary Bridge 🧭 Keyword Pioneer 🐝 Cross-Pollinator (14)
🐝 Cross-Pollinator (14) 🌈 Renaissance Researcher (6) πŸ—ΊοΈ Taxonomy Completionist (79) 🏠 Conference Loyalist (21) 🀝 Dynamic Duo (43) πŸ† Keyword Champion (2) πŸ‘‘ Triple Crown 🧬 Topic Evolution πŸ”¬ Deep Specialist (21) πŸ† Grand Slam πŸ”₯ Unstoppable (6) πŸ’Ž Century Club (58) ⚑ Prolific Year (9) πŸ—ƒοΈ Keyword Collector (203)

Conferences

CVPR (21) ICCV (10) ECCV (6) ICLR (6) NIPS (6) AAAI (3) ICML (3) ACL (1) IJCAI (1) NAACL (1)

Papers

GenHancer: Imperfect Generative Models are Secretly Strong Vision-Centric Enhancers ICCV 2025 Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation CVPR 2025 VoCo-LLaMA: Towards Vision Compression with Large Language Models CVPR 2025 ATP-LLaVA: Adaptive Token Pruning for Large Vision Language Models CVPR 2025 Scalable Image Tokenization with Index Backpropagation Quantization ICCV 2025 Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots NAACL 2025 Moto: Latent Motion Token as the Bridging Language for Learning Robot Manipulation from Videos ICCV 2025 AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction ICCV 2025 HaploVL: A Single-Transformer Baseline for Multi-Modal Understanding ICML 2025 LoRA-Gen: Specializing Large Language Model via Online LoRA Generation ICML 2025 ST-LLM: Large Language Models Are Effective Temporal Learners ECCV 2024 MambaTree: Tree Topology is All You Need in State Space Model NIPS 2024 Cached Transformers: Improving Transformers with Differentiable Memory Cachde AAAI 2024 LLaMA Pro: Progressive LLaMA with Block Expansion ACL 2024 Rethinking the Objectives of Vector-Quantized Tokenizers for Image Synthesis CVPR 2024 SmartEdit: Exploring Complex Instruction-based Image Editing with Multimodal Large Language Models CVPR 2024 Low-Rank Approximation for Sparse Attention in Multi-Modal LLMs CVPR 2024 BT-Adapter: Video Conversation is Feasible Without Video Instruction Tuning CVPR 2024 Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities CVPR 2024 YOLO-World: Real-Time Open-Vocabulary Object Detection CVPR 2024 ViT-Lens: Towards Omni-modal Representations CVPR 2024 UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio Video Point Cloud Time-Series and Image Recognition CVPR 2024 SEED-Bench: Benchmarking Multimodal Large Language Models CVPR 2024 DreamDiffusion: High-Quality EEG-to-Image Generation with Temporal Masked Signal Modeling and CLIP Alignment ECCV 2024 Making LLaMA SEE and Draw with SEED Tokenizer ICLR 2024 $\pi$-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation ICML 2023 Meta-Adapter: An Online Few-shot Learner for Vision-Language Model NIPS 2023 Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models NIPS 2023 Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection ICCV 2023 Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation ICCV 2023 BoxSnake: Polygonal Instance Segmentation with Box Supervision ICCV 2023 Exploring Model Transferability through the Lens of Potential Energy ICCV 2023 Darwinian Model Upgrades: Model Evolving with Selective Compatibility AAAI 2023 Video-Text Pre-training with Learned Regions for Retrieval AAAI 2023 GPT4Tools: Teaching Large Language Model to Use Tools via Self-instruction NIPS 2023 Accelerating Vision-Language Pretraining With Free Language Modeling CVPR 2023 All in One: Exploring Unified Video-Language Pre-Training CVPR 2023 Learning Transferable Spatiotemporal Representations From Natural Script Knowledge CVPR 2023 RILS: Masked Visual Reconstruction in Language Semantic Space CVPR 2023 Masked Image Modeling with Denoising Contrast ICLR 2023 Object-Aware Video-Language Pre-Training for Retrieval CVPR 2022 Uncertainty Modeling for Out-of-Distribution Generalization ICLR 2022 Mc-BEiT: Multi-Choice Discretization for Image BERT Pre-training ECCV 2022 Not All Models Are Equal: Predicting Model Transferability in a Self-Challenging Fisher Space ECCV 2022 MILES: Visual BERT Pre-training with Injected Language Semantics for Video-Text Retrieval ECCV 2022 Towards Universal Backward-Compatible Representation Learning IJCAI 2022 Hot-Refresh Model Upgrades with Regression-Free Compatible Training in Image Retrieval ICLR 2022 Dynamic Token Normalization improves Vision Transformers ICLR 2022 Bridging Video-Text Retrieval With Multiple Choice Questions CVPR 2022 Progressive Correspondence Pruning by Consensus Learning ICCV 2021 Online Pseudo Label Generation by Hierarchical Cluster Dynamics for Adaptive Person Re-Identification ICCV 2021 Mutual CRF-GNN for Few-Shot Learning CVPR 2021 Refining Pseudo Labels With Clustering Consensus Over Generations for Unsupervised Object Re-Identification CVPR 2021 DivCo: Diverse Conditional Image Synthesis via Contrastive Generative Adversarial Network CVPR 2021 Mutual Mean-Teaching: Pseudo Label Refinery for Unsupervised Domain Adaptation on Person Re-identification ICLR 2020 Self-paced Contrastive Learning with Hybrid Memory for Domain Adaptive Object Re-ID NIPS 2020 Self-supervising Fine-grained Region Similarities for Large-scale Image Localization ECCV 2020 FD-GAN: Pose-guided Feature Distilling GAN for Robust Person Re-identification NIPS 2018