conftrace_

Yansong Tang

62 papers · 2018–2025 · 8 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓
+9 more ↓ πŸƒ Academic Marathon (7) πŸŒ‰ Interdisciplinary Bridge 🧭 Keyword Pioneer 🌍 Conference Polyglot (8) 🐝 Cross-Pollinator (12)
🌈 Renaissance Researcher (7) πŸ—ΊοΈ Taxonomy Completionist (91) πŸŒ‰ Interdisciplinary Bridge 🏠 Conference Loyalist (25) πŸ”¬ Deep Specialist (18) 🀝 Dynamic Duo (24) ⚑ Prolific Year (22) πŸ’Ž Century Club (62) πŸ—ƒοΈ Keyword Collector (267)

Conferences

CVPR (25) ICCV (11) ECCV (8) NIPS (8) AAAI (4) ICLR (4) ACL (1) IJCAI (1)

Papers

ScoreHOI: Physically Plausible Reconstruction of Human-Object Interaction via Score-Guided Diffusion ICCV 2025 WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct ICLR 2025 IteRPrimE: Zero-shot Referring Image Segmentation with Iterative Grad-CAM Refinement and Primary Word Emphasis AAAI 2025 Ponder & Press: Advancing Visual GUI Agent towards General Computer Control ACL 2025 InstaRevive: One-Step Image Enhancement via Dynamic Score Matching ICLR 2025 ThinkBot: Embodied Instruction Following with Thought Chain Reasoning ICLR 2025 Flash-VStream: Efficient Real-Time Understanding for Long Video Streams ICCV 2025 KV-Edit: Training-Free Image Editing for Precise Background Preservation ICCV 2025 GWM: Towards Scalable Gaussian World Models for Robotic Manipulation ICCV 2025 Stepping Out of Similar Semantic Space for Open-Vocabulary Segmentation ICCV 2025 AnyBimanual: Transferring Unimanual Policy for General Bimanual Manipulation ICCV 2025 Momentum-GS: Momentum Gaussian Self-Distillation for High-Quality Large Scene Reconstruction ICCV 2025 SAM2-LOVE: Segment Anything Model 2 in Language-aided Audio-Visual Scenes CVPR 2025 ATP-LLaVA: Adaptive Token Pruning for Large Vision Language Models CVPR 2025 Coarse Correspondences Boost Spatial-Temporal Reasoning in Multimodal Language Model CVPR 2025 VoCo-LLaMA: Towards Vision Compression with Large Language Models CVPR 2025 FADE: Frequency-Aware Diffusion Model Factorization for Video Editing CVPR 2025 Narrative Action Evaluation with Prompt-Guided Multimodal Interaction CVPR 2024 PTM-VQA: Efficient Video Quality Assessment Leveraging Diverse PreTrained Models from the Wild CVPR 2024 MADTP: Multimodal Alignment-Guided Dynamic Token Pruning for Accelerating Vision-Language Transformer CVPR 2024 Learning Dual-Level Deformable Implicit Representation for Real-World Scale Arbitrary Super-Resolution ECCV 2024 GeoLRM: Geometry-Aware Large Reconstruction Model for High-Quality 3D Gaussian Generation NIPS 2024 GaussianCube: A Structured and Explicit Radiance Representation for 3D Generative Modeling NIPS 2024 RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models ECCV 2024 MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model ECCV 2024 "Plan, Posture and Go: Towards Open-vocabulary Text-to-Motion Generation" ECCV 2024 ManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic Manipulation ECCV 2024 Post-training Quantization with Progressive Calibration and Activation Relaxing for Text-to-Image Diffusion Models ECCV 2024 WizardArena: Post-training Large Language Models via Simulated Offline Chatbot Arena NIPS 2024 Q-VLM: Post-training Quantization for Large Vision-Language Models NIPS 2024 DPMesh: Exploiting Diffusion Prior for Occluded Human Mesh Recovery CVPR 2024 Learning Multi-Scale Video-Text Correspondence for Weakly Supervised Temporal Article Gronding AAAI 2024 CoSTA: End-to-End Comprehensive Space-Time Entanglement for Spatio-Temporal Video Grounding AAAI 2024 Open-Vocabulary Segmentation with Semantic-Assisted Calibration CVPR 2024 FlowIE: Efficient Image Enhancement via Rectified Flow CVPR 2024 Segment and Caption Anything CVPR 2024 Universal Segmentation at Arbitrary Granularity with Language Instruction CVPR 2024 Once for Both: Single Stage of Importance and Sparsity Search for Vision Transformer Compression CVPR 2024 Towards Accurate Post-training Quantization for Diffusion Models CVPR 2024 HOI-aware Adaptive Network for Weakly-supervised Action Segmentation IJCAI 2023 MCUFormer: Deploying Vision Tranformers on Microcontrollers with Limited Memory NIPS 2023 SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation NIPS 2023 Semantics-Aware Dynamic Localization and Refinement for Referring Image Segmentation AAAI 2023 LOGO: A Long-Form Video Dataset for Group Action Quality Assessment CVPR 2023 FLAG3D: A 3D Fitness Activity Dataset With Language Instruction CVPR 2023 Global Knowledge Calibration for Fast Open-Vocabulary Segmentation ICCV 2023 Skip-Plan: Procedure Planning in Instructional Videos via Condensed Action Space Learning ICCV 2023 FineDance: A Fine-grained Choreography Dataset for 3D Full Body Dance Generation ICCV 2023 Tem-Adapter: Adapting Image-Text Pretraining for Video Question Answer ICCV 2023 GAIN: On the Generalization of Instructional Action Understanding ICLR 2023 ScalableViT: Rethinking the Context-Oriented Generalization of Vision Transformer ECCV 2022 YouMVOS: An Actor-Centric Multi-Shot Video Object Segmentation Dataset CVPR 2022 DenseCLIP: Language-Guided Dense Prediction With Context-Aware Prompting CVPR 2022 LAVT: Language-Aware Vision Transformer for Referring Image Segmentation CVPR 2022 Semantic-Aware Auto-Encoders for Self-Supervised Representation Learning CVPR 2022 BNV-Fusion: Dense 3D Reconstruction Using Bi-Level Neural Volume Fusion CVPR 2022 Global Spectral Filter Memory Network for Video Object Segmentation ECCV 2022 HorNet: Efficient High-Order Spatial Interactions with Recursive Gated Convolutions NIPS 2022 OrdinalCLIP: Learning Rank Prompts for Language-Guided Ordinal Regression NIPS 2022 Uncertainty-Aware Score Distribution Learning for Action Quality Assessment CVPR 2020 COIN: A Large-Scale Dataset for Comprehensive Instructional Video Analysis CVPR 2019 Deep Progressive Reinforcement Learning for Skeleton-Based Action Recognition CVPR 2018