Yi Wang

136 papers · 2013–2026 · 18 conferences · across top CS/AI conferences

Achievements

+18 more ↓

🗺️ Taxonomy Completionist (16) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (6) 🌍 Conference Polyglot (18)

🏃 Academic Marathon (12) 🌉 Interdisciplinary Bridge 🐣 Hot Topic Early Bird 🏠 Conference Loyalist (30) 🧬 Topic Evolution 🔬 Deep Specialist (15) 🏆 Keyword Champion 🤝 Dynamic Duo (20) 👑 Triple Crown 👥 Mega-Team (69) 🏆 Grand Slam 🗃️ Keyword Collector (537) ❓ The Questioner (2) ⚡ Prolific Year (23) 🚀 Conference Pioneer 📈 Trend Setter 💎 Century Club (120) 🔥 Unstoppable (13)

Conferences

AAAI (30) CVPR (19) ICCV (14) NIPS (12) IJCAI (9) ICLR (9) ICML (7) ECCV (6) ACL (6) INTERSPEECH (6) MICCAI (6) NSDI (4) EMNLP (2) MIDL (2) COLING (1) JMLR (1) AACL (1) WACV (1)

Top co-authors

Yu Qiao (21) Limin Wang (16) Wei Wang (13) Yali Wang (13) Yinan He (12) Kunchang Li (10) Jiaya Jia (8) Lu Qi (7) Bo Huang (6) Ming Li (6)

Keywords

large language model (10) video understanding (7) semantic segmentation (6) image generation (6) neural network (5) multimodal learning (5) representation learning (5) knowledge distillation (5) diffusion model (5) domain adaptation (4) multi-modal learning (4) transfer learning (4) multimodal large language model (4) deep neural network (4) generative model (4) attention mechanism (4) model compression (4) automatic speech recognition (4) vision-language model (4) image restoration (3)

Papers

Semi-supervised Latent Disentangled Diffusion Model for Textile Pattern Generation AAAI 2026 TraveLLaMA: A Multimodal Travel Assistant with Large-Scale Dataset and Structured Reasoning AAAI 2026 REFO: Reinforced Evolutionary Faithfulness Optimization for Large Language Models AAAI 2026 Permutation Equivariant Framelet-based Hypergraph Neural Networks AAAI 2026 Richer Representations for Neural Algorithmic Reasoning via Auxiliary Reconstruction AAAI 2026 Think Then Rewrite: Reasoning Enhanced Query Rewriting for Domain Specific Retrieval AAAI 2026 Diffusion Reconstruction-based Data Likelihood Estimation for Core-Set Selection AAAI 2026 CMMCoT: Enhancing Complex Multi-Image Comprehension via Multi-Modal Chain-of-Thought and Memory Augmentation AAAI 2026 VideoChat-A1: Thinking with Long Videos by Chain-of-Shot Reasoning AAAI 2026 InterCoser: Interactive 3D Character Creation with Disentangled Fine-Grained Features AAAI 2026 FreeMem: Enhancing Consistency in Long Video Generation via Tuning-Free Memory AAAI 2026 SSTODE: Ocean-Atmosphere Physics-Informed Neural ODEs for Sea Surface Temperature Prediction AAAI 2026 P2S: Probabilistic Process Supervision for General-Domain Reasoning Question Answering AAAI 2026 Evolutionary Negative Module Pruning for Better LoRA Merging ACL 2026 From Short Video to Clickable Search: RLVR-Enabled Listwise Query Suggestion with Retrieval-Augmented Context ACL 2026 ConstructAI: From Real-Time Safety Insight to Skill Growth in Deployed Construction AI Systems AAAI 2026 Deep Hypergraph Neural Networks with Tight Framelets AAAI 2025 Enhancing Vision-Language Models with Morphological and Taxonomic Knowledge: Towards Coral Recognition for Ocean Health AAAI 2025 XCotton: Advancing AI-Enabled Hardware/Software Integrated System for Foreign Fiber Cleaning AAAI 2025 DELMAN: Dynamic Defense Against Large Language Model Jailbreaking with Model Editing ACL 2025 When Evolution Strategy Meets Language Models Tuning COLING 2025 Hybrid-View Attention Network for Clinically Significant Prostate Cancer Classification in Transrectal Ultrasound MICCAI 2025 EUReg: End-to-end Framework for Efficient 2D-3D Ultrasound Registration MICCAI 2025 EchoCardMAE: Video Masked Auto-Encoders Customized for Echocardiography MICCAI 2025 Clinical Prior-Guided Tumor Generation for Breast Ultrasound with Cross Domain Adaptation MICCAI 2025 Bidirectional Search while Ensuring Meet-In-The-Middle via Effective and Efficient-to-Compute Termination Conditions IJCAI 2025 All Roads Lead to Rome: Exploring Edge Distribution Shifts for Heterophilic Graph Learning IJCAI 2025 A Non-isotropic Time Series Diffusion Model with Moving Average Transitions ICML 2025 HyperNear: Unnoticeable Node Injection Attacks on Hypergraph Neural Networks ICML 2025 TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning ICLR 2025 Bootstrapping Language-Guided Navigation Learning with Self-Refining Data Flywheel ICLR 2025 OccProphet: Pushing the Efficiency Frontier of Camera-Only 4D Occupancy Forecasting with an Observer-Forecaster-Refiner Framework ICLR 2025 OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text ICLR 2025 FlashGS: Efficient 3D Gaussian Splatting for Large-scale and High-resolution Rendering CVPR 2025 Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment CVPR 2025 ANNEXE: Unified Analyzing, Answering, and Pixel Grounding for Egocentric Interaction CVPR 2025 Discovering Fine-Grained Visual-Concept Relations by Disentangled Optimal Transport Concept Bottleneck Models CVPR 2025 Influence-Guided Diffusion for Dataset Distillation ICLR 2025 ViLLa: Video Reasoning Segmentation with Large Language Model ICCV 2025 VRBench: A Benchmark for Multi-Step Reasoning in Long Narrative Videos ICCV 2025 DisCo: Towards Distinct and Coherent Visual Encapsulation in Video MLLMs ICCV 2025 INDOORWORLD : Integrating Physical Task Solving and Social Simulation in A Heterogeneous Multi-Agent Environment EMNLP 2025 Make Your Training Flexible: Towards Deployment-Efficient Video Models ICCV 2025 DiffVSR: Revealing an Effective Recipe for Taming Robust Video Super-Resolution Against Complex Degradations ICCV 2025 Adaptive Learning of High-Value Regions for Semi-Supervised Medical Image Segmentation ICCV 2025 Towards a Unified Copernicus Foundation Model for Earth Vision ICCV 2025 MM-Mixing: Multi-Modal Mixing Alignment for 3D Understanding AAAI 2025 ML-GOOD: Towards Multi-Label Graph Out-Of-Distribution Detection AAAI 2025 MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis AAAI 2025 When Hypergraph Meets Heterophily: New Benchmark Datasets and Baseline AAAI 2025 NetAssistant: Dialogue Based Network Diagnosis in Data Center Networks NSDI 2024 F-OAL: Forward-only Online Analytic Learning with Fast Training and Low Memory Footprint in Class Incremental Learning NIPS 2024 Does Video-Text Pretraining Help Open-Vocabulary Online Action Detection? NIPS 2024 SyncVIS: Synchronized Video Instance Segmentation NIPS 2024 Vision Mamba Mender NIPS 2024 Voxel Proposal Network via Multi-Frame Knowledge Distillation for Semantic Scene Completion NIPS 2024 Task-oriented Time Series Imputation Evaluation via Generalized Representers NIPS 2024 PointPatchMix: Point Cloud Mixing with Patch Scoring AAAI 2024 DCV2I: A Practical Approach for Supporting Geographers’ Visual Interpretation in Dune Segmentation with Deep Vision Models AAAI 2024 ChatMusician: Understanding and Generating Music Intrinsically with LLM ACL 2024 Learning to Maximize Mutual Information for Chain-of-Thought Distillation ACL 2024 MVBench: A Comprehensive Multi-modal Video Understanding Benchmark CVPR 2024 VideoMamba: State Space Model for Efficient Video Understanding ECCV 2024 Decoupling Common and Unique Representations for Multimodal Self-supervised Learning ECCV 2024 InternVideo2: Scaling Foundation Models for Multimodal Video Understanding ECCV 2024 Explaining Time Series via Contrastive and Locally Sparse Perturbations ICLR 2024 InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation ICLR 2024 Fast Equilibrium of SGD in Generic Situations ICLR 2024 CW Complex Hypothesis for Image Data ICML 2024 Purpose Enhanced Reasoning through Iterative Prompting: Uncover Latent Robustness of ChatGPT on Code Comprehension IJCAI 2024 Novelty Detection Based Discriminative Multiple Instance Feature Mining to Classify NSCLC PD-L1 Status on HE-Stained Histopathological Images MICCAI 2024 Towards Multi-modality Fusion and Prototype-based Feature Refinement for Clinically Significant Prostate Cancer Classification in Transrectal Ultrasound MICCAI 2024 Crescent: Emulating Heterogeneous Production Network at Scale NSDI 2024 NEWTON: Are Large Language Models Capable of Physical Reasoning? EMNLP 2023 NeuralLift-360: Lifting an In-the-Wild 2D Photo to a 3D Object With 360deg Views CVPR 2023 Pixels, Regions, and Objects: Multiple Enhancement for Salient Object Detection CVPR 2023 Boosting Accuracy and Robustness of Student Models via Adaptive Adversarial Distillation CVPR 2023 NeRFLix: High-Quality Neural View Synthesis by Learning a Degradation-Driven Inter-Viewpoint MiXer CVPR 2023 Bitstream-Corrupted Video Recovery: A Novel Benchmark Dataset and Method NIPS 2023 Learning Open-Vocabulary Semantic Segmentation Models From Natural Language Supervision CVPR 2023 VideoMAE V2: Scaling Video Masked Autoencoders With Dual Masking CVPR 2023 SSL4EO-L: Datasets and Foundation Models for Landsat Imagery NIPS 2023 Bitstream-Corrupted JPEG Images Are Restorable: Two-Stage Compensation and Alignment Framework for Image Restoration CVPR 2023 Integrated and Enhanced Pipeline System to Support Spoken Language Analytics for Screening Neurocognitive Disorders INTERSPEECH 2023 Hyper-parameter Adaptation of Conformer ASR Systems for Elderly and Dysarthric Speech Recognition INTERSPEECH 2023 Speaker Extraction with Detection of Presence and Absence of Target Speakers INTERSPEECH 2023 SQLFlow: An Extensible Toolkit Integrating DB and AI JMLR 2023 JourneyDB: A Benchmark for Generative Image Understanding NIPS 2023 TMT-VIS: Taxonomy-aware Multi-dataset Joint Training for Video Instance Segmentation NIPS 2023 OPRADI: Applying Security Game to Fight Drive under the Influence in Real-World AAAI 2023 ScatterFormer: Locally-Invariant Scattering Transformer for Patient-Independent Multispectral Detection of Epileptiform Discharges AAAI 2023 Complex Dynamic Neurons Improved Spiking Transformer Network for Efficient Automatic Speech Recognition AAAI 2023 Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer with Mixture-of-View-Experts ICCV 2023 Scaling Data Generation in Vision-and-Language Navigation ICCV 2023 Unmasked Teacher: Towards Training-Efficient Video Foundation Models ICCV 2023 UniFormerV2: Unlocking the Potential of Image ViTs for Video Understanding ICCV 2023 PalGAN: Image Colorization with Palette Generative Adversarial Networks ECCV 2022 DoTAT: A Domain-oriented Text Annotation Tool ACL 2022 Towards Implicit Text-Guided 3D Shape Generation CVPR 2022 MAT: Mask-Aware Transformer for Large Hole Image Inpainting CVPR 2022 Diversity Features Enhanced Prototypical Network for Few-shot Intent Detection IJCAI 2022 Exploring linguistic feature and model combination for speech recognition based automatic AD detection INTERSPEECH 2022 Conformer Based Elderly Speech Recognition System for Alzheimer’s Disease Detection INTERSPEECH 2022 Nonlinear ICA Using Volume-Preserving Transformations ICLR 2022 Three-stage Evolution and Fast Equilibrium for SGD with Non-degerate Critical Points ICML 2022 Point Cloud Domain Adaptation via Masked Local 3D Structure Prediction ECCV 2022 Adversarial Defence by Diversified Simultaneous Training of Deep Ensembles AAAI 2021 Image Synthesis via Semantic Composition ICCV 2021 Efficient Folded Attention for Medical Image Reconstruction and Segmentation AAAI 2021 Student Customized Knowledge Distillation: Bridging the Gap Between Student and Teacher ICCV 2021 Multi-Scale Aligned Distillation for Low-Resolution Detection CVPR 2021 Hybrid optimization between iterative and network fine-tuning reconstructions for fast quantitative susceptibility mapping MIDL 2021 LightGuardian: A Full-Visibility, Lightweight, In-band Telemetry System Using Sketchlets NSDI 2021 Ensembling Low Precision Models for Binary Biomedical Image Segmentation WACV 2021 Logics of Allies and Enemies: A Formal Approach to the Dynamics of Social Balance Theory IJCAI 2020 Chinese Grammatical Error Correction Based on Hybrid Models with Data Augmentation AACL 2020 Group-Wise Dynamic Dropout Based on Latent Semantic Variations AAAI 2020 GraphER: Token-Centric Entity Resolution with Graph Convolutional Neural Networks AAAI 2020 VCNet: A Robust Approach to Blind Image Inpainting ECCV 2020 Bayesian Learning of Probabilistic Dipole Inversion for Quantitative Susceptibility Mapping MIDL 2020 RANet: Region Attention Network for Semantic Segmentation NIPS 2020 Attentive Normalization for Conditional Image Generation CVPR 2020 Exploiting Visual Features Using Bayesian Gated Neural Networks for Disordered Speech Recognition INTERSPEECH 2019 Model-Agnostic Adversarial Detection by Random Perturbations IJCAI 2019 Wide-Context Semantic Image Extrapolation CVPR 2019 Generalized Robust Bayesian Committee Machine for Large-scale Gaussian Process Regression ICML 2018 Fast Factorization-free Kernel Learning for Unlabeled Chunk Data Streams IJCAI 2018 Image Inpainting via Generative Multi-column Convolutional Neural Networks NIPS 2018 Online Robust Image Alignment via Subspace Learning From Gradient Orientations ICCV 2017 Incremental Kernel Null Space Discriminant Analysis for Novelty Detection CVPR 2017 Bayesian Optimization of Partition Layouts for Mondrian Processes IJCAI 2016 Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin ICML 2016 Saliency Detection with a Deeper Investigation of Light Field IJCAI 2015 Metadata Dependent Mondrian Processes ICML 2015 Stable Learning in Coding Space for Multi-Class Decoding and Its Extension for Multi-Class Hypothesis Transfer Learning CVPR 2014 Wire Speed Name Lookup: A GPU-based Approach NSDI 2013