Yiran Zhong

34 papers · 2016–2026 · 8 conferences · across top CS/AI conferences

Achievements

+12 more ↓

🌍 Conference Polyglot (8) 🏃 Academic Marathon (9) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🐝 Cross-Pollinator (13)

🐝 Cross-Pollinator (13) 🌈 Renaissance Researcher (5) 🗺️ Taxonomy Completionist (60) 🧬 Topic Evolution 🤝 Dynamic Duo (13) 👑 Triple Crown 🏆 Grand Slam 💎 Century Club (33) 🚀 Conference Pioneer ⚡ Prolific Year (5) 🗃️ Keyword Collector (144) 🔥 Unstoppable (8)

Conferences

CVPR (11) AAAI (6) ECCV (4) NIPS (4) EMNLP (3) ICLR (3) ICCV (2) ICML (1)

Top co-authors

Yuchao Dai (14) Zhen Qin (12) Hongdong Li (11) Weixuan Sun (7) yuxin mao (6) Jinxing Zhou (6) DONGXU LI (5) Meng Wang (5) Jianyuan Wang (5) Dong Li (5)

Keywords

multimodal learning (6) optical flow (5) depth estimation (5) attention mechanism (4) multi-modal learning (4) optical flow estimation (3) sequence modeling (3) stereo matching (3) motion estimation (3) transformer architecture (3) language modeling (3) video understanding (3) visual slam (2) linear complexity (2) audio-visual learning (2) 3d vision (2) semantic segmentation (2) unsupervised learning (2) 3d reconstruction (2) state space model (2)

Papers

Learning Spatial Decay for Vision Transformers AAAI 2026 Deep Non-Rigid Structure-from-Motion Revisited: Canonicalization and Sequence Modeling AAAI 2025 Tri-Ergon: Fine-Grained Video-to-Audio Generation with Multi-Modal Conditions and LUFS Control AAAI 2025 Towards Open-Vocabulary Audio-Visual Event Localization CVPR 2025 Exploring Transformer Extrapolation AAAI 2024 CO2: Efficient Distributed Training with Full Communication-Computation Overlap ICLR 2024 Label-anticipated Event Disentanglement for Audio-Visual Video Parsing ECCV 2024 Scaling Laws for Linear Complexity Language Models EMNLP 2024 MetaLA: Unified Optimal Linear Approximation to Softmax Attention Map NIPS 2024 Various Lengths, Constant Speed: Efficient Language Modeling with Lightning Attention ICML 2024 Improving Audio-Visual Segmentation with Bidirectional Generation AAAI 2024 Fine-Grained Audible Video Description CVPR 2023 Hierarchically Gated Recurrent Neural Network for Sequence Modeling NIPS 2023 Learning Audio-Visual Source Localization via False Negative Aware Contrastive Learning CVPR 2023 Accelerating Toeplitz Neural Network with Constant-time Inference Complexity EMNLP 2023 Multimodal Variational Auto-encoder based Audio-Visual Segmentation ICCV 2023 Toeplitz Neural Network for Sequence Modeling ICLR 2023 Audio—Visual Segmentation ECCV 2022 Implicit Motion Handling for Video Camouflaged Object Detection CVPR 2022 The Devil in Linear Transformer EMNLP 2022 cosFormer: Rethinking Softmax In Attention ICLR 2022 Transcribing Natural Languages for the Deaf via Neural Editing Programs AAAI 2022 Deep Two-View Structure-From-Motion Revisited CVPR 2021 RGB-D Saliency Detection via Cascaded Mutual Information Minimization ICCV 2021 ARVo: Learning All-Range Volumetric Correspondence for Video Deblurring CVPR 2021 Positive Sample Propagation Along the Audio-Visual Event Line CVPR 2021 Displacement-Invariant Matching Cost Learning for Accurate Optical Flow Estimation NIPS 2020 Hierarchical Neural Architecture Search for Deep Stereo Matching NIPS 2020 Deblurring by Realistic Blurring CVPR 2020 Noise-Aware Unsupervised Deep Lidar-Stereo Fusion CVPR 2019 Unsupervised Deep Epipolar Flow for Stationary or Dynamic Scenes CVPR 2019 Stereo Computation for a Single Mixture Image ECCV 2018 Open-World Stereo Video Matching with Deep RNN ECCV 2018 Robust Multi-Body Feature Tracker: A Segmentation-Free Approach CVPR 2016