Yao Lu

65 papers · 2015–2026 · 16 conferences · across top CS/AI conferences

Achievements

+12 more ↓

🌍 Conference Polyglot (16) 🐣 Hot Topic Early Bird 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🏃 Academic Marathon (10)

🐣 Hot Topic Early Bird 🐝 Cross-Pollinator (10) 🗺️ Taxonomy Completionist (88) 🤝 Dynamic Duo (12) 🏆 Grand Slam 👥 Mega-Team (58) 🧬 Topic Evolution 💎 Century Club (62) 🚀 Conference Pioneer 🗃️ Keyword Collector (244) 🔥 Unstoppable (8) ⚡ Prolific Year (6)

Conferences

CVPR (9) AAAI (7) ICLR (7) CORL (6) EMNLP (6) ICCV (4) ICML (4) IJCAI (4) NIPS (4) ACL (3) NAACL (3) MICCAI (2) RSS (2) WACV (2) ACML (1) ECCV (1)

Top co-authors

Ted Xiao (12) Song Han (11) Karol Hausman (11) Sergey Levine (10) Guangming Lu (8) Chelsea Finn (8) Keerthana Gopalakrishnan (8) Kanishka Rao (7) Yevgen Chebotar (7) Hongxu Yin (6)

Research topics

Optimization (1)

Keywords

multi-task learning (5) model compression (5) vision-language model (4) robotic manipulation (4) large language model (4) reinforcement learning (3) convolutional neural network (3) imitation learning (3) neural network (3) optical flow (3) information retrieval (2) representation learning (2) offline reinforcement learning (2) contrastive learning (2) unsupervised learning (2) multimodal learning (2) knowledge distillation (2) motion estimation (2) self-supervised learning (2) question answering (2)

Papers

TimeCAP: A Channel-Aware Pre-Training Framework for Multivariate Time Series Forecasting AAAI 2026 The Role of Mixed-Language Documents for Multilingual Large Language Model Pretraining ACL 2026 SepPrune: Structured Pruning for Efficient Deep Speech Separation AAAI 2026 Efficient and Separate Authentication Image Steganography Network ICML 2025 ALRMR-GEC: Adjusting Learning Rate Based on Memory Rate to Optimize the Edit Scorer for Grammatical Error Correction AAAI 2025 CoT-VLA: Visual Chain-of-Thought Reasoning for Vision-Language-Action Models CVPR 2025 RADIOv2.5: Improved Baselines for Agglomerative Vision Foundation Models CVPR 2025 NVILA: Efficient Frontier Visual Language Models CVPR 2025 Scaling Vision Pre-Training to 4K Resolution CVPR 2025 VILA-M3: Enhancing Vision-Language Models with Medical Expert Knowledge CVPR 2025 SynC-LLM: Generation of Large-Scale Synthetic Circuit Code with Hierarchical Language Models EMNLP 2025 Multilingual Language Model Pretraining using Machine-translated Data EMNLP 2025 OkraLong: A Flexible Retrieval-Augmented Framework for Long-Text Question Answering EMNLP 2025 SparseVILA: Decoupling Visual Sparsity for Efficient VLM Inference ICCV 2025 DC-AR: Efficient Masked Autoregressive Image Generation with Deep Compression Hybrid Tokenizer ICCV 2025 COAT: Compressing Optimizer states and Activations for Memory-Efficient FP8 Training ICLR 2025 VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation ICLR 2025 LongVILA: Scaling Long-Context Visual Language Models for Long Videos ICLR 2025 SANA: Efficient High-Resolution Text-to-Image Synthesis with Linear Diffusion Transformers ICLR 2025 HART: Efficient Visual Generation with Hybrid Autoregressive Transformer ICLR 2025 SSHR: More Secure Generative Steganography with High-Quality Revealed Secret Images ICML 2025 Multimodal Fusion Network with Distribution-based Tumor-Marker Imputation for Multi-Origin Metastatic Cervical Lymphadenopathy Classification MICCAI 2025 RT-Trajectory: Robotic Task Generalization via Hindsight Trajectory Sketches ICLR 2024 QFormer: An Efficient Quaternion Transformer for Image Denoising IJCAI 2024 Implicit Prompt Learning for Image Denoising IJCAI 2024 Words Worth a Thousand Pictures: Measuring and Understanding Perceptual Variability in Text-to-Image Generation EMNLP 2024 Dual-Modality Watershed Fusion Network for Thyroid Nodule Classification of Dual-View CEUS Video MICCAI 2024 UDA: A Benchmark Suite for Retrieval Augmented Generation in Real-World Document Analysis NIPS 2024 Strings from the Library of Babel: Random Sampling as a Strong Baseline for Prompt Optimisation NAACL 2024 AfriMTE and AfriCOMET: Enhancing COMET to Embrace Under-resourced African Languages NAACL 2024 Jump-Start Reinforcement Learning ICML 2023 Grounded Decoding: Guiding Text Generation with Grounded Models for Embodied Agents NIPS 2023 Deep RL at Scale: Sorting Waste in Office Buildings with a Fleet of Mobile Manipulators RSS 2023 RT-1: Robotics Transformer for Real-World Control at Scale RSS 2023 Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous Driving Research NIPS 2023 RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control CORL 2023 Token Turing Machines CVPR 2023 Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions CORL 2023 Open-World Object Manipulation using Pre-Trained Vision-Language Models CORL 2023 Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity ACL 2022 Understanding the Dynamics of DNNs Using Graph Modularity ECCV 2022 Value Function Spaces: Skill-Centric State Abstractions for Long-Horizon Reasoning ICLR 2022 Do As I Can, Not As I Say: Grounding Language in Robotic Affordances CORL 2022 PI-QT-Opt: Predictive Information Improves Multi-Task Robotic Reinforcement Learning at Scale CORL 2022 Detail-Preserving Transformer for Light Field Image Super-resolution AAAI 2022 Learning To Estimate Hidden Motions With Global Motion Aggregation ICCV 2021 AW-Opt: Learning Robotic Skills with Imitation andReinforcement at Scale CORL 2021 Learning Optical Flow From a Few Matches CVPR 2021 Taskology: Utilizing Task Relations at Scale CVPR 2021 A Global Occlusion-Aware Approach to Self-Supervised Monocular Visual Odometry AAAI 2021 Delayed Gradient Averaging: Tolerate the Communication Latency for Federated Learning NIPS 2021 Vid2Int: Detecting Implicit Intention From Long Dialog Videos WACV 2021 Actionable Models: Unsupervised Offline Reinforcement Learning of Robotic Skills ICML 2021 Devon: Deformable Volume Network for Learning Optical Flow WACV 2020 Discrete Optimization for Unsupervised Sentence Summarization with Word-Level Extraction ACL 2020 Multi-XScience: A Large-scale Dataset for Extreme Multi-document Summarization of Scientific Articles EMNLP 2020 Natural Language Generation for Effective Knowledge Distillation EMNLP 2019 A Multi-Task Learning Framework for Abstractive Text Summarization AAAI 2019 Separate Loss for Basic and Compound Facial Expression Recognition in the Wild ACML 2019 Super Sparse Convolutional Neural Networks AAAI 2019 AAR-CNNs: Auto Adaptive Regularized Convolutional Neural Networks IJCAI 2018 Unsupervised Learning on Neural Network Outputs: With Application in Zero-Shot Learning IJCAI 2016 Coherent Parametric Contours for Interactive Video Object Segmentation CVPR 2016 Detecting “Smart” Spammers on Social Network: A Topic Model Approach NAACL 2016 Contour Flow: Middle-Level Motion Estimation by Combining Motion Segmentation and Contour Alignment ICCV 2015