Tao Yu

115 papers · 2017–2026 · 13 conferences · across top CS/AI conferences

Achievements

+19 more ↓

🏃 Academic Marathon (8) 🌍 Conference Polyglot (13) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🐣 Hot Topic Early Bird

🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐝 Cross-Pollinator (5) 🏠 Conference Loyalist (20) 🌟 Keyword Trendsetter Combo (4) 🤝 Dynamic Duo (21) 👑 Triple Crown 🏆 Grand Slam 👥 Mega-Team (35) 🔬 Deep Specialist (23) 🧬 Topic Evolution 🏆 Keyword Champion (3) ❓ The Questioner 📈 Trend Setter 🗃️ Keyword Collector (434) ⚡ Prolific Year (16) 🔥 Unstoppable (9) 💎 Century Club (110) 🚀 Conference Pioneer

Conferences

CVPR (23) EMNLP (20) ICLR (15) NIPS (11) AAAI (9) ACL (8) ECCV (7) ICML (7) ICCV (5) IJCNLP (3) NAACL (3) AISTATS (2) INTERSPEECH (2)

Top co-authors

Yebin Liu (22) Dragomir Radev (18) Rui Zhang (17) Caiming Xiong (16) Zerong Zheng (12) Qionghai Dai (12) Xi Victoria Lin (9) Hongjin Su (8) Tianbao Xie (8) Michihiro Yasunaga (7)

Research topics

Generation (1)

Keywords

3d reconstruction (15) semantic parsing (8) large language model (7) human pose estimation (7) attention mechanism (6) code generation (5) neural network (5) language model (5) contrastive learning (4) benchmark evaluation (4) 3d vision (4) in-context learning (4) motion capture (4) neural radiance field (4) question answering (3) unsupervised learning (3) knowledge base (3) knowledge distillation (3) representation learning (3) neural rendering (3)

Papers

Dynamic Deep Graph Learning for Incomplete Multi-View Clustering with Masked Graph Reconstruction Loss AAAI 2026 Improving Generalization in LLM Structured Pruning via Function-Aware Neuron Grouping AAAI 2026 Scaling Law for Multimodal Large Language Model Supervised Fine-Tuning ACL 2026 Monocular Mesh Recovery and Body Measurement of Female Saanen Goats AAAI 2026 Improving Deepfake Detection with Reinforcement Learning-Based Adaptive Data Augmentation AAAI 2026 Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows ICLR 2025 V2V3D: View-to-View Denoised 3D Reconstruction for Light Field Microscopy CVPR 2025 GUI-Xplore: Empowering Generalizable GUI Agents with One Exploration CVPR 2025 MM-RLHF: The Next Step Forward in Multimodal LLM Alignment ICML 2025 Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction ICML 2025 ImViD: Immersive Volumetric Videos for Enhanced VR Engagement CVPR 2025 MotionPRO: Exploring the Role of Pressure in Human MoCap and Beyond CVPR 2025 View Transformation Robustness for Multi-View 3D Object Reconstruction with Reconstruction Error-Guided View Selection AAAI 2025 PSHuman: Photorealistic Single-image 3D Human Reconstruction using Cross-Scale Multiview Diffusion and Explicit Remeshing CVPR 2025 Systematic Outliers in Large Language Models ICLR 2025 Neural Fluid Simulation on Geometric Surfaces ICLR 2025 Attacking Vision-Language Computer Agents via Pop-ups ACL 2025 Digest the Knowledge: Large Language Models empowered Message Passing for Knowledge Graph Question Answering ACL 2025 Stochastic Rounding for LLM Training: Theory and Practice AISTATS 2025 Training LLMs with MXFP4 AISTATS 2025 Generative Representational Instruction Tuning ICLR 2025 AgentTrek: Agent Trajectory Synthesis via Guiding Replay with Web Tutorials ICLR 2025 BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval ICLR 2025 Learn-by-interact: A Data-Centric Framework For Self-Adaptive Agents in Realistic Environments ICLR 2025 OmniSeg3D: Omniversal 3D Segmentation via Hierarchical Contrastive Learning CVPR 2024 HHMR: Holistic Hand Mesh Recovery by Enhancing the Multimodal Controllability of Graph Diffusion Models CVPR 2024 DiffPerformer: Iterative Learning of Consistent Latent Guidance for Diffusion-based Human Video Generation CVPR 2024 VLKEB: A Large Vision-Language Model Knowledge Editing Benchmark NIPS 2024 OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments NIPS 2024 Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows? NIPS 2024 Neural Physical Simulation with Multi-Resolution Hash Grid Encoding AAAI 2024 Fluctuation-Based Adaptive Structured Pruning for Large Language Models AAAI 2024 Collage: Light-Weight Low-Precision Strategy for LLM Training ICML 2024 Lemur: Harmonizing Natural Language and Code for Language Agents ICLR 2024 Shadow Cones: A Generalized Framework for Partial Order Embeddings ICLR 2024 Text2Reward: Reward Shaping with Language Models for Reinforcement Learning ICLR 2024 EvoR: Evolving Retrieval for Code Generation EMNLP 2024 Language Agents: Foundations, Prospects, and Risks EMNLP 2024 FOLIO: Natural Language Reasoning with First-Order Logic EMNLP 2024 MMVP: A Multimodal MoCap Dataset with Vision and Pressure Sensors CVPR 2024 Batch Prompting: Efficient Inference with Large Language Model APIs EMNLP 2023 Triangulation Residual Loss for Data-efficient 3D Pose Estimation NIPS 2023 Coneheads: Hierarchy Aware Attention NIPS 2023 Complex Reasoning in Natural Language ACL 2023 One Embedder, Any Task: Instruction-Finetuned Text Embeddings ACL 2023 Learning Visibility Field for Detailed 3D Human Reconstruction and Relighting CVPR 2023 ZBS: Zero-Shot Background Subtraction via Instance-Level Background Modeling and Foreground Selection CVPR 2023 Task Residual for Tuning Vision-Language Models CVPR 2023 Generating Data for Symbolic Language with Large Language Models EMNLP 2023 PARF: Primitive-Aware Radiance Fusion for Indoor Scene Novel View Synthesis ICCV 2023 Random Laplacian Features for Learning with Hyperbolic Space ICLR 2023 Selective Annotation Makes Language Models Better Few-Shot Learners ICLR 2023 Binding Language Models in Symbolic Languages ICLR 2023 DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation ICML 2023 Compositional Exemplars for In-context Learning ICML 2023 Coder Reviewer Reranking for Code Generation ICML 2023 Hybrid AHS: A Hybrid of Kalman Filter and Deep Learning for Acoustic Howling Suppression INTERSPEECH 2023 Speech Enhancement with Fullband-Subband Cross-Attention Network INTERSPEECH 2022 Understanding Hyperdimensional Computing for Parallel Single-Pass Learning NIPS 2022 DYLE: Dynamic Latent Extraction for Abstractive Long-Input Summarization ACL 2022 UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models EMNLP 2022 ZeroGen: Efficient Zero-shot Learning via Dataset Generation EMNLP 2022 In-Context Learning for Few-Shot Dialogue State Tracking EMNLP 2022 ProGen: Progressive Zero-shot Dataset Generation via In-context Feedback EMNLP 2022 Augmenting Multi-Turn Text-to-SQL Datasets with Self-Play EMNLP 2022 GIMO: Gaze-Informed Human Motion Prediction in Context ECCV 2022 Mask-based Latent Reconstruction for Reinforcement Learning NIPS 2022 DoubleField: Bridging the Neural Surface and Radiance Fields for High-Fidelity Human Reconstruction and Rendering CVPR 2022 HuMMan: Multi-modal 4D Human Dataset for Versatile Sensing and Modeling ECCV 2022 Structured Local Radiance Fields for Human Avatar Modeling CVPR 2022 Interacting Attention Graph for Single Image Two-Hand Reconstruction CVPR 2022 Geometry-Aware Single-Image Full-Body Human Relighting ECCV 2022 FaceVerse: A Fine-Grained and Detail-Controllable 3D Face Morphable Model From a Hybrid Dataset CVPR 2022 Deep Implicit Templates for 3D Shape Representation CVPR 2021 PlayVirtual: Augmenting Cycle-Consistent Virtual Trajectories for Reinforcement Learning NIPS 2021 Effective Fine-Tuning Methods for Cross-lingual Adaptation EMNLP 2021 Representing Hyperbolic Space Accurately using Multi-Component Floats NIPS 2021 Logic-Consistency Text Generation from Semantic Parses ACL 2021 SummerTime: Text Summarization Toolkit for Non-experts EMNLP 2021 An Exploratory Study on Long Dialogue Summarization: What Works and What’s Next EMNLP 2021 Testing Cross-Database Semantic Parsers With Canonical Utterances EMNLP 2021 Logic-Consistency Text Generation from Semantic Parses IJCNLP 2021 DeepMultiCap: Performance Capture of Multiple Characters Using Sparse Multiview Cameras ICCV 2021 Lightweight Multi-Person Total Motion Capture Using Sparse Multi-View Cameras ICCV 2021 Learning Omni-Frequency Region-adaptive Representations for Real Image Super-Resolution AAAI 2021 SCoRe: Pre-Training for Context Representation in Conversational Semantic Parsing ICLR 2021 GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing ICLR 2021 DART: Open-Domain Structured Data Record to Text Generation NAACL 2021 QMSum: A New Benchmark for Query-based Multi-domain Meeting Summarization NAACL 2021 Function4D: Real-Time Human Volumetric Capture From Very Sparse Consumer RGBD Sensors CVPR 2021 POSEFusion: Pose-Guided Selective Fusion for Single-View Human Volumetric Capture CVPR 2021 RobustFusion: Human Volumetric Capture with Data-driven Visual Cues using a RGBD Camera ECCV 2020 Semantic Evaluation for Text-to-SQL with Distilled Test Suites EMNLP 2020 Online Conversation Disentanglement with Pointer Networks EMNLP 2020 4D Association Graph for Realtime Multi-Person Motion Capture Using Multiple Video Cameras CVPR 2020 Robust 3D Self-Portraits in Seconds CVPR 2020 Region Normalization for Image Inpainting AAAI 2020 NormalGAN: Learning Detailed 3D Human from a Single RGB-D Image ECCV 2020 Learning Disentangled Feature Representation for Hybrid-distorted Image Restoration ECCV 2020 SimulCap : Single-View Human Performance Capture With Cloth Simulation CVPR 2019 DeepHuman: 3D Human Reconstruction From a Single Image ICCV 2019 CoSQL: A Conversational Text-to-SQL Challenge Towards Cross-Domain Natural Language Interfaces to Databases EMNLP 2019 Numerically Accurate Hyperbolic Embeddings Using Tiling-Based Models NIPS 2019 A New Defense Against Adversarial Images: Turning a Weakness into a Strength NIPS 2019 Simplifying Graph Convolutional Networks ICML 2019 SParC: Cross-Domain Semantic Parsing in Context ACL 2019 Editing-Based SQL Query Generation for Cross-Domain Context-Dependent Questions EMNLP 2019 CoSQL: A Conversational Text-to-SQL Challenge Towards Cross-Domain Natural Language Interfaces to Databases IJCNLP 2019 Editing-Based SQL Query Generation for Cross-Domain Context-Dependent Questions IJCNLP 2019 TypeSQL: Knowledge-Based Type-Aware Neural Text-to-SQL Generation NAACL 2018 HybridFusion: Real-Time Performance Capture Using a Single Depth Sensor and Sparse IMUs ECCV 2018 DoubleFusion: Real-Time Capture of Human Performances With Inner Body Shapes From a Single Depth Sensor CVPR 2018 SyntaxSQLNet: Syntax Tree Networks for Complex and Cross-Domain Text-to-SQL Task EMNLP 2018 Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task EMNLP 2018 BodyFusion: Real-Time Capture of Human Motion and Surface Geometry Using a Single Depth Camera ICCV 2017