Dit-Yan Yeung

71 papers · 2008–2026 · 13 conferences · across top CS/AI conferences

Achievements

+17 more ↓

🗺️ Taxonomy Completionist (18) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (6) 🌍 Conference Polyglot (13)

🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (13) 🗺️ Taxonomy Completionist (18) 🌟 Keyword Trendsetter Combo (4) 🌱 Topic Pioneer 🧬 Topic Evolution 🏆 Keyword Champion 👥 Mega-Team (30) 👑 Triple Crown 🤝 Dynamic Duo (12) 🏆 Grand Slam ⚡ Prolific Year (10) 📈 Trend Setter 🚀 Conference Pioneer 🔥 Unstoppable (14) 💎 Century Club (69) 🗃️ Keyword Collector (298)

Conferences

NIPS (14) CVPR (9) ICCV (9) ICLR (8) ACL (6) ECCV (6) EMNLP (5) IJCAI (3) WACV (3) AAAI (2) ICML (2) IJCNLP (2) NAACL (2)

Top co-authors

Kai Chen (12) Lanqing Hong (12) Zhenguo Li (11) Hang Xu (9) Naiyan Wang (8) Xingjian SHI (6) Lemao Liu (6) Jianhua Han (5) Zhili Liu (5) Yangqiu Song (5)

Keywords

large language model (6) self-supervised learning (4) precipitation nowcasting (4) visual tracking (4) multimodal learning (3) feature extraction (3) bayesian inference (3) benchmark evaluation (3) vision-language model (3) hate speech detection (3) diffusion model (3) deep learning (3) convolutional neural network (3) toxicity detection (2) feature learning (2) autonomous driving (2) data augmentation (2) representation learning (2) point cloud (2) dimensionality reduction (2)

Papers

Situated Embedding Models for Context-Aware Dense Retrieval ACL 2026 CoherenDream: Boosting Holistic Text Coherence in 3D Generation via Multimodal Large Language Models Feedback AAAI 2026 G-VEval: A Versatile Metric for Evaluating Image and Video Captions Using GPT-4o AAAI 2025 EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions CVPR 2025 DivLogicEval: A Framework for Benchmarking Logical Reasoning Evaluation in Large Language Models EMNLP 2025 TrackDiffusion: Tracklet-Conditioned Video Generation via Diffusion Models WACV 2025 Automated Evaluation of Large Vision-Language Models on Self-Driving Corner Cases WACV 2025 Ref-Long: Benchmarking the Long-context Referencing Capability of Long-context Language Models ACL 2025 Fast and Slow Streams for Online Time Series Forecasting Without Information Leakage ICLR 2025 Understanding LLMs’ Fluid Intelligence Deficiency: An Analysis of the ARC Task NAACL 2025 The Stochastic Parrot on LLM’s Shoulder: A Summative Assessment of Physical Concept Understanding NAACL 2025 Anyattack: Towards Large-scale Self-supervised Adversarial Attacks on Vision-language Models CVPR 2025 JointDreamer: Ensuring Geometry Consistency and Text Congruence in Text-to-3D Generation via Joint Score Distillation ECCV 2024 Gaussian Shell Maps for Efficient 3D Human Generation CVPR 2024 DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception CVPR 2024 Implicit Concept Removal of Diffusion Models ECCV 2024 "Eyes Closed, Safety On: Protecting Multimodal LLMs via Image-to-Text Transformation" ECCV 2024 Gaining Wisdom from Setbacks: Aligning Large Language Models via Mistake Analysis ICLR 2024 RoboDreamer: Learning Compositional World Models for Robot Imagination ICML 2024 Selection-p: Self-Supervised Task-Agnostic Prompt Compression for Faithfulness and Transferability EMNLP 2024 Fourier Amplitude and Correlation Loss: Beyond Using L2 Loss for Skillful Precipitation Nowcasting NIPS 2024 MagicDrive: Street View Generation with Diverse 3D Geometry Control ICLR 2024 Learning High-resolution Vector Representation from Multi-Camera Images for 3D Object Detection ECCV 2024 GeoDiffusion: Text-Prompted Geometric Control for Object Detection Data Generation ICLR 2024 Mixed Autoencoder for Self-Supervised Visual Representation Learning CVPR 2023 Detection Recovery in Online Multi-Object Tracking With Sparse Graph Tracker WACV 2023 Adaptive Online Replanning with Diffusion Models NIPS 2023 SongRewriter: A Chinese Song Rewriting System with Controllable Content and Rhyme Scheme ACL 2023 Towards Reference-free Text Simplification Evaluation with a BERT Siamese Network Architecture ACL 2023 Learning 3D-Aware Image Synthesis With Unknown Pose Distribution CVPR 2023 CLIP2: Contrastive Language-Image-Point Pretraining From Real-World Point Cloud Data CVPR 2023 Towards General Error Diagnosis via Behavioral Testing in Machine Translation EMNLP 2023 ILA-DA: Improving Transferability of Intermediate Level Attack with Data Augmentation ICLR 2023 SVQNet: Sparse Voxel-Adjacent Query Network for 4D Spatio-Temporal LiDAR Semantic Segmentation ICCV 2023 Controlled Text Generation Using Dictionary Prior in Variational Autoencoders ACL 2022 AdaAug: Learning Class- and Instance-adaptive Data Augmentation Policies ICLR 2022 Earthformer: Exploring Space-Time Transformers for Earth System Forecasting NIPS 2022 Improving 3D-aware Image Synthesis with A Geometry-aware Discriminator NIPS 2022 CODA: A Real-World Road Corner Case Dataset for Object Detection in Autonomous Driving ECCV 2022 3D-Aware Indoor Scene Synthesis with Depth Priors ECCV 2022 Probing Toxic Content in Large Pre-Trained Language Models ACL 2021 Probing Toxic Content in Large Pre-Trained Language Models IJCNLP 2021 MODALS: Modality-agnostic Automated Data Augmentation in the Latent Space ICLR 2021 MultiSiam: Self-Supervised Multi-Instance Siamese Representation Learning for Autonomous Driving ICCV 2021 Comparative Evaluation of Label-Agnostic Selection Bias in Multilingual Hate Speech Datasets EMNLP 2020 Multilingual and Multi-Aspect Hate Speech Analysis EMNLP 2019 MARGINALIZED AVERAGE ATTENTIONAL NETWORK FOR WEAKLY-SUPERVISED LEARNING ICLR 2019 Multilingual and Multi-Aspect Hate Speech Analysis IJCNLP 2019 Learning Unmanned Aerial Vehicle Control for Autonomous Target Following IJCAI 2018 Temporal Dynamic Graph LSTM for Action-Driven Video Object Detection ICCV 2017 Deep Learning for Precipitation Nowcasting: A Benchmark and A New Model NIPS 2017 Spatiotemporal Modeling for Crowd Counting in Videos ICCV 2017 Lattice Long Short-Term Memory for Human Action Recognition ICCV 2017 Natural-Parameter Networks: A Class of Probabilistic Neural Networks NIPS 2016 Collaborative Recurrent Autoencoder: Recommend while Learning to Fill in the Blanks NIPS 2016 Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting NIPS 2015 Bayesian Adaptive Matrix Factorization With Automatic Model Selection CVPR 2015 DevNet: A Deep Event Network for Multimedia Event Detection and Evidence Recounting CVPR 2015 Human Action Recognition Using Factorized Spatio-Temporal Convolutional Networks ICCV 2015 Understanding and Diagnosing Visual Tracking Systems ICCV 2015 Ensemble-Based Tracking: Aggregating Crowdsourced Structured Time Series Data ICML 2014 Online Robust Non-negative Dictionary Learning for Visual Tracking ICCV 2013 Learning High-Order Task Relationships in Multi-Task Learning IJCAI 2013 SCMF: Sparse Covariance Matrix Factorization for Collaborative Filtering IJCAI 2013 Learning a Deep Compact Image Representation for Visual Tracking NIPS 2013 Bayesian Robust Matrix Factorization for Image and Video Processing ICCV 2013 Co-Regularized Hashing for Multimodal Data NIPS 2012 Probabilistic Multi-Task Feature Selection NIPS 2010 Worst-Case Linear Discriminant Analysis NIPS 2010 Probabilistic Relational PCA NIPS 2009 Posterior Consistency of the Silverman g-prior in Bayesian Model Choice NIPS 2008