Yifan Zhang

119 papers · 2013–2026 · 15 conferences · across top CS/AI conferences

Achievements

+17 more ↓

🗺️ Taxonomy Completionist (16) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (5) 🐣 Hot Topic Early Bird

🌈 Renaissance Researcher (5) 🌍 Conference Polyglot (15) 🏃 Academic Marathon (12) 👥 Mega-Team (20) 🏆 Grand Slam 🔬 Deep Specialist (10) 🧬 Topic Evolution 🏆 Keyword Champion (3) 🤝 Dynamic Duo (16) 👑 Triple Crown 🗃️ Keyword Collector (434) ❓ The Questioner (3) ⚡ Prolific Year (8) 🚀 Conference Pioneer 📈 Trend Setter 💎 Century Club (110) 🔥 Unstoppable (11)

Conferences

AAAI (17) ICML (14) CVPR (13) NIPS (13) ACL (12) EMNLP (9) ICLR (9) IJCAI (9) ECCV (8) ICCV (8) NAACL (3) COLING (1) EACL (1) IJCNLP (1) INTERSPEECH (1)

Top co-authors

Jian Cheng (16) Hanqing Lu (11) Shuaicheng Niu (8) Mingkui Tan (7) Liang Wang (6) Qingsong Wen (6) Zhang Zhang (6) Jiashi Feng (6) Rong Jin (6) Bo Pang (5)

Research topics

Applications (1)

Keywords

large language model (11) contrastive learning (6) model compression (5) retrieval-augmented generation (4) reinforcement learning (4) action recognition (4) knowledge distillation (4) object detection (4) semantic segmentation (4) multimodal learning (4) diffusion model (4) mathematical reasoning (3) generative model (3) 3d object detection (3) representation learning (3) domain adaptation (3) autonomous driving (3) transformer architecture (3) data augmentation (3) video understanding (3)

Papers

MCIE: Multimodal LLM-Driven Complex Instruction Image Editing with Spatial Guidance AAAI 2026 EyeMulator: Improving Code Language Models by Mimicking Human Visual Attention ACL 2026 Scaling Law for Multimodal Large Language Model Supervised Fine-Tuning ACL 2026 CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization ACL 2026 SPIRAL: Symbolic LLM Planning via Grounded and Reflective Search AAAI 2026 AirWino: Optimized Winograd Convolution for Accelerating CNN Inference on ARMv8 Processors AAAI 2026 CoGenSAM: Codebook-Interactive Generative Labeling for Adapting SAM to Crack Segmentation AAAI 2026 DriveFlow: Rectified Flow Adaptation for Robust 3D Object Detection in Autonomous Driving AAAI 2026 Explicit Temporal-Semantic Modeling for Dense Video Captioning via Context-Aware Cross-Modal Interaction AAAI 2026 Computational Complexity of Planning for Recursive Primitive Task Networks: Selective Action Nullification with State Preservation IJCAI 2025 Improving Multimodal Social Media Popularity Prediction via Selective Retrieval Knowledge Augmentation AAAI 2025 Augmenting Math Word Problems via Iterative Question Composing AAAI 2025 A Comprehensive Evaluation on Event Reasoning of Large Language Models AAAI 2025 AI-Driven Virtual Teacher for Enhanced Educational Efficiency: Leveraging Large Pretrain Models for Autonomous Error Analysis and Correction AAAI 2025 Efficient Image Similarity Search with Quadtrees (Student Abstract) AAAI 2025 ELABORATION: A Comprehensive Benchmark on Human-LLM Competitive Programming ACL 2025 Finite State Automata Inside Transformers with Chain-of-Thought: A Mechanistic Study on State Tracking ACL 2025 RARE: Retrieval-Augmented Reasoning Enhancement for Large Language Models ACL 2025 We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning? ACL 2025 V-Oracle: Making Progressive Reasoning in Deciphering Oracle Bones for You and Me ACL 2025 One-Dimensional Object Detection for Streaming Text Segmentation of Meeting Dialogue ACL 2025 Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts ACL 2025 TopNet: Transformer-Efficient Occupancy Prediction Network for Octree-Structured Point Cloud Geometry Compression CVPR 2025 DriveGEN: Generalized and Robust 3D Detection in Driving via Controllable Text-to-Image Diffusion Generation CVPR 2025 Gradient-Attention Guided Dual-Masking Synergetic Framework for Robust Text-based Person Retrieval EMNLP 2025 MemeReaCon: Probing Contextual Meme Understanding in Large Vision-Language Models EMNLP 2025 From Scores to Steps: Diagnosing and Improving LLM Performance in Evidence-Based Medical Calculations EMNLP 2025 SPO: Self Preference Optimization with Self Regularization EMNLP 2025 Chatbot To Help Patients Understand Their Health EMNLP 2025 LoRaDA: Low-Rank Direct Attention Adaptation for Efficient LLM Fine-tuning EMNLP 2025 Training LLMs for Optimization Modeling via Iterative Data Synthesis and Structured Validation EMNLP 2025 MEGA: Memory-Efficient 4D Gaussian Splatting for Dynamic Scenes ICCV 2025 Learning to Generalize without Bias for Open-Vocabulary Action Recognition ICCV 2025 Beyond Isolated Words: Diffusion Brush for Handwritten Text-Line Generation ICCV 2025 Beyond Squared Error: Exploring Loss Design for Enhanced Training of Generative Flow Networks ICLR 2025 MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans? ICLR 2025 Poison-splat: Computation Cost Attack on 3D Gaussian Splatting ICLR 2025 DAMA: Data- and Model-aware Alignment of Multi-modal LLMs ICML 2025 e-GAI: e-value-based Generalized $α$-Investing for Online False Discovery Rate Control ICML 2025 MM-RLHF: The Next Step Forward in Multimodal LLM Alignment ICML 2025 Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment ICML 2025 Position: Trustworthy AI Agents Require the Integration of Large Language Models and Formal Methods ICML 2025 MEGAD: A Memory-Efficient Framework for Large-Scale Attributed Graph Anomaly Detection IJCAI 2025 Learning to Extrapolate and Adjust: Two-Stage Meta-Learning for Concept Drift in Online Time Series Forecasting IJCAI 2025 CONSTRUCTA: Automating Commercial Construction Schedules in Fabrication Facilities with Large Language Models NAACL 2025 Contrastive Learning is Spectral Clustering on Similarity Graph ICLR 2024 Evaluating Step-by-Step Reasoning through Symbolic Verification NAACL 2024 MonoTTA: Fully Test-Time Adaptation for Monocular 3D Object Detection ECCV 2024 Information Flow in Self-Supervised Learning ICML 2024 One-Shot Diffusion Mimicker for Handwritten Text Generation ECCV 2024 Generalization Analysis for Label-Specific Representation Learning NIPS 2024 Fine-grained Image-to-LiDAR Contrastive Distillation with Visual Foundation Models NIPS 2024 Position: What Can Large Language Models Tell Us about Time Series Analysis ICML 2024 DecorateLM: Data Engineering through Corpus Rating, Tagging, and Editing with Language Models EMNLP 2024 Segment Any Event Streams via Weighted Adaptation of Pivotal Tokens CVPR 2024 Towards Automated Chinese Ancient Character Restoration: A Diffusion-Based Method with a New Dataset AAAI 2024 Generalization Analysis for Multi-Label Learning ICML 2024 Matrix Information Theory for Self-Supervised Learning ICML 2024 Intrinsic Action Tendency Consistency for Cooperative Multi-Agent Reinforcement Learning AAAI 2024 MEEL: Multi-Modal Event Evolution Learning ACL 2024 A Study on the Calibration of In-context Learning NAACL 2024 HGCN2SP: Hierarchical Graph Convolutional Network for Two-Stage Stochastic Programming ICML 2024 Implicit Concept Removal of Diffusion Models ECCV 2024 OneNet: Enhancing Time Series Forecasting Models under Concept Drift by Online Ensembling NIPS 2023 Disentangling Writer and Character Styles for Handwriting Generation CVPR 2023 On-the-Fly Adapting Code Summarization on Trainable Cost-Effective Language Models NIPS 2023 Expanding Small-Scale Datasets with Guided Imagination NIPS 2023 UnSE: Unsupervised Speech Enhancement Using Optimal Transport INTERSPEECH 2023 Free Lunch for Domain Adversarial Training: Environment Label Smoothing ICLR 2023 On the Data-Efficiency with Contrastive Image Transformation in Reinforcement Learning ICLR 2023 Nearly-tight Bounds for Deep Kernel Learning ICML 2023 AdaNPC: Exploring Non-Parametric Classifier for Test-Time Adaptation ICML 2023 Dataset Quantization ICCV 2023 Asynchronous Event Processing with Local-Shift Graph Convolutional Network AAAI 2023 Trade-off Between Efficiency and Consistency for Removal-based Explanations NIPS 2023 Unleash the Potential of Image Branch for Cross-modal 3D Object Detection NIPS 2023 QD-BEV : Quantization-aware View-guided Distillation for Multi-view 3D Object Detection ICCV 2023 Towards Stable Test-time Adaptation in Dynamic Wild World ICLR 2023 Semantic Segmentation by Early Region Proxy CVPR 2022 MENet: A Memory-Based Network with Dual-Branch for Efficient Event Stream Processing ECCV 2022 Unsupervised Visual Representation Learning by Synchronous Momentum Grouping ECCV 2022 Prototype-Guided Continual Adaptation for Class-Incremental Unsupervised Domain Adaptation ECCV 2022 AutoMS: Automatic Model Selection for Novelty Detection with Error Rate Control NIPS 2022 Self-Supervised Aggregation of Diverse Experts for Test-Agnostic Long-Tailed Recognition NIPS 2022 PKD: General Distillation Framework for Object Detectors via Pearson Correlation Coefficient NIPS 2022 Unsupervised Representation for Semantic Segmentation by Implicit Cycle-Attention Contrastive Learning AAAI 2022 How Well Does Self-Supervised Pre-Training Perform with Streaming Data? ICLR 2022 Not All Points Are Equal: Learning Highly Efficient Point-Based Detectors for 3D LiDAR Point Clouds CVPR 2022 Efficient Test-Time Model Adaptation without Forgetting ICML 2022 AdaXpert: Adapting Neural Architecture for Growing Data ICML 2021 Source-free Domain Adaptation via Avatar Prototype Generation and Adaptation IJCAI 2021 Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regularized Fine-Tuning NIPS 2021 AdaSGN: Adapting Joint Number and Model Size for Efficient Skeleton-Based Action Recognition ICCV 2021 No Fear of Heterogeneity: Classifier Calibration for Federated Learning with Non-IID Data NIPS 2021 StablePose: Learning 6D Object Poses From Geometrically Stable Patches CVPR 2021 A Novel Learning Framework for Sampling-Based Motion Planning in Autonomous Driving AAAI 2020 Interpretable Complex-Valued Neural Networks for Privacy Protection ICLR 2020 Prta: A System to Support the Analysis of Propaganda Techniques in the News ACL 2020 Relation-Aware Transformer for Portfolio Policy Learning IJCAI 2020 Skeleton-Based Action Recognition With Shift Graph Convolutional Network CVPR 2020 TubeTK: Adopting Tubes to Track Multi-Object in a One-Step Training Model CVPR 2020 Decoupling GCN with DropGraph Module for Skeleton-Based Action Recognition ECCV 2020 Further Understanding Videos through Adverbs: A New Video Task AAAI 2020 Multi-marginal Wasserstein GAN NIPS 2019 Tanbih: Get To Know What You Are Reading EMNLP 2019 Skeleton-Based Action Recognition With Directed Graph Neural Networks CVPR 2019 Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition CVPR 2019 Tanbih: Get To Know What You Are Reading IJCNLP 2019 Two-Step Quantization for Low-Bit Neural Networks CVPR 2018 Training Binary Weight Networks via Semi-Binary Decomposition ECCV 2018 QCRI Live Speech Translation System EACL 2017 Egocentric Gesture Recognition Using Recurrent 3D Convolutional Neural Networks With Spatiotemporal Transformer Modules ICCV 2017 Switched Linear Multi-Robot Navigation Using Hierarchical Model Predictive Control IJCAI 2017 Leveraging Multiple Domains for Sentiment Classification COLING 2016 Action Recognition with Joints-Pooled 3D Deep Convolutional Descriptors IJCAI 2016 Hierarchical Model Predictive Control for Multi-Robot Navigation IJCAI 2016 Face Clustering in Videos with Proportion Prior IJCAI 2015 Constrained Clustering and Its Application to Face Clustering in Videos CVPR 2013 Event Detection in Complex Scenes Using Interval Temporal Constraints ICCV 2013