conftrace_

Yang Zhao

120 papers · 2017–2026 · 16 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+16 more ↓

🗺️ Taxonomy Completionist (19) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (5) 🌍 Conference Polyglot (15)

🌈 Renaissance Researcher (5) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (15) 🤝 Dynamic Duo (23) 👑 Triple Crown 🏆 Grand Slam 🔬 Deep Specialist (30) 🧬 Topic Evolution 🏆 Keyword Champion 🔥 Unstoppable (9) ⚡ Prolific Year (15) ❓ The Questioner (2) 💎 Century Club (113) 🗃️ Keyword Collector (500) 📈 Trend Setter 🚀 Conference Pioneer

Conferences

ACL (22) EMNLP (17) CVPR (15) AAAI (14) ICML (8) NIPS (8) ICCV (7) COLING (6) ICLR (6) IJCAI (5) ECCV (3) IJCNLP (3) NAACL (3) AACL (1) EACL (1) WACV (1)

Top co-authors

Chengqing Zong (24) Zhou Zhao (12) Yu Zhou (12) Changyou Chen (9) Jiajun Zhang (9) Haifeng Huang (9) Yupu Liang (8) Wei Jia (8) Zehan Wang (8) Yaping Zhang (8)

Keywords

large language model (14) multimodal learning (10) neural machine translation (10) reinforcement learning (10) multi-modal learning (9) multimodal large language model (8) image generation (7) machine translation (7) diffusion model (6) visual grounding (5) unsupervised learning (5) video generation (5) generative adversarial network (5) document image translation (5) language model (5) semantic alignment (4) domain adaptation (4) generative model (4) knowledge distillation (4) text summarization (4)

Papers

DART: Disambiguation-Aware Reasoning for Video-guided Machine Translation ACL 2026 Deep Clustering Based on Sparse Kolmogorov-Arnold Network and Spectral Constraint AAAI 2026 LSAP-PV: High-Fidelity Palm Vein Image Synthesis via Layered Spectral Absorption Projection-Guided Diffusion Model AAAI 2026 Consolidation or Adaptation? PRISM: Disentangling SFT and RL Data via Gradient Concentration ACL 2026 Event-Guided Scene Text Image Super-Resolution AAAI 2026 VaseVQA: Multimodal Agent and Benchmark for Ancient Greek Pottery EACL 2026 MAESTRO: Meta-learning Adaptive Estimation of Scalarization Trade-offs for Reward Optimization ACL 2026 RoSTE: An Efficient Quantization-Aware Supervised Fine-Tuning Approach for Large Language Models ICML 2025 SeedVR: Seeding Infinity in Diffusion Transformer Towards Generic Video Restoration CVPR 2025 Diff-Palm: Realistic Palmprint Generation with Polynomial Creases and Intra-Class Variation Controllable Diffusion Models CVPR 2025 MExD: An Expert-Infused Diffusion Model for Whole-Slide Image Classification CVPR 2025 A Query-Response Framework for Whole-Page Complex-Layout Document Image Translation with Relevant Regional Concentration ACL 2025 Boosting LLM Translation Skills without General Ability Loss via Rationale Distillation ACL 2025 A Self-Improving Method for Generating Descriptions of Financial Data Quality Grading Using LLMs EMNLP 2025 Unified Adversarial Augmentation for Improving Palmprint Recognition ICCV 2025 SimulPL: Aligning Human Preferences in Simultaneous Machine Translation ICLR 2025 E4: Energy-Efficient DNN Inference for Edge Video Analytics via Early Exiting and DVFS AAAI 2025 PVTree: Realistic and Controllable Palm Vein Generation for Recognition Tasks AAAI 2025 Multimodal Class-aware Semantic Enhancement Network for Audio-Visual Video Parsing AAAI 2025 Bias Analysis and Mitigation through Protected Attribute Detection and Regard Classification EMNLP 2025 Multimodal Inconsistency Reasoning (MMIR): A New Benchmark for Multimodal Reasoning Models ACL 2025 Improving MLLM’s Document Image Machine Translation via Synchronously Self-reviewing Its OCR Proficiency ACL 2025 PresentAgent: Multimodal Agent for Presentation Video Generation EMNLP 2025 Data-Efficiently Learn Large Language Model for Universal 3D Scene Perception NAACL 2025 Hidden in Plain Sight: Reasoning in Underspecified and Misspecified Scenarios for Multimodal LLMs EMNLP 2025 SHIFT: Selected Helpful Informative Frame for Video-guided Machine Translation EMNLP 2025 Permutative Preference Alignment from Listwise Ranking of Human Judgments EMNLP 2025 3DMolFormer: A Dual-channel Framework for Structure-based Drug Discovery ICLR 2025 Seeing Symbols, Missing Cultures: Probing Vision-Language Models’ Reasoning on Fire Imagery and Cultural Meaning EMNLP 2025 VideoAuteur: Towards Long Narrative Video Generation ICCV 2025 How Far Is Video Generation from World Model: A Physical Law Perspective ICML 2025 Single-to-mix Modality Alignment with Multimodal Large Language Model for Document Image Machine Translation ACL 2025 Analyzing the Rapid Generalization of SFT via the Perspective of Attention Head Activation Patterns ACL 2025 Beyond Similarity: A Gradient-based Graph Method for Instruction Tuning Data Selection ACL 2025 LLM-Enhanced Self-Evolving Reinforcement Learning for Multi-Step E-Commerce Payment Fraud Risk Detection ACL 2025 A Simple-Yet-Efficient Instruction Augmentation Method for Zero-Shot Sentiment Classification COLING 2025 TriFine: A Large-Scale Dataset of Vision-Audio-Subtitle for Tri-Modal Machine Translation and Benchmark with Fine-Grained Annotated Tags COLING 2025 From Chaotic OCR Words to Coherent Document: A Fine-to-Coarse Zoom-Out Network for Complex-Layout Document Image Translation COLING 2025 Occult: Optimizing Collaborative Communications across Experts for Accelerated Parallel MoE Training and Inference ICML 2025 MSMAR-RL: Multi-Step Masked-Attention Recovery Reinforcement Learning for Safe Maneuver Decision in High-Speed Pursuit-Evasion Game IJCAI 2025 MobileDiffusion: Instant Text-to-Image Generation on Mobile Devices ECCV 2024 Extending Multi-modal Contrastive Representations NIPS 2024 Chat-Scene: Bridging 3D Scene and Large Language Models with Object Identifiers NIPS 2024 Stereo Vision Conversion from Planar Videos Based on Temporal Multiplane Images AAAI 2024 PCE-Palm: Palm Crease Energy Based Two-Stage Realistic Pseudo-Palmprint Generation AAAI 2024 Muffin or Chihuahua? Challenging Multimodal Large Language Models with Multipanel VQA ACL 2024 Causal-Guided Active Learning for Debiasing Large Language Models ACL 2024 Incorporating Syntax and Lexical Knowledge to Multilingual Sentiment Classification on Large Language Models ACL 2024 Deciphering the Impact of Pretraining Data on Large Language Models through Machine Unlearning ACL 2024 Document Image Machine Translation with Dynamic Multi-pre-trained Models Assembling NAACL 2024 Born a BabyNet with Hierarchical Parental Supervision for End-to-End Text Image Machine Translation COLING 2024 Improving Subject-Driven Image Synthesis with Subject-Agnostic Guidance CVPR 2024 Deep Video Inverse Tone Mapping Based on Temporal Clues CVPR 2024 OHTA: One-shot Hand Avatar via Data-driven Implicit Priors CVPR 2024 UFOGen: You Forward Once Large Scale Text-to-Image Generation via Diffusion GANs CVPR 2024 Instruct-Imagen: Image Generation with Multi-modal Instruction CVPR 2024 When Will Gradient Regularization Be Harmful? ICML 2024 FreeBind: Free Lunch in Unified Multimodal Space via Knowledge Fusion ICML 2024 Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding EMNLP 2024 Image Understanding Makes for A Good Tokenizer for Image Generation NIPS 2024 De novo Drug Design using Reinforcement Learning with Multiple GPT Agents NIPS 2023 LayoutDIT: Layout-Aware End-to-End Document Image Translation with Multi-Step Conductive Decoder EMNLP 2023 Towards Authentic Face Restoration with Iterative Diffusion Models and Beyond ICCV 2023 A Simple Yet Strong Domain-Agnostic De-bias Method for Zero-Shot Sentiment Classification ACL 2023 DATE: Domain Adaptive Product Seeker for E-Commerce CVPR 2023 Scene-robust Natural Language Video Localization via Learning Domain-invariant Representations ACL 2023 Multilingual Knowledge Graph Completion with Language-Sensitive Multi-Graph Attention ACL 2023 CoopInit: Initializing Generative Adversarial Networks via Cooperative Learning AAAI 2023 Revisiting the Stack-Based Inverse Tone Mapping CVPR 2023 Distilling Coarse-to-Fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding ICCV 2023 RPG-Palm: Realistic Pseudo-data Generation for Palmprint Recognition ICCV 2023 Connecting Multi-modal Contrastive Representations NIPS 2023 3DRP-Net: 3D Relative Position-aware Network for 3D Visual Grounding EMNLP 2023 Towards Informative Open-ended Text Generation with Dynamic Knowledge Triples EMNLP 2023 CCIM: Cross-modal Cross-lingual Interactive Image Translation EMNLP 2023 A Simple Yet Effective Hybrid Pre-trained Language Model for Unsupervised Sentence Acceptability Prediction IJCNLP 2022 Calibrating CNNs for Few-Shot Meta Learning WACV 2022 A Versatile Adaptive Curriculum Learning Framework for Task-oriented Dialogue Policy Learning NAACL 2022 Quantitative Performance Assessment of CNN Units via Topological Entropy Calculation ICLR 2022 A Simple Yet Effective Hybrid Pre-trained Language Model for Unsupervised Sentence Acceptability Prediction AACL 2022 Penalizing Gradient Norm for Efficiently Improving Generalization in Deep Learning ICML 2022 Rethinking Deep Face Restoration CVPR 2022 Towards Effective Multi-Modal Interchanges in Zero-Resource Sounding Object Localization NIPS 2022 HW-NAS-Bench: Hardware-Aware Neural Architecture Search Benchmark ICLR 2021 Synchronous Interactive Decoding for Multilingual Neural Machine Translation AAAI 2021 Cascaded Prediction Network via Segment Tree for Temporal Video Grounding CVPR 2021 Unpaired Image-to-Image Translation via Latent Energy Transport CVPR 2021 Rethinking Sentiment Style Transfer EMNLP 2021 Benchmark Platform for Ultra-Fine-Grained Visual Categorization Beyond Human Performance ICCV 2021 Learning Energy-Based Generative Models via Coarse-to-Fine Expanding and Sampling ICLR 2021 FracTrain: Fractionally Squeezing Bit Savings Both Temporally and Spatially for Efficient DNN Training NIPS 2020 Bayesian Meta Sampling for Fast Uncertainty Adaptation ICLR 2020 Variance Reduction in Stochastic Particle-Optimization Sampling ICML 2020 Feature Quantization Improves GAN Training ICML 2020 Dynamic Context Selection for Document-level Neural Machine Translation via Reinforcement Learning EMNLP 2020 Q-learning with Language Model for Edit-based Unsupervised Summarization EMNLP 2020 Structure-Aware Human-Action Generation ECCV 2020 A Flexible Recurrent Residual Pyramid Network for Video Frame Interpolation ECCV 2020 Learning From Multi-Dimensional Partial Labels IJCAI 2020 Knowledge Graphs Enhanced Neural Machine Translation IJCAI 2020 Where Does It Exist: Spatio-Temporal Video Grounding for Multi-Form Sentences CVPR 2020 Knowledge Graph Enhanced Neural Machine Translation via Multi-task Learning on Sub-entity Granularity COLING 2020 Deconstruct to Reconstruct a Configurable Evaluation Metric for Open-Domain Dialogue Systems COLING 2020 CASIA’s System for IWSLT 2020 Open Domain Translation ACL 2020 Patchy Image Structure Classification Using Multi-Orientation Region Transform AAAI 2020 Learning Diverse Stochastic Human-Action Generators by Learning Smooth Latent Transitions AAAI 2020 Discriminative and Correlative Partial Multi-Label Learning IJCAI 2019 Improving Latent Alignment in Text Summarization by Generalizing the Pointer Generator EMNLP 2019 E2-Train: Training State-of-the-art CNNs with Over 80% Energy Savings NIPS 2019 Self-Adversarially Learned Bayesian Sampling AAAI 2019 Unsupervised Rewriter for Multi-Sentence Compression ACL 2019 Improving Latent Alignment in Text Summarization by Generalizing the Pointer Generator IJCNLP 2019 Addressing the Under-Translation Problem from the Entropy Perspective AAAI 2019 A Language Model based Evaluator for Sentence Compression ACL 2018 Phrase Table as Recommendation Memory for Neural Machine Translation IJCAI 2018 Multispectral Image Intrinsic Decomposition via Subspace Constraint CVPR 2018 Addressing Troublesome Words in Neural Machine Translation EMNLP 2018 A Conditional Variational Framework for Dialog Generation ACL 2017 Towards Neural Machine Translation with Partially Aligned Corpora IJCNLP 2017 Automatic Spatially-Aware Fashion Concept Discovery ICCV 2017