conftrace_

Di Zhang

59 papers · 2023–2026 · 12 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓
+10 more ↓ 🌍 Conference Polyglot (12) 🐝 Cross-Pollinator (7) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (7)
🧭 Keyword Pioneer 🌈 Renaissance Researcher (7) 🏆 Grand Slam 👑 Triple Crown 🤝 Dynamic Duo (20) 🔬 Deep Specialist (11) Prolific Year (16) 🚀 Conference Pioneer 🗃️ Keyword Collector (244) 💎 Century Club (54)

Conferences

CVPR (10) ACL (8) EMNLP (8) ICCV (8) ICLR (7) AAAI (4) ICML (4) COLING (3) NIPS (3) NAACL (2) CORL (1) MICCAI (1)

Papers

FilmWeaver: Weaving Consistent Multi-Shot Videos with Cache-Guided Autoregressive Diffusion AAAI 2026 Boosting Resolution Generalization of Diffusion Transformers with Randomized Positional Encodings AAAI 2026 From Detection to Understanding: Multi-Turn Reasoning for Video Misinformation Analysis ACL 2026 Towards Stable and Effective Reinforcement Learning for Mixture-of-Experts ACL 2026 TIME: Temporal-Sensitive Multi-Dimensional Instruction Tuning and Robust Benchmarking for Video-LLMs AAAI 2026 iMOVE : Instance-Motion-Aware Video Understanding ACL 2025 Stable Segment Anything Model ICLR 2025 KineDex: Learning Tactile-Informed Visuomotor Policies via Kinesthetic Teaching for Dexterous Manipulation CORL 2025 ChemVLM: Exploring the Power of Multimodal Large Language Models in Chemistry Area AAAI 2025 HAIC: Improving Human Action Understanding and Generation with Better Captions for Multi-modal Large Language Models ACL 2025 VidCapBench: A Comprehensive Benchmark of Video Captioning for Controllable Text-to-Video Generation ACL 2025 How Far are AI-generated Videos from Simulating the 3D Visual World: A Learned 3D Evaluation Approach ICCV 2025 ReCamMaster: Camera-Controlled Generative Rendering from A Single Video ICCV 2025 GameFactory: Creating New Games with Generative Interactive Videos ICCV 2025 Imbalance in Balance: Online Concept Balancing in Generation Models ICCV 2025 GGTalker: Talking Head Systhesis with Generalizable Gaussian Priors and Identity-Specific Adaptation ICCV 2025 FullDiT: Video Generative Foundation Models with Multimodal Control via Full Attention ICCV 2025 MUSE: Multi-Subject Unified Synthesis via Explicit Layout Semantic Expansion ICCV 2025 Scene Graph Guided Generation: Enable Accurate Relations Generation in Text-to-Image Models via Textural Rectification ICCV 2025 Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model ICLR 2025 TaskGalaxy: Scaling Multi-modal Instruction Fine-tuning with Tens of Thousands Vision Task Types ICLR 2025 Cafe-Talk: Generating 3D Talking Face Animation with Multimodal Coarse- and Fine-grained Control ICLR 2025 SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints ICLR 2025 3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation ICLR 2025 MODA: MOdular Duplex Attention for Multimodal Perception, Cognition, and Emotion Understanding ICML 2025 MM-RLHF: The Next Step Forward in Multimodal LLM Alignment ICML 2025 CERTAIN: Context Uncertainty-aware One-Shot Adaptation for Context-based Offline Meta Reinforcement Learning ICML 2025 LLaMA-Berry: Pairwise Optimization for Olympiad-level Mathematical Reasoning via O1-like Monte Carlo Tree Search NAACL 2025 Chain-of-Specificity: Enhancing Task-Specific Constraint Adherence in Large Language Models COLING 2025 Breaking the Stage Barrier: A Novel Single-Stage Approach to Long Context Extension for Large Language Models COLING 2025 SketchVideo: Sketch-based Video Generation and Editing CVPR 2025 StyleMaster: Stylize Your Video with Artistic Generation and Translation CVPR 2025 Koala-36M: A Large-scale Video Dataset Improving Consistency between Fine-grained Conditions and Video Content CVPR 2025 Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning CVPR 2025 PatchVSR: Breaking Video Diffusion Resolution Limits with Patch-wise Video Super-Resolution CVPR 2025 GPAvatar: High-fidelity Head Avatars by Learning Efficient Gaussian Projections CVPR 2025 Libra-Merging: Importance-redundancy and Pruning-merging Trade-off for Acceleration Plug-in in Large Vision-Language Model CVPR 2025 Towards Precise Scaling Laws for Video Diffusion Transformers CVPR 2025 Unleashing the Potential of Multi-modal Foundation Models and Video Diffusion for 4D Dynamic Physical Scene Simulation CVPR 2025 DSMoE: Matrix-Partitioned Experts with Dynamic Routing for Computation-Efficient Dense LLMs EMNLP 2025 SPPD: Self-training with Process Preference Learning Using Dynamic Value Margin EMNLP 2025 Biology-Instructions: A Dataset and Benchmark for Multi-Omics Sequence Understanding Capability of Large Language Models EMNLP 2025 Decoding at the Speed of Thought: Harnessing Parallel Decoding of Lexical Units for LLMs COLING 2024 ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors EMNLP 2024 Parrot: Enhancing Multi-Turn Instruction Following for Large Language Models ACL 2024 Learning Multi-Dimensional Human Preference for Text-to-Image Generation CVPR 2024 Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual Tokenization ICLR 2024 Focus On What Matters: Separated Models For Visual-Based RL Generalization NIPS 2024 Hierarchical multiple instance learning for COPD grading with relatively specific similarity MICCAI 2024 DialogBench: Evaluating LLMs as Human-like Dialogue Systems NAACL 2024 Be a Multitude to Itself: A Prompt Evolution Framework for Red Teaming EMNLP 2024 VideoTetris: Towards Compositional Text-to-Video Generation NIPS 2024 Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint ACL 2024 Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization ICML 2024 Just Ask One More Time! Self-Agreement Improves Reasoning of Language Models in (Almost) All Scenarios ACL 2024 Evaluating Readability and Faithfulness of Concept-based Explanations EMNLP 2024 Small Agent Can Also Rock! Empowering Small Language Models as Hallucination Detector EMNLP 2024 Inductive-Deductive Strategy Reuse for Multi-Turn Instructional Dialogues EMNLP 2024 How to Fine-tune the Model: Unified Model Shift and Model Bias Policy Optimization NIPS 2023