Tsu-Jui Fu
32 papers · 2018–2025 · 12 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+10 more ↓ Show less ↑
π Academic Marathon (7) π Conference Polyglot (12) π Interdisciplinary Bridge π§ Keyword Pioneer π Cross-Pollinator (10)
π
Cross-Pollinator
(10)
π
Renaissance Researcher
(9)
πΊοΈ
Taxonomy Completionist
(71)
π€
Dynamic Duo
(18)
π¬
Deep Specialist
(11)
π₯
Unstoppable
(8)
β
The Questioner
π
Century Club
(32)
ποΈ
Keyword Collector
(155)
β‘
Prolific Year
(8)
Conferences
EMNLP (7)
CVPR (4)
NIPS (4)
AAAI (3)
ACL (3)
EACL (2)
ECCV (2)
ICCV (2)
ICLR (2)
CORL (1)
IJCNLP (1)
NAACL (1)
Top co-authors
Keywords
multimodal learning
(6)
video understanding
(3)
multi-modal learning
(3)
text-to-image generation
(3)
diffusion model
(3)
image editing
(3)
text-to-video generation
(3)
video generation
(3)
text-guided generation
(2)
vision-language navigation
(2)
vision-and-language navigation
(2)
deep reinforcement learning
(2)
embodied agent
(2)
named entity recognition
(2)
self-supervised learning
(2)
representation learning
(2)
in-context learning
(2)
cross-modal learning
(2)
generative model
(2)
large language model
(2)
Papers
TC-Bench: Benchmarking Temporal Compositionality in Conditional Video Generation
ACL 2025
STIV: Scalable Text and Image Conditioned Video Generation
ICCV 2025
UniVG: A Generalist Diffusion Model for Unified Image Generation and Editing
ICCV 2025
VELMA: Verbalization Embodiment of LLM Agents for Vision and Language Navigation in Street View
AAAI 2024
Guiding Instruction-based Image Editing via Multimodal Large Language Models
ICLR 2024
T2V-Turbo: Breaking the Quality Bottleneck of Video Consistency Model with Mixed Reward Feedback
NIPS 2024
Collaborative Generative AI: Integrating GPT-k for Efficient Editing in Text-to-Image Generation
EMNLP 2023
EDIS: Entity-Driven Image Search over Multimodal Web Content
EMNLP 2023
Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis
ICLR 2023
LayoutGPT: Compositional Visual Planning and Generation with Large Language Models
NIPS 2023
An Empirical Study of End-to-End Video-Language Transformers With Masked Visual Modeling
CVPR 2023
Tell Me What Happened: Unifying Text-Guided Video Completion via Multimodal Masked Video Generation
CVPR 2023
PHOTOSWAP: Personalized Subject Swapping in Images
NIPS 2023
Text-guided 3D Human Generation from 2D Collections
EMNLP 2023
CPL: Counterfactual Prompt Learning for Vision and Language Models
EMNLP 2022
DOC2PPT: Automatic Presentation Slides Generation from Scientific Documents
AAAI 2022
ULN: Towards Underspecified Vision-and-Language Navigation
EMNLP 2022
M3L: Language-Based Video Editing via Multi-Modal Multi-Level Transformers
CVPR 2022
Language-Driven Artistic Style Transfer
ECCV 2022
Semi-Supervised Policy Initialization for Playing Games with Language Hints
NAACL 2021
H-FND: Hierarchical False-Negative Denoising for Distant Supervision Relation Extraction
ACL 2021
Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation
EACL 2021
L2C: Describing Visual Differences Needs Semantic Understanding of Individuals
EACL 2021
H-FND: Hierarchical False-Negative Denoising for Distant Supervision Relation Extraction
IJCNLP 2021
SSCR: Iterative Language-Based Image Editing via Self-Supervised Counterfactual Reasoning
EMNLP 2020
Counterfactual Vision-and-Language Navigation via Adversarial Path Sampler
ECCV 2020
Why Attention? Analyze BiLSTM Deficiency and Its Remedies in the Case of NER
AAAI 2020
GraphRel: Modeling Text as Relational Graphs for Joint Entity and Relation Extraction
ACL 2019
Adversarial Active Exploration for Inverse Dynamics Model Learning
CORL 2019
Speed Reading: Learning to Read ForBackward via Shuttle
EMNLP 2018
Diversity-Driven Exploration Strategy for Deep Reinforcement Learning
NIPS 2018
Dynamic Video Segmentation Network
CVPR 2018