Yinfei Yang
69 papers · 2012–2025 · 15 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+15 more ↓ Show less ↑
π Academic Marathon (13) π Conference Polyglot (15) π Interdisciplinary Bridge π§ Keyword Pioneer π Cross-Pollinator (6)
π
Cross-Pollinator
(6)
π
Renaissance Researcher
(10)
πΊοΈ
Taxonomy Completionist
(92)
π±
Topic Pioneer
π¬
Deep Specialist
(11)
π₯
Mega-Team
(29)
π
Keyword Champion
(6)
π§¬
Topic Evolution
π€
Dynamic Duo
(16)
π
Conference Pioneer
ποΈ
Keyword Collector
(230)
β‘
Prolific Year
(12)
π₯
Unstoppable
(9)
π
Trend Setter
π
Century Club
(69)
Conferences
EMNLP (14)
ACL (11)
ICLR (9)
EACL (6)
ICCV (5)
NAACL (5)
CVPR (4)
ECCV (3)
ICML (3)
IJCNLP (3)
WACV (2)
AAAI (1)
AACL (1)
CONLL (1)
IJCAI (1)
Top co-authors
Keywords
transfer learning
(9)
contrastive learning
(7)
zero-shot learning
(6)
dual encoder
(6)
sentence embedding
(6)
neural retrieval
(5)
semantic similarity
(5)
multimodal learning
(5)
question answering
(5)
domain adaptation
(4)
information retrieval
(4)
cross-lingual transfer
(4)
cross-lingual retrieval
(4)
representation learning
(3)
neural machine translation
(3)
information extraction
(3)
text-to-image generation
(3)
data augmentation
(3)
machine translation
(3)
vision-language model
(3)
Papers
Improve Vision Language Model Chain-of-thought Reasoning
ACL 2025
MM-Spatial: Exploring 3D Spatial Understanding in Multimodal LLMs
ICCV 2025
STIV: Scalable Text and Image Conditioned Video Generation
ICCV 2025
MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning
ICLR 2025
Contrastive Localized Language-Image Pre-Training
ICML 2025
Multimodal Autoregressive Pre-training of Large Vision Encoders
CVPR 2025
UniVG: A Generalist Diffusion Model for Unified Image Generation and Editing
ICCV 2025
CLIP-UP: A Simple and Efficient Mixture-of-Experts CLIP Training Recipe with Sparse Upcycling
EMNLP 2025
MMEgo: Towards Building Egocentric Multimodal LLMs for Video QA
ICLR 2025
MIA-Bench: Towards Better Instruction Following Evaluation of Multimodal LLMs
ICLR 2025
Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models
ICLR 2025
Ferret-UI 2: Mastering Universal User Interface Understanding Across Platforms
ICLR 2025
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs
ECCV 2024
"MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training"
ECCV 2024
On the Intractability to Synthesize Factual Inconsistencies in Summarization
EACL 2024
Guiding Instruction-based Image Editing via Multimodal Large Language Models
ICLR 2024
Ferret: Refer and Ground Anything Anywhere at Any Granularity
ICLR 2024
Compressing LLMs: The Truth is Rarely Pure and Never Simple
ICLR 2024
MOFI: Learning Image Representations from Noisy Entity Annotated Images
ICLR 2024
Empowering Unsupervised Domain Adaptation With Large-Scale Pre-Trained Vision-Language Models
WACV 2024
VeCLIP: Improving CLIP Training via Visual-enriched Captions
ECCV 2024
Perceptual Grouping in Contrastive Vision-Language Models
ICCV 2023
Masked Autoencoding Does Not Help Natural Language Supervision at Scale
CVPR 2023
A New Path: Scaling Vision-and-Language Navigation With Synthetic Instructions and Imitation Learning
CVPR 2023
STAIR: Learning Sparse Text and Image Representation in Grounded Tokens
EMNLP 2023
DocAsRef: An Empirical Study on Repurposing Reference-based Summary Quality Metrics as Reference-free Metrics
EMNLP 2023
Simple and Effective Synthesis of Indoor 3D Scenes
AAAI 2023
Robustness in Multimodal Learning under Train-Test Modality Mismatch
ICML 2023
Sentence-T5: Scalable Sentence Encoders from Pre-trained Text-to-Text Models
ACL 2022
Language-agnostic BERT Sentence Embedding
ACL 2022
Large Dual Encoders Are Generalizable Retrievers
EMNLP 2022
SueNes: A Weakly Supervised Approach to Evaluating Single-Document Summarization via Negative Sampling
NAACL 2022
LongT5: Efficient Text-To-Text Transformer for Long Sequences
NAACL 2022
A Simple and Effective Method To Eliminate the Self Language Bias in Multilingual Representations
EMNLP 2021
Multi-stage Training with Improved Negative Contrast for Neural Passage Retrieval
EMNLP 2021
Universal Sentence Representation Learning with Conditional Masked Language Model
EMNLP 2021
MURAL: Multimodal, Multitask Representations Across Languages
EMNLP 2021
Crisscrossed Captions: Extended Intramodal and Intermodal Semantic Similarity Judgments for MS-COCO
EACL 2021
Cross-Modal Contrastive Learning for Text-to-Image Generation
CVPR 2021
MultiReQA: A Cross-Domain Evaluation forRetrieval Question Answering Models
EACL 2021
Zero-shot Neural Passage Retrieval via Domain-targeted Synthetic Question Generation
EACL 2021
Pathdreamer: A World Model for Indoor Navigation
ICCV 2021
Text-to-Image Generation Grounded by Fine-Grained User Attention
WACV 2021
Neural Retrieval for Question Answering with Cross-Attention Supervised Data Augmentation
IJCNLP 2021
Neural Retrieval for Question Answering with Cross-Attention Supervised Data Augmentation
ACL 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
ICML 2021
Self-Supervised Learning for Pairwise Data Refinement
AACL 2020
LAReQA: Language-Agnostic Answer Retrieval from a Multilingual Pool
EMNLP 2020
Multilingual Universal Sentence Encoder for Semantic Retrieval
ACL 2020
Learning a Multi-Domain Curriculum for Neural Machine Translation
ACL 2020
Predicting Annotation Difficulty to Improve Task Routing and Model Performance for Biomedical Information Extraction
NAACL 2019
Hierarchical Document Encoder for Parallel Corpus Mining
ACL 2019
Learning Cross-Lingual Sentence Representations via a Multi-task Dual-Encoder Model
ACL 2019
PAWS-X: A Cross-lingual Adversarial Dataset for Paraphrase Identification
EMNLP 2019
ReQA: An Evaluation for End-to-End Answer Retrieval Models
EMNLP 2019
PAWS-X: A Cross-lingual Adversarial Dataset for Paraphrase Identification
IJCNLP 2019
Improving Multilingual Sentence Embedding using Bi-directional Dual Encoder with Additive Margin Softmax
IJCAI 2019
Cross-Domain Review Helpfulness Prediction Based on Convolutional Neural Networks with Auxiliary Domain Discriminators
NAACL 2018
Effective Parallel Corpus Mining using Bilingual Sentence Embeddings
EMNLP 2018
Learning Semantic Textual Similarity from Conversations
ACL 2018
A Corpus with Multi-Level Annotations of Patients, Interventions and Outcomes to Support Language Processing for Medical Literature
ACL 2018
Syntactic Patterns Improve Information Extraction for Medical Search
NAACL 2018
Universal Sentence Encoder for English
EMNLP 2018
Aspect Extraction from Product Reviews Using Category Hierarchy Information
EACL 2017
Detecting (Un)Important Content for Single-Document News Summarization
EACL 2017
Semantic Analysis and Helpfulness Prediction of Text for Online Product Reviews
ACL 2015
Semantic Analysis and Helpfulness Prediction of Text for Online Product Reviews
IJCNLP 2015
Linking Named Entities to Any Database
EMNLP 2012
Linking Named Entities to Any Database
CONLL 2012