Hao Fei
97 papers · 2020–2026 · 14 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+17 more ↓ Show less ↑
π§ Keyword Pioneer πΊοΈ Taxonomy Completionist (12) π Interdisciplinary Bridge π Renaissance Researcher (5) π Conference Polyglot (14)
π
Interdisciplinary Bridge
πΊοΈ
Taxonomy Completionist
(12)
π£
Hot Topic Early Bird
π
Conference Loyalist
(22)
π
Grand Slam
π
Keyword Champion
(3)
π€
Dynamic Duo
(31)
π₯
Mega-Team
(32)
π¬
Deep Specialist
(25)
π§¬
Topic Evolution
β
The Questioner
(2)
β‘
Prolific Year
(31)
ποΈ
Keyword Collector
(390)
π
Trend Setter
π
Century Club
(93)
π
Conference Pioneer
π₯
Unstoppable
(6)
Conferences
ACL (24)
AAAI (16)
EMNLP (13)
NIPS (11)
ICML (7)
COLING (5)
IJCAI (5)
CVPR (4)
ICCV (4)
ICLR (2)
IJCNLP (2)
NAACL (2)
AACL (1)
SEMEVAL (1)
Top co-authors
Keywords
large language model
(19)
multimodal learning
(13)
multimodal large language model
(7)
scene graph
(6)
graph neural network
(6)
visual reasoning
(6)
sentiment analysis
(5)
zero-shot learning
(5)
semantic role labeling
(5)
named entity recognition
(5)
vision-language model
(5)
instruction tuning
(4)
transfer learning
(4)
diffusion model
(4)
relation extraction
(4)
vision language model
(4)
syntactic parsing
(4)
attention mechanism
(4)
text generation
(4)
dependency parsing
(4)
Papers
Dynamic Emotion and Personality Profiling for Multimodal Deception Detection
ACL 2026
Taming Actor-Observer Asymmetry in Agents via Dialectical Alignment
ACL 2026
DragNeXt: Rethinking Drag-Based Image Editing
AAAI 2026
Orthogonal Spatial-temporal Distributional Transfer for 4D Generation
AAAI 2026
$\mathcalVista\mathcalDPO$: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models
ICML 2025
Learning 4D Panoptic Scene Graph Generation from Rich 2D Visual Scene
CVPR 2025
Universal Scene Graph Generation
CVPR 2025
Watch Out Your Album! On the Inadvertent Privacy Memorization in Multi-Modal Large Language Models
ICML 2025
David vs. Goliath: Cost-Efficient Financial QA via Cascaded Multi-Agent Reasoning
EMNLP 2025
InTriage: Intelligent Telephone Triage in Pre-Hospital Emergency Care
EMNLP 2025
CHiP: Cross-modal Hierarchical Direct Preference Optimization for Multimodal LLMs
ICLR 2025
Towards Semantic Equivalence of Tokenization in Multimodal LLM
ICLR 2025
Where, What, Why: Towards Explainable Driver Attention Prediction
ICCV 2025
Enhancing Hyperbole and Metaphor Detection with Their Bidirectional Dynamic Interaction and Emotion Knowledge
ACL 2025
Aristotle: Mastering Logical Reasoning with A Logic-Complete Decompose-Search-Resolve Framework
ACL 2025
CCHall: A Novel Benchmark for Joint Cross-Lingual and Cross-Modal Hallucinations Detection in Large Language Models
ACL 2025
VEGAS: Towards Visually Explainable and Grounded Artificial Social Intelligence
AAAI 2025
Combating Multimodal LLM Hallucination via Bottom-Up Holistic Reasoning
AAAI 2025
CoMT: A Novel Benchmark for Chain of Multi-modal Thought on Large Vision-Language Models
AAAI 2025
Divide-Solve-Combine: An Interpretable and Accurate Prompting Framework for Zero-shot Multi-Intent Detection
AAAI 2025
Multi-Granular Multimodal Clue Fusion for Meme Understanding
AAAI 2025
Derm1M: A Million-scale Vision-Language Dataset Aligned with Clinical Ontology Knowledge for Dermatology
ICCV 2025
PhysSplat: Efficient Physics Simulation for 3D Scenes via MLLM-Guided Gaussian Splatting
ICCV 2025
Improving Consistency Identification in Task-oriented Dialogue Through Multi-Agent Collaboration
IJCAI 2025
Iris: Breaking GUI Complexity with Adaptive Focus and Self-Refining
ICCV 2025
When Words Smile: Generating Diverse Emotional Facial Expressions from Text
EMNLP 2025
CLEAR: A Framework Enabling Large Language Models to Discern Confusing Legal Paragraphs
EMNLP 2025
On Path to Multimodal Generalist: General-Level and General-Bench
ICML 2025
Towards Unified Multimodal Editing with Enhanced Knowledge Collaboration
NIPS 2024
Synergistic Dual Spatial-aware Generation of Image-to-text and Text-to-image
NIPS 2024
Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition
ICML 2024
Momentor: Advancing Video Large Language Model with Fine-Grained Temporal Reasoning
ICML 2024
What Factors Affect Multi-Modal In-Context Learning? An In-Depth Exploration
NIPS 2024
Improving Expressive Power of Spectral Graph Neural Networks with Eigenvalue Correction
AAAI 2024
Harnessing Holistic Discourse Features and Triadic Interaction for Sentiment Quadruple Extraction in Dialogues
AAAI 2024
Reverse Multi-Choice Dialogue Commonsense Inference with Graph-of-Thought
AAAI 2024
NUS-Emo at SemEval-2024 Task 3: Instruction-Tuning LLM for Multimodal Emotion-Cause Analysis in Conversations
SEMEVAL 2024
NUS-Emo at SemEval-2024 Task 3: Instruction-Tuning LLM for Multimodal Emotion-Cause Analysis in Conversations
NAACL 2024
Actively Learn from LLMs with Uncertainty Propagation for Generalized Category Discovery
NAACL 2024
Unified Generative and Discriminative Training for Multi-modal Large Language Models
NIPS 2024
ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models
NIPS 2024
ProtT3: Protein-to-Text Generation for Text-based Protein Understanding
ACL 2024
Revisiting Structured Sentiment Analysis as Latent Dependency Graph Parsing
ACL 2024
Faithful Logical Reasoning via Symbolic Chain-of-Thought
ACL 2024
XNLP: An Interactive Demonstration System for Universal Structured NLP
ACL 2024
EmpathyEar: An Open-source Avatar Multimodal Empathetic Chatbot
ACL 2024
Synergizing Large Language Models and Pre-Trained Smaller Models for Conversational Intent Discovery
ACL 2024
Recognizing Everything from All Modalities at Once: Grounded Multimodal Universal Information Extraction
ACL 2024
OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding
NIPS 2024
What Factors Influence LLMsβ Judgments? A Case Study on Question Answering
COLING 2024
From Multimodal LLM to Human-level AI: Modality, Instruction, Reasoning, Efficiency and beyond
COLING 2024
Dysen-VDM: Empowering Dynamics-aware Text-to-Video Diffusion with LLMs
CVPR 2024
LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding Reasoning and Planning
CVPR 2024
NExT-GPT: Any-to-Any Multimodal LLM
ICML 2024
RG-SAN: Rule-Guided Spatial Awareness Network for End-to-End 3D Referring Expression Segmentation
NIPS 2024
A Survey of Ontology Expansion for Conversational Understanding
EMNLP 2024
Guided Knowledge Generation with Language Models for Commonsense Reasoning
EMNLP 2024
Divide and Conquer: Legal Concept-guided Criminal Court View Generation
EMNLP 2024
Vitron: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing
NIPS 2024
DiaASQ: A Benchmark of Conversational Aspect-based Sentiment Quadruple Analysis
ACL 2023
Generating Visual Spatial Description via Holistic 3D Scene Understanding
ACL 2023
Imagine That! Abstract-to-Intricate Text-to-Image Synthesis with Scene Graph Hallucination Diffusion
NIPS 2023
Scene Graph as Pivoting: Inference-time Image-free Unsupervised Multimodal Machine Translation with Visual Scene Hallucination
ACL 2023
VPGTrans: Transfer Visual Prompt Generator across LLMs
NIPS 2023
MolCA: Molecular Graph-Language Modeling with Cross-Modal Projector and Uni-Modal Adapter
EMNLP 2023
Cross2StrA: Unpaired Cross-lingual Image Captioning with Cross-lingual Cross-modal Structure-pivoted Alignment
ACL 2023
Constructing Code-mixed Universal Dependency Forest for Unbiased Cross-lingual Relation Extraction
ACL 2023
Reasoning Implicit Sentiment with Chain-of-Thought Prompting
ACL 2023
Information Screening whilst Exploiting! Multimodal Relation Extraction with Feature Denoising and Multimodal Topic Modeling
ACL 2023
Entity-centered Cross-document Relation Extraction
EMNLP 2022
Unified Named Entity Recognition as Word-Word Relation Classification
AAAI 2022
Mastering the Explicit Opinion-Role Interaction: Syntax-Aided Neural Transition System for Unified Opinion Role Labeling
AAAI 2022
Cross-Lingual Contrastive Learning for Fine-Grained Entity Typing for Low-Resource Languages
ACL 2022
Effective Token Graph Modeling using a Novel Labeling Strategy for Structured Sentiment Analysis
ACL 2022
OneEE: A One-Stage Framework for Fast Overlapping and Nested Event Extraction
COLING 2022
Joint Alignment of Multi-Task Feature and Label Spaces for Emotion Cause Pair Extraction
COLING 2022
LasUIE: Unifying Information Extraction with Latent Adaptive Structure-aware Generative Language Model
NIPS 2022
Conversation Disentanglement with Bi-Level Contrastive Learning
EMNLP 2022
Matching Structure for Dual Learning
ICML 2022
Global Inference with Explicit Syntactic and Discourse Structures for Dialogue-Level Relation Extraction
IJCAI 2022
Conversational Semantic Role Labeling with Predicate-Oriented Latent Graph
IJCAI 2022
Inheriting the Wisdom of Predecessors: A Multiplex Cascade Framework for Unified Aspect-based Sentiment Analysis
IJCAI 2022
Better Combine Them Together! Integrating Syntactic Constituency and Dependency Representations for Semantic Role Labeling
ACL 2021
Encoder-Decoder Based Unified Semantic Role Labeling with Label-Aware Syntax
AAAI 2021
Rethinking Boundaries: End-To-End Recognition of Discontinuous Mentions with Pointer Networks
AAAI 2021
End-to-end Semantic Role Labeling with Neural Transition-based Model
AAAI 2021
Learn from Syntax: Improving Pair-wise Aspect and Opinion Terms Extraction with Rich Syntactic Knowledge
IJCAI 2021
MRN: A Locally and Globally Mention-Based Reasoning Network for Document-Level Relation Extraction
ACL 2021
Better Combine Them Together! Integrating Syntactic Constituency and Dependency Representations for Semantic Role Labeling
IJCNLP 2021
MRN: A Locally and Globally Mention-Based Reasoning Network for Document-Level Relation Extraction
IJCNLP 2021
Retrofitting Structure-aware Transformer Language Model for End Tasks
EMNLP 2020
Latent Emotion Memory for Multi-Label Emotion Classification
AAAI 2020
Cross-Lingual Semantic Role Labeling with High-Quality Translated Training Corpus
ACL 2020
High-order Refining for End-to-end Chinese Semantic Role Labeling
AACL 2020
Mimic and Conquer: Heterogeneous Tree Structure Distillation for Syntactic NLP
EMNLP 2020
Improving Text Understanding via Deep Syntax-Semantics Communication
EMNLP 2020
Modeling Local Contexts for Joint Dialogue Act Recognition and Sentiment Classification with Bi-channel Dynamic Convolutions
COLING 2020