Ruihua Song
28 papers · 2015–2026 · 12 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+8 more ↓ Show less ↑
π Academic Marathon (10) π Interdisciplinary Bridge π§ Keyword Pioneer π Conference Polyglot (12) π Cross-Pollinator (13)
πΊοΈ
Taxonomy Completionist
(49)
π
Interdisciplinary Bridge
π§
Keyword Pioneer
π§¬
Topic Evolution
β‘
Prolific Year
(11)
β
The Questioner
(2)
π
Century Club
(27)
ποΈ
Keyword Collector
(134)
Conferences
ACL (7)
AAAI (3)
EMNLP (3)
INTERSPEECH (3)
COLING (2)
ICCV (2)
ICLR (2)
IJCAI (2)
CVPR (1)
NAACL (1)
NIPS (1)
WACV (1)
Top co-authors
Keywords
large language model
(4)
multimodal learning
(4)
preference optimization
(3)
retrieval-augmented generation
(3)
instruction following
(2)
video generation
(2)
automatic speech recognition
(2)
dialogue system
(2)
knowledge graph
(2)
persuasive dialogue
(2)
text generation
(2)
multimodal large language model
(2)
multi-agent system
(2)
visual question answering
(1)
robotic manipulation
(1)
curriculum learning
(1)
visual saliency
(1)
imitation learning
(1)
reinforcement learning
(1)
natural language processing
(1)
Papers
DPWriter: Reinforcement Learning with Diverse Planning Branching for Creative Writing
ACL 2026
ETVA: Evaluation of Text-to-Video Alignment via Fine-grained Question Generation and Answering
ICCV 2025
EyEar: Learning Audio Synchronized Human Gaze Trajectory Based on Physics-Informed Dynamics
AAAI 2025
Enhancing Audiovisual Speech Recognition Through Bifocal Preference Optimization
AAAI 2025
Transferring Foundation Models for Generalizable Robotic Manipulation
WACV 2025
VAFlow: Video-to-Audio Generation with Cross-Modality Flow Matching
ICCV 2025
What Makes for Good Visual Instructions? Synthesizing Complex Visual Reasoning Instructions for Visual Instruction Tuning
COLING 2025
MuKA: Multimodal Knowledge Augmented Visual Information-Seeking
COLING 2025
Animate and Sound an Image
CVPR 2025
Think Then React: Towards Unconstrained Action-to-Reaction Motion Generation
ICLR 2025
Towards Effective and Efficient Continual Pre-training of Large Language Models
ACL 2025
Select, Read, and Write: A Multi-Agent Framework of Full-Text-based Related Work Generation
ACL 2025
Persuading across Diverse Domains: a Dataset and Persuasion Large Language Model
ACL 2024
Parrot: Enhancing Multi-Turn Instruction Following for Large Language Models
ACL 2024
BSharedRAG: Backbone Shared Retrieval-Augmented Generation for the E-commerce Domain
EMNLP 2024
What Matters in Training a GPT4-Style Language Model with Multimodal Inputs?
NAACL 2024
Joint Semantic and Strategy Matching for Persuasive Dialogue
EMNLP 2023
VideoDubber: Machine Translation with Speech-Aware Length Control for Video Dubbing
AAAI 2023
CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Alignment
ICLR 2023
ComedicSpeech: Text To Speech For Stand-up Comedies in Low-Resource Scenarios
INTERSPEECH 2023
A Multi-Modal Knowledge Graph for Classical Chinese Poetry
EMNLP 2022
Long-Form Video-Language Pre-Training with Multimodal Temporal Contrastive Learning
NIPS 2022
AdaSpeech 4: Adaptive Text to Speech in Zero-Shot Scenarios
INTERSPEECH 2022
Self-supervised Context-aware Style Representation for Expressive Speech Synthesis
INTERSPEECH 2022
ScriptWriter: Narrative-Guided Script Generation
ACL 2020
Composing a Picture Book by Automatic Story Understanding and Visualization
ACL 2019
Understanding People Lifestyles: Construction of Urban Movement Knowledge Graph from GPS Trajectory
IJCAI 2017
Mobile Query Recommendation via Tensor Function Learning
IJCAI 2015