Bowen Shi
37 papers · 2019–2026 · 11 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+12 more ↓ Show less ↑
π§ Keyword Pioneer π Renaissance Researcher (5) π Interdisciplinary Bridge πΊοΈ Taxonomy Completionist (16) π£ Hot Topic Early Bird
π
Renaissance Researcher
(5)
πΊοΈ
Taxonomy Completionist
(16)
π§
Keyword Pioneer
π¬
Deep Specialist
(10)
π
Keyword Champion
(2)
π€
Dynamic Duo
(13)
π
Grand Slam
ποΈ
Keyword Collector
(137)
β‘
Prolific Year
(8)
π
Trend Setter
π
Century Club
(35)
π₯
Unstoppable
(7)
Conferences
INTERSPEECH (8)
ACL (7)
ICLR (4)
CVPR (3)
EMNLP (3)
ICML (3)
NIPS (3)
ECCV (2)
ICCV (2)
AAAI (1)
JMLR (1)
Top co-authors
Keywords
video understanding
(5)
sign language translation
(4)
self-supervised learning
(4)
representation learning
(3)
audio-visual speech recognition
(3)
sign language recognition
(3)
zero-shot learning
(3)
speech recognition
(3)
american sign language
(3)
multimodal learning
(3)
speech generation
(2)
audio-visual speech
(2)
vision transformer
(2)
speech translation
(2)
automatic speech recognition
(2)
model quantization
(2)
model compression
(2)
multi-task learning
(2)
speech synthesis
(2)
self-supervised pretraining
(2)
Papers
Profiling-Free Mixed-Precision Quantization for MoE LLMs via Fuzzy Rule Interpolation
ACL 2026
CT-FineBench: A Diagnostic Fidelity Benchmark for Fine-Grained Evaluation of CT Report Generation
ACL 2026
METEOR: Multi-Encoder Collaborative Token Pruning for Efficient Vision Language Models
ICCV 2025
MDCure: A Scalable Pipeline for Multi-Document Instruction-Following
ACL 2025
MusicFlow: Cascaded Flow Matching for Text Guided Music Generation
ICML 2024
BarLeRIa: An Efficient Tuning Framework for Referring Image Segmentation
ICLR 2024
Scaling Speech Technology to 1,000+ Languages
JMLR 2024
Hybrid Distillation: Connecting Masked Autoencoders with Contrastive Learners
ICLR 2024
Generative Pre-training for Speech with Flow Matching
ICLR 2024
Learning Fine-Grained Controllability on Speech Generation via Efficient Fine-Tuning
INTERSPEECH 2024
Towards Privacy-Aware Sign Language Translation at Scale
ACL 2024
XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception
ACL 2024
Bootstrap AutoEncoders With Contrastive Paradigm for Self-supervised Gaze Estimation
ICML 2024
UMG-CLIP: A Unified Multi-Granularity Vision Generalist for Open-World Understanding
ECCV 2024
Pose-Oriented Transformer with Uncertainty-Guided Refinement for 2D-to-3D Human Pose Estimation
AAAI 2023
Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale
NIPS 2023
Adapting Shortcut With Normalizing Flow: An Efficient Tuning Framework for Visual Recognition
CVPR 2023
ReVISE: Self-Supervised Speech Resynthesis With Visual Input for Universal and Generalized Speech Regeneration
CVPR 2023
TTICβs Submission to WMT-SLT 23
EMNLP 2023
SEGA: Structural Entropy Guided Anchor View for Graph Contrastive Learning
ICML 2023
MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation
INTERSPEECH 2023
Expresso: A Benchmark and Analysis of Discrete Expressive Speech Resynthesis
INTERSPEECH 2023
AiluRus: A Scalable ViT Framework for Dense Prediction
NIPS 2023
TTICβs WMT-SLT 22 Sign Language Translation System
EMNLP 2022
Open-Domain Sign Language Translation Learned from Online Video
EMNLP 2022
A Transformer-Based Decoder for Semantic Segmentation with Multi-level Context Mining
ECCV 2022
Robust Self-Supervised Audio-Visual Speech Recognition
INTERSPEECH 2022
Learning Lip-Based Audio-Visual Speaker Embeddings with AV-HuBERT
INTERSPEECH 2022
Searching for fingerspelled content in American Sign Language
ACL 2022
Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction
ICLR 2022
u-HuBERT: Unified Mixed-Modal Speech Pretraining And Zero-Shot Transfer to Unlabeled Modality
NIPS 2022
Fingerspelling Detection in American Sign Language
CVPR 2021
A Joint Framework for Audio Tagging and Weakly Supervised Acoustic Event Detection Using DenseNet with Global Average Pooling
INTERSPEECH 2020
A Cross-Task Analysis of Text Span Representations
ACL 2020
Compression of Acoustic Event Detection Models with Quantized Distillation
INTERSPEECH 2019
On the Contributions of Visual and Textual Supervision in Low-Resource Semantic Speech Retrieval
INTERSPEECH 2019
Fingerspelling Recognition in the Wild With Iterative Visual Attention
ICCV 2019