Yifan Peng
48 papers · 2013–2026 · 10 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+13 more ↓ Show less ↑
πΊοΈ Taxonomy Completionist (16) π§ Keyword Pioneer π Renaissance Researcher (6) π Interdisciplinary Bridge π Conference Polyglot (10)
π
Cross-Pollinator
(10)
πΊοΈ
Taxonomy Completionist
(16)
π§
Keyword Pioneer
π€
Dynamic Duo
(23)
π§¬
Topic Evolution
π₯
Mega-Team
(21)
π
Keyword Champion
(3)
ποΈ
Keyword Collector
(211)
π
Conference Pioneer
β‘
Prolific Year
(10)
π
Trend Setter
π
Century Club
(45)
π₯
Unstoppable
(9)
Conferences
ACL (11)
CVPR (10)
INTERSPEECH (10)
NAACL (6)
AAAI (2)
EMNLP (2)
ICCV (2)
ICML (2)
WACV (2)
ICLR (1)
Top co-authors
Keywords
automatic speech recognition
(6)
medical imaging
(6)
speech recognition
(6)
contrastive learning
(4)
large language model
(4)
speech translation
(4)
clinical text
(3)
multi-label classification
(3)
named entity recognition
(3)
speech processing
(3)
disease classification
(3)
radiology report
(3)
spoken language understanding
(3)
self-supervised learning
(3)
end-to-end asr
(3)
multi-task learning
(2)
language model
(2)
model compression
(2)
multimodal learning
(2)
zero-shot learning
(2)
Papers
A Disease-Aware Dual-Stage Framework for Chest X-ray Report Generation
AAAI 2026
MARCH: Multi-Agent Radiology Clinical Hierarchy for CT Report Generation
ACL 2026
Improving Retrieval-Augmented Generation without Taxonomy-based Error Categorization
ACL 2026
ESPnet-SpeechLM: An Open Speech Language Model Toolkit
NAACL 2025
Natural Language Processing in Support of Evidence-based Medicine: A Scoping Review
ACL 2025
Glossy Object Reconstruction with Cost-effective Polarized Acquisition
CVPR 2025
Seeing Far and Clearly: Mitigating Hallucinations in MLLMs with Attention Causal Decoding
CVPR 2025
Learned Binocular-Encoding Optics for RGBD Imaging Using Joint Stereo and Focus Cues
CVPR 2025
Context-aware Dynamic Pruning for Speech Foundation Models
ICLR 2025
OWLS: Scaling Laws for Multilingual Speech Recognition and Translation Models
ICML 2025
VoiceTextBlender: Augmenting Large Language Models with Speech Capabilities via Single-Stage Joint Speech-Text Supervised Fine-Tuning
NAACL 2025
Enhancing Audiovisual Speech Recognition Through Bifocal Preference Optimization
AAAI 2025
ESPnet-SDS: Unified Toolkit and Demo for Spoken Dialogue Systems
NAACL 2025
Towards Robust Speech Representation Learning for Thousands of Languages
EMNLP 2024
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language Identification
ACL 2024
On the Effects of Heterogeneous Data Sources on Speech-to-Text Foundation Models
INTERSPEECH 2024
Learned Scanpaths Aid Blind Panoramic Video Quality Assessment
CVPR 2024
Contextualized End-to-end Automatic Speech Recognition with Intermediate Biasing Loss
INTERSPEECH 2024
OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer
INTERSPEECH 2024
MULTI-CONVFORMER: Extending Conformer with Multiple Convolution Kernels
INTERSPEECH 2024
UniverSLU: Universal Spoken Language Understanding for Diverse Tasks with Natural Language Instructions
NAACL 2024
CMUβs IWSLT 2023 Simultaneous Speech Translation System
ACL 2023
Learning a Room with the Occ-SDF Hybrid: Signed Distance Function Mingled with Occupancy Aids Scene Representation
ICCV 2023
Attend Who Is Weak: Pruning-Assisted Medical Image Localization Under Sophisticated and Implicit Imbalances
WACV 2023
DPHuBERT: Joint Distillation and Pruning of Self-Supervised Speech Models
INTERSPEECH 2023
Tensor decomposition for minimization of E2E SLU model toward on-device processing
INTERSPEECH 2023
A Comparative Study on E-Branchformer vs Conformer in Speech Recognition, Translation, and Understanding Tasks
INTERSPEECH 2023
Reducing Barriers to Self-Supervised Learning: HuBERT Pre-training with Academic Compute
INTERSPEECH 2023
Time-synchronous one-pass Beam Search for Parallel Online and Offline Transducers with Dynamic Block Training
INTERSPEECH 2023
ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit
ACL 2023
Less Likely Brainstorming: Using Language Models to Generate Alternative Hypotheses
ACL 2023
CMUβs IWSLT 2022 Dialect Speech Translation System
ACL 2022
Knowledge-Augmented Contrastive Learning for Abnormality Classification and Localization in Chest X-Rays With Radiomics Using a Feedback Loop
WACV 2022
Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding
ICML 2022
EchoGen: Generating Conclusions from Echocardiogram Notes
ACL 2022
Attention Weight Smoothing Using Prior Distributions for Transformer-Based End-to-End ASR
INTERSPEECH 2022
Leveraging Deep Representations of Radiology Reports in Survival Analysis for Predicting Heart Failure Patient Mortality
NAACL 2021
Improving BERT Model Using Contrastive Learning for Biomedical Relation Extraction
NAACL 2021
Automatic recognition of abdominal lymph nodes from clinical text
EMNLP 2020
An Empirical Study of Multi-Task Learning on BERT for Biomedical Text Mining
ACL 2020
Deep Optics for Single-Shot High-Dynamic-Range Imaging
CVPR 2020
Holistic and Comprehensive Annotation of Clinically Significant Findings on Diverse CT Images: Learning From Radiology Reports and Label Ontology
CVPR 2019
Transfer Learning in Biomedical Natural Language Processing: An Evaluation of BERT and ELMo on Ten Benchmarking Datasets
ACL 2019
Depth and Transient Imaging With Compressive SPAD Array Cameras
CVPR 2018
TieNet: Text-Image Embedding Network for Common Thorax Disease Classification and Reporting in Chest X-Rays
CVPR 2018
ChestX-ray8: Hospital-Scale Chest X-Ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases
CVPR 2017
Revisiting Cross-Channel Information Transfer for Chromatic Aberration Correction
ICCV 2017
Studying Relationships between Human Gaze, Description, and Computer Vision
CVPR 2013