Shimin Tao
47 papers · 2019–2026 · 9 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+13 more ↓ Show less ↑
🌍 Conference Polyglot (9) 🧭 Keyword Pioneer 🌈 Renaissance Researcher (5) 🌉 Interdisciplinary Bridge 🏃 Academic Marathon (6)
🏃
Academic Marathon
(6)
🐝
Cross-Pollinator
(13)
🗺️
Taxonomy Completionist
(74)
🏠
Conference Loyalist
(20)
🏆
Keyword Champion
(3)
🤝
Dynamic Duo
(40)
🔬
Deep Specialist
(20)
🧬
Topic Evolution
🗃️
Keyword Collector
(180)
💎
Century Club
(41)
🔥
Unstoppable
(7)
❓
The Questioner
⚡
Prolific Year
(15)
Conferences
EMNLP (20)
ACL (11)
AAAI (6)
INTERSPEECH (3)
COLING (2)
NAACL (2)
AACL (1)
IJCAI (1)
SEMEVAL (1)
Top co-authors
Keywords
machine translation
(19)
quality estimation
(10)
neural machine translation
(9)
large language model
(8)
automatic speech recognition
(4)
document-level translation
(4)
named entity recognition
(4)
knowledge distillation
(3)
multitask learning
(3)
text generation
(3)
speech recognition
(3)
bottleneck adapter
(3)
word-level prediction
(3)
model compression
(2)
pseudo labeling
(2)
multi-task learning
(2)
ensemble learning
(2)
domain adaptation
(2)
prompt engineering
(2)
data augmentation
(2)
Papers
MIDB: Multilingual Instruction Data Booster for Enhancing Cultural Equality in Multilingual Instruction Synthesis
AAAI 2026
ELSPR: Evaluator LLM Training Data Self-Purification on Non-Transitive Preferences via Tournament Graph Reconstruction
AAAI 2026
Measuring the Unmeasurable: Unveiling Latent Cognitive Capabilities of LLM
AAAI 2026
The “Knowledge–Behavior Gap” in Cultural Taboo Safety of Large Language Models
ACL 2026
The GaoYao Benchmark: A Comprehensive Framework for Evaluating Multilingual and Multicultural Abilities of Large Language Models
ACL 2026
DeReA: Improving Idiom Translation with Detect-Retrieve-Arbitrate Reasoning
ACL 2026
Taming Text-to-Image Synthesis for Novices: User-centric Prompt Generation via Multi-turn Guidance
EMNLP 2025
SRDC: Semantics-based Ransomware Detection and Classification with LLM-assisted Pre-training
AAAI 2025
Two Intermediate Translations Are Better Than One: Fine-tuning LLMs for Document-level Translation Refinement
ACL 2025
M-Ped: Multi-Prompt Ensemble Decoding for Large Language Models
EMNLP 2025
HW-TSC at TextGraphs-17 Shared Task: Enhancing Inference Capabilities of LLMs with Knowledge Graphs
ACL 2024
Using Large Language Model for End-to-End Chinese ASR and NER
INTERSPEECH 2024
DeMPT: Decoding-enhanced Multi-phase Prompt Tuning for Making LLMs Be Better Context-aware Translators
EMNLP 2024
Evaluation Dataset for Lexical Translation Consistency in Chinese-to-English Document-level Translation
COLING 2024
Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation
EMNLP 2024
Translate Meanings, Not Just Words: IdiomKB’s Role in Optimizing Idiomatic Translation with Language Models
AAAI 2024
A Multitask Training Approach to Enhance Whisper with Open-Vocabulary Keyword Spotting
INTERSPEECH 2024
Denoising Pre-training for Machine Translation Quality Estimation with Curriculum Learning
AAAI 2023
Lexical Translation Inconsistency-Aware Document-Level Translation Repair
ACL 2023
The HW-TSC’s Speech-to-Speech Translation System for IWSLT 2023
ACL 2023
Improved Pseudo Data for Machine Translation Quality Estimation with Constrained Beam Search
EMNLP 2023
SmartSpanNER: Making SpanNER Robust in Low Resource Scenarios
EMNLP 2023
Empowering a Metric with LLM-assisted Named Entity Annotation: HW-TSC’s Submission to the WMT23 Metrics Shared Task
EMNLP 2023
Unify Word-level and Span-level Tasks: NJUNLP’s Participation for the WMT2023 Quality Estimation Shared Task
EMNLP 2023
HW-TSC’s Participation in the WMT 2023 Automatic Post Editing Shared Task
EMNLP 2023
WhiSLU: End-to-End Spoken Language Understanding with Whisper
INTERSPEECH 2023
The HW-TSC’s Speech to Speech Translation System for IWSLT 2022 Evaluation
ACL 2022
The HW-TSC’s Simultaneous Speech Translation System for IWSLT 2022 Evaluation
ACL 2022
The HW-TSC’s Offline Speech Translation System for IWSLT 2022 Evaluation
ACL 2022
Capture Human Disagreement Distributions by Calibrated Networks for Natural Language Inference
ACL 2022
Modeling Consistency Preference via Lexical Chains for Document-level Neural Machine Translation
EMNLP 2022
Exploring Robustness of Machine Translation Metrics: A Study of Twenty-Two Automatic Metrics in the WMT22 Metric Task
EMNLP 2022
Partial Could Be Better than Whole. HW-TSC 2022 Submission for the Metrics Shared Task
EMNLP 2022
NJUNLP’s Participation for the WMT2022 Quality Estimation Shared Task
EMNLP 2022
CrossQE: HW-TSC 2022 Submission for the Quality Estimation Shared Task
EMNLP 2022
HW-TSC’s Submission for the WMT22 Efficiency Task
EMNLP 2022
HwTscSU’s Submissions on WAT 2022 Shared Task
COLING 2022
Part Represents Whole: Improving the Evaluation of Machine Translation System Using Entropy Enhanced Metrics
AACL 2022
Neighbors Are Not Strangers: Improving Non-Autoregressive Translation under Low-Frequency Lexical Constraints
NAACL 2022
HW-TSC at SemEval-2022 Task 7: Ensemble Model Based on Pretrained Models for Identifying Plausible Clarifications
NAACL 2022
HW-TSC at SemEval-2022 Task 7: Ensemble Model Based on Pretrained Models for Identifying Plausible Clarifications
SEMEVAL 2022
HW-TSC’s Participation at WMT 2021 Quality Estimation Shared Task
EMNLP 2021
HW-TSC’s Participation in the WMT 2021 Efficiency Shared Task
EMNLP 2021
How Length Prediction Influence the Performance of Non-Autoregressive Translation?
EMNLP 2021
HW-TSC’s Participation at WMT 2020 Automatic Post Editing Shared Task
EMNLP 2020
HW-TSC’s Participation at WMT 2020 Quality Estimation Shared Task
EMNLP 2020
LogAnomaly: Unsupervised Detection of Sequential and Quantitative Anomalies in Unstructured Logs
IJCAI 2019