Co-occurring keywords
Papers
Blind Men and the Elephant: Diverse Perspectives on Gender Stereotypes in Benchmark Datasets
EMNLP 2025
VoxEval: Benchmarking the Knowledge Understanding Capabilities of End-to-End Spoken Language Models
ACL 2025
Reasoning or Memorization? Investigating LLMs’ Capability in Restoring Chinese Internet Homophones
ACL 2025
Exploring Response Uncertainty in MLLMs: An Empirical Evaluation under Misleading Scenarios
EMNLP 2025