Papers
VisDoT : Enhancing Visual Reasoning through Human-Like Interpretation Grounding and Decomposition of Thought
Eunsoo Lee, Jeongwoo Lee, Minki Hong et al.
Vision-Language Models Align with Human Neural Representations in Concept Processing
Anna Bavaresco, Marianne De Heer Kloots, Sandro Pezzelle et al.
Visual–Linguistic Abductive Reasoning with LLMs for Knowledge-based Visual Question Answering
Jieun Kim, Yujin Jeong, Sung-Bae Cho
VN-MTEB: Vietnamese Massive Text Embedding Benchmark
Loc Pham, Tung Luu, Thu Vo et al.
VortexPIA: Indirect Prompt Injection Attack against LLMs for Efficient Extraction of User Privacy
Yu Cui, Sicheng Pan, Yifei Liu et al.
Weakly-supervised Argument Mining with Boundary Refinement and Relation Denoising
Wei Sun, Mingxiao Li, Jesse Davis et al.
Weakly Supervised Named Entity Recognition for Historical Texts
Marco Sorbi, Laurent Moccozet, Stephane Marchand-Maillet
"We Are (Language) Family”: Adapting Transformer models to related minority languages with linguistic data
Miguel López-Otal, Jorge Gracia
We Are What We Repeatedly Do: Improving Long Context Instruction Following
Preston K Robinette, Andrew Hard, Swaroop Ramaswamy et al.
WebNovelBench: Placing LLM Novelists on the Web Novel Distribution
Liangtao Lin, Jun Zheng, Haidong Wang
WebRollback: Enhancing Web Agents with Explicit Rollback Mechanisms
Zhisong Zhang, Tianqing Fang, Kaixin Ma et al.
What Breaks Knowledge Graph based RAG? Benchmarking and Empirical Insights into Reasoning under Incomplete Knowledge
Dongzhuoran Zhou, Yuqicheng Zhu, Xiaxia Wang et al.
What Does Infect Mean to Cardio? Investigating the Role of Clinical Specialty Data in Medical LLMs
Xinlan Yan, Di Wu, Yibin Lei et al.
What does Surprisal have to do with Information Status?
Andrew Thomas Dyer
What Makes a Good Query? Measuring the Impact of Human-Confusing Linguistic Features on LLM Performance
William Watson, Nicole Cho, Sumitra Ganesh et al.
What Matters to an LLM? Behavioral and Computational Evidences from Summarization
Yongxin Zhou, Changshun Wu, Philippe Mulhem et al.
What Really Matters for Table LLMs? A Meta-Evaluation of Model and Data Effects
Naihao Deng, Sheng Zhang, Henghui Zhu et al.
What’s Missing in Vision-Language Models? Probing Their Struggles with Causal Order Reasoning
Zhaotian Weng, Haoxuan Li, Xin Eric Wang et al.
What the Router Sees Matters: Funnel Pooling for Fast, Content Driven Expert Routing
Josef Pichlmeier, Sebastian Nicolas Mueller, Jakob Sturm et al.
When Benchmarks Age: Temporal Misalignment through Large Language Model Factuality Evaluation
Xunyi Jiang, Dingyi Chang, Julian McAuley et al.
When Can We Trust LLMs in Mental Health? Large-Scale Benchmarks for Reliable LLM Evaluation
Abeer Badawi, Elahe Rahimi, Md Tahmid Rahman Laskar et al.
When Does Auxiliary Modality Matter in Solving Geometric Problems? A Comprehensive Study of Textual, Formal, and Visual Modalities
Hyuk Namgoong, Jeesu Jung, Yerim Han et al.
When Do Language Models Endorse Limitations on Human Rights Principles?
Keenan Samway, Miu Nicole Takagi, Rada Mihalcea et al.
When Flores Bloomz Wrong: Cross-Direction Contamination in Machine Translation Evaluation
David Tan, Pinzhen Chen, Josef Van Genabith et al.