Papers
LLMEval-Med: A Real-world Clinical Benchmark for Medical LLMs with Physician Validation
Ming Zhang, Yujiong Shen, Zelin Li et al.
LlmFixer: Fix the Helpfulness of Defensive Large Language Models
Zelong Yu, Xiaoming Zhang, Litian Zhang et al.
LLM-Guided Co-Training for Text Classification
Md Mezbaur Rahman, Cornelia Caragea
LLM-Guided Semantic Relational Reasoning for Multimodal Intent Recognition
Qianrui Zhou, Hua Xu, Yifan Wang et al.
LLM-Independent Adaptive RAG: Let the Question Speak for Itself
Maria Marina, Nikolay Ivanov, Sergey Pletenev et al.
LLMInit: A Free Lunch from Large Language Models for Selective Initialization of Recommendation
Weizhi Zhang, Liangwei Yang, Wooseong Yang et al.
LLM Jailbreak Detection for (Almost) Free!
Guorui Chen, Yifan Xia, Xiaojun Jia et al.
LLM×MapReduce-V3: Enabling Interactive In-Depth Survey Generation through a MCP-Driven Hierarchically Modular Agent System
Yu Chao, Siyu Lin, Xiaorong Wang et al.
LLM-OREF: An Open Relation Extraction Framework Based on Large Language Models
Hongyao Tu, Liang Zhang, Yujie Lin et al.
LLMs are Better Than You Think: Label-Guided In-Context Learning for Named Entity Recognition
Fan Bai, Hamid Hassanzadeh, Ardavan Saeedi et al.
LLMs are Privacy Erasable
Zipeng Ye, Wenjian Luo
LLMs as annotators of argumentation
Anna Lindahl
LLMs as World Models: Data-Driven and Human-Centered Pre-Event Simulation for Disaster Impact Assessment
Lingyao Li, Dawei Li, Zhenhui Ou et al.
LLMs Behind the Scenes: Enabling Narrative Scene Illustration
Melissa Roemmele, John Joon Young Chung, Taewook Kim et al.
LLMs Can Compensate for Deficiencies in Visual Representations
Sho Takishita, Jay Gala, Abdelrahman Mohamed et al.
LLMs cannot spot math errors, even when allowed to peek into the solution
Kv Aditya Srivatsa, Kaushal Kumar Maurya, Ekaterina Kochmar
LLMs Don’t Know Their Own Decision Boundaries: The Unreliability of Self-Generated Counterfactual Explanations
Harry Mayne, Ryan Othniel Kearns, Yushi Yang et al.
LLMs for Bayesian Optimization in Scientific Domains: Are We There Yet?
Rushil Gupta, Jason Hartford, Bang Liu
LLMs on a Budget? Say HOLA
Zohaib Hasan Siddiqui, Jiechao Gao, Ebad Shabbir et al.
LLMsPark: A Benchmark for Evaluating Large Language Models in Strategic Gaming Contexts
Junhao Chen, Jingbo Sun, Xiang Li et al.
LLMs Reproduce Stereotypes of Sexual and Gender Minorities
Ruby Ostrow, Adam Lopez
LM2Protein: A Structure-to-Token Protein Large Language Model
Chang Zhou, Yuheng Shan, Pengan Chen et al.
LMR-BENCH: Evaluating LLM Agent’s Ability on Reproducing Language Modeling Research
Shuo Yan, Ruochen Li, Ziming Luo et al.
LM-Searcher: Cross-domain Neural Architecture Search with LLMs via Unified Numerical Encoding
Yuxuan Hu, Jihao Liu, Ke Wang et al.