Papers
5,479 papers found
EditRoom: LLM-parameterized Graph Diffusion for Composable 3D Room Layout Editing
Kaizhi Zheng, Xiaotong Chen, Xuehai He et al.
LLM-based Typed Hyperresolution for Commonsense Reasoning with Knowledge Bases
Armin Toroghi, Ali Pesaranghader, Tanmana Sadhu et al.
AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents
Ke Yang, Yao Liu, Sapana Chaudhary et al.
Empowering Users in Digital Privacy Management through Interactive LLM-Based Agents
BOLUN SUN, Yifan Zhou, Haiyun Jiang
BadJudge: Backdoor Vulnerabilities of LLM-As-A-Judge
Terry Tong, Fei Wang, Zhe Zhao et al.
Can We Trust Embodied Agents? Exploring Backdoor Attacks against Embodied LLM-Based Decision-Making Systems
Ruochen Jiao, Shaoyuan Xie, Justin Yue et al.
LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations
Hadas Orgad, Michael Toker, Zorik Gekhman et al.
Triples as the Key: Structuring Makes Decomposition and Verification Easier in LLM-based TableQA
Zhen Yang, Ziwei Du, Minghan Zhang et al.
Learning LLM-as-a-Judge for Preference Alignment
Ziyi Ye, Xiangsheng Li, Qiuchi Li et al.
RevisEval: Improving LLM-as-a-Judge via Response-Adapted References
Qiyuan Zhang, Yufei Wang, Tiezheng YU et al.
Not All LLM-Generated Data Are Equal: Rethinking Data Weighting in Text Classification
Hsun-Yu Kuo, Yin-Hsiang Liao, Yu-Chieh Chao et al.
LLM-wrapper: Black-Box Semantic-Aware Adaptation of Vision-Language Models for Referring Expression Comprehension
Amaia Cardiel, Eloi Zablocki, Elias Ramzi et al.
How efficient is LLM-generated code? A rigorous & high-standard benchmark
Ruizhong Qiu, Weiliang Will Zeng, James Ezick et al.
Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents
Hanrong Zhang, Jingyuan Huang, Kai Mei et al.
The Power of LLM-Generated Synthetic Data for Stance Detection in Online Political Discussions
Stefan Sylvius Wagner, Maike Behrendt, Marc Ziegele et al.
Training-free LLM-generated Text Detection by Mining Token Probability Sequences
Yihuai Xu, Yongwei Wang, Yifei Bi et al.
IRIS: LLM-Assisted Static Analysis for Detecting Security Vulnerabilities
Ziyang Li, Saikat Dutta, Mayur Naik
Justice or Prejudice? Quantifying Biases in LLM-as-a-Judge
Jiayi Ye, Yanbo Wang, Yue Huang et al.
Planning Anything with Rigor: General-Purpose Zero-Shot Planning with LLM-based Formalized Programming
Yilun Hao, Yang Zhang, Chuchu Fan
From Commands to Prompts: LLM-based Semantic File System for AIOS
Zeru Shi, Kai Mei, Mingyu Jin et al.
JudgeBench: A Benchmark for Evaluating LLM-Based Judges
Sijun Tan, Siyuan Zhuang, Kyle Montgomery et al.
Glimpse: Enabling White-Box Methods to Use Proprietary Models for Zero-Shot LLM-Generated Text Detection
Guangsheng Bao, Yanbin Zhao, Juncai He et al.
Improving Data Efficiency via Curating LLM-Driven Rating Systems
Jinlong Pang, Jiaheng Wei, Ankit Shah et al.
LLM-SR: Scientific Equation Discovery via Programming with Large Language Models
Parshin Shojaee, Kazem Meidani, Shashank Gupta et al.
Cut the Crap: An Economical Communication Pipeline for LLM-based Multi-Agent Systems
Guibin Zhang, Yanwei Yue, Zhixun Li et al.