Papers
5,479 papers found
Reasoning Aware Self-Consistency: Leveraging Reasoning Paths for Efficient LLM Sampling
Guangya Wan, Yuqi Wu, Jie Chen et al.
CRMArena: Understanding the Capacity of LLM Agents to Perform Professional CRM Tasks in Realistic Environments
Kung-Hsiang Huang, Akshara Prabhakar, Sidharth Dhawan et al.
An Efficient Gloss-Free Sign Language Translation Using Spatial Configurations and Motion Dynamics with LLMs
Eui Jun Hwang, Sukmin Cho, Junmyeong Lee et al.
Communication Makes Perfect: Persuasion Dataset Construction via Multi-LLM Communication
Weicheng Ma, Hefan Zhang, Ivory Yang et al.
LLM4DistReconfig: A Fine-tuned Large Language Model for Power Distribution Network Reconfiguration
Panayiotis Christou, Md. Zahidul Islam, Yuzhang Lin et al.
The Good, The Bad, and The Greedy: Evaluation of LLMs Should Not Ignore Non-Determinism
Yifan Song, Guoyin Wang, Sujian Li et al.
ToolFlow: Boosting LLM Tool-Calling Through Natural and Coherent Dialogue Synthesis
Zezhong Wang, Xingshan Zeng, Weiwen Liu et al.
SVD-LLM V2: Optimizing Singular Value Truncation for Large Language Model Compression
Xin Wang, Samiul Alam, Zhongwei Wan et al.
SLIM: Let LLM Learn More and Forget Less with Soft LoRA and Identity Mixture
Jiayi Han, Liang Du, Hongwei Du et al.
MiLoRA: Harnessing Minor Singular Components for Parameter-Efficient LLM Finetuning
Hanqing Wang, Yixia Li, Shuo Wang et al.
MLLM-Bench: Evaluating Multimodal LLMs with Per-sample Criteria
Wentao Ge, Shunian Chen, Hardy Chen et al.
MeNTi: Bridging Medical Calculator and LLM Agent with Nested Tool Calling
Yakun Zhu, Shaohang Wei, Xu Wang et al.
Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering
Yu Zhao, Alessio Devoto, Giwon Hong et al.
DIRAS: Efficient LLM Annotation of Document Relevance for Retrieval Augmented Generation
Jingwei Ni, Tobias Schimanski, Meihong Lin et al.
My LLM might Mimic AAE - But When Should It?
Sandra Camille Sandoval, Christabel Acquaye, Kwesi Adu Cobbina et al.
CSEval: Towards Automated, Multi-Dimensional, and Reference-Free Counterspeech Evaluation using Auto-Calibrated LLMs
Amey Hengle, Aswini Kumar Padhi, Anil Bandhakavi et al.
RAG LLMs are Not Safer: A Safety Analysis of Retrieval-Augmented Generation for Large Language Models
Bang An, Shiyue Zhang, Mark Dredze
Arabic Dataset for LLM Safeguard Evaluation
Yasser Ashraf, Yuxia Wang, Bin Gu et al.
Elevating Legal LLM Responses: Harnessing Trainable Logical Structures and Semantic Knowledge with Legal Reasoning
Rujing Yao, Yang Wu, Chenghao Wang et al.
SynthDetoxM: Modern LLMs are Few-Shot Parallel Detoxification Data Annotators
Daniil Moskovskiy, Nikita Sushko, Sergey Pletenev et al.
Iterative Self-Tuning LLMs for Enhanced Jailbreaking Capabilities
Chung-En Sun, Xiaodong Liu, Weiwei Yang et al.
AEGIS2.0: A Diverse AI Safety Dataset and Risks Taxonomy for Alignment of LLM Guardrails
Shaona Ghosh, Prasoon Varshney, Makesh Narsimhan Sreedhar et al.
Rethinking the Role of LLMs for Document-level Relation Extraction: a Refiner with Task Distribution and Probability Fusion
Fu Zhang, Xinlong Jin, Jingwei Cheng et al.
Model Surgery: Modulating LLM’s Behavior Via Simple Parameter Editing
Huanqian Wang, Yang Yue, Rui Lu et al.
CharacterBox: Evaluating the Role-Playing Capabilities of LLMs in Text-Based Virtual Worlds
Lei Wang, Jianxun Lian, Yi Huang et al.