Papers
5,479 papers found
LLM Stinger: Jailbreaking LLMs Using RL Fine-Tuned LLMs (Student Abstract)
Piyush Jha, Arnav Arora, Vijay Ganesh
Improving Automatic Evaluation of Large Language Models (LLMs) in Biomedical Relation Extraction via LLMs-as-the-Judge
Md Tahmid Rahman Laskar, Israt Jahan, Elham Dolatabadi et al.
LlamaDuo: LLMOps Pipeline for Seamless Migration from Service LLMs to Small-Scale Local LLMs
Chansung Park, Juyong Jiang, Fan Wang et al.
An Empirical Study of LLM-as-a-Judge for LLM Evaluation: Fine-tuned Judge Model is not a General Substitute for GPT-4
Hui Huang, Xingyuan Bu, Hongli Zhou et al.
LLM-as-an-Interviewer: Beyond Static Testing Through Dynamic LLM Evaluation
Eunsu Kim, Juyoung Suk, Seungone Kim et al.
LLM-Personalize: Aligning LLM Planners with Human Preferences via Reinforced Self-Training for Housekeeping Robots
Dongge Han, Trevor McInroe, Adam Jelley et al.
Is LLM-as-a-Judge Robust? Investigating Universal Adversarial Attacks on Zero-shot LLM Assessment
Vyas Raina, Adian Liusie, Mark Gales
LLM-Evolve: Evaluation for LLM’s Evolving Capability on Benchmarks
Jiaxuan You, Mingjie Liu, Shrimai Prabhumoye et al.
LLM-as-a-tutor in EFL Writing Education: Focusing on Evaluation of Student-LLM Interaction
Jieun Han, Haneul Yoo, Junho Myung et al.
The LLM Already Knows: Estimating LLM-Perceived Question Difficulty via Hidden Representations
Yubo Zhu, Dongrui Liu, Zecheng Lin et al.
Dial-In LLM: Human-Aligned LLM-in-the-loop Intent Clustering for Customer Service Dialogues
Mengze Hong, Wailing Ng, Chen Jason Zhang et al.
You are an LLM teaching a smaller model everything you know: Multi-task pretraining of language models with LLM-designed study plans
Wiktor Kamzela, Mateusz Lango, Ondrej Dusek
LeMAJ (Legal LLM-as-a-Judge): Bridging Legal Reasoning and LLM Evaluation
Joseph Enguehard, Morgane Van Ermengem, Kate Atkinson et al.
LLM-CXR: Instruction-Finetuned LLM for CXR Image Understanding and Generation
Suhyeon Lee, Won Jun Kim, Jinho Chang et al.
LLM-C3MOD: A Human-LLM Collaborative System for Cross-Cultural Hate Speech Moderation
Junyeong Park, Seogyeong Jeong, Seyoung Song et al.
Efficient LLM-Jailbreaking via Multimodal-LLM Jailbreak
Haoxuan Ji, Zheng Lin, Zhenxing Niu et al.
LLM-Pruner: On the Structural Pruning of Large Language Models
Xinyin Ma, Gongfan Fang, Xinchao Wang
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Lianmin Zheng, Wei-Lin Chiang, Ying Sheng et al.
SG-Nav: Online 3D Scene Graph Prompting for LLM-based Zero-shot Object Navigation
Hang Yin, Xiuwei Xu, Zhenyu Wu et al.
Can Graph Learning Improve Planning in LLM-based Agents?
Xixi Wu, Yifei Shen, Caihua Shan et al.
LLMs as Zero-shot Graph Learners: Alignment of GNN Representations with LLM Token Embeddings
Duo Wang, Yuan Zuo, Fengzhi Li et al.
LLM-based Skill Diffusion for Zero-shot Policy Adaptation
Woo Kyung Kim, Youngseok Lee, Jooyoung Kim et al.
LLM-ESR: Large Language Models Enhancement for Long-tailed Sequential Recommendation
Qidong Liu, Xian Wu, Yejing Wang et al.
LLM-Check: Investigating Detection of Hallucinations in Large Language Models
Gaurang Sriramanan, Siddhant Bharti, Vinu Sankar Sadasivan et al.
Web-Scale Visual Entity Recognition: An LLM-Driven Data Approach
Mathilde Caron, Alireza Fathi, Cordelia Schmid et al.