Papers
17,973 papers found
Benchmark Profiling: Mechanistic Diagnosis of LLM Benchmarks
Dongjun Kim, Gyuho Shim, Yongchan Chun et al.
BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language Models
Xu Huang, Wenhao Zhu, Hanxu Hu et al.
Beneath the Facade: Probing Safety Vulnerabilities in LLMs via Auto-Generated Jailbreak Prompts
Heehyeon Kim, Kyeongryul Lee, Joyce Jiyoung Whang
BeSimulator: A Large Language Model Powered Text-based Behavior Simulator
Jianan Wang, Bin Li, Jingtao Qi et al.
Beyond A Single AI Cluster: A Survey of Decentralized LLM Training
Haotian Dong, Jingyan Jiang, Rongwei Lu et al.
Beyond Averages: Learning with Annotator Disagreement in STS
Alejandro Benito-Santos, Adrian Ghajari
Beyond Binary Preferences: Semi-Online Label-Free GRACE-KTO with Group-Wise Adaptive Calibration for High-Quality Long-Text Generation
Jingyang Deng, Ran Chen, Jo-Ku Cheng et al.
Beyond Checkmate: Exploring the Creative Choke Points for AI Generated Texts
Nafis Irtiza Tripto, Saranya Venkatraman, Mahjabin Nahar et al.
Beyond Coarse Labels: Fine-Grained Problem Augmentation and Multi-Dimensional Feedback for Emotional Support Conversation
Yuanchen Shi, Jiawang Hao, Fang Kong
Beyond Content: How Grammatical Gender Shapes Visual Representation in Text-to-Image Models
Muhammed Saeed, Shaina Raza, Ashmal Vayani et al.
Beyond Contrastive Learning: Synthetic Data Enables List-wise Training with Multiple Levels of Relevance
Reza Esfandiarpoor, George Zerveas, Ruochen Zhang et al.
Beyond Correctness: Confidence-Aware Reward Modeling for Enhancing Large Language Model Reasoning
Qianxi He, Qingyu Ren, Shanzhe Lei et al.
Beyond Demographics: Enhancing Cultural Value Survey Simulation with Multi-Stage Personality-Driven Cognitive Reasoning
Haijiang Liu, Qiyuan Li, Chao Gao et al.
Beyond Demonstrations: Dynamic Vector Construction from Latent Representations
Wang Cai, Hsiu-Yuan Huang, Zhixiang Wang et al.
Beyond Dynamic Quantization: An Efficient Static Hierarchical Mix-precision Framework for Near-Lossless LLM Compression
Yi Zhang, Kai Zhang, Zheyang Li et al.
Beyond Fixed-Length Calibration for Post-Training Compression of LLMs
Jaehoon Oh, Dokwan Oh
Beyond Function-Level Search: Repository-Aware Dual-Encoder Code Retrieval with Adversarial Verification
Aofan Liu, Song Shiyuan, Haoxuan Li et al.
Beyond Guilt: Legal Judgment Prediction with Trichotomous Reasoning
Kepu Zhang, Haoyue Yang, Xu Tang et al.
Beyond Hate Speech: NLP’s Challenges and Opportunities in Uncovering Dehumanizing Language
Hamidreza Saffari, Mohammadamin Shafiei, Hezhao Zhang et al.
Beyond Human Judgment: A Bayesian Evaluation of LLMs’ Moral Values Understanding
Maciej Skorski, Alina Landowska
Beyond Human Labels: A Multi-Linguistic Auto-Generated Benchmark for Evaluating Large Language Models on Resume Parsing
Zijian Ling, Han Zhang, Jiahao Cui et al.
Beyond Inherent Cognition Biases in LLM-Based Event Forecasting: A Multi-Cognition Agentic Framework
Zhen Wang, Xi Zhou, Yating Yang et al.
Beyond Input Activations: Identifying Influential Latents by Gradient Sparse Autoencoders
Dong Shu, Xuansheng Wu, Haiyan Zhao et al.
Beyond Linear Steering: Unified Multi-Attribute Control for Language Models
Narmeen Fatimah Oozeer, Luke Marks, Fazl Barez et al.