Papers
2,781 papers found
CodeMixBench: Evaluating Code-Mixing Capabilities of LLMs Across 18 Languages
Yilun Yang, Yekun Chai
Unveiling Internal Reasoning Modes in LLMs: A Deep Dive into Latent Reasoning vs. Factual Shortcuts with Attribute Rate Ratio
Yiran Yang, Haifeng Sun, Jingyu Wang et al.
LLMs Behind the Scenes: Enabling Narrative Scene Illustration
Melissa Roemmele, John Joon Young Chung, Taewook Kim et al.
FilBench: Can LLMs Understand and Generate Filipino?
Lester James Validad Miranda, Elyanah Aco, Conner G. Manuel et al.
Code to Think, Think to Code: A Survey on Code-Enhanced Reasoning and Reasoning-Driven Code Intelligence in LLMs
Dayu Yang, Tianyang Liu, Daoan Zhang et al.
Read to Hear: A Zero-Shot Pronunciation Assessment Using Textual Descriptions and LLMs
Yu-Wen Chen, Melody Ma, Julia Hirschberg
Ask Patients with Patience: Enabling LLMs for Human-Centric Medical Dialogue with Grounded Reasoning
Jiayuan Zhu, Jiazhen Pan, Yuyuan Liu et al.
Unleashing the Reasoning Potential of LLMs by Critique Fine-Tuning on One Problem
Yubo Wang, Ping Nie, Kai Zou et al.
LLMs as World Models: Data-Driven and Human-Centered Pre-Event Simulation for Disaster Impact Assessment
Lingyao Li, Dawei Li, Zhenhui Ou et al.
Mind the Value-Action Gap: Do LLMs Act in Alignment with Their Values?
Hua Shen, Nicholas Clark, Tanu Mitra
TokenSkip: Controllable Chain-of-Thought Compression in LLMs
Heming Xia, Chak Tou Leong, Wenjie Wang et al.
Exploring Changes in Nation Perception with Nationality-Assigned Personas in LLMs
Mahammed Kamruzzaman, Gene Louis Kim
RAG-Instruct: Boosting LLMs with Diverse Retrieval-Augmented Instructions
Wanlong Liu, Junying Chen, Ke Ji et al.
Training LLMs to be Better Text Embedders through Bidirectional Reconstruction
Chang Su, Dengliang Shi, Siyuan Huang et al.
CoLA: Compute-Efficient Pre-Training of LLMs via Low-Rank Activation
Ziyue Liu, Ruijie Zhang, Zhengyang Wang et al.
Continuously Steering LLMs Sensitivity to Contextual Knowledge with Proxy Models
Yilin Wang, Heng Wang, Yuyang Bai et al.
Too Consistent to Detect: A Study of Self-Consistent Errors in LLMs
Hexiang Tan, Fei Sun, Sha Liu et al.
Co-Evolving LLMs and Embedding Models via Density-Guided Preference Optimization for Text Clustering
Zetong Li, Qinliang Su, Minhua Huang et al.
P-MMEval: A Parallel Multilingual Multitask Benchmark for Consistent Evaluation of LLMs
Yidan Zhang, Yu Wan, Boyi Deng et al.
InMind: Evaluating LLMs in Capturing and Applying Individual Human Reasoning Styles
Zizhen Li, Chuanhao Li, Yibin Wang et al.
SEPS: A Separability Measure for Robust Unlearning in LLMs
Wonje Jeung, Sangyeon Yoon, Albert No
AQuilt: Weaving Logic and Self-Inspection into Low-Cost, High-Relevance Data Synthesis for Specialist LLMs
Xiaopeng Ke, Hexuan Deng, Xuebo Liu et al.
Merger-as-a-Stealer: Stealing Targeted PII from Aligned LLMs with Model Merging
Lin Lu, Zhigang Zuo, Ziji Sheng et al.
QualBench: Benchmarking Chinese LLMs with Localized Professional Qualifications for Vertical Domain Evaluation
Mengze Hong, Wailing Ng, Chen Jason Zhang et al.