Papers
17,973 papers found
BANMIME : Misogyny Detection with Metaphor Explanation on Bangla Memes
Md Ayon Mia, Akm Moshiur Rahman Mazumder, Khadiza Sultana Sayma et al.
BannerAgency: Advertising Banner Design with Multimodal LLM Agents
Heng Wang, Yotaro Shimose, Shingo Takamatsu
BannerBench: Benchmarking Vision Language Models for Multi-Ad Selection with Human Preferences
Hiroto Otake, Peinan Zhang, Yusuke Sakai et al.
BAREC Demo: Resources and Tools for Sentence-level Arabic Readability Assessment
Kinda Altarbouch, Khalid N. Elmadani, Ossama Obeid et al.
BAREC Shared Task 2025 on Arabic Readability Assessment
Khalid N. Elmadani, Bashar Alhafni, Hanada Taha et al.
Batched Self-Consistency Improves LLM Relevance Assessment and Ranking
Anton Korikov, Pan Du, Scott Sanner et al.
Batch-wise Convergent Pre-training: Step-by-Step Learning Inspired by Child Language Development
Ko Yoshida, Daiki Shiono, Kai Sato et al.
BBScoreV2: Learning Time-Evolution and Latent Alignment from Stochastic Representation
Tianhao Zhang, Zhecheng Sheng, Zhexiao Lin et al.
BcQLM: Efficient Vision-Language Understanding with Distilled Q-Gated Cross-Modal Fusion
Sike Xiang, Shuang Chen, Amir Atapour-Abarghouei
BehaviorSFT: Behavioral Token Conditioning for Health Agents Across the Proactivity Spectrum
Yubin Kim, Zhiyuan Hu, Hyewon Jeong et al.
Benchmarking and Improving LLM Robustness for Personalized Generation
Chimaobi Okite, Naihao Deng, Kiran Bodipati et al.
Benchmarking and Mitigating MCQA Selection Bias of Large Vision-Language Models
Md. Atabuzzaman, Ali Asgarov, Chris Thomas
Benchmarking Contextual and Paralinguistic Reasoning in Speech-LLMs: A Case Study with In-the-Wild Data
Qiongqiong Wang, Hardik Bhupendra Sailor, Tianchi Liu et al.
Benchmarking Critical Questions Generation: A Challenging Reasoning Task for Large Language Models
Blanca Calvo Figueras, Rodrigo Agerri
Benchmarking Debiasing Methods for LLM-based Parameter Estimates
Nicolas Audinet de Pieuchon, Adel Daoud, Connor Thomas Jerzak et al.
Benchmarking Deep Search over Heterogeneous Enterprise Data
Prafulla Kumar Choubey, Xiangyu Peng, Shilpa Bhagavath et al.
Benchmarking for Domain-Specific LLMs: A Case Study on Academia and Beyond
Rubing Chen, Jiaxin Wu, Jian Wang et al.
Benchmarking Foundation Models with Retrieval-Augmented Generation in Olympic-Level Physics Problem Solving
Shunfeng Zheng, Yudi Zhang, Meng Fang et al.
Benchmarking Large Language Models for Cryptanalysis and Side-Channel Vulnerabilities
Utsav Maskey, Chencheng Zhu, Usman Naseem
Benchmarking Large Language Models Under Data Contamination: A Survey from Static to Dynamic Evaluation
Simin Chen, Yiming Chen, Zexin Li et al.
Benchmarking LLM Faithfulness in RAG with Evolving Leaderboards
Manveer Singh Tamber, Forrest Sheng Bao, Chenyu Xu et al.
Benchmarking LLMs for Translating Classical Chinese Poetry: Evaluating Adequacy, Fluency, and Elegance
Andong Chen, Lianzhang Lou, Kehai Chen et al.
Benchmarking LLMs on Semantic Overlap Summarization
John Salvador, Naman Bansal, Mousumi Akter et al.
Benchmarking the Detection of LLMs-Generated Modern Chinese Poetry
Shanshan Wang, Junchao Wu, Fengying Ye et al.
Benchmarking Uncertainty Metrics for LLM Target-Aware Search
Pei-Fu Guo, Yun-Da Tsai, Shou-De Lin