Exploring Attention Attractors in Large Language Models
Abstract
AbstractThis paper explores attention attractors, tokens that draw significantly high attention, in large language models. We analyze them from three perspectives: (1) Functionality: We demonstrate their role in aggregating information from preceding contexts to facilitate future predictions. (2) Distribution: Through layer-wise and token-wise analysis, we reveal that attention attractors are widely distributed across layers but predominantly originate from low-semantic words like "_the". (3) Mechanism: We demonstrate the correlation between attention weights allocated to tokens with their specific activation dimension values. We hope these findings provide new insights into the attention mechanisms of large language models and inspire further exploration.