RedOne 2.0: Rethinking Domain-specific LLM Post-Training in Social Networking Services

Fei Zhao; Chonggang Lu; Haofu Qian; Fangcheng Shi; Zijie Meng; Jianzhao Huang; Zheyong Xie; Shaosheng Cao

2026 ACL ACL 2026

RedOne 2.0: Rethinking Domain-specific LLM Post-Training in Social Networking Services

Abstract

AbstractAs a primary medium for human interaction and information exchange, social networking services (SNS) present distinct challenges for large language models (LLMs): rapidly evolving norms and slang, and culturally diverse content that causes knowledge distribution shift. While supervised fine-tuning (SFT) can improve in-domain performance, it often induces a ”seesaw” trade-off with out-of-domain robustness, especially for smaller models. To address these challenges, we present RedOne 2.0, an SNS-oriented LLM developed with a progressive, RL-prioritized post-training paradigm for fast and stable adaptation. Our pipeline has three stages: (1) Exploratory Learning on curated SNS corpora to establish initial alignment and surface systematic weaknesses; (2) Targeted Fine-Tuning that applies SFT only to diagnosed gaps while mixing a small amount of general data to reduce forgetting; and (3) Refinement Learning that re-applies RL with SNS-centric signals to consolidate gains and balance trade-offs across tasks. Across various tasks in three categories, our 4B model improves by 2.41 on average over the prior 7B RedOne baseline. It also yields an 8.74 average gain over its Qwen3-4B base while using less than half the data required by the SFT-centric method, demonstrating superior data efficiency and stability at compact scales. Overall, RedOne 2.0 provides a competitive, cost-effective baseline for SNS-specific LLMs, improving capability without sacrificing robustness.

Authors

Fei Zhao , Chonggang Lu , Haofu Qian , Fangcheng Shi , Zijie Meng , Jianzhao Huang , Zheyong Xie , Shaosheng Cao

Topics

Artificial Intelligence > Core AI > Large Language Models Artificial Intelligence > Core AI > Reinforcement Learning Artificial Intelligence > Learning Paradigms > Domain Adaptation

Keywords

reinforcement learning domain adaptation supervised fine-tuning large language model social networking service knowledge distribution shift

Download PDF

Related papers

No Reader Left Behind: Multi-Agent Summaries Everyone Can Understand 2026

One-step Nonautoregressive Natural Language Generation with Shortcut Flow Matching Models 2026

Optimizing Retrieval-Augmented Generation for E-Commerce How-To Assistance 2026

Make Mechanistic Interpretability Auditable: A Call to Develop Guidelines via Continuous Collaborative Reviewing 2026

MQM Re-Annotation: A Technique for Collaborative Evaluation of Machine Translation 2026