Songyang Gao
19 papers · 2022–2025 · 5 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+8 more ↓ Show less ↑
π Cross-Pollinator (10) π Conference Polyglot (5) π Interdisciplinary Bridge π§ Keyword Pioneer π Renaissance Researcher (7)
π
Renaissance Researcher
(7)
πΊοΈ
Taxonomy Completionist
(49)
π₯
Mega-Team
(20)
π€
Dynamic Duo
(17)
β
The Questioner
β‘
Prolific Year
(6)
π
Century Club
(19)
ποΈ
Keyword Collector
(107)
Conferences
ACL (9)
EMNLP (6)
COLING (2)
AAAI (1)
ICML (1)
Top co-authors
Research topics
Keywords
large language model
(10)
reinforcement learning
(3)
tool learning
(3)
reinforcement learning from human feedback
(2)
text classification
(2)
benchmark evaluation
(2)
ai safety
(2)
reward model
(2)
distribution shift
(2)
agent system
(2)
out-of-distribution generalization
(2)
adversarial training
(1)
preference optimization
(1)
language model alignment
(1)
text generation
(1)
model robustness
(1)
model evaluation
(1)
computational efficiency
(1)
prompt engineering
(1)
domain generalization
(1)
Papers
Alleviating Shifted Distribution in Human Preference Alignment through Meta-Learning
AAAI 2025
Capability Salience Vector: Fine-grained Alignment of Loss and Capabilities for Downstream Task Scaling Law
ACL 2025
AgentGym: Evaluating and Training Large Language Model-based Agents across Diverse Environments
ACL 2025
Are Your LLMs Capable of Stable Reasoning?
ACL 2025
ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios
COLING 2025
CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward
EMNLP 2025
Navigating the OverKill in Large Language Models
ACL 2024
RoTBench: A Multi-Level Benchmark for Evaluating the Robustness of Large Language Models in Tool Learning
EMNLP 2024
Inverse-Q*: Token Level Reinforcement Learning for Aligning Large Language Models Without Preference Data
EMNLP 2024
Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback
ICML 2024
LoRAMoE: Alleviating World Knowledge Forgetting in Large Language Models via MoE-Style Plugin
ACL 2024
ToolSword: Unveiling Safety Issues of Large Language Models in Tool Learning Across Three Stages
ACL 2024
DSRM: Boost Textual Adversarial Training with Distribution Shift Risk Minimization
ACL 2023
RealBehavior: A Framework for Faithfully Characterizing Foundation Modelsβ Human-like Behavior Mechanisms
EMNLP 2023
Self-Polish: Enhance Reasoning in Large Language Models via Problem Refinement
EMNLP 2023
Farewell to Aimless Large-scale Pretraining: Influential Subset Selection for Language Model
ACL 2023
On the Universal Adversarial Perturbations for Efficient Data-free Adversarial Detection
ACL 2023
Kernel-Whitening: Overcome Dataset Bias with Isotropic Sentence Embedding
EMNLP 2022
Decorrelate Irrelevant, Purify Relevant: Overcome Textual Spurious Correlations from a Feature Perspective
COLING 2022