conftrace_

Songyang Gao

19 papers · 2022–2025 · 5 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+8 more ↓

🐝 Cross-Pollinator (10) 🌍 Conference Polyglot (5) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌈 Renaissance Researcher (7)

🌈 Renaissance Researcher (7) 🗺️ Taxonomy Completionist (49) 👥 Mega-Team (20) 🤝 Dynamic Duo (17) ❓ The Questioner ⚡ Prolific Year (6) 💎 Century Club (19) 🗃️ Keyword Collector (107)

Conferences

ACL (9) EMNLP (6) COLING (2) AAAI (1) ICML (1)

Top co-authors

Qi Zhang (17) Xuanjing Huang (13) Tao Gui (10) Shihan Dou (8) Rui Zheng (7) Xiao Wang (5) Zhiheng Xi (5) Junjie Ye (5) Xiaoran Fan (4) Qiming Ge (4)

Research topics

Keywords

large language model (10) reinforcement learning (3) tool learning (3) reinforcement learning from human feedback (2) text classification (2) benchmark evaluation (2) ai safety (2) reward model (2) distribution shift (2) agent system (2) out-of-distribution generalization (2) adversarial training (1) preference optimization (1) language model alignment (1) text generation (1) model robustness (1) model evaluation (1) computational efficiency (1) prompt engineering (1) domain generalization (1)

Papers

Alleviating Shifted Distribution in Human Preference Alignment through Meta-Learning AAAI 2025 Capability Salience Vector: Fine-grained Alignment of Loss and Capabilities for Downstream Task Scaling Law ACL 2025 AgentGym: Evaluating and Training Large Language Model-based Agents across Diverse Environments ACL 2025 Are Your LLMs Capable of Stable Reasoning? ACL 2025 ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios COLING 2025 CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward EMNLP 2025 Navigating the OverKill in Large Language Models ACL 2024 RoTBench: A Multi-Level Benchmark for Evaluating the Robustness of Large Language Models in Tool Learning EMNLP 2024 Inverse-Q*: Token Level Reinforcement Learning for Aligning Large Language Models Without Preference Data EMNLP 2024 Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback ICML 2024 LoRAMoE: Alleviating World Knowledge Forgetting in Large Language Models via MoE-Style Plugin ACL 2024 ToolSword: Unveiling Safety Issues of Large Language Models in Tool Learning Across Three Stages ACL 2024 DSRM: Boost Textual Adversarial Training with Distribution Shift Risk Minimization ACL 2023 RealBehavior: A Framework for Faithfully Characterizing Foundation Models’ Human-like Behavior Mechanisms EMNLP 2023 Self-Polish: Enhance Reasoning in Large Language Models via Problem Refinement EMNLP 2023 Farewell to Aimless Large-scale Pretraining: Influential Subset Selection for Language Model ACL 2023 On the Universal Adversarial Perturbations for Efficient Data-free Adversarial Detection ACL 2023 Kernel-Whitening: Overcome Dataset Bias with Isotropic Sentence Embedding EMNLP 2022 Decorrelate Irrelevant, Purify Relevant: Overcome Textual Spurious Correlations from a Feature Perspective COLING 2022