conftrace_

Guoli Yin

4 papers · 2025–2025 · 3 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+1 more ↓

🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌍 Conference Polyglot (3) 🐝 Cross-Pollinator (5) 👥 Mega-Team (24)

❓ The Questioner

Conferences

NAACL (2) ACL (1) ICML (1)

Top co-authors

Ruoming Pang (4) Shen Ma (2) Yizhe Zhang (2) Jiarui Lu (2) Zirui Wang (2) Feng Nan (2) Haoping Bai (2) Shuang Ma (2) Bernhard Aumayer (1) Bairu Hou (1)

Keywords

large language model (3) tool use (2) llm evaluation (1) web search (1) pairwise preference (1) annotation quality (1) agent system (1) code execution (1) reasoning task (1) tool-augmented ai (1) external validation (1) agent capability (1) conversational evaluation (1) stateful execution (1) agent capabilities (1) benchmark evaluation (1) external validation tool (1) text generation (1)

Papers

Can External Validation Tools Improve Annotation Quality for LLM-as-a-Judge? ACL 2025 Instruction-Following Pruning for Large Language Models ICML 2025 ToolSandbox: A Stateful, Conversational, Interactive Evaluation Benchmark for LLM Tool Use Capabilities NAACL 2025 MMAU: A Holistic Benchmark of Agent Capabilities Across Diverse Domains NAACL 2025