Shi Han

38 papers · 2019–2026 · 8 conferences · across top CS/AI conferences

Achievements

+11 more ↓

🏃 Academic Marathon (6) 🌍 Conference Polyglot (8) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐝 Cross-Pollinator (6)

🌈 Renaissance Researcher (9) 🌍 Conference Polyglot (8) 🏃 Academic Marathon (6) 🤝 Dynamic Duo (34) 🧬 Topic Evolution 💎 Century Club (35) 📈 Trend Setter ⚡ Prolific Year (10) 🚀 Conference Pioneer 🗃️ Keyword Collector (184) 🔥 Unstoppable (7)

Conferences

EMNLP (14) ACL (9) AAAI (8) ICLR (2) IJCAI (2) AISTATS (1) COLING (1) NIPS (1)

Top co-authors

Dongmei Zhang (37) Mengyu Zhou (18) Haoyu Dong (17) Lun Du (11) Xinyi He (7) Qiang Fu (5) Yeye He (5) Ran Jia (5) Zejian Yuan (4) Rui Ding (4)

Keywords

large language model (11) semantic parsing (3) question answering (3) code generation (3) numerical reasoning (3) data analysis (3) table detection (3) table question answering (3) formula prediction (2) self-supervised learning (2) tabular data analysis (2) tabular datum (2) table reasoning (2) table processing (2) convolutional neural network (2) language model (2) information retrieval (2) representation learning (2) data augmentation (2) chain-of-thought reasoning (2)

Papers

SheetBrain: A Neuro-Symbolic Agent for Accurate Reasoning over Complex and Large Spreadsheets AAAI 2026 Not All Tokens Matter: Towards Efficient LLM Reasoning via Token Significance in Reinforcement Learning ACL 2026 Jupiter: Enhancing LLM Data Analysis Capabilities via Notebook and Inference-Time Value-Guided Search AAAI 2026 SheetDesigner: MLLM-Powered Spreadsheet Layout Generation with Rule-Based and Vision-Based Reflection EMNLP 2025 Learning Identifiable Structures Helps Avoid Bias in DNN-based Supervised Causal Learning AISTATS 2025 TwT: Thinking without Tokens by Habitual Reasoning Distillation with Multi-Teachers’ Guidance EMNLP 2025 Table-LLM-Specialist: Language Model Specialists for Tables using Iterative Fine-tuning EMNLP 2025 TablePilot: Recommending Human-Preferred Tabular Data Analysis with Large Language Models ACL 2025 TableLoRA: Low-rank Adaptation on Table Structure Understanding for Large Language Models ACL 2025 CoCoST: Automatic Complex Code Generation with Online Searching and Correctness Testing EMNLP 2024 PromptIntern: Saving Inference Costs by Internalizing Recurrent Prompt during Large Language Model Fine-tuning EMNLP 2024 TAP4LLM: Table Provider on Sampling, Augmenting, and Packing Semi-structured Data for Large Language Model Reasoning EMNLP 2024 Vision Language Models for Spreadsheet Understanding: Challenges and Opportunities ACL 2024 KET-QA: A Dataset for Knowledge Enhanced Table Question Answering COLING 2024 Text-to-Image Generation for Abstract Concepts AAAI 2024 Text2Analysis: A Benchmark of Table Question Answering with Advanced Data Analysis and Unclear Queries AAAI 2024 Encoding Spreadsheets for Large Language Models EMNLP 2024 SheetPT: Spreadsheet Pre-training Based on Hierarchical Attention Network AAAI 2023 Unveiling the Black Box of PLMs with Semantic Anchors: Towards Interpretable Neural Semantic Parsing AAAI 2023 CASR: Generating Complex Sequences with Autoregressive Self-Boost Refinement ICLR 2023 InsightPilot: An LLM-Empowered Automated Data Exploration System EMNLP 2023 AnaMeta: A Table Understanding Dataset of Field Metadata Knowledge Shared by Multi-dimensional Data Analysis Tasks ACL 2023 HermEs: Interactive Spreadsheet Formula Prediction via Hierarchical Formulet Expansion ACL 2023 Causal-Based Supervision of Attention in Graph Neural Network: A Better and Simpler Choice towards Powerful Attention IJCAI 2023 Out-of-Distribution Detection based on In-Distribution Data Patterns Memorization with Modern Hopfield Energy ICLR 2023 RACE: Retrieval-augmented Commit Message Generation EMNLP 2022 HiTab: A Hierarchical Table Dataset for Question Answering and Natural Language Generation ACL 2022 FORTAP: Using Formulas for Numerical-Reasoning-Aware Table Pretraining ACL 2022 Accelerating Code Search with Deep Hashing and Code Classification ACL 2022 TaCube: Pre-computing Data Cubes for Answering Numerical-Reasoning Questions over Tabular Data EMNLP 2022 Neuron with Steady Response Leads to Better Generalization NIPS 2022 PLOG: Table-to-Logic Pretraining for Logical Table-to-Text Generation EMNLP 2022 Towards Robust Numerical Question Answering: Diagnosing Numerical Capabilities of NLP Systems EMNLP 2022 FormLM: Recommending Creation Ideas for Online Forms by Modelling Semantic and Structural Information EMNLP 2022 Table Pre-training: A Survey on Model Architectures, Pre-training Objectives, and Downstream Tasks IJCAI 2022 CAST: Enhancing Code Summarization with Hierarchical Splitting and Reconstruction of Abstract Syntax Trees EMNLP 2021 Reliable and Efficient Anytime Skeleton Learning AAAI 2020 TableSense: Spreadsheet Table Detection with Convolutional Neural Networks AAAI 2019