conftrace_

Sebastian Jaszczur

5 papers · 2021–2025 · 3 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

🌍 Conference Polyglot (3) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🐝 Cross-Pollinator (15)

Conferences

ICML (2) NIPS (2) AAAI (1)

Top co-authors

Jakub Krajewski (3) Marek Cygan (3) Michał Krutul (3) Jan Ludziejewski (3) Maciej Pióro (3) Piotr Sankowski (2) Kamil Ciebiera (2) Kamil Adamczewski (2) Tomasz Odrzygóźdź (2) Piotr Miłoś (2)

Keywords

transformer architecture (1) autoregressive generation (1) language modeling (1) efficient inference (1) mixture of expert (1) context window (1) sparse attention (1) long context (1) context utilization (1) parameter scaling (1) large language model (1) sparse layer (1) continuous moe (1) cross-example aggregation (1)

Papers

Structured Packing in LLM Training Improves Long Context Utilization AAAI 2025 Joint MoE Scaling Laws: Mixture of Experts Can Be Memory Efficient ICML 2025 Mixture of Tokens: Continuous MoE through Cross-Example Aggregation NIPS 2024 Scaling Laws for Fine-Grained Mixture of Experts ICML 2024 Sparse is Enough in Scaling Transformers NIPS 2021