conftrace_
2025 ICML ICML 2025

BlockDialect: Block-wise Fine-grained Mixed Format Quantization for Energy-Efficient LLM Inference