Calculate memory savings and performance gains from model quantization (1-bit, 4-bit, 8-bit). Inspired by Microsoft BitNet b1.58.
Microsoft's BitNet uses 1.58-bit quantization (ternary weights: -1, 0, +1) enabling: