If Apple Neural Engine can support the 8bits/integer inference? Quantized weights to 8bits could reduce storage to quarter, but the inference speed did not change?
If Apple Neural Engine can support the 8bits/integer inference? Quantized weights to 8bits could reduce storage to quarter, but the inference speed did not change?