Where I work, we need to quantize our models to run them quick enough, and we found that Quantization Aware Training is the only one that has a chance of retaining the desired accuracy. Using Post-training Quantization incurs too many losses.
However, QAT is incredibly difficult and cumbersome in TF 2 because it only applies to models defined through the functional API, whereas many interesting models use for example the object-oriented approach of defining a model.
Does anyone know if there are plans to make QAT easier to use in the future?
submitted by /u/wattnurt
[visit reddit] [comments]