This document describes evaluation of optimized checkpoints for GPUNet-0.
Please install and setup AIMET before proceeding further.
This model was tested with the torch_gpu
variant of AIMET 1.25.0.
export PYTHONPATH=$PYTHONPATH:<aimet_model_zoo_path>
Install required packages
pip install -r <path to aimet-model-zoo>/aimet_zoo_torch/gpunet0/requirements.txt
ImageNet can be downloaded from here:
The folder structure and format of ImageNet dataset is like below:
--ImageNet
--val
--n01440764
--ILSVRC2012_val_00048969.JPEG
--train
--n13133613
--n13133613_7875.JPEG
python gpunet0_quanteval.py \
--dataset-path <The path to the ImageNet dataset's root path>
--model-config <Quantized Model Configuration to test, default is 'gpunet0_w8a8', and just one choice>
--batch-size <Data batch size to evaluate your model, default is 200>
--use-cuda <Use cuda or cpu, default is True> \
- example
python gpunet0_quanteval.py --dataset-path <ILSVRC2012_PyTorch_path> --model-config gpunet0_w8a8
- Optimized w8a8 checkpoint can be downloaded from here: gpunet0_w8a8_checkpoint.pth
- Optimized w8a8 encodings can be downloaded from here: gpunet0_w8a8_torch.encodings
- The Quantization Simulation (Quantsim) Configuration file can be downloaded from here: default_config_per_channel.json (Please see this page for more information on this file).
- Weight quantization: 8 bits, per channel symmetric quantization
- Bias parameters are not quantized
- Activation quantization: 8 bits, asymmetric quantization
- Model inputs are quantized
- Percentile was used as quantization scheme, and the value was set to 99.999
- Adaround and fold_all_batch_norms_to_scale have been applied
Below are the acc top1 results of this GPUNet-0 implementation on ImageNet:
Model Configuration | acc top1 (%) |
---|---|
GPUNet0_FP32 | 78.86 |
GPUNet0_FP32 + simple PTQ(w8a8) | 76.87 |
GPUNet0_W8A8 | 78.42 |