This document describes evaluation of optimized checkpoints for vision transformer (ViT) for image classification
Please install and setup AIMET (Torch GPU variant) before proceeding further.
NOTE
- All AIMET releases are available here: https://github.com/quic/aimet/releases
- This model has been tested using AIMET version 1.24.0 (i.e. set
release_tag="1.26.0"
in the above instructions). - This model is compatible with the PyTorch GPU variant of AIMET (i.e. set
AIMET_VARIANT="torch_gpu"
in the above instructions).
pip install accelerate==0.9.0
pip install transformers==4.21.0
pip install datasets==2.4.0
- Original full precision checkpoints without downstream training were downloaded through hugging face
- [Full precision model with downstream training weight files] are automatically downloaded using evaluation script
- [Quantization optimized model weight files] are automatically downloaded using evaluation script
-
- This evaluation was designed for the 2012 ImageNet Large Scale Visual Recognition Challenge (ILSVRC2012). The dataset directory is expected to have 3 subdirectories: train, valid, and test (only the valid test is used, hence if the other subdirectories are missing that is ok). Each of the {train, valid, test} directories is then expected to have 1000 subdirectories, each containing the images from the 1000 classes present in the ILSVRC2012 dataset, such as in the example below:
train/
├── n01440764
│ ├── n01440764_10026.JPEG
│ ├── n01440764_10027.JPEG
│ ├── ......
├── ......
val/
├── n01440764
│ ├── ILSVRC2012_val_00000293.JPEG
│ ├── ILSVRC2012_val_00002138.JPEG
│ ├── ......
├── ......
- To run evaluation with QuantSim in AIMET, use the following
python vit_quanteval.py \
--model_config <model_config> \
--per_device_eval_batch_size <evaluation batch size> \
--train_dir <imagenet training data path> \
--validation_dir <imagenet validation data path>
python vit_quanteval.py --model_config vit_w8a8 --per_device_eval_batch_size 4 --train_dir <imagenet_train_path> --validation_dir <imagenet_val_path>
- supported keywords of model_config is vit_w8a8
The following configuration has been used for the above models for INT8 quantization:
- Weight quantization: 8 bits, symmetric quantization
- Bias parameters are not quantized
- Activation quantization: 8 bits, asymmetric quantization
- Model inputs are quantized
- TF range learning was used as quantization scheme
- Clamped initialization was adopted
- Quantization aware training (QAT) was used to obtain optimized quantized weights