PyTorch Transformer model Mobile Vision Transformer(MobileViT) for Image Classification

This document describes evaluation of optimized checkpoints for mobile vision transformer (MobileViT) for image classification

AIMET installation and setup

Please install and setup AIMET (Torch GPU variant) before proceeding further.

NOTE

All AIMET releases are available here: https://github.com/quic/aimet/releases
This model has been tested using AIMET version 1.24.0 (i.e. set release_tag="1.26.0" in the above instructions).
This model is compatible with the PyTorch GPU variant of AIMET (i.e. set AIMET_VARIANT="torch_gpu" in the above instructions).

Install AIMET-Model-Zoo

Clone the AIMET Model Zoo repo into your workspace:
git clone https://github.com/quic/aimet-model-zoo.git
export PYTHONPATH=$PYTHONPATH:<path to parent of aimet_model_zoo>

Additional Setup Dependencies

pip install accelerate==0.9.0
pip install transformers==4.21.0
pip install datasets==2.4.0
pip install tensorboard==2.13.0

Model checkpoint

Original full precision checkpoints without downstream training were downloaded through hugging face
[Full precision model with downstream training weight files] are automatically downloaded using evaluation script
[Quantization optimized model weight files] are automatically downloaded using evaluation script

Dataset

- This evaluation was designed for the 2012 ImageNet Large Scale Visual Recognition Challenge (ILSVRC2012). The dataset directory is expected to have 3 subdirectories: train, valid, and test (only the valid test is used, hence if the other subdirectories are missing that is ok). Each of the {train, valid, test} directories is then expected to have 1000 subdirectories, each containing the images from the 1000 classes present in the ILSVRC2012 dataset, such as in the example below:

  train/
  ├── n01440764
  │   ├── n01440764_10026.JPEG
  │   ├── n01440764_10027.JPEG
  │   ├── ......
  ├── ......
  val/
  ├── n01440764
  │   ├── ILSVRC2012_val_00000293.JPEG
  │   ├── ILSVRC2012_val_00002138.JPEG
  │   ├── ......
  ├── ......

Usage

To run evaluation with QuantSim in AIMET, use the following

python mobilevit_quanteval.py \
    --model_config <model_configuration> \
    --per_device_eval_batch_size <batch_size> \
    --train_dir <imagenet_train_path> \
    --validation_dir <imagenet_val_path> 

# Example
python mobilevit_quanteval.py --model_config mobilevit_w8a8 --per_device_eval_batch_size 4 --train_dir <imagenet_train_path> --validation_dir <imagenet_val_path>

supported keywords of model_config is "mobilevit_w8a8"

Quantization Configuration

The following configuration has been used for the above models for INT8 quantization:

Weight quantization: 8 bits, symmetric quantization
Bias parameters are not quantized
Activation quantization: 8 bits, asymmetric quantization
Model inputs are quantized
TF range learning was used as quantization scheme
Clamped initialization was adopted
Quantization aware training (QAT) was used to obtain optimized quantized weights

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MobileViT.md

MobileViT.md

PyTorch Transformer model Mobile Vision Transformer(MobileViT) for Image Classification

AIMET installation and setup

Install AIMET-Model-Zoo

Additional Setup Dependencies

Model checkpoint

Dataset

Usage

Quantization Configuration

Files

MobileViT.md

Latest commit

History

MobileViT.md

File metadata and controls

PyTorch Transformer model Mobile Vision Transformer(MobileViT) for Image Classification

AIMET installation and setup

Install AIMET-Model-Zoo

Additional Setup Dependencies

Model checkpoint

Dataset

Usage

Quantization Configuration