Skip to content

Latest commit

 

History

History
93 lines (83 loc) · 3.62 KB

File metadata and controls

93 lines (83 loc) · 3.62 KB

PyTorch Transformer model Roberta-base for Natural Language Classification

This document describes evaluation of optimized checkpoints for transformer models Roberta-base for Natural Language tasks.

AIMET installation and setup

Please install and setup AIMET (Torch GPU variant) before proceeding further.

NOTE

  • All AIMET releases are available here: https://github.com/quic/aimet/releases
  • This model has been tested using AIMET version 1.23.0 (i.e. set release_tag="1.23.0" in the above instructions).
  • This model is compatible with the PyTorch GPU variant of AIMET (i.e. set AIMET_VARIANT="torch_gpu" in the above instructions).

Additional Setup Dependencies

pip install datasets==2.4.0
pip install transformers==4.11.3

Model checkpoint

  • Original full precision checkpoints without downstream training were downloaded through hugging face
  • [Full precision model with downstream training weight files] are automatically downloaded using evaluation script
  • [Quantization optimized model weight files] are automatically downloaded using evaluation script

Dataset

Usage

To run evaluation with QuantSim in AIMET, use the following

python roberta_quanteval.py \
        --model_config <MODEL_CONFIGURATION> \
        --per_device_eval_batch_size 4 \
        --output_dir <OUT_DIR> \
  • example

    python roberta_quanteval.py --model_config roberta_w8a8_rte --per_device_eval_batch_size 4 --output_dir ./evaluation_result 
    
  • supported values of model_config are "roberta_w8a8_rte","roberta_w8a8_stsb","roberta_w8a8_mrpc","roberta_w8a8_cola","roberta_w8a8_sst2","roberta_w8a8_qnli","roberta_w8a8_qqp","roberta_w8a8_mnli"

Quantization Configuration

The following configuration has been used for the above models for INT8 quantization:

Results

Below are the results of the Pytorch transformer model Roberta for GLUE dataset:

Configuration CoLA (corr) SST-2 (acc) MRPC (f1) STS-B (corr) QQP (acc) MNLI (acc) QNLI (acc) RTE (acc) GLUE
FP32 60.36 94.72 91.84 90.54 91.24 87.29 92.33 72.56 85.11
W8A8 57.35 92.55 92.69 90.15 90.09 86.88 91.47 72.92 84.26