Releases: quic/aimet
Releases · quic/aimet
version 1.35.1
Release of the AI Model Efficiency toolkit package User guide: https://quic.github.io/aimet-pages/releases/1.35.1/user_guide/index.html API documentation: https://quic.github.io/aimet-pages/releases/1.35.1/api_docs/index.html Documentation main page: https://quic.github.io/aimet-pages/index.html
version 1.35.0
What's New
- PyTorch
- Added support for W16A16 in Autoquant.
- Deprecation Notice
- Support for Pytorch 1.13 is deprecated. It will be removed in next release.
- ONNX
- Optimized Memory and Speed utilization (for CPU).
Documentation
- Release main page: https://github.com/quic/aimet/releases/tag/1.35.0
- Installation guide: https://quic.github.io/aimet-pages/releases/1.35.0/install/index.html
- User guide: https://quic.github.io/aimet-pages/releases/1.35.0/user_guide/index.html
- API documentation: https://quic.github.io/aimet-pages/releases/1.35.0/api_docs/index.html
Packages
- aimet_torch-1.35.0.cu121-cp310-cp310-manylinux_2_34_x86_64.whl
- PyTorch 2.1 GPU package with Python 3.10 and CUDA 12.x
- aimet_torch-1.35.0.cu117-cp310-cp310-manylinux_2_34_x86_64.whl
- PyTorch 1.13 GPU package with Python 3.10 and CUDA 11.x
- aimet_torch-1.35.0.cpu-cp310-cp310-manylinux_2_34_x86_64.whl
- PyTorch 2.1 CPU package with Python 3.10 - If installing on a machine without CUDA
- aimet_onnx-1.35.0.cu117-cp310-cp310-manylinux_2_34_x86_64.whl
- ONNX 1.16 GPU package with Python 3.10 - Recommended for use with ONNX models
- aimet_onnx-1.35.0.cpu-cp310-cp310-manylinux_2_34_x86_64.whl
- ONNX 1.16 CPU package with Python 3.10 - If installing on a machine without CUDA
- aimet_tensorflow-1.35.0.cu118-cp310-cp310-manylinux_2_34_x86_64.whl
- TensorFlow 2.10 GPU package with Python 3.10 - Recommended for use with TensorFlow models
- aimet_tensorflow-1.35.0.cpu-cp310-cp310-manylinux_2_34_x86_64.whl
- TensorFlow 2.10 CPU package with Python 3.10 - If installing on a machine without CUDA
version 1.34.0
What's New
- PyTorch
- Added support for WSL2
- CUDA version upgraded for Pytorch 2.1
- Extended QuantAnalyzer functionality for LLM range analysis
- Keras
- Adds support for certain TFOpLambda layers created by tf functional calls.
- ONNX
- Upgraded AIMET to support ONNX version 1.16.1 and ONNXRUNTIME version 1.18.1.
Documentation
- Release main page: https://github.com/quic/aimet/releases/tag/1.34.0
- Installation guide: https://quic.github.io/aimet-pages/releases/1.34.0/install/index.html
- User guide: https://quic.github.io/aimet-pages/releases/1.34.0/user_guide/index.html
- API documentation: https://quic.github.io/aimet-pages/releases/1.34.0/api_docs/index.html
Packages
- aimet_torch-1.34.0.cu121-cp310-cp310-manylinux_2_34_x86_64.whl
- PyTorch 2.1 GPU package with Python 3.10 and CUDA 12.x
- aimet_torch-1.34.0.cu117-cp310-cp310-manylinux_2_34_x86_64.whl
- PyTorch 1.13 GPU package with Python 3.10 and CUDA 11.x
- aimet_torch-1.34.0.cpu-cp310-cp310-manylinux_2_34_x86_64.whl
- PyTorch 2.1 CPU package with Python 3.10 - If installing on a machine without CUDA
- aimet_onnx-1.34.0.cu117-cp310-cp310-manylinux_2_34_x86_64.whl
- ONNX 1.16 GPU package with Python 3.10 - Recommended for use with ONNX models
- aimet_onnx-1.34.0.cpu-cp310-cp310-manylinux_2_34_x86_64.whl
- ONNX 1.16 CPU package with Python 3.10 - If installing on a machine without CUDA
- aimet_tensorflow-1.34.0.cu118-cp310-cp310-manylinux_2_34_x86_64.whl
- TensorFlow 2.10 GPU package with Python 3.10 - Recommended for use with TensorFlow models
- aimet_tensorflow-1.34.0.cpu-cp310-cp310-manylinux_2_34_x86_64.whl
- TensorFlow 2.10 CPU package with Python 3.10 - If installing on a machine without CUDA
version 1.33.5
What's New
- PyTorch
- Various bugfixes/QoL updates for LoRA
- Updated minimum scale value and registered additional custom quantized ops with QuantSim 2.0
Documentation
- Release main page: https://github.com/quic/aimet/releases/tag/1.33.5
- Installation guide: https://quic.github.io/aimet-pages/releases/1.33.5/install/index.html
- User guide: https://quic.github.io/aimet-pages/releases/1.33.5/user_guide/index.html
- API documentation: https://quic.github.io/aimet-pages/releases/1.33.5/api_docs/index.html
Packages
- aimet_torch-1.33.5.cu118-cp310-cp310-manylinux_2_34_x86_64.whl
- PyTorch 2.1 GPU package with Python 3.10 and CUDA 11
- aimet_torch-1.33.5.cu117-cp310-cp310-manylinux_2_34_x86_64.whl
- PyTorch 1.13 GPU package with Python 3.10 and CUDA 11.x
- aimet_torch-1.33.5.cpu-cp310-cp310-manylinux_2_34_x86_64.whl
- PyTorch 2.1 CPU package with Python 3.10 - If installing on a machine without CUDA
- aimet_onnx-1.33.5.cu117-cp310-cp310-manylinux_2_34_x86_64.whl
- ONNX 1.14 GPU package with Python 3.10 - Recommended for use with ONNX models
- aimet_onnx-1.33.5.cpu-cp310-cp310-manylinux_2_34_x86_64.whl
- ONNX 1.14 CPU package with Python 3.10 - If installing on a machine without CUDA
- aimet_tensorflow-1.33.5.cu118-cp310-cp310-manylinux_2_34_x86_64.whl
- TensorFlow 2.10 GPU package with Python 3.10 - Recommended for use with TensorFlow models
- aimet_tensorflow-1.33.5.cpu-cp310-cp310-manylinux_2_34_x86_64.whl
- TensorFlow 2.10 CPU package with Python 3.10 - If installing on a machine without CUDA
version 1.33.0
What's New
- PyTorch
- Enhancements done in export pipeline for GPU memory optimization with LLMs.
- [Experimental] Added support for handling of LoRA (via PEFT API) in AIMET. and enabled export of required artifacts for QNN.
- Added examples for training pipeline with for distributed KD-QAT.
- [Experimental] Added support for block wise quantization (BQ) to support w4fp16 format, and the low-power block quantization (LPBQ) to support w4a8 and w4a16 formats. This feature needs QuantSim V2.
Documentation
- Release main page: https://github.com/quic/aimet/releases/tag/1.33.0
- Installation guide: https://quic.github.io/aimet-pages/releases/1.33.0/install/index.html
- User guide: https://quic.github.io/aimet-pages/releases/1.33.0/user_guide/index.html
- API documentation: https://quic.github.io/aimet-pages/releases/1.33.0/api_docs/index.html
Packages
- aimet_torch-1.33.0.cu118-cp310-cp310-manylinux_2_34_x86_64.whl
- PyTorch 2.1 GPU package with Python 3.10 and CUDA 11
- aimet_torch-1.33.0.cu117-cp310-cp310-manylinux_2_34_x86_64.whl
- PyTorch 1.13 GPU package with Python 3.10 and CUDA 11.x
- aimet_torch-1.33.0.cpu-cp310-cp310-manylinux_2_34_x86_64.whl
- PyTorch 2.1 CPU package with Python 3.10 - If installing on a machine without CUDA
- aimet_onnx-1.33.0.cu117-cp310-cp310-manylinux_2_34_x86_64.whl
- ONNX 1.14 GPU package with Python 3.10 - Recommended for use with ONNX models
- aimet_onnx-1.33.0.cpu-cp310-cp310-manylinux_2_34_x86_64.whl
- ONNX 1.14 CPU package with Python 3.10 - If installing on a machine without CUDA
- aimet_tensorflow-1.33.0.cu118-cp310-cp310-manylinux_2_34_x86_64.whl
- TensorFlow 2.10 GPU package with Python 3.10 - Recommended for use with TensorFlow models
- aimet_tensorflow-1.33.0.cpu-cp310-cp310-manylinux_2_34_x86_64.whl
- TensorFlow 2.10 CPU package with Python 3.10 - If installing on a machine without CUDA
version 1.32.0
What's New
- PyTorch
- Added MultiGPU support for Adaround.
- Upgraded AIMET to support PyTorch version 2.1 as a new variant. AIMET with PyTorch version 1.13 remains the default.
- Keras
- For models with SeparableConv2D layers, use model_preparer first before applying any quantization API.
- Common
- Upgraded AIMET to support Ubuntu22 and Python3.10 for all AIMET variants.
Documentation
- Release main page: https://github.com/quic/aimet/releases/tag/1.32.0
- Installation guide: https://quic.github.io/aimet-pages/releases/1.32.0/install/index.html
- User guide: https://quic.github.io/aimet-pages/releases/1.32.0/user_guide/index.html
- API documentation: https://quic.github.io/aimet-pages/releases/1.32.0/api_docs/index.html
Packages
- aimet_torch_gpu_pt21-1.32.0.cu118-cp310-cp310-manylinux_2_34_x86_64.whl
- PyTorch 2.1 GPU package with Python 3.10 and CUDA 11
- aimet_torch_gpu-1.32.0.cu117-cp310-cp310-manylinux_2_34_x86_64.whl
- PyTorch 1.13 GPU package with Python 3.10 and CUDA 11.x
- aimet_torch_cpu_pt21-1.32.0.cpu-cp310-cp310-manylinux_2_34_x86_64.whl
- PyTorch 2.1 CPU package with Python 3.10 - If installing on a machine without CUDA
- aimet_torch_cpu-1.32.0.cpu-cp310-cp310-manylinux_2_34_x86_64.whl
- PyTorch 1.13 CPU package with Python 3.10 - If installing on a machine without CUDA
- aimet_onnx_gpu-1.32.0.cu117-cp310-cp310-manylinux_2_34_x86_64.whl
- ONNX 1.14 GPU package with Python 3.10 - Recommended for use with ONNX models
- aimet_onnx_cpu-1.32.0.cpu-cp310-cp310-manylinux_2_34_x86_64.whl
- ONNX 1.14 CPU package with Python 3.10 - If installing on a machine without CUDA
- aimet_tf_gpu-1.32.0.cu118-cp310-cp310-manylinux_2_34_x86_64.whl
- TensorFlow 2.10 GPU package with Python 3.10 - Recommended for use with TensorFlow models
- aimet_tf_cpu-1.32.0.cpu-cp310-cp310-manylinux_2_34_x86_64.whl
- TensorFlow 2.10 CPU package with Python 3.10 - If installing on a machine without CUDA
version 1.31.2
What's New
- TODO
Documentation
- Release main page: https://github.com/quic/aimet/releases/tag/1.31.0
- Installation guide: https://quic.github.io/aimet-pages/releases/1.31.0/install/index.html
- User guide: https://quic.github.io/aimet-pages/releases/1.31.0/user_guide/index.html
- API documentation: https://quic.github.io/aimet-pages/releases/1.31.0/api_docs/index.html
Packages
- aimet_torch-gpu_1.31.2_cu117-cp310-cp310-manylinux_2_32_x86_64.whl
- PyTorch 1.13 GPU package with Python 3.8 and CUDA 11.x - Recommended for use with PyTorch models
- aimet_torch-cpu_1.31.2_cpu-cp310-cp310-manylinux_2_32_x86_64.whl
- PyTorch 1.13 CPU package with Python 3.8 - If installing on a machine without CUDA
- aimet_tensorflow-gpu_1.31.2_cu118-cp310-cp310-manylinux_2_32_x86_64.whl
- TensorFlow 2.10 GPU package with Python 3.8 - Recommended for use with TensorFlow models
- aimet_tensorflow-cpu_1.31.2_cpu-cp310-cp310-manylinux_2_32_x86_64.whl
- TensorFlow 2.10 CPU package with Python 3.8 - If installing on a machine without CUDA
- aimet_onnx-gpu_1.31.2_cu117-cp310-cp310-manylinux_2_32_x86_64.whl
- ONNX 1.11.0 GPU package with Python 3.8 - Recommended for use with ONNX models
- aimet_onnx-cpu_1.31.2_cpu-cp310-cp310-manylinux_2_32_x86_64.whl
- ONNX 1.11.0 CPU package with Python 3.8 - If installing on a machine without CUDA
version 1.31.0
What's New
- ONNX
- Added support for custom ops in QuantSim, CLE, AdaRound and AMP.
- Added support for Quant Analyzer.
- Keras
- Added support for unrolled quantized LSTM with only Quantsim in PTQ mode.
- Fix for ReLU Encoding min going past 0 for QAT.
- Fixes Input Quantizers for TFOpLambda Layers (kwargs)
- Fixes logic for placing input quantizers
Documentation
- Release main page: https://github.com/quic/aimet/releases/tag/1.31.0
- Installation guide: https://quic.github.io/aimet-pages/releases/1.31.0/install/index.html
- User guide: https://quic.github.io/aimet-pages/releases/1.31.0/user_guide/index.html
- API documentation: https://quic.github.io/aimet-pages/releases/1.31.0/api_docs/index.html
Packages
- aimet_torch-torch_gpu_1.31.0-cp38-cp38-linux_x86_64.whl
- PyTorch 1.13 GPU package with Python 3.8 and CUDA 11.x - Recommended for use with PyTorch models
- aimet_torch-torch_cpu_1.31.0-cp38-cp38-linux_x86_64.whl
- PyTorch 1.13 CPU package with Python 3.8 - If installing on a machine without CUDA
- aimet_torch-torch_cpu_pt19_1.31.0-cp38-cp38-linux_x86_64.whl
- PyTorch 1.9 CPU package with Python 3.8 - If installing on a machine without CUDA
- aimet_tensorflow-tf_gpu_1.31.0-cp38-cp38-linux_x86_64.whl
- TensorFlow 2.10 GPU package with Python 3.8 - Recommended for use with TensorFlow models
- aimet_tensorflow-tf_cpu_1.31.0-cp38-cp38-linux_x86_64.whl
- TensorFlow 2.10 CPU package with Python 3.8 - If installing on a machine without CUDA
- aimet_onnx-onnx_gpu_1.31.0-cp38-cp38-linux_x86_64.whl
- ONNX 1.11.0 GPU package with Python 3.8 - Recommended for use with ONNX models
- aimet_onnx-onnx_cpu_1.31.0-cp38-cp38-linux_x86_64.whl
- ONNX 1.11.0 CPU package with Python 3.8 - If installing on a machine without CUDA
version 1.30.0
What's New
ONNX
- Upgraded AIMET to support Onnx version 1.14 and ONNXRUNTIME version 1.15.
- Added support for AutoQuant.
Documentation
- Release main page: https://github.com/quic/aimet/releases/tag/1.30.0
- Installation guide: https://quic.github.io/aimet-pages/releases/1.30.0/install/index.html
- User guide: https://quic.github.io/aimet-pages/releases/1.30.0/user_guide/index.html
- API documentation: https://quic.github.io/aimet-pages/releases/1.30.0/api_docs/index.html
- Documentation main page: https://quic.github.io/aimet-pages/index.html
version 1.29.0
What's New
Keras
- Fixes issues with TF Op Lambda Layers in Qc Quantize Wrappers call.
PyTorch
- [experimental] Support for embedding AIMET encodings within the graph using ONNX quantize/dequantize operators. Currently this option is only supported when using 8bit per-tensor quantization.
ONNX
- Added support for Adaround.
TensorFlow
- No significant updates
Documentation
- Release main page: https://github.com/quic/aimet/releases/tag/1.29.0
- Installation guide: https://quic.github.io/aimet-pages/releases/1.29.0/install/index.html
- User guide: https://quic.github.io/aimet-pages/releases/1.29.0/user_guide/index.html
- API documentation: https://quic.github.io/aimet-pages/releases/1.29.0/api_docs/index.html
- Documentation main page: https://quic.github.io/aimet-pages/index.html