EP (CPU TensoRT CUDA) accuracy test #22545

jingyanwangms · 2024-10-22T20:14:43Z

Description

This test compares output of below huggingface models

"microsoft/resnet-50"
"microsoft/Phi-3.5-mini-instruct"
on Pytorch cpu vs [ORT CPU EP, ORT TensorRT EP, ORT CUDA] with different configurations (fp16, no ort graph optimization, 1 layer transformer vs full model / Resnet18 vs Resnet50)

Future work:

Integrate with existing accuracy test such as Adrian's tool
Troubleshoot Phi3.5 >1 layer error
Add more models

Motivation and Context

github-advanced-security

lintrunner found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

onnxruntime/test/python/onnxruntime_test_python_trt_acc.py

+from transformers import AutoModel, AutoTokenizer
+from transformers import AutoModelForCausalLM
+import torch
+from transformers.onnx import export


onnxruntime/test/python/onnxruntime_test_python_trt_acc.py

+import numpy as np
+import time
+import unittest
+import onnx


onnxruntime/test/python/onnxruntime_test_python_trt_acc.py

+            'attention_mask': pytorch_inputs['attention_mask'].numpy(),
+            'onnx::Neg_2': torch.ones(1, dtype=torch.int64).numpy() # ORT requires this input since it's in the exported graph
+        }
+    return model, pytorch_inputs, ort_inputs


onnxruntime/test/python/onnxruntime_test_python_trt_acc.py

+def run_comparison(self, model_name, use_minimal_model=True, use_tensorrt=True, use_fp16=True, use_graph_opt=True, rtol=1e-2, atol=1e-2):
+    start_time = time.time()
+    model, pytorch_inputs, ort_inputs = get_model_and_inputs(model_name, use_minimal_model)
+    pytorch_output = run_model_in_pytorch(model, pytorch_inputs)


onnxruntime/test/python/onnxruntime_test_python_trt_acc.py

+def run_comparison(self, model_name, use_minimal_model=True, use_tensorrt=True, use_fp16=True, use_graph_opt=True, rtol=1e-2, atol=1e-2):
+    start_time = time.time()
+    model, pytorch_inputs, ort_inputs = get_model_and_inputs(model_name, use_minimal_model)
+    pytorch_output = run_model_in_pytorch(model, pytorch_inputs)


onnxruntime/test/python/onnxruntime_test_python_trt_acc.py

+    if model_name == "microsoft/Phi-3.5-mini-instruct":
+        fix_phi35_model(model_file)
+    providers = get_ep(use_tensorrt, use_fp16)
+    ort_output = run_model_in_ort(model_file, ort_inputs, providers, use_graph_opt=use_graph_opt)


onnxruntime/test/python/onnxruntime_test_python_trt_acc.py

+        run_comparison(self, "microsoft/resnet-18", 
+            use_minimal_model=False, use_tensorrt=False, use_fp16=False, use_graph_opt=False)
+
+    def test_resnet18_cpu_fp32(self):


jingyanwangms added 2 commits October 22, 2024 19:36

Add trt accuracy test

d1118e9

Clean up

15f2bb3

jingyanwangms changed the title ~~Add TensorRT accuracy test~~ [WIP] Add TensorRT accuracy test Oct 22, 2024

github-advanced-security bot found potential problems Oct 22, 2024

View reviewed changes

jingyanwangms added 2 commits October 22, 2024 20:38

Add github issue link

362f9e1

Add cuda EP

3a10abc

jingyanwangms changed the title ~~[WIP] Add TensorRT accuracy test~~ [WIP] EP (CPU TensoRT CUDA) accuracy test Nov 13, 2024

jingyanwangms changed the title ~~[WIP] EP (CPU TensoRT CUDA) accuracy test~~ EP (CPU TensoRT CUDA) accuracy test Nov 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EP (CPU TensoRT CUDA) accuracy test #22545

EP (CPU TensoRT CUDA) accuracy test #22545

jingyanwangms commented Oct 22, 2024 •

edited

Loading

github-advanced-security bot left a comment

EP (CPU TensoRT CUDA) accuracy test #22545

Are you sure you want to change the base?

EP (CPU TensoRT CUDA) accuracy test #22545

Conversation

jingyanwangms commented Oct 22, 2024 • edited Loading

Description

Motivation and Context

github-advanced-security bot left a comment

Choose a reason for hiding this comment

jingyanwangms commented Oct 22, 2024 •

edited

Loading