-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't use implicitly elapsed_time
in autotuner
#3036
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Anatoly Myachev <[email protected]>
Signed-off-by: Anatoly Myachev <[email protected]>
Signed-off-by: Anatoly Myachev <[email protected]>
Signed-off-by: Anatoly Myachev <[email protected]>
@whitneywhtsang we can try the changes in #2484 on DLE runner, but we need to cherry-pick 2a4b818 into Pavel's branch |
benchmarks/triton_kernels_benchmark/gemm_postop_addmatrix_benchmark.py
Outdated
Show resolved
Hide resolved
benchmarks/triton_kernels_benchmark/gemm_postop_gelu_benchmark.py
Outdated
Show resolved
Hide resolved
benchmarks/triton_kernels_benchmark/gemm_preop_exp_benchmark.py
Outdated
Show resolved
Hide resolved
Let's cherry-pick this PR to |
Co-authored-by: Whitney Tsang <[email protected]>
ok, but let's use 2a4b818 (last commit in #2484) which compatible with changes on Pavel's branch |
Signed-off-by: Anatoly Myachev <[email protected]>
This reverts commit 2a4b818.
The main idea of this pull request is not to use
elapsed_time
that enable profiling mode for sycl queues, as this is not needed for profiling with PyTorch and PTI.CI runs: