-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[XPU] Enable reduction optimization by default #2846
base: main
Are you sure you want to change the base?
Conversation
Add reduction optimization pass to the pipeline by default.
Performance gains can be seen under @chengjunlu I'd say we go with this for now. You can disable it when doing your current work. This is giving performance now and having this would make testing the followups we have in mind easier. WDYT? |
Looks like the wide vector flag is giving some better performance, but this is noticeable enough. Just a minor bump from that flag. |
LGTM. |
if os.getenv("TRITON_INTEL_OPTIMIZE_REDUCTION_LOCALITY", "0") == "1": | ||
intel.passes.ttgpuir.add_optimize_reduction_locality(pm) | ||
intel.passes.ttgpuir.add_optimize_elementwise_parallelism(pm) | ||
intel.passes.ttgpuir.add_optimize_reduction_locality(pm) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use if os.getenv("TRITON_INTEL_OPTIMIZE_REDUCTION_LOCALITY", "1") == "1":
to allow the user to disable it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course. I'll add that when I finish the investigation
From https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/12061551469, it appears that the reduction optimization has a negative geomean impact to FA (causal=false, dhead=64) with the new IGC driver (improved instruction scheduling). Can we investigate it further before enabling it by default? |
Yes! I can look into that. |
Waiting till regressions are fixed. #2907 is the best attempt we have currently. |
Add reduction optimization pass to the pipeline by default.