-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Make selector choose appropriate CUDA 12.x versions based on dependencies #471
Comments
Possibly related ( #470 ) |
#470 fixes the compatible major versions of CUDA for the TensorFlow GPU conda-forge package. It does not impact minor version compatibility. What part of this is dependent on RAPIDS supporting CUDA 12.2? I was able to solve this environment, and got a CUDA 12 build of pytorch from conda-forge ( mamba create -n rapids-23.12 -c rapidsai -c conda-forge -c nvidia rapids=23.12 python=3.10 cuda-version=12.0 pytorch I don't think we can offer official compatibility between RAPIDS /
Last I tested it, this environment worked but we can't offer support for a configuration with CUDA from a mixed set of channels. At some point in the future we are hoping to make the CUDA distributions on the |
I agree that this isn't addressable until the nvidia and conda-forge CTK packages are aligned. We should consider how the selector ought to work once that day comes, though. To @MatthiasKohl's point, though, the pytorch channel is the officially supported medium (by both NVIDIA and PyTorch) for installing the package, so IMHO once the two are aligned we would probably want to encourage installation of PyTorch from the pytorch channel unless and until we see a similar level of support for the conda-forge package as NVIDIA is now providing for the CTK on cf. |
It might work as desired, but I don't think it should. |
Big relevant news here: pytorch/pytorch#138506 |
There has not been any substantial effort / progress to become compatible with DLFWs since this was last discussed. |
Rapidsai is often used in conjunction with PyTorch and Tensorflow for many users. Wouldn't it instead make sense to support the conda-forge feedstocks, since they are community driven and pull requests can be made on them? The changes being discussed here can be made for compatibility moving forward with rapids now that the conda-forge channel is the way pytorch will be distributed moving forward on conda. |
This does make sense, but it definitely requires support from Cliff Woolley and org, so I'd recommend reaching out to them and see what they can support. This will likely take a long time, especially if we can support conda-forge officially, so while this effort is going on, I'd still recommend removing the selector. |
Is your feature request related to a problem? Please describe.
Once RAPIDS adds support for CUDA 12.2, it will be possible to install conda packages of PyTorch along with RAPIDS from conda. Currently this is not possible because PyTorch supports 12.1 and will likely bump straight to 12.3 for their next set of packages. Since the CUDA 12 lineup of RAPIDS packages is going to leverage CEC to support arbitrary CUDA minor versions, we will no longer need users to have a specific one for RAPIDS, but dependencies like PyTorch will likely continue to do so.
Describe the solution you'd like
We should update the release selector to include a range of CUDA minor versions and have it automatically select supported ones based on the user's choice of packages to include in their environment.
Additional context
For libraries like PyTorch, we will also need to consider what channel the package will be installed from. Officially supported PyTorch builds come from the
pytorch
channel, notconda-forge
, so unless/until that changes we will need to ensure that our install command accounts for that correctly.The text was updated successfully, but these errors were encountered: