Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Using the cuda dlls installed with pip from official Nvidia python packages in onnxruntime-gpu #19350

Open
martinResearch opened this issue Jan 31, 2024 · 22 comments · May be fixed by #22506
Assignees
Labels
ep:CUDA issues related to the CUDA execution provider feature request request for unsupported feature or enhancement platform:windows issues related to the Windows platform

Comments

@martinResearch
Copy link

Describe the feature request

Short Description

Enable the usage of CUDA DLLs installed via pip from the NVIDIA Python index or include the CUDA DLLs in the onnxruntime-gpu wheel when installing with pip install onnxruntime-gpu[cuda_dlls].

Problem

Installing CUDA DLLs for onnxruntime-gpu currently involves limitations such as:

  1. Mandatory user account creation on the NVIDIA website.
  2. Dependency on admin rights, restricting installation on machines without such privileges.
  3. Risk of installing incompatible CUDA versions.
  4. Inconvenience of updating CUDA_PATH when switching Python environments with different CUDA versions.

In contrast, PyTorch on Windows includes CUDA DLLs in its wheels, simplifying the installation process and reducing version mismatch risks. On Linux, PyTorch seems to use NVIDIA packages from the feed https://pypi.ngc.nvidia.com (installable with pip install nvidia-pyindex) (I did not double check this).

Possible Solutions

To streamline the installation process for onnxruntime-gpu, the following solutions could be considered:

  1. Packaged CUDA DLLs with onnxruntime-gpu Wheels:

    • Create onnxruntime-gpu wheels that include CUDA DLLs, allowing users to install them conveniently with pip install onnxruntime-gpu[cuda_dlls].
  2. Dependency Configuration via onnxruntime-gpu Wheel:

    • Create an onnxruntime-gpu wheel installable with pip install onnxruntime-gpu[cuda_dlls].
    • This wheel would list packages from the NVIDIA package index as install dependencies (e.g., nvidia-cudnn-cu12), which can be installed with pip install nvidia-pyindex followed by pip install nvidia-cudnn-cu12.
    • Configure onnxruntime to utilize these DLLs instead of those in CUDA_PATH.

The second solution may facilitate reuse of the same CUDA DLLs by other packages like CuPy or PyTorch, potentially reducing the overall size of the Python environment.

Describe scenario use case

  • allow full automation of the the cuda dlls installation
  • allow the use different cuda dlls versions in different python environment with editing CUDA_PATH
@martinResearch martinResearch added the feature request request for unsupported feature or enhancement label Jan 31, 2024
@github-actions github-actions bot added ep:CUDA issues related to the CUDA execution provider platform:windows issues related to the Windows platform labels Jan 31, 2024
@snnn
Copy link
Member

snnn commented Jan 31, 2024

Due to some internal requirements, we are not allowed to use a second feed.

@snnn snnn closed this as completed Jan 31, 2024
@snnn
Copy link
Member

snnn commented Jan 31, 2024

The CUDA DLLs are so huge that they cannot be hosted at pypi. So they must be hosted at somewhere else. However, due to some security concerns we are not allowed to use a second feed. So this went to a dead end.

@snnn
Copy link
Member

snnn commented Jan 31, 2024

I will keep this issue open, while we could continue discussing the details offline.

@snnn snnn reopened this Jan 31, 2024
@martinResearch
Copy link
Author

Is the problem that you are you not allowed to consume packages from the external feed https://pypi.ngc.nvidia.com/ in your development environment and/or in your CI tests pipelines (github actions) for security reasons?

@snnn
Copy link
Member

snnn commented Feb 1, 2024

In CI.

@martinResearch
Copy link
Author

martinResearch commented Feb 15, 2024

could you possibly add https://pypi.ngc.nvidia.com/ as an upstream to the feed you are currently using?

@snnn
Copy link
Member

snnn commented Feb 15, 2024

Azure DevOps Artifact's upstream feature doesn't support that. It only supports pypi. Correct me if I was wrong.

@martinResearch
Copy link
Author

Indeed it seems I cannot use https://pypi.ngc.nvidia.com/ as an upstream in Azure Devops.
However It seems that all the required nvidia packages to get the required dlls for cuda 12 are actually on pypi.org, so not need to add https://pypi.ngc.nvidia.com/ as an upstream.

Here is a list of package on pypi.org that allowed me to get all the dlls I needed to use onnxruntime-gpu (some might actually not be required)

nvidia-nvjitlink-cu12
nvidia-nvtx-cu12
nvidia-cuda-runtime-cu12
nvidia-cublas-cu12
nvidia-cuda-cupti-cu12
nvidia-cuda-nvrtc-cu12
nvidia-cudnn-cu12
nvidia-cufft-cu12
nvidia-curand-cu12
nvidia-cusolver-cu12
nvidia-cusparse-cu12

so for cuda 12 it seems we could potentially get onnxruntime-gpu to use the dlls from these packages.
Unfortunately onnxruntime-gpu for cuda 12 is not on pypi.org see #19438, which I hope will be solved soon.

@snnn
Copy link
Member

snnn commented Apr 8, 2024

Are they real? The file size of nvidia-cuda-runtime-cu12 is less than 1MB.
https://pypi.org/project/nvidia-cuda-runtime-cu12/12.4.127/#files

@martinResearch
Copy link
Author

Are they real? The file size of nvidia-cuda-runtime-cu12 is less than 1MB. https://pypi.org/project/nvidia-cuda-runtime-cu12/12.4.127/#files

I think so. The whl file contains the cudart64_12.dll that is about 540KB, which is about the same as the dll we can get from the official cuda toolkit installer

@snnn
Copy link
Member

snnn commented Apr 11, 2024

But the cudnn files looks wired. The latest one(9.0) only has binaries for ARM64, while a previous version(8.9) only has binaries for Windows/Linux x64. And we don't know if the ARM one is for sbsa or jetson.

@gedoensmax
Copy link
Contributor

@snnn These cudnn wheels look reasonable to me: https://pypi.org/project/nvidia-cudnn-cu12/#files
Which ones did you look at ?

@snnn
Copy link
Member

snnn commented May 28, 2024

Now it is good. Thanks!

@snnn snnn assigned jchen351 and unassigned snnn and pranavsharma May 28, 2024
@gedoensmax
Copy link
Contributor

@snnn or @jchen351 did you start any work on this yet ? The problem propagates to ORT GenAI as well which is even more python focused. pip deployment will make it much easier with nvidia-* packages.

@snnn
Copy link
Member

snnn commented Oct 14, 2024

Sorry the work has not been started.

@snnn
Copy link
Member

snnn commented Oct 14, 2024

The first solution is not available to us

Packaged CUDA DLLs with onnxruntime-gpu Wheels:

Because these DLLs are large and we are very tight on space.

We can try the second one.

Dependency Configuration via onnxruntime-gpu Wheel:

@jchen351
Copy link
Contributor

@martinResearch Do you know if libcuda.so should to be installed via pip? If so, do you know where can I find it?

@martinResearch
Copy link
Author

@martinResearch Do you know if libcuda.so should to be installed via pip? If so, do you know where can I find it?

I don't know. Is it required for onnxruntime-gpu? it is not not in the python environment I use to run onnxruntime-gpu. You have in libcudart.so.12 in https://files.pythonhosted.org/packages/f0/62/65c05e161eeddbafeca24dc461f47de550d9fa8a7e04eb213e32b55cfd99/nvidia_cuda_runtime_cu12-12.6.77-py3-none-manylinux2014_x86_64.whl

@gedoensmax
Copy link
Contributor

libcuda.so is a driver library and is installed with the nvidia driver. In a docker container it will be mounted when using nvidia docker toolkit.

@martinResearch
Copy link
Author

martinResearch commented Nov 4, 2024

When installing the packages nvidia-cuda-nvrtc-cu12 and nvidia-cudnn-cu12 the dlls end up respectively in lib\site-packages\nvidia\cuda_nvrtc\bin and lib\site-packages\nvidia\cudnn\bin and I wonder if that makes using these dlls more complicated on the onnxruntime side.
Maybe Nvidia could structure the published packages using namespace packages so that the dlls for the different packages end up in the same lib\site-packages\nvidia\bin folder? I don't know if that solution is pip-compliant and would allow uninstalling a nvidia packages individually though.

@martinResearch
Copy link
Author

A similar effort is planned in the cupy repository cupy/cupy#8013. It might provide some valuable information on how to proceed.

@snnn
Copy link
Member

snnn commented Dec 12, 2024

Configure onnxruntime to utilize these DLLs instead of those in CUDA_PATH.

Would you mind explaining more on this part? Do we need to manually preload the libraries, or we just need to setup some paths(like https://docs.python.org/3/library/os.html#os.add_dll_directory)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ep:CUDA issues related to the CUDA execution provider feature request request for unsupported feature or enhancement platform:windows issues related to the Windows platform
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants