-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge nvidia-gpu-device-plugin and nvidia-device-plugin. #19545
base: master
Are you sure you want to change the base?
Merge nvidia-gpu-device-plugin and nvidia-device-plugin. #19545
Conversation
Hi @RamBITS-AI. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Can one of the admins verify this patch? |
Initial Commit.
…plugin. Removing isKVMDriverForNVIDIA from validations for nvidia-gpu-device-plugin. This is being done to allow merger of functionalities of nvidia-gpu-device-plugin and nvidia-device-plugin into nvidia-gpu-device-plugin. This basically means nvidia-gpu-device-plugin will work for both KVM & non-KVM (including docker) drivers.
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: RamBITS-AI The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Messages, Warnings and In-line Code Comments w.r.t the merger.
Merged nvidia-device-plugin addon into nvidia-gpu-device-plugin addon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @RamBITS-AI, thanks for the PR! There might have been a miscommunication but nvidia-device-plugin
is the addon we want users to be using as it's being kept up to date, nvidia-gpu-device-plugin
on the other hand hasn't been updated in 2.5 years, to that's the one we want to deprecate and move users off of.
Yes, will do @spowelljr. |
Reverting nvidia-gpu-device-plugin to nvidia-device-plugin.
Reverting nvidia-gpu-device-plugin to nvidia-device-plugin.
Reverting nvidia-gpu-device-plugin to nvidia-device-plugin.
Reverting nvidia-gpu-device-plugin to nvidia-device-plugin.
Reverting nvidia-gpu-device-plugin to nvidia-device-plugin.
Reverting nvidia-gpu-device-plugin to nvidia-device-plugin.
…loy\addons\nvidia-device-plugin\nvidia-device-plugin.yaml.tmpl. Merging deploy\addons\gpu\nvidia-gpu-device-plugin.yaml.tmpl into deploy\addons\nvidia-device-plugin\nvidia-device-plugin.yaml.tmpl.
/ok-to-test |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This reverts commit 4be65dc.
dev volume added.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Hi @spowelljr, Have updated the AFTER screenshots for the requested change. |
deploy/addons/nvidia-device-plugin/nvidia-device-plugin.yaml.tmpl
Outdated
Show resolved
Hide resolved
These values are likely no longer needed when using the updated image.
kvm2 driver with docker runtime
Times for minikube start: 49.6s 48.6s 50.8s 49.8s 47.1s Times for minikube ingress: 15.4s 14.5s 14.5s 15.0s 19.0s docker driver with docker runtime
Times for minikube start: 23.0s 23.1s 20.0s 21.4s 20.4s Times for minikube ingress: 12.2s 13.8s 13.3s 12.8s 12.8s docker driver with containerd runtime
Times for minikube start: 19.2s 19.4s 19.2s 20.4s 23.1s Times for minikube ingress: 31.3s 23.3s 39.3s 38.8s 38.8s |
Hi @spowelljr The corrections have been made. |
Here are the number of top 10 failed tests in each environments with lowest flake rate.
Besides the following environments also have failed tests:
To see the flake rates of all tests by environment, click here. |
resources: | ||
requests: | ||
cpu: 50m | ||
memory: 10Mi | ||
securityContext: | ||
allowPrivilegeEscalation: false | ||
capabilities: | ||
drop: ["ALL"] | ||
volumeMounts: | ||
- name: device-plugin | ||
mountPath: /var/lib/kubelet/device-plugins | ||
- name: dev | ||
mountPath: /dev | ||
volumes: | ||
- name: device-plugin | ||
hostPath: | ||
path: /var/lib/kubelet/device-plugins | ||
- name: dev | ||
hostPath: | ||
path: /dev |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can these be removed as well please?
Fixes #19114.
EDIT:
Merged both the addons, nvidia-gpu-device-plugin and nvidia-device-plugin, into nvidia-device-plugin addon.
BEFORE:
minikube start --driver docker --container-runtime docker --gpus all
minikube status
minikube kubectl -- get pods -A
minikube addons enable nvidia-device-plugin --alsologtostderr
minikube addons enable nvidia-gpu-device-plugin --alsologtostderr
AFTER:
minikube start --driver docker --container-runtime docker --gpus all
minikube status
minikube kubectl -- get pods -A
minikube addons enable nvidia-device-plugin --alsologtostderr
minikube addons enable nvidia-gpu-device-plugin --alsologtostderr