Discussion: custom ansible strategy for rolling update of nodes #10497

VannTen · 2023-10-04T11:53:49Z

What would you like to be added:

A custom ansible strategy plugin, based on the host_pinned strategy, which would be used in the node kubelet upgrade play (and possibly other plays dealing with all the nodes). Described in ansible/ansible#81736 more precisely.

Why is this needed:

The linear strategy waits for all hosts to finish the current tasks. Unless I'm mistaken, kubelet upgrades are independant between nodes and don't need to wait. Thus we're losing time busy-waiting.
Using serial allows a batch upgrade rather than a rolling upgrade, even if we were to use host_pinned with the current play. (host_pinned works only for the current batch as defined by serial). A true rolling upgrade would instead start the play for another node as soon as one has completed it.

Consider the following scenarios (which is not hypothetic, we have clusters doing that):

Scenario A:
We have some pods in the cluster with a long start time (15-30 min), which are constrained (with labels) to a particular set of nodes S. These pods have PodDisruptionPolicy to avoid loosing the service (during cluster upgrade, notably). Other pods have more typical startup time (<10s).

Once the first or second batch of nodes are upgraded, some of pods with long start time are at their minimal count according to the PodDisruptionPolicy. Which means when we try to upgrade a node in S in another batch , hosting some of those pods, it blocks for a long time while waiting for the other pods to start before it can safely drain the node (which is good). However, all of the other nodes in the batch are essentially finished with their upgrades, and we wait for nothing.

Scenario B (worse):
Two or more nodes in S are in the same batch. The first successfully drains, but not the second (because the PodDisprutionPolicy is now at the minimum number of pods acceptable). This results in a stuck upgrade, because the first node is waiting on the second to complete the task. If it wasn't, it could complete the upgrade, become Schedulable again, allowing the cluster to place new pods and make room for the second done drain. -> this would be solved simply by changing the strategy to host_pinned IMO.

Point 2. is in my opinion the more critical to Kubespray performance in scenarios like those I described, but it implies 1.
I raised this issue on Ansible github, the devel mailing list, and matrix, but I didn't get much responses besides the automated issue closure.

I would rather have this in ansible itself, and use it in Kubespray. However, if upstream is not interested, what would you think of integrating this in kubespray ? Is the maintenance worth the (presumed, I haven't tested this concretely) perf uplift ?

(I can implement this myself, whether by copying the free strategy with some tweaks or starting from scratch).

The text was updated successfully, but these errors were encountered:

VannTen · 2023-11-30T09:32:05Z

So, I thought of something which is likely to get a faster ROI: instead of trying to retro-fit an ansible strategy with the "slot" concepts, I'd use the host_pinned strategy coupled with kubernetes leases which would act as "slot reservations".
This has the advantage that it easily scale to a "slot-per-group" concept (which would natively support #10591 ) by leveraging group_vars.

Opinions welcomed !

k8s-triage-robot · 2024-02-28T10:12:28Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

VannTen · 2024-02-28T10:19:28Z

/remove-lifecycle stale /lifecycle frozen

VannTen added the kind/feature Categorizes issue or PR as related to a new feature. label Oct 4, 2023

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 28, 2024

k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discussion: custom ansible strategy for rolling update of nodes #10497

Discussion: custom ansible strategy for rolling update of nodes #10497

VannTen commented Oct 4, 2023

VannTen commented Nov 30, 2023

k8s-triage-robot commented Feb 28, 2024

VannTen commented Feb 28, 2024 via email

Discussion: custom ansible strategy for rolling update of nodes #10497

Discussion: custom ansible strategy for rolling update of nodes #10497

Comments

VannTen commented Oct 4, 2023

VannTen commented Nov 30, 2023

k8s-triage-robot commented Feb 28, 2024

VannTen commented Feb 28, 2024 via email