Auto dependency bump #11557

kokyhm · 2024-09-20T06:23:14Z

What type of PR is this?
/kind feature

What this PR does / why we need it:
Auto dependency bump
Kubespray uses newer software and the maintainers can save time.

Which issue(s) this PR fixes:

Fixes #10681

Special notes for your reviewer:
POC:
Actions
PRs

Known issues(YAML):

when crio_archive_checksums are updated, 2 comments below it are lost
numeric like version keys (e.g. 20240916 or 1.17) become single quoted

Does this PR introduce a user-facing change?:

NONE

k8s-ci-robot · 2024-09-20T06:23:21Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: kokyhm
Once this PR has been reviewed and has the lgtm label, please assign floryut for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot · 2024-09-20T06:23:24Z

Hi @kokyhm. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

yankay · 2024-09-20T06:25:30Z

/ok-to-test

VannTen · 2024-09-20T14:34:26Z

First: This needs some comments to have a rough idea of what the code does without too much trouble.

I have a couple of comments at first glance:

Any reason not to build on the already existing ? (script/download_hash.py in particular ?)
It looks like you're doing one graphql request per component ? The whole point of GraphQL is to be able to bundle things in one request, it should be able multiples things at once => also, bundling more things is one request make it less likely to hit the rate-limits request of GH.
It looks like you're only getting the last release for a component, what we actually need is all the new patchs version for the minor releases listed in the checksums
How fast is it ? 🚀

kokyhm · 2024-09-25T21:23:15Z

Thank you @VannTen for the comments.
I have not built on download.py since it does not handle all components.
However, I did borrow some(+3) findings from there! 👍

Hope my refactor address comments you have provided.

Key changes:

added comments
use only one graphql request
added support for patch versions
use explicit sha regex to calculate sha
make code more readable
make workflow actions use run-name to have a clear view what action does what
use one branch for the component to create-pull-request... it will update until merged.
intrduce constants
many other more readable and error prone improvements.

Speed:

it varies... localy or CI
CI - less than a minute (download is a matter of seconds)
locally - e.g. 4 workers, 2 Mbps uplink, ~20 minutes... (kata-containers have 700MB)

Thanks again and looking forward to your comments.

VannTen · 2024-09-27T17:05:39Z

Hum, 20 minutes is a lot. It's also not sure we'd run this on GitHub workers, given our CI is in gitlab and the larger kubernetes-sigs in prow, so I'd rather not take that for granted. (The most gains I've seen previously was to download the hashes instead of the binaries, are you doing that ?) I'll try to look at the code more in detail, but 500 lines of python is pretty big to review, so it could take a while until I have the time to look properly at this. Also, I wanted to mention this the other time but forgot: I had some wip stuff on that subject started some time ago, feel free to look at it (if you want) https://github.com/VannTen/kubespray/tree/wip/download_graphql

kokyhm · 2024-09-28T16:16:08Z

Let's recap:

download is there for a reason. I have not found other way to do the sha.
it solves the ticket.
dependabot for pip is running on GH actions
script handles one component, which means Gitlab integration is a matter of few lines of code

What have i found, do not hardcode kubespray versions...
But that would be a part of another PR...

kokyhm · 2024-09-29T21:20:49Z

I do not get "free to look at it if you want". What does it mean?
I have taken your suggestion not to make one graphql per component.
Nice work with graphql, I am happy to see different approaches, but does it solve the issue?
I have found this as a big problem not having latest versions/shas.
Having it locally dependent, that would be worst case nightmare, but it does the job, takes 20 minutes, if you have slow internet, you can put max_workers at 1, it will take an hour... How to handle when you don't have sha file and you must download it and calculate it?
I don't see a usecase there.
CI if it runs every day and PRs are handled, would take 30 minutes max per month, said too much.

VannTen · 2024-09-30T05:33:16Z

It means no more that than, I've done that some times ago, figured you might get some inspiration from it. A usecase for what ? A fast script ? The usecase is for the maintainers who are going to debug the script when it will have bugs. Besides that, the script is big, so it's gonna take a while to review. That's always the case for big PRs, that's why an iterative approach usually is faster.

kokyhm · 2024-09-30T09:36:34Z

Thank you for the comments. Was rushing it, sorry about that.

VannTen

First some preamble:
This should really be split in several parts (at least 2):

script to download
automation to run the script.

(I can't review everything in details, there is really too much, but I've left some general comments).

VannTen · 2024-12-16T09:17:19Z

scripts/dependency_config.py

@@ -0,0 +1,221 @@
+ARCHITECTURES = ['arm', 'arm64', 'amd64', 'ppc64le']


I'd rather source this from the data (aka take the existing keys in the existing checksums)

VannTen · 2024-12-16T09:20:09Z

scripts/dependency_updater.py

+            version = tag.get('name', '')
+            if stable_version_pattern.match(version):
+                patch_versions.append(version)
+    patch_versions.sort(key=lambda v: list(map(int, re.findall(r'\d+', v)))) # sort for checksum update


It's not clear what you mean by sort for checkum update.

VannTen · 2024-12-16T09:25:18Z

scripts/dependency_updater.py

+                }}
+            }}
+        """)
+


On the query.
First, you should really put it in a separate file, so it can be edited and run separately.
Secondly, you should use graphql variables instead of templating the query and templating and joining.
For instance something like this :

query($repoWithReleases: [ID!]!, $repoWithTags: [ID!]!) { with_releases: nodes(ids: $repoWithReleases) { ... on Repository { nameWithOwner releases(first: 100) { nodes { tagName isPrerelease releaseAssets { totalCount } } } } } with_tags: nodes(ids: $repoWithTags) { ... on Repository { nameWithOwner refs(refPrefix: "refs/tags/", last: 100) { nodes { name } } } } }

VannTen · 2024-12-16T09:27:41Z

scripts/dependency_updater.py

+    cache_file = f'{component}-{arch}-{version}'
+    if os.path.exists(f'cache/{cache_file}'):
+        logging.info(f'Using cached file for {url_download}')
+        return calculate_checksum(cache_file, sha_regex)
+    try:
+        response = session.get(url_download, timeout=10)
+        response.raise_for_status()
+        with open(f'cache/{cache_file}', 'wb') as f:
+            f.write(response.content)
+        logging.info(f'Downloaded and cached file for {url_download}')


I don't think we want to do caching (at least at start) It adds some footguns + I'd rather have a more predictable output from multiples "runners" (CI, locally, etc).

VannTen · 2024-12-16T09:29:24Z

scripts/dependency_updater.py

+def update_readme(component, version):
+    for i, line in enumerate(readme_data):
+        if component in line and re.search(r'v\d+\.\d+\.\d+', line):
+            readme_data[i] = re.sub(r'v\d+\.\d+\.\d+', version, line)
+            logging.info(f'Updated {component} to {version} in README')
+            break
+    return readme_data


We should first focus on the checksums, the README is another story.

kokyhm · 2024-12-16T14:02:50Z

@VannTen Thank you very much for your time and comments. (Un)fortunatelly, I was very busy last few months, that's why I did not participate in other tasks. Just checking POC from time to time to see how it works(found some minor bugs). I will address your comments at the start of the next year. Wish you all the best.

Add scripts and github actions for auto dependency bump

ec1a3a2

k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Sep 20, 2024

k8s-ci-robot requested review from cyclinder and ErikJiang September 20, 2024 06:23

k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Sep 20, 2024

kokyhm changed the title ~~Add scripts and github actions for auto dependency bump~~ Auto dependency bump Sep 20, 2024

k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Sep 20, 2024

refactor

7b13bac

not trigger an action if latest equals current

01d9fc2

tico88612 mentioned this pull request Oct 17, 2024

Discussion: Upgrade dependencies version policy #11644

Open

VannTen reviewed Dec 16, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Auto dependency bump #11557

Auto dependency bump #11557

kokyhm commented Sep 20, 2024 •

edited

Loading

k8s-ci-robot commented Sep 20, 2024

k8s-ci-robot commented Sep 20, 2024

yankay commented Sep 20, 2024

VannTen commented Sep 20, 2024

kokyhm commented Sep 25, 2024

VannTen commented Sep 27, 2024 via email

kokyhm commented Sep 28, 2024

kokyhm commented Sep 29, 2024 •

edited

Loading

VannTen commented Sep 30, 2024 via email

kokyhm commented Sep 30, 2024

VannTen left a comment

VannTen Dec 16, 2024

VannTen Dec 16, 2024

VannTen Dec 16, 2024

VannTen Dec 16, 2024

VannTen Dec 16, 2024

kokyhm commented Dec 16, 2024

		@@ -0,0 +1,221 @@
		ARCHITECTURES = ['arm', 'arm64', 'amd64', 'ppc64le']

Auto dependency bump #11557

Are you sure you want to change the base?

Auto dependency bump #11557

Conversation

kokyhm commented Sep 20, 2024 • edited Loading

k8s-ci-robot commented Sep 20, 2024

k8s-ci-robot commented Sep 20, 2024

yankay commented Sep 20, 2024

VannTen commented Sep 20, 2024

kokyhm commented Sep 25, 2024

VannTen commented Sep 27, 2024 via email

kokyhm commented Sep 28, 2024

kokyhm commented Sep 29, 2024 • edited Loading

VannTen commented Sep 30, 2024 via email

kokyhm commented Sep 30, 2024

VannTen left a comment

Choose a reason for hiding this comment

VannTen Dec 16, 2024

Choose a reason for hiding this comment

VannTen Dec 16, 2024

Choose a reason for hiding this comment

VannTen Dec 16, 2024

Choose a reason for hiding this comment

VannTen Dec 16, 2024

Choose a reason for hiding this comment

VannTen Dec 16, 2024

Choose a reason for hiding this comment

kokyhm commented Dec 16, 2024

kokyhm commented Sep 20, 2024 •

edited

Loading

kokyhm commented Sep 29, 2024 •

edited

Loading