Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto dependency bump #11557

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

kokyhm
Copy link
Contributor

@kokyhm kokyhm commented Sep 20, 2024

What type of PR is this?
/kind feature

What this PR does / why we need it:
Auto dependency bump
Kubespray uses newer software and the maintainers can save time.

Which issue(s) this PR fixes:

Fixes #10681

Special notes for your reviewer:
POC:
Actions
PRs

Known issues(YAML):

  • when crio_archive_checksums are updated, 2 comments below it are lost
  • numeric like version keys (e.g. 20240916 or 1.17) become single quoted

Does this PR introduce a user-facing change?:

NONE

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Sep 20, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: kokyhm
Once this PR has been reviewed and has the lgtm label, please assign floryut for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Sep 20, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @kokyhm. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@kokyhm kokyhm changed the title Add scripts and github actions for auto dependency bump Auto dependency bump Sep 20, 2024
@yankay
Copy link
Member

yankay commented Sep 20, 2024

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Sep 20, 2024
@VannTen
Copy link
Contributor

VannTen commented Sep 20, 2024

First: This needs some comments to have a rough idea of what the code does without too much trouble.

I have a couple of comments at first glance:

  • Any reason not to build on the already existing ? (script/download_hash.py in particular ?)
  • It looks like you're doing one graphql request per component ? The whole point of GraphQL is to be able to bundle things in one request, it should be able multiples things at once => also, bundling more things is one request make it less likely to hit the rate-limits request of GH.
  • It looks like you're only getting the last release for a component, what we actually need is all the new patchs version for the minor releases listed in the checksums
  • How fast is it ? 🚀

@kokyhm
Copy link
Contributor Author

kokyhm commented Sep 25, 2024

Thank you @VannTen for the comments.
I have not built on download.py since it does not handle all components.
However, I did borrow some(+3) findings from there! 👍

Hope my refactor address comments you have provided.

Key changes:

  • added comments
  • use only one graphql request
  • added support for patch versions
  • use explicit sha regex to calculate sha
  • make code more readable
  • make workflow actions use run-name to have a clear view what action does what
  • use one branch for the component to create-pull-request... it will update until merged.
  • intrduce constants
  • many other more readable and error prone improvements.

Speed:

  • it varies... localy or CI
  • CI - less than a minute (download is a matter of seconds)
  • locally - e.g. 4 workers, 2 Mbps uplink, ~20 minutes... (kata-containers have 700MB)

Thanks again and looking forward to your comments.

@VannTen
Copy link
Contributor

VannTen commented Sep 27, 2024 via email

@kokyhm
Copy link
Contributor Author

kokyhm commented Sep 28, 2024

Let's recap:

  • download is there for a reason. I have not found other way to do the sha.
  • it solves the ticket.
  • dependabot for pip is running on GH actions
  • script handles one component, which means Gitlab integration is a matter of few lines of code

What have i found, do not hardcode kubespray versions...
But that would be a part of another PR...

@kokyhm
Copy link
Contributor Author

kokyhm commented Sep 29, 2024

I do not get "free to look at it if you want". What does it mean?
I have taken your suggestion not to make one graphql per component.
Nice work with graphql, I am happy to see different approaches, but does it solve the issue?
I have found this as a big problem not having latest versions/shas.
Having it locally dependent, that would be worst case nightmare, but it does the job, takes 20 minutes, if you have slow internet, you can put max_workers at 1, it will take an hour... How to handle when you don't have sha file and you must download it and calculate it?
I don't see a usecase there.
CI if it runs every day and PRs are handled, would take 30 minutes max per month, said too much.

@VannTen
Copy link
Contributor

VannTen commented Sep 30, 2024 via email

@kokyhm
Copy link
Contributor Author

kokyhm commented Sep 30, 2024

Thank you for the comments. Was rushing it, sorry about that.

Copy link
Contributor

@VannTen VannTen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First some preamble:
This should really be split in several parts (at least 2):

  • script to download
  • automation to run the script.

(I can't review everything in details, there is really too much, but I've left some general comments).

@@ -0,0 +1,221 @@
ARCHITECTURES = ['arm', 'arm64', 'amd64', 'ppc64le']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather source this from the data (aka take the existing keys in the existing checksums)

version = tag.get('name', '')
if stable_version_pattern.match(version):
patch_versions.append(version)
patch_versions.sort(key=lambda v: list(map(int, re.findall(r'\d+', v)))) # sort for checksum update
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear what you mean by sort for checkum update.

}}
}}
""")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the query.
First, you should really put it in a separate file, so it can be edited and run separately.
Secondly, you should use graphql variables instead of templating the query and templating and joining.
For instance something like this :

query($repoWithReleases: [ID!]!, $repoWithTags: [ID!]!) {
  with_releases: nodes(ids: $repoWithReleases) {

    ... on Repository {
      nameWithOwner
      releases(first: 100) {
        nodes {
          tagName
          isPrerelease
          releaseAssets {
            totalCount
          }
        }
      }
    }
  }

  with_tags: nodes(ids: $repoWithTags) {

    ... on Repository {
      nameWithOwner
      refs(refPrefix: "refs/tags/", last: 100) {
        nodes {
          name
        }
      }
    }
  }
}

Comment on lines +200 to +209
cache_file = f'{component}-{arch}-{version}'
if os.path.exists(f'cache/{cache_file}'):
logging.info(f'Using cached file for {url_download}')
return calculate_checksum(cache_file, sha_regex)
try:
response = session.get(url_download, timeout=10)
response.raise_for_status()
with open(f'cache/{cache_file}', 'wb') as f:
f.write(response.content)
logging.info(f'Downloaded and cached file for {url_download}')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to do caching (at least at start) It adds some footguns + I'd rather have a more predictable output from multiples "runners" (CI, locally, etc).

Comment on lines +310 to +316
def update_readme(component, version):
for i, line in enumerate(readme_data):
if component in line and re.search(r'v\d+\.\d+\.\d+', line):
readme_data[i] = re.sub(r'v\d+\.\d+\.\d+', version, line)
logging.info(f'Updated {component} to {version} in README')
break
return readme_data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should first focus on the checksums, the README is another story.

@kokyhm
Copy link
Contributor Author

kokyhm commented Dec 16, 2024

@VannTen Thank you very much for your time and comments. (Un)fortunatelly, I was very busy last few months, that's why I did not participate in other tasks. Just checking POC from time to time to see how it works(found some minor bugs). I will address your comments at the start of the next year. Wish you all the best.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Auto dependency dump like dependabot
4 participants