Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Account for missing ACR images for build that only affects a portion of a shared tag #1511

Open
mthalman opened this issue Dec 4, 2024 · 3 comments

Comments

@mthalman
Copy link
Member

mthalman commented Dec 4, 2024

There can be a scenario that will cause the publishManifest command to fail.

Conditions:

  • A .NET version has become EOL and its associated images have been marked with EOL annotations.
  • Dockerfiles for that version have not yet been deleted from the repo.
  • Cleanup pipeline has deleted those EOL images from the ACR.
  • A base image of one of those Dockerfiles (e.g. Debian Bullseye) is updated, but only for a subset of the total set of architectures we have Dockerfiles for (e.g. amd64).
  • AutoBuilder runs a build to rebuild the affected Dockerfiles.

With these conditions, the publishManifest command will fail in the publish stage. This occurs because, when creating the manifest list for the shared tag (e.g. 6.0-bullseye), it needs to query the ACR to get the digests of all the images associated with that that. Since the pipeline didn't rebuild all of the architectures, there is no image for the architecture that was not updated in the ACR. So the query to get that image's digest fails.

The command we currently use to create the manifest list is exemplified by the following:

docker manifest create --amend \
  dotnetdocker.azurecr.io/public/dotnet/runtime-deps:6.0-alpine \
  dotnetdocker.azurecr.io/public/dotnet/runtime-deps:6.0.36-alpine3.20-amd64 \
  dotnetdocker.azurecr.io/public/dotnet/runtime-deps:6.0.36-alpine3.20-arm32v7 \
  dotnetdocker.azurecr.io/public/dotnet/runtime-deps:6.0.36-alpine3.20-arm64v8

We could investigate whether using docker buildx imagetools create --append could be used here instead.

Another lower-cost solution would be to extend the time period we use before deleting EOL images from the ACR. If this was extended past 1 month to allow for the 1 month grace period after the .NET version EOL date, then we could avoid this.

Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@lbussell
Copy link
Contributor

lbussell commented Dec 9, 2024

[Triage] A good first step here is to investigate if using docker buildx imagetools create --append will be a simple solution to this issue. Whoever works on #1510 should investigate this at the same time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Backlog
Development

No branches or pull requests

2 participants