Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prepare to push v0.8.0 to the Operator Hub #287

Merged
merged 15 commits into from
Nov 16, 2023

Conversation

wainersm
Copy link
Member

In order to publish our releases on the Operator Hub, the bundle directory should be re-generated. The last time we did it was for the release 0.5.0 using an older operator SDK. The operator hub relies on a new operator SDK (1.30) which requires fields not present at the time we last published, so I had to update some files on this repository.

I re-generate the bundle and opened a pull request on k8s-operatorhub/community-operators#3532 . All CI jobs passed, but it would be nice to have a careful review on the files to ensure that everything is as expected.

There are some warnings generated by the operator SDK that I did not address:

  • WARN[0000] Warning: Value : (cc-operator.v0.8.0) csv.Spec.minKubeVersion is not informed. It is recommended you provide this information. Otherwise, it would mean that your operator project can be distributed and installed in any cluster version available, which is not necessarily the case for all projects.
  • WARN[0000] Warning: Value cc-operator.v0.8.0: this bundle is using APIs which were deprecated and removed in v1.25. More info: https://kubernetes.io/docs/reference/using-api/deprecation-guide/#v1-25. Migrate the API(s) for runtimeclasses: (["ClusterServiceVersion.Spec.InstallStrategy.StrategySpec.ClusterPermissions[0].Rules[6]"])
  • WARN[0000] Warning: Value cc-operator.v0.8.0: unable to find the resource requests for the container: (kube-rbac-proxy). It is recommended to ensure the resource request for CPU and Memory. Be aware that for some clusters configurations it is required to specify requests or limits for those values. Otherwise, the system or quota may reject Pod creation. More info: https://master.sdk.operatorframework.io/docs/best-practices/managing-resources/

Please advise on what values I should be using to silent the warnings completely.

The operator-sdk will be installed if not found on $PATH nor ./bin
directory, just like the other build tools. Because we handle its
installation on the Makefile, it won't be needed to get installed by the
tests Ansible playbook.

Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
Added the 'Operator hub' section on `make help` output, and moved the
'opm' target to under 'Build Dependencies'.

```
$ make help

[snip]

Build Dependencies
  kustomize        Download kustomize locally if necessary.
  controller-gen   Download controller-gen locally if necessary.
  envtest          Download envtest-setup locally if necessary.
  operator-sdk     Download operator-sdk locally if necessary.
  opm              Download opm locally if necessary.

Operator Hub
  bundle           Generate bundle manifests and metadata, then validate generated files.
  bundle-build     Build the bundle image.
  bundle-push      Push the bundle image.
  catalog-build    Build a catalog image.
  catalog-push     Push a catalog image.
```

Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
Using the operator-sdk v1.30.0 which is the same version used on the
operator hub CI.

Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
Updated the config/manifest kustomize configuration to reflect the
following:

 * The config/samples were split in ccruntime and enclave-cc on
commit ee48a38, being config/samples/ccruntime/default the
default configuration to amd64.

 * The config/release was created on commit 1e068ff. It kustomize a
stable (release) image for config/default.

Fixes confidential-containers#238
Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
Just like on Operator Hub, let's run the full operator framwork
validation (i.e. include the optional checks).

Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
Full names and valid email are required to pass the operator hub CI.

Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
The operator-sdk validation was failing with the error:
```
ERRO[0000] Error: Value : (cc-operator.v0.0.1) csv.Spec.Icon elements should contain both data and mediatype
```

Desides adding the icon  media type, this added the icon data in base64.

Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
Fixed the error on operator hub CI:
```
An empty value found - 'metadata.annotations.categories'
```

Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
Updated the operator hub bundle for the 0.8.0 release.

First bumped VERSION in Makefile then the new bundle was generated with:
```
$ make bundle IMG=quay.io/confidential-containers/operator:v0.8.0
```

Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
On Operator Hub we are configured for a new version to replace the
previous, in this case the v0.8.0 should replace v0.5.0.

First added the `replaces` property to bundle/manifests/cc-operator.clusterserviceversion.yaml
then re-generated the manifests as:
```
make bundle IMG=quay.io/confidential-containers/operator:v0.8.0
```

Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
Fix error found on operator hub CI:
```
Value of 'metadata.annotations.containerImage' not found, please check.
```

Apparently the operator SDK cannot guess the operator image from the
informations provided. Manifests regenerated with:
```
make bundle IMG=quay.io/confidential-containers/operator:v0.8.0
```

Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
Created the OPERATOR_HUB.md with instructions to publish a new version
of the operator.

Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
@portersrc
Copy link
Member

portersrc commented Nov 14, 2023

Hi @wainersm, Thanks for this PR. I looked through the code and didn't catch anything. Regarding the warnings, some thoughts:

  • Minimum kube version should be based off our test environments, I would think. I cannot speak to that, but our quickstart says v1.24.
  • This one feels risky to tweak at this point. I think the warning is coming from the node.k8s.io lines here, which I believe gets its API version from here. I'm basing this off of the deprecation guide for v1.25 for RuntimeClass. It suggests node.k8s.io/v1beta1 should be changed to node.k8s.io/v1, which has been "available since v1.20". Assuming we/users/tests are >=1.24, that's sort of OK (because at least they can invoke it), but the problem is I have no clue how differently those APIs behave. I would leave this warning until we can test this update to the operator.
  • I do not know what sensible values would be for this, but a hint is from this openshift guide (though the context is different). For comparison, the manager takes these requests and limits.

@wainersm
Copy link
Member Author

Hi @portersrc thanks for the review and suggestions! Some comments below:

Hi @wainersm, Thanks for this PR. I looked through the code and didn't catch anything. Regarding the warnings, some thoughts:

* Minimum kube version should be based off our test environments, I would think. I cannot speak to that, but our quickstart says v1.24.

I set minKubeVersion as below:

diff --git a/config/manifests/bases/cc-operator.clusterserviceversion.yaml b/config/manifests/bases/cc-operator.clusterserviceversion.yaml
index 404cb13..a5a4c4e 100644
--- a/config/manifests/bases/cc-operator.clusterserviceversion.yaml
+++ b/config/manifests/bases/cc-operator.clusterserviceversion.yaml
@@ -52,6 +52,7 @@ spec:
   - email: [email protected]
     name: Pradipta Banerjee
   maturity: alpha
+  minKubeVersion: "1.24"
   provider:
     name: Confidential Containers Community
     url: https://github.com/confidential-containers

Then I regenerated the operator and updated k8s-operatorhub/community-operators#3532 . However CI fail... it is not clear to me the failure though, whether I am passing the wrong version (e.g. "v1.24" instead of "1.24" or "1.24.0"

* This one feels risky to tweak at this point. I think the warning is coming from the node.k8s.io lines [here](https://github.com/k8s-operatorhub/community-operators/pull/3532/files#diff-fa1d2566738b128e57c07abe64ae1c9e1afac53262c6f99ef97a54bb7a7fae9bR333-R344), which I believe gets its API version from [here](https://github.com/k8s-operatorhub/community-operators/pull/3532/files#diff-fa1d2566738b128e57c07abe64ae1c9e1afac53262c6f99ef97a54bb7a7fae9bR267). I'm basing this off of the [deprecation guide for v1.25 for RuntimeClass](https://kubernetes.io/docs/reference/using-api/deprecation-guide/#runtimeclass-v125). It suggests `node.k8s.io/v1beta1` should be changed to `node.k8s.io/v1`, which has been "available since v1.20". Assuming we/users/tests are >=1.24, that's sort of OK (because at least they can invoke it), but the problem is I have no clue how differently those APIs behave. I would leave this warning until we can test this update to the operator.

* I do not know what sensible values would be for this, but a hint is from this [openshift guide](https://docs.openshift.com/serverless/1.29/knative-serving/kube-rbac-proxy-serving.html) (though the context is different). For comparison, the manager takes [these requests and limits](https://github.com/k8s-operatorhub/community-operators/pull/3532/files#diff-fa1d2566738b128e57c07abe64ae1c9e1afac53262c6f99ef97a54bb7a7fae9bR413-R417).

I will open an issue to track the fix for those other warnings but I won't be addressing them because it is too risky at this point.

Supress the warning:

```
WARN[0000] Warning: Value : (cc-operator.v0.8.0) csv.Spec.minKubeVersion is not informed. It is recommended you provide this information. Otherwise, it would mean that your operator project can be distributed and installed in any cluster version available, which is not necessarily the case for all projects.
```

Kubernetes 1.24 is the version used on operator CI as well as the
mininum recommended version in the quickstart guide of CoCo.

Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
@wainersm
Copy link
Member Author

Hi @portersrc thanks for the review and suggestions! Some comments below:

Hi @wainersm, Thanks for this PR. I looked through the code and didn't catch anything. Regarding the warnings, some thoughts:

* Minimum kube version should be based off our test environments, I would think. I cannot speak to that, but our quickstart says v1.24.

I set minKubeVersion as below:

diff --git a/config/manifests/bases/cc-operator.clusterserviceversion.yaml b/config/manifests/bases/cc-operator.clusterserviceversion.yaml
index 404cb13..a5a4c4e 100644
--- a/config/manifests/bases/cc-operator.clusterserviceversion.yaml
+++ b/config/manifests/bases/cc-operator.clusterserviceversion.yaml
@@ -52,6 +52,7 @@ spec:
   - email: [email protected]
     name: Pradipta Banerjee
   maturity: alpha
+  minKubeVersion: "1.24"
   provider:
     name: Confidential Containers Community
     url: https://github.com/confidential-containers

Then I regenerated the operator and updated k8s-operatorhub/community-operators#3532 . However CI fail... it is not clear to me the failure though, whether I am passing the wrong version (e.g. "v1.24" instead of "1.24" or "1.24.0"

Set to "1.24.0" and now it passes the operator hub CI!

* This one feels risky to tweak at this point. I think the warning is coming from the node.k8s.io lines [here](https://github.com/k8s-operatorhub/community-operators/pull/3532/files#diff-fa1d2566738b128e57c07abe64ae1c9e1afac53262c6f99ef97a54bb7a7fae9bR333-R344), which I believe gets its API version from [here](https://github.com/k8s-operatorhub/community-operators/pull/3532/files#diff-fa1d2566738b128e57c07abe64ae1c9e1afac53262c6f99ef97a54bb7a7fae9bR267). I'm basing this off of the [deprecation guide for v1.25 for RuntimeClass](https://kubernetes.io/docs/reference/using-api/deprecation-guide/#runtimeclass-v125). It suggests `node.k8s.io/v1beta1` should be changed to `node.k8s.io/v1`, which has been "available since v1.20". Assuming we/users/tests are >=1.24, that's sort of OK (because at least they can invoke it), but the problem is I have no clue how differently those APIs behave. I would leave this warning until we can test this update to the operator.

* I do not know what sensible values would be for this, but a hint is from this [openshift guide](https://docs.openshift.com/serverless/1.29/knative-serving/kube-rbac-proxy-serving.html) (though the context is different). For comparison, the manager takes [these requests and limits](https://github.com/k8s-operatorhub/community-operators/pull/3532/files#diff-fa1d2566738b128e57c07abe64ae1c9e1afac53262c6f99ef97a54bb7a7fae9bR413-R417).

I will open an issue to track the fix for those other warnings but I won't be addressing them because it is too risky at this point.

Opened the issues:

#289
#290

@wainersm
Copy link
Member Author

Added minKubeVersion to 1.24.0. Passing operator hub CI: k8s-operatorhub/community-operators#3532

Copy link
Member

@stevenhorsman stevenhorsman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the disclaimer that I'm fair from operator SDK fluent, I have gone through all the commits and they seem to make sense and address the warnings mentioned. Thanks for the documentation too!

@wainersm
Copy link
Member Author

With the disclaimer that I'm fair from operator SDK fluent, I have gone through all the commits and they seem to make sense and address the warnings mentioned. Thanks for the documentation too!

" I have gone through all the commits and they seem to make sense and address the warnings mentioned" <--- this makes you an expert! j/k Thanks once again @stevenhorsman !

@fidencio
Copy link
Member

/test

@wainersm
Copy link
Member Author

It seems that the changes broke the operator:

11:02:44 deployment.apps/cc-operator-controller-manager created
11:04:44 ERROR: cc-operator-controller-manager pod is not running
11:04:44 DEBUG: Pod cc-operator-controller-manager-67f7dff85b-m25ms
11:04:44 + kubectl describe pods/cc-operator-controller-manager-67f7dff85b-m25ms '-n confidential-containers-system'
11:04:44 Error from server (NotFound): namespaces " confidential-containers-system" not found
11:04:44 + true
11:04:44 + kubectl logs pods/cc-operator-controller-manager-67f7dff85b-m25ms '-n confidential-containers-system'
11:04:44 Error from server (NotFound): namespaces " confidential-containers-system" not found

I suspect it is this change: 570df87#diff-75dd57ab6a8429b6c9a269cff17b4227c474835c62479b76b6531536855a2210L12

@wainersm
Copy link
Member Author

/test-kata-qemu

2 similar comments
@wainersm
Copy link
Member Author

/test-kata-qemu

@wainersm
Copy link
Member Author

/test-kata-qemu

The following message is an error on the debug_pod() function:
```
16:31:31 DEBUG: Pod cc-operator-controller-manager-67f7dff85b-qfnm4
16:31:31 + kubectl describe pods/cc-operator-controller-manager-67f7dff85b-qfnm4 '-n confidential-containers-system'
16:31:31 Error from server (NotFound): namespaces " confidential-containers-system" not found
```

It's passing '-n confidential-containers-system' as an single argument
to kubectl rather two (kubectl [...] '-n confidential-containers-system' vs kubectl [...] -n confidential-containers-system).

Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
@wainersm
Copy link
Member Author

I found the problem. There are two issues on the CI scripts.
For the first one, it is a debug function, I just sent the fix. I won't fix the 2nd issue so I can run CI to check the first issue is proper solved.

/test-kata-qemu

@wainersm
Copy link
Member Author

Now it print the debug pod correctly:

17:54:06 ERROR: cc-operator-controller-manager pod is not running
17:54:06 DEBUG: Pod cc-operator-controller-manager-67f7dff85b-q7s7z
17:54:06 + kubectl describe pods/cc-operator-controller-manager-67f7dff85b-q7s7z -n confidential-containers-system
17:54:06 Name:         cc-operator-controller-manager-67f7dff85b-q7s7z
17:54:06 Namespace:    confidential-containers-system
17:54:06 Priority:     0
17:54:06 Node:         ubuntu20-9abce0/10.0.0.12
17:54:06 Start Time:   Thu, 16 Nov 2023 20:52:05 +0000
17:54:06 Labels:       control-plane=controller-manager
17:54:06               pod-template-hash=67f7dff85b
17:54:06 Annotations:  <none>
17:54:06 Status:       Pending
17:54:06 IP:           10.244.0.4
17:54:06 IPs:
17:54:06   IP:           10.244.0.4
17:54:06 Controlled By:  ReplicaSet/cc-operator-controller-manager-67f7dff85b
17:54:06 Containers:
17:54:06   kube-rbac-proxy:
17:54:06     Container ID:  containerd://d1f95c0df0de5ab41400636d64354ae989c2a0a81837ef196c9b0f38c4a348cc
17:54:06     Image:         gcr.io/kubebuilder/kube-rbac-proxy:v0.13.1
17:54:06     Image ID:      gcr.io/kubebuilder/kube-rbac-proxy@sha256:d4883d7c622683b3319b5e6b3a7edfbf2594c18060131a8bf64504805f875522
17:54:06     Port:          8443/TCP
17:54:06     Host Port:     0/TCP
17:54:06     Args:
17:54:06       --secure-listen-address=0.0.0.0:8443
17:54:06       --upstream=http://127.0.0.1:8080/
17:54:06       --logtostderr=true
17:54:06       --v=10
17:54:06     State:          Running
17:54:06       Started:      Thu, 16 Nov 2023 20:52:08 +0000
17:54:06     Ready:          True
17:54:06     Restart Count:  0
17:54:06     Environment:    <none>
17:54:06     Mounts:
17:54:06       /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-7mtsg (ro)
17:54:06   manager:
17:54:06     Container ID:  
17:54:06     Image:         localhost:5000/cc-operator:v0.8.0
17:54:06     Image ID:      
17:54:06     Port:          <none>
17:54:06     Host Port:     <none>
17:54:06     Command:
17:54:06       /manager
17:54:06     Args:
17:54:06       --health-probe-bind-address=:8081
17:54:06       --metrics-bind-address=127.0.0.1:8080
17:54:06       --leader-elect
17:54:06     State:          Waiting
17:54:06       Reason:       ErrImagePull
17:54:06     Ready:          False
17:54:06     Restart Count:  0
17:54:06     Limits:
17:54:06       cpu:     200m
17:54:06       memory:  100Mi
17:54:06     Requests:
17:54:06       cpu:      100m
17:54:06       memory:   20Mi
17:54:06     Liveness:   http-get http://:8081/healthz delay=15s timeout=1s period=20s #success=1 #failure=3
17:54:06     Readiness:  http-get http://:8081/readyz delay=5s timeout=1s period=10s #success=1 #failure=3
17:54:06     Environment:
17:54:06       CCRUNTIME_NAMESPACE:  confidential-containers-system (v1:metadata.namespace)
17:54:06     Mounts:
17:54:06       /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-7mtsg (ro)
17:54:06 Conditions:
17:54:06   Type              Status
17:54:06   Initialized       True 
17:54:06   Ready             False 
17:54:06   ContainersReady   False 
17:54:06   PodScheduled      True 
17:54:06 Volumes:
17:54:06   kube-api-access-7mtsg:
17:54:06     Type:                    Projected (a volume that contains injected data from multiple sources)
17:54:06     TokenExpirationSeconds:  3607
17:54:06     ConfigMapName:           kube-root-ca.crt
17:54:06     ConfigMapOptional:       <nil>
17:54:06     DownwardAPI:             true
17:54:06 QoS Class:                   Burstable
17:54:06 Node-Selectors:              <none>
17:54:06 Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
17:54:06                              node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
17:54:06 Events:
17:54:06   Type     Reason     Age                 From               Message
17:54:06   ----     ------     ----                ----               -------
17:54:06   Normal   Scheduled  2m1s                default-scheduler  Successfully assigned confidential-containers-system/cc-operator-controller-manager-67f7dff85b-q7s7z to ubuntu20-9abce0
17:54:06   Normal   Pulling    2m                  kubelet            Pulling image "gcr.io/kubebuilder/kube-rbac-proxy:v0.13.1"
17:54:06   Normal   Pulled     119s                kubelet            Successfully pulled image "gcr.io/kubebuilder/kube-rbac-proxy:v0.13.1" in 1.60801282s
17:54:06   Normal   Created    119s                kubelet            Created container kube-rbac-proxy
17:54:06   Normal   Started    118s                kubelet            Started container kube-rbac-proxy
17:54:06   Normal   Pulling    75s (x3 over 118s)  kubelet            Pulling image "localhost:5000/cc-operator:v0.8.0"
17:54:06   Warning  Failed     75s (x3 over 118s)  kubelet            Failed to pull image "localhost:5000/cc-operator:v0.8.0": rpc error: code = NotFound desc = failed to pull and unpack image "localhost:5000/cc-operator:v0.8.0": failed to resolve reference "localhost:5000/cc-operator:v0.8.0": localhost:5000/cc-operator:v0.8.0: not found
17:54:06   Warning  Failed     75s (x3 over 118s)  kubelet            Error: ErrImagePull
17:54:06   Normal   BackOff    40s (x6 over 118s)  kubelet            Back-off pulling image "localhost:5000/cc-operator:v0.8.0"
17:54:06   Warning  Failed     40s (x6 over 118s)  kubelet            Error: ImagePullBackOff
17:54:06 + kubectl logs pods/cc-operator-controller-manager-67f7dff85b-q7s7z -n confidential-containers-system
17:54:06 Defaulted container "kube-rbac-proxy" out of: kube-rbac-proxy, manager
17:54:06 W1116 20:52:08.064478       1 main.go:165] 
17:54:06 ==== Deprecation Warning ======================
17:54:06 
17:54:06 Insecure listen address will be removed.
17:54:06 Using --insecure-listen-address won't be possible!
17:54:06 
17:54:06 The ability to run kube-rbac-proxy without TLS certificates will be removed.
17:54:06 Not using --tls-cert-file and --tls-private-key-file won't be possible!
17:54:06 
17:54:06 For more information, please go to https://github.com/brancz/kube-rbac-proxy/issues/187
17:54:06 
17:54:06 ===============================================
17:54:06 
17:54:06 		
17:54:06 I1116 20:52:08.064845       1 main.go:218] Valid token audiences: 
17:54:06 I1116 20:52:08.064936       1 main.go:344] Generating self signed cert as no cert is provided
17:54:06 I1116 20:52:09.041159       1 main.go:394] Starting TCP socket on 0.0.0.0:8443
17:54:06 I1116 20:52:09.041529       1 main.go:401] Listening securely on 0.0.0.0:8443
17:54:06 + set +x
17:54:07 Build step 'Execute shell' marked build as failure
17:54:07 Setting status of 02ed4c9db17428a2d8985e99b7d04a54c5ebe97e to FAILURE with url http://jenkins.katacontainers.io/job/confidential-containers-operator-main-ubuntu-20.04-x86_64-containerd_kata-qemu-PR/357/ and message: 'Build finished. '
17:54:07 Using context: tests-e2e-ubuntu-20.04-x86_64-containerd_kata-qemu
17:54:08 Finished: FAILURE

The operator image is built then published on a local registry; on
config/manager/kustomization.yaml its name is changed to localhost:5000/cc-operator
so that it is pulled from that registry on install time. It isn´t setting the tag of
the image though, which means it uses whatever `newTag` value from the
kustomization file. Everything works fine when `newTag` is `latest`, but
when it is set to a release tag (e.g. v0.8.0) the installation will look for
an image like localhost:5000/cc-operator:v0.8.0, whereas on local
registry it was pushed as localhost:5000/cc-operator:latest. This
change ensures that `newTag` is always set to `latest`, matching the
image name and tag of the one pushed to the local registry.

Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
@wainersm
Copy link
Member Author

This last commit I added on this PR should make the CI green.

/test

Copy link
Member

@fidencio fidencio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks @wainersm!

@fidencio fidencio merged commit a27d7f3 into confidential-containers:main Nov 16, 2023
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants