-
Notifications
You must be signed in to change notification settings - Fork 716
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kubeadm join not connected to specified apiservice #3079
Comments
the machines must have network connectivity between each other. please check the support link the bot will share below. /support |
Hello, @limylily 🤖 👋 You seem to have troubles using Kubernetes and kubeadm. Please see: |
Hello, I do not think it is networking issue, but rather a security issue but just because there's an error there, it could be considered as a security feature in this scenario where it fails to continue. if it is not, then I think, it simply is something that requires little bit of effort of going over cobra parameters, overrides and overall Kubeadm configuration related logic whether Cluster Info Config or KubeConfig or Kubelet admin confs values are read from file, CMs coming from the ControlPlane Endpoints or other. Unless SAN presence with 127.0.0.1 in apiServer certSANs in initial configuration of kubeadm and kubeadm-admin may have something to do with it, hence blocking the init phases even though they are completed (I think this is a conversation for another time though, or even a subject for another ticket). Regarding the problem mentioned by the author of this ticket, I have experienced very same issue - almost analogical. My setup - I have LimaVM kubernetes Control Plane successfully provisioned with kubeadm.k8s.io/v1beta4, but I run only one this one control plane (as of now) which LimaVM GuestOS with network type configured to be user-v2 hosts. That kind of a network configuration I have, assures connectivity among nodes and indeed, I can easily communicate through ICMP and not only with the other node that I am trying to get to be a Worker or another Control Plane if needed (mostly trying to run kubeadm join command and attempting to have this worker successfully connected with kube-apiserver by calling proper ip looking for the next CMap data from the proper endpoint). I run Ubuntu 24.04 image which is powered by vmType: vz (Apple Virtualization.framework) rather than Qemu which I usually go with, but it seems to work very well with "vz". I also run everything with cgroup2fs based on non tmpfs systemd entries and containerd is used as CRI and even as a client even though there's also a docker deamon on the LimaVM installed as well. Flannel is the CNI and it also provides with KubeDNS. I can access kube-api from my HostOS without an issue as well. I simply try to run Kubeadm join with token and sha256 but it only get to the certain point, failing on second ConfigMap (roundtripper.go). Problem I noticed is that after issuing Kubeadm join command, and after successfully having init phases started and auth validated TLS against the pinned public key during first requests to get data from CM, it gets that data but then after it finishes first calls, it continues on and sets "server" (which is https://127.0.0.1:8443) based on the response from request.go and the logic and that entry in kube-public namespaced cluster-info CMap. Seems that after making first calls with request.go it sets that config rather than updating it. It gets this data starting with proper address when worker node tries to finish init phases and client starts getting more data after request.go, from another ConfigMap in the control-plane node - this time it's kubeadm-conf from kube-system namespace of the control plane it tries to finish calling and reading from, roundtripper starts looking for mentioned ConfigMap again and again and eventually fails with ugly "stacktrace" and "failed to get config map" - it certainly won't be able to reach https://127.0.0.1:6443. I found that a reason for it is that ConfigMap called cluster-info from kube-public namespace contains a key, value being server: https://127.0.0.1:6443, while I am still trying to continue process of joining to be worker with ip of 192.168.104.6 into 192.168.104.5 (server https://192.168.104.5:6443). It kind of get the server data from cluster-info first and "swaps" the ip address originally provided through cli with "kubeadm join" command. Client.go and roundtripper , as it repeats as per code logic and eventually hit the limit logging stack trace and invalid message about ConfigMap - this time kubeadm-conf which it looks for with wrong ip address even though I went with
My Cluster
@neolit123 I woud like to work on this issue since I have already written code for it (mainly in kubeconfig.go for now) and I have been already involved in debugging this for quite a time now. I confirm it is the issue, and I would like to start being full part of Kubeadm maintainers. |
that's how kubeadm works. you can find more about it by checking the source code under cmd/kubeadm/app/discovery. to fix that you cannot manually edit the cluster-info CM, because it uses JWS signatures for a given token and for the kubeconfig in the CM. like this you can do this:
and then delete/create tokens with
there is no bug to fix here. this is misconfiguration during "init". also please note we don't provide support on github anymore: |
@neolit123 Thanks for your answer and a tip regarding support. Thing is, I have provided proper ip address during initial configuration and yet it changes during the lookup for the second ConfigMap that resides in ControlPlane when kubeadm join is issued after reading the body of first response (which contains that 127.0.0.1) I have been into the code of the Kubernetes and working on GetClusterFromKubeConfig method from kubeconfig.go (cmd/kubeadm/app/util/kubeconfig/kubeconfig.go) but it's stashed for now since there's something there I want to test more with some local changes, local setup and scenarios getting into kubeadm, so I am aiming at testing my build of kubeadm with some changes there. Regarding the issue, It is possible however, that I initially created the control plane with proper ip, and it was running fine but then I had to do a reset and again with proper args to have this control Plane node kube-api working. Perhaps something there in between of initial start, reset to get the data for joiner, or another init got weird. Anyway, I am going to get deeper into this in the meaning of making sure it all works with clear setup recreating the environment so I can also be sure the lima template is properly written (I plan to create a PR there that would be kubeadm related for Debian or update if there's already one committed - I work with newest v4 as mentioned in prev comment). Coming back to kubeadm, and since I am working on lima template for k8s anyway, I will be checking it on other machines, but something tells me there may be a corner case here. I suspect that it can have something to do with the fact I have initially created a control plane on the node that is now supposed to join to another control plane because then I used kubeadm init with defaults, but then I reset and removed manifests var lib data etc and so on... so I plan on using another node without any config present for either control plane init or a worker join to see what is the minimal number of steps since I plan to have 100 clear lima-vm steps for that too in the manifest (aiming at qemu due to "generic nature" to include Windows usability in general so perhaps I will need to create another vms anyway). After I get the results and have clear comparison between setups with clear steps, then I will join next kubeadm office hours as I plan to do it anyway on Wednesday next week. cheers! |
@neolit123 you're right. No issue here, at least I could not have it reproduced it. |
Versions
kubeadm version (use
kubeadm version
):Environment:
kubectl version
):Bare Metal Server
uname -a
):docker, cri-docker
calico
Highly available virtual IP provided through kube vip
What happened?
When I used the kubeadm join command to join the control panel, it did not send a request to the virtual IP to obtain the configuration and mistakenly sent a request to the node to be joined
virtual IP:10.10.2.243 node:10.10.2.192
I originally did not use a virtual IP for a single control panel, but later added a virtual IP to add multiple control panels
Here are the details
What you expected to happen?
I hope it should obtain configuration information through virtual IP, rather than through nodes that have not yet been joined
How to reproduce it (as minimally and precisely as possible)?
Deploy a single control panel k8s cluster without using virtual IP first, and then upgrade it to a k8s cluster with multiple control panels
The text was updated successfully, but these errors were encountered: