You signed in with another tab or window.
Reload
to refresh your session.
You signed out in another tab or window.
Reload
to refresh your session.
You switched accounts on another tab or window.
Reload
to refresh your session.
Already on GitHub?
Sign in
to your account
[BUG] Unable to provision downstream k8s v1.25.2-rancher1-1 cluster using Oracle Linux 8.6
#39988
[BUG] Unable to provision downstream k8s v1.25.2-rancher1-1 cluster using Oracle Linux 8.6
#39988
rishabhmsra
opened this issue
Dec 23, 2022
· 5 comments
kind/bug-qa
Issues that have not yet hit a real release. Bugs introduced by a new feature or enhancement
team/hostbusters
The team that is responsible for provisioning/managing downstream clusters + K8s version support
If Helm Chart, Kubernetes Cluster and version (RKE1, RKE2, k3s, EKS, etc): RKE1, v1.24.8-rancher1-1
Information about the Cluster
Kubernetes version: v1.25.2-rancher1-1
Cluster Type (Local/Downstream): Downstream
If downstream, what type of cluster? (Custom/Imported or specify provider for Hosted/Infrastructure Provider): Custom
User Information
What is the role of the user logged in? (Admin/Cluster Owner/Cluster Member/Project Owner/Project Member/Custom) Admin
If custom, define the set of permissions:
Describe the bug
Provisioned k8s v1.25.2-rancher1-1 k8s cluster with OL 8.6 (used ami-0131316000f02f99f), cluster came into Active state, but after some time it goes into Error state with following error:
Cluster health check failed: Failed to communicate with API server during namespace check: Get "https://<REDACTED>:6443/api/v1/namespaces/kube-system?timeout=45s": context deadline exceeded
Also after the error, not able to SSH into control plane node.
There's an
issue
logged for high CPU usage on OL 8.6 with k8s v1.24.
To Reproduce
Create 3 VMs on aws using OL 8.6 ami(in this case ami-0131316000f02f99f is used)
Open the ports and run the firewall commands as mentioned
here
.
Create custom cluster(1-cp, 1-e, 1-w) by running the registration commands on the nodes and wait for cluster to come into Active state.
After some time cluster will go into Error state with error: Failed to communicate with API server during namespace check: Get "https://:6443/api/v1/namespaces/kube-system?timeout=45s": context deadline exceeded`
Result
Cluster fails to provision using OL 8.6 ami
Expected Result
Cluster should provision successfully and come into Active state.
team/hostbusters
The team that is responsible for provisioning/managing downstream clusters + K8s version support
labels
Dec 23, 2022
@sowmyav27
, k8s v1.24 with OL 8.6 already has a CPU usage issue for which below issues are logged:
Yes I'm seeing same healthcheck error with OL 8.6 with k8s 1.24:
Created a
k8s v1.24.8-rancher1-1
custom cluster using the steps as mentioned
here
Wait for cluster to come into Active state and then observer after sometime cluster will go into Error state with following error:
Cluster health check failed: Failed to communicate with API server during namespace check: Get "https://<REDACTED>:6443/api/v1/namespaces/kube-system?timeout=45s": context deadline exceeded
Wait for some more time and observe that cluster will go into Unavailable state with error:
Cluster agent is not connected
docker.service logs:
docker.service - Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; disabled; vendor preset: disabled)
Active: active (running) since Thu 2023-01-05 13:35:25 GMT; 28min ago
Docs: https://docs.docker.com
Main PID: 11781 (dockerd)
Tasks: 66
Memory: 2.1G
CGroup: /system.slice/docker.service
└─11781 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
time="2023-01-05T13:51:36.631002475Z" level=error msg="Not continuing with pull after error: context canceled"
time="2023-01-05T13:56:55.537716495Z" level=error msg="Handler for GET /v1.41/containers/b8a2e57d0060c0dc9067ea5796668f1678c85e0e13bd071aa4e7c80890628ded/json returned error: write unix /var/run/docker.sock->@: write: broken pipe"
http: superfluous response.WriteHeader call from github.com/docker/docker/api/server/httputils.WriteJSON (httputils_write_json.go:11)
time="2023-01-05T13:57:29.060444869Z" level=error msg="Handler for GET /v1.41/containers/bb29e9dd266aa3fda19405fdf715317230a161f2d465fb58e3c9f54ee705dccc/json returned error: write unix /var/run/docker.sock->@: write: broken pipe"
time="2023-01-05T13:57:29.432542906Z" level=error msg="Handler for GET /v1.41/containers/8f2adce05071064c9be961270c05d2d7dfebff1ef5055932b866dd782a3fea03/json returned error: write unix /var/run/docker.sock->@: write: broken pipe"
http: superfluous response.WriteHeader call from github.com/docker/docker/api/server/httputils.WriteJSON (httputils_write_json.go:11)
http: superfluous response.WriteHeader call from github.com/docker/docker/api/server/httputils.WriteJSON (httputils_write_json.go:11)
time="2023-01-05T13:58:21.956038247Z" level=error msg="Handler for GET /v1.41/containers/json returned error: write unix /var/run/docker.sock->@: write: broken pipe"
http: superfluous response.WriteHeader call from github.com/docker/docker/api/server/httputils.WriteJSON (httputils_write_json.go:11)
time="2023-01-05T13:58:29.271516967Z" level=error msg="exit event" container=2d5a3ab4e391cf84cfcc312181749bb6c10bbd08017d2b26f0889536e671e4e7 error="no such exec" module=libcontainerd namespace=moby process=bccfc45bf28efe6aa58765af789ae8c500766b06e2204600c3ba61f590951c73
Kubelet logs on the CP node:
time="2023-01-05T13:52:27Z" level=error msg="operation timeout: context deadline exceeded Failed to get stats from container bb29e9dd266aa3fda19405fdf715317230a161f2d465fb58e3c9f54ee705dccc" time="2023-01-05T13:53:45Z" level=error msg="unable to inspect docker image \"sha256:59daef946c8c6f1a1152d05726e87a4677e8a196ab3045249faad95181f6fafa\" while inspecting docker container \"9cc33f7c5355b8fcfcbb5d1542954c0e3fd925a718c08717b0f4b27543b3f53b\": operation timeout: context deadline exceeded Failed to get stats from container 9cc33f7c5355b8fcfcbb5d1542954c0e3fd925a718c08717b0f4b27543b3f53b"
Also after the error, not able to SSH into control plane node.
kind/bug-qaIssues that have not yet hit a real release. Bugs introduced by a new feature or enhancementteam/hostbustersThe team that is responsible for provisioning/managing downstream clusters + K8s version support