I got the following errors flooding my system journal. I think the IP
10.101.213.69
refers to the
metrics-server
pod in my cluster.
Aug 14 11:23:46 l09853 k0s[441]: time="2023-08-14 11:23:46" level=info msg="E0814 11:23:46.339804 705 controller.go:116] loading OpenAPI spec for \"v1beta1.metrics.k8s.io\" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable" component=kube-apiserver stream=stderr Aug 14 11:23:46 l09853 k0s[441]: time="2023-08-14 11:23:46" level=info msg=", Header: map[Content-Type:[text/plain; charset=utf-8] X-Content-Type-Options:[nosniff]]" component=kube-apiserver stream=stderr Aug 14 11:23:46 l09853 k0s[441]: time="2023-08-14 11:23:46" level=info msg="I0814 11:23:46.341048 705 controller.go:129] OpenAPI AggregationController: action for item v1beta1.metrics.k8s.io: Rate Limited Requeue." component=kube-apiserver stream=stderr Aug 14 11:23:50 l09853 k0s[441]: time="2023-08-14 11:23:50" level=info msg="E0814 11:23:50.340588 705 available_controller.go:460] v1beta1.metrics.k8s.io failed with: failing or missing response from https://10.101.213.69:443/apis/metrics.k8s.io/v1beta1: Get \"https://10.101.213.69:443/apis/metrics.k8s.io/v1beta1\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)" component=kube-apiserver stream=stderr Aug 14 11:23:55 l09853 k0s[441]: time="2023-08-14 11:23:55" level=info msg="E0814 11:23:55.347965 705 available_controller.go:460] v1beta1.metrics.k8s.io failed with: failing or missing response from https://10.101.213.69:443/apis/metrics.k8s.io/v1beta1: Get \"https://10.101.213.69:443/apis/metrics.k8s.io/v1beta1\": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)" component=kube-apiserver stream=stderr
Aug 14 11:23:55 l09853 k0s[441]: time="2023-08-14 11:23:55" level=info msg="I0814 11:23:55.396666 705 handler_discovery.go:325] DiscoveryManager: Failed to download discovery for kube-system/metrics-server:443: 503 error trying to reach service: EOF" component=kube-apiserver stream=stderr
Aug 14 11:23:55 l09853 k0s[441]: time="2023-08-14 11:23:55" level=info msg="I0814 11:23:55.396712 705 handler.go:232] Adding GroupVersion metrics.k8s.io v1beta1 to ResourceManager" component=kube-apiserver stream=stderr
However, the log level of these messages is info. So, maybe I can ignore them?
But I hope there’s a better way to resolve it.
Thanks.
kube-apiserver is trying to connect to metrics-server but apparently that is failing. It connect it via the kube-system/metrics-server
ClusterIP service.
Do you run plain controller node(s) (==no worker enable on the controller) in your cluster? If yes, then this is often a symptom of konnectivity agents on the workers not being able to connect to the konnectivity-server on the controllers.
Check the following docs for some hints on the config:
https://docs.k0sproject.io/stable/high-availability/
https://docs.k0sproject.io/stable/nllb/
No, I am using a single configuration. Anyway, after I reset the cluster a few times, the metrics server seems to be working.
Maybe it has something to do with kube-router
. I remember in one experiment, I disabled it by setting metricsPort
to 0
Thanks
It’s is common that when the cluster is booting up there are these errors until all pods etc. are properly up and running.
Metrics server registers itself as API extension on the api-server and based on our testing getting everything “ready” can take some minutes in some cases. The api extension is registered BEFORE the pods etc. are are properly running. I think API has some backoff policy when connecting/discovering the extensions and thus it takes a while when images are pulled etc. the first time at least.