添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement . We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug description

Istio ingressgateway logs have below error every 30 minutes

[Envoy (Epoch 0)] [2019-12-02 04:27:30.085][22][warning][config] [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:91] gRPC config stream closed: 13,

Expected behavior
Doesn't have errors

Steps to reproduce the bug

Version (include the output of istioctl version --remote and kubectl version and helm version if you used Helm)

istio v1.4.0
helm v2.16.1

How was Istio installed?

Helm chart

Environment where bug was observed (cloud vendor, OS, etc)

It would be great if that was not in the log by default. As someone new to istio, I thought I had a configuration error somewhere because my ingress gateway log has hundreds of lines of this message, and the message is classified as "warning".

A "warning" implies that something might be wrong and needs investigation. I think this is probably more like a debug / verbose kind of message.

It would at least be helpful to cancel the request with a gRPC status CANCELED(1), rather than returning gRPC status INTERNAL(13).

INTERNAL suggests a real problem, whereas CANCEL somewhat explains what happened.

I think it is go-gRPC implementation that is draining the connection with Internal See https://github.com/grpc/grpc-go/blob/8c50fc25657c1a4e32b99646afa453fcaccd01b2/internal/transport/http2_server.go#L987 and https://github.com/grpc/grpc-go/blob/40ed2eb467471df2bd3c59e66cc5357159062d48/internal/transport/http_util.go#L59

I use istio 1.6.8 and I see exact same log and after I see that log My service starts to not to work. I get

upstream connect error or disconnect/reset before headers. reset reason: connection termination

error. After 30 min (EXACTLY) I get same warning and it suddenly starts to work again.

I have opened a case for that at SeldonIO/seldon-core#2347

Exat lines are

2020-09-03T16:37:47.226983Z     warning envoy config    [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:92] StreamAggregatedResources gRPC config stream closed: 13,
2020-09-03T16:37:47.289783Z     warning envoy filter    [src/envoy/http/authn/http_filter_factory.cc:83] mTLS PERMISSIVE mode is used, connection can be either plaintext or TLS, and client cert can be omitted. Please consider to upgrade to mTLS STRICT mode for more secure configuration that only allows TLS connection with client cert. See https://istio.io/docs/tasks/security/mtls-migration/
2020-09-03T16:37:47.291518Z     warning envoy filter    [src/envoy/http/authn/http_filter_factory.cc:83] mTLS PERMISSIVE mode is used, connection can be either plaintext or TLS, and client cert can be omitted. Please consider to upgrade to mTLS STRICT mode for more secure configuration that only allows TLS connection with client cert. See https://istio.io/docs/tasks/security/mtls-migration/
2020-09-03T16:37:47.298553Z     warning envoy filter    [src/envoy/http/authn/http_filter_factory.cc:83] mTLS PERMISSIVE mode is used, connection can be either plaintext or TLS, and client cert can be omitted. Please consider to upgrade to mTLS STRICT mode for more secure configuration that only allows TLS connection with client cert. See https://istio.io/docs/tasks/security/mtls-migration/
2020-09-03T16:37:47.302983Z     warning envoy filter    [src/envoy/http/authn/http_filter_factory.cc:83] mTLS PERMISSIVE mode is used, connection can be either plaintext or TLS, and client cert can be omitted. Please consider to upgrade to mTLS STRICT mode for more secure configuration that only allows TLS connection with client cert. See https://istio.io/docs/tasks/security/mtls-migration/
  • Installed on AWS EKS (1.17)
  • Installed on Kind Kubernetes (1.17)
  • Installed on Azure AKS (1.16.13)
  • I got same error on all of them. Also i disabled mTLS just in case to see that is causing that issue too but no. I also disabled sniffing as suggested by other issues

        pilot:
          appNamespaces: []
          autoscaleEnabled: true
          autoscaleMax: 5
          autoscaleMin: 1
          configMap: true
          configNamespace: istio-config
          cpu:
            targetAverageUtilization: 80
          enableProtocolSniffingForInbound: false  <!!!!
          enableProtocolSniffingForOutbound: false <!!!!
          env: {}
          image: pilot
          keepaliveMaxServerConnectionAge: 30m
          nodeSelector: {}
          podAntiAffinityLabelSelector: []
          podAntiAffinityTermLabelSelector: []

    @howardjohn Can you clarify this issue? It is not only error but causing Service to be not reachable for 30 minutes AFTER we this log

    not always having this upstream connect error or disconnect/reset before headers after getting gRPC config stream closed: 13 but if we get upstream connect error or disconnect/reset before headers it definitely starts with gRPC config stream closed: 13

    As mentioned previously, the log is NOT an error. However, it being unreachable for 30 mins after is. If you can get a config dump before and after (https://github.com/istio/istio/wiki/Troubleshooting-Istio) please open an issue with the diff.

    Basically the log means we reconnect to istiod, which should return the same config, but its possible due to a bug its returning different config. Then 30min later it reconnects again and gets the "correct" config

    Hello @howardjohn

    I think there is a real problem currently with this "gRPC config stream closed: 13" errors every 30min.
    We tried to move away from istio ans search some layers deeper. OS, mongo or K8s and we still have this issue.
    We enabled tcp keep alive like described in the mongo diagnostics guide https://docs.mongodb.com/manual/faq/diagnostics/#does-tcp-keepalive-time-affect-mongodb-deployments,
    and even created a Destination Rule with this keep alive. Still the same issue.

    I ve explained it also in this issue here, what we see every 30 min. We recieve a tcp connection drop:
    #17139 (comment)

    Would you please be so kind to double check this? The only way we could help us, was to make a dirty workaround and ping the DB every minute.

    Important: This was no issue before with Istio 1.2.x and we did not change application layer, just started to pop up when we upgraded to 1.6.8.

    I am a bit confused that this issue was closed. Maybe it is right to close it, but you see here are some people seeing this issue with newer Istio versions (see #17139 (comment)).

    As mentioned previously, the log is NOT an error. However, it being unreachable for 30 mins after is. If you can get a config dump before and after (https://github.com/istio/istio/wiki/Troubleshooting-Istio) please open an issue with the diff.

    Basically the log means we reconnect to istiod, which should return the same config, but its possible due to a bug its returning different config. Then 30min later it reconnects again and gets the "correct" config

    I have just started and luckily got proxy config dump before and after. I will open a new case after connection is fixed after exactly 30 min later.
    @howardjohn

    upstream connect error or disconnect/reset before headers. reset reason: connection termination #27513