添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement . We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trying to deploy Traefik, I'm facing some issues.

Trying to reproduce something similar to the following configuration: https://github.com/kubernetes-sigs/kubespray/pull/7147/files#diff-8a8f0f9c4954c722a017317b1c39c2371c79b4261db52ea205737ca1b88ffbad
And especifically:

  • expose my ingress using privileged hostNetwork / host ports (81 and 444 in my test, though would usually be 80 / 443)
  • with PodSecurityPolicies enabled
  • having the traefik container run without any specific privileges other than the API and its PSP, allowing for hostNetwork.
  • 1. hostNetwork enabled, hostPort != containerPort

    additionalArguments:
    - --metrics.prometheus=true
    - --metrics.prometheus.entrypoint=metrics
    - --metrics.prometheus.buckets=0.1,0.3,1.2,5.0
    - --ping=true
    - --providers.file
    - --providers.file.directory=/static # loading some wildcard tls cert ....
    deployment:
      enabled: true
      kind: Deployment
      replicas: 1
    hostNetwork: true
    ingressRoute:
      dashboard:
        enabled: false
    logs:
      general:
        level: INFO
      access:
        enabled: true
    nodeSelector: {}
    persistence:
      enabled: false
    podSecurityPolicy:
      enabled: true
    ports:
      metrics:
        port: 9113
        expose: false
        exposedPort: 9113
        protocol: TCP
      traefik:
        port: 9000
        expose: false
        exposedPort: 9000
        protocol: TCP
        port: 8081
        hostPort: 81
        expose: true
        exposedPort: 81
        protocol: TCP
      websecure:
        port: 8444
        hostPort: 444
        expose: true
        exposedPort: 444
        protocol: TCP
          enabled: true
    providers:
      kubernetesCRD:
        enabled: true
        namespaces: []
      kubernetesIngress:
        enabled: true
        namespaces: []
        publishedService:
          enabled: false
    rbac:
      enabled: true
      namespaced: true
    volumes:
    - name: traefik
      mountPath: /static
      type: configMap
    - name: traefik-default-tls
      mountPath: /tls-secret
      type: secret
    

    Kubernetes API server would refuse the deployment object when applied. Helm output as following:

    W0114 09:43:47.213804    6708 warnings.go:70] apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
    W0114 09:43:47.240531    6708 warnings.go:70] apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
    W0114 09:43:47.269013    6708 warnings.go:70] apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
    W0114 09:43:47.302539    6708 warnings.go:70] apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
    W0114 09:43:47.328470    6708 warnings.go:70] apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
    W0114 09:43:47.355895    6708 warnings.go:70] apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
    W0114 09:43:47.383972    6708 warnings.go:70] apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
    Error: Deployment.apps "traefik" is invalid: [spec.template.spec.containers[0].ports[2].containerPort: Invalid value: 8081: must match `hostPort` when `hostNetwork` is true, spec.template.spec.containers[0].ports[3].containerPort: Invalid value: 8444: must match `hostPort` when `hostNetwork` is true]
    

    2. Disables hostNetwork

    Assuming the issue would be my setting hostNetwork to true, in my values, let's try again, changing it to false.

    Then, the PodSecurityPolicy generated no longer allows for hostNetwork - which makes sense. Though Kubernetes scheduler would refuse to create Pod:

    5s          Warning   FailedCreate        replicaset/traefik-7557bcb7f7   Error creating: pods "traefik-7557bcb7f7-" is forbidden: PodSecurityPolicy: unable to admit pod: [spec.containers[0].hostPort: Invalid value: 9113: Host port 9113 is not allowed to be used. Allowed ports: [] spec.containers[0].hostPort: Invalid value: 9000: Host port 9000 is not allowed to be used. Allowed ports: [] spec.containers[0].hostPort: Invalid value: 81: Host port 81 is not allowed to be used. Allowed ports: [] spec.containers[0].hostPort: Invalid value: 444: Host port 444 is not allowed to be used. Allowed ports: [] spec.containers[0].hostPort: Invalid value: 9113: Host port 9113 is not allowed to be used. Allowed ports: [] spec.containers[0].hostPort: Invalid value: 9000: Host port 9000 is not allowed to be used. Allowed ports: [] spec.containers[0].hostPort: Invalid value: 81: Host port 81 is not allowed to be used. Allowed ports: [] spec.containers[0].hostPort: Invalid value: 444: Host port 444 is not allowed to be used. Allowed ports: []]
    

    3. Re-enable hostNetwork and set containerPort to match hostPort

    Now, let's resolve to start traefik binding directly on privileged ports, as I can't manage to get port translation working here:

    Having both web and websecure listening on 81 and 444 respectively, the Pod gets created, the container refuses to start:

    time="2021-01-14T10:17:07Z" level=info msg="Configuration loaded from flags."
    time="2021-01-14T10:17:07Z" level=info msg="Traefik version 2.3.6 built on 2020-12-17T16:34:27Z"
    time="2021-01-14T10:17:07Z" level=info msg="Stats collection is enabled."
    time="2021-01-14T10:17:07Z" level=info msg="Many thanks for contributing to Traefik's improvement by allowing us to receive anonymous information from your configuration."
    time="2021-01-14T10:17:07Z" level=info msg="Help us improve Traefik by leaving this feature on :)"
    time="2021-01-14T10:17:07Z" level=info msg="More details on: https://doc.traefik.io/traefik/contributing/data-collection/"
    2021/01/14 10:17:07 traefik.go:76: command traefik error: error while building entryPoint web: error preparing server: error opening listener: listen tcp :81: bind: permission denied
    

    What should be done

    when using hostNetwork, regardless of PSP being enabled, you should not set the containers hostNetwork to true, if you intend to set the hostPort attribute in your containers ports array, AND that any of the hostPort mismatches the corresponding containerPort.
    The following works:

    spec: containers: - args: image: docker.io/traefik:v2.3 name: traefik ports: - containerPort: *8080* hostPort: *80* name: http protocol: TCP - containerPort: *8443* hostPort: *443* name: https protocol: TCP - containerPort: 8100 hostPort: 8100 name: dashboard protocol: TCP - containerPort: 10254 name: metrics protocol: TCP readinessProbe: httpGet: path: /metrics port: 10254 scheme: HTTP securityContext: runAsUser: 1001 serviceAccount: traefik serviceAccountName: traefik

    IF PodSecurityPolices are enabled, then you just need to allow for hostNetwork, even though not mentionned in your containers spec:

      hostNetwork: true
      hostPorts:
      - max: 65535
        min: 0
    

    Moreover, when starting traefik binding on privileged ports, I think there's some NET_BIND_SERVICE capability to grant, in your PSP, as well as maybe some allowPrivilegeEscalation (not certain), to avoid the permission denied I had in my third test.

    - fixes hostNetwork configuration, when containerPort != hostPort
    - fixes PodSecurityPolicy, when running privileged
    - fixes when rbac.namespaced=true, access to ingressclass is defined
    - adds `--providers.kubernetescrd.allowcrossnamespace` option to
      podTemplate, defaults to false, can be changed setting
      `providers.kubernetesCRD.allowCrossNamespace=true`

    So... I've tried to address these in the PR above -- #337 .

    I've been digging on the privileges escalations: I'm still getting permission denied, binding on privileged ports, even adding the NET_BIND_SERVICE capability, fixing the pods securityContext, ... That's weird, I'm pretty sure Nginx would work, with such a configuration, unless I missed something, ...
    Nevermind. A comment in the charts does say privileged containers are required binding on privileged ports. I won't try further to prove it wrong...

    In addition to my previous remarks, let's add:

    if I want to run a privileged container, having PodSecurityPolicy enabled, I'm facing with another issue: the traefik PSP does not allow for privileged containers.

    if I'm setting the rbac.namespaced to true, then Traefik can not query for clusters IngressClass objects.

    E0114 14:43:11.595524       1 reflector.go:178] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:125: Failed to list *v1beta1.IngressClass: ingressclasses.networking.k8s.io is forbidden: User "system:serviceaccount:helm-traefik:traefik" cannot list resource "ingressclasses" in API group "networking.k8s.io" at the cluster scope
    E0114 14:43:11.598641       1 reflector.go:178] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:125: Failed to list *v1beta1.IngressClass: ingressclasses.networking.k8s.io is forbidden: User "system:serviceaccount:helm-traefik:traefik" cannot list resource "ingressclasses" in API group "networking.k8s.io" at the cluster scope
    

    Also, the role doesn't mention the latest api group for kubernetes ingresses.

    All of which should be covered in my PR.
    Though this is my very first time with Helm: I hope I didn't do anything stupid. Let me know if I can clarify / fix anything.

    At least for your scenario Nr.3 the issue seems to be the this:

    securityContext:
      capabilities:
        drop: [ALL]
    

    the following configuration works for me:

    hostNetwork: true
    ports:
        port: 80
        redirectTo: websecure
      websecure:
        port: 443
    securityContext:
      capabilities:
        drop: [ALL]
        add: [NET_BIND_SERVICE]
      readOnlyRootFilesystem: true
      runAsGroup: 0
      runAsNonRoot: false
      runAsUser: 0
    # as I do not have podSecurityContext enabled, I am not sure this part works.
    podSecurityContext:
      fsGroup: 0
      sandroden, littleboss, nunisa, jpmckearin, crazy-canux, theoparis, akhfa, Oceanswave, robalb, cinatic, and VuiDJi reacted with thumbs up emoji
      jpmckearin, theoparis, and akhfa reacted with hooray emoji
          All reactions
              

    point 3 states: having the traefik container run without any specific privileges other than the API and its PSP, allowing for hostNetwork

    Your sample securityContext doest start a privileged container, which should not be necessary - and would require granting more privileges in your PSP/if you had those activated, which I would like to avoid.

    FYI, here's the securityPolicy for the nginx ingress controller daemonset:

          hostNetwork: true
          containers:
            ports:
            - containerPort: 80
              hostPort: 80
            - containerPort: 443
              hostPort: 443
            securityContext:
              capabilities:
                - NET_BIND_SERVICE
                drop:
                - ALL
              runAsUser: 101
    

    Now this does not run as privileged, while it still exposes ports on hosts, binding on "privileged" ports.
    Here would be the corresponding PSP, allowing containers to use such a securityContext:

    spec:
      allowPrivilegeEscalation: true
      allowedCapabilities:
      - NET_BIND_SERVICE
      fsGroup:
        ranges:
        - max: 65535
          min: 1
        rule: MustRunAs
      hostNetwork: true
      hostPorts:
      - max: 65535
        min: 0
      runAsUser:
        rule: MustRunAsNonRoot
      seLinux:
        rule: RunAsAny
      supplementalGroups:
        ranges:
        - max: 65535
          min: 1
        rule: MustRunAs
      volumes:
    

    There is no reason for Traefik to run as root (spec.runAsUser.rule=MustRunAsNonRoot, no spec.privileged=true).
    Root privileges should be granted carefully, especially when going around Kubernetes usual ipc/net/pid isolation (hostNetwork).

    Faced with the same issue here ...
    In retrospective, I think this has to do with the following cap being set on the nginx binary, in the nginx-ingress-controller image:

    $ getcap /usr/local/nginx/sbin/nginx
    /usr/local/nginx/sbin/nginx = cap_net_bind_service+ep
    

    The getcap binary is missing from the Traefik image -- not certain right now, but if setcap/getcap are missing, then there's a fair chance the traefik binary doesn't have that capability set at all.

    - fixes hostNetwork configuration, when containerPort != hostPort
    - fixes PodSecurityPolicy, when running privileged
    - fixes when rbac.namespaced=true, access to ingressclass is defined
    - adds `--providers.kubernetescrd.allowcrossnamespace` option to
      podTemplate, defaults to false, can be changed setting
      `providers.kubernetesCRD.allowCrossNamespace=true`

    Even after applying this change recommended by Jasper-Ben , my helm based deployment is still waiting for the LoadBalancer IP to be assigne. Doesn't this should also disable the Service.enable=false or something else.

    basically, I want to deploy Traefik as Ingress Controller without any LoadBalancer IP assignment just like it works with the RKE2-ingress-nginx ( which listens on all the CP nodes's IPs and I dont have to have an LB on top of it )

    WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /etc/rancher/rke2/rke2.yaml
    WARNING: Kubernetes configuration file is world-readable. This is insecure. Location: /etc/rancher/rke2/rke2.yaml
    history.go:56: [debug] getting history for release traefik
    Release "traefik" does not exist. Installing it now.
    install.go:192: [debug] Original chart version: ""
    install.go:209: [debug] CHART PATH: /root/external/traefik
    client.go:128: [debug] creating 1 resource(s)
    client.go:128: [debug] creating 1 resource(s)
    client.go:128: [debug] creating 1 resource(s)
    client.go:128: [debug] creating 1 resource(s)
    client.go:128: [debug] creating 1 resource(s)
    client.go:128: [debug] creating 1 resource(s)
    client.go:128: [debug] creating 1 resource(s)
    client.go:128: [debug] creating 1 resource(s)
    client.go:128: [debug] creating 1 resource(s)
    install.go:165: [debug] Clearing discovery cache
    wait.go:66: [debug] beginning wait for 9 resources with timeout of 1m0s
    client.go:128: [debug] creating 1 resource(s)
    client.go:128: [debug] creating 5 resource(s)
    wait.go:66: [debug] beginning wait for 5 resources with timeout of 15m0s
    ready.go:258: [debug] Service does not have load balancer ingress IP address: traefik-ingress/traefik
    ready.go:258: [debug] Service does not have load balancer ingress IP address: traefik-ingress/traefik
    ready.go:258: [debug] Service does not have load balancer ingress IP address: traefik-ingress/traefik
    ready.go:258: [debug] Service does not have load balancer ingress IP address: traefik-ingress/traefik
    ready.go:258: [debug] Service does not have load balancer ingress IP address: traefik-ingress/traefik
    ready.go:258: [debug] Service does not have load balancer ingress IP address: traefik-ingress/traefik
    ready.go:258: [debug] Service does not have load balancer ingress IP address: traefik-ingress/traefik
    ready.go:258: [debug] Service does not have load balancer ingress IP address: traefik-ingress/traefik
    ready.go:258: [debug] Service does not have load balancer ingress IP address: traefik-ingress/traefik
    ready.go:258: [debug] Service does not have load balancer ingress IP address: traefik-ingress/traefik
    ready.go:258: [debug] Service does not have load balancer ingress IP address: traefik-ingress/traefik
    

    any option on deploying the traefik ingress as DS without Loadbalancer?

    Even after applying this change recommended by Jasper-Ben , my helm based deployment is still waiting for the LoadBalancer IP to be assigne. Doesn't this should also disable the Service.enable=false or something else.

    basically, I want to deploy Traefik as Ingress Controller without any LoadBalancer IP assignment just like it works with the RKE2-ingress-nginx ( which listens on all the CP nodes's IPs and I dont have to have an LB on top of it )

    WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /etc/rancher/rke2/rke2.yaml
    WARNING: Kubernetes configuration file is world-readable. This is insecure. Location: /etc/rancher/rke2/rke2.yaml
    history.go:56: [debug] getting history for release traefik
    Release "traefik" does not exist. Installing it now.
    install.go:192: [debug] Original chart version: ""
    install.go:209: [debug] CHART PATH: /root/external/traefik
    client.go:128: [debug] creating 1 resource(s)
    client.go:128: [debug] creating 1 resource(s)
    client.go:128: [debug] creating 1 resource(s)
    client.go:128: [debug] creating 1 resource(s)
    client.go:128: [debug] creating 1 resource(s)
    client.go:128: [debug] creating 1 resource(s)
    client.go:128: [debug] creating 1 resource(s)
    client.go:128: [debug] creating 1 resource(s)
    client.go:128: [debug] creating 1 resource(s)
    install.go:165: [debug] Clearing discovery cache
    wait.go:66: [debug] beginning wait for 9 resources with timeout of 1m0s
    client.go:128: [debug] creating 1 resource(s)
    client.go:128: [debug] creating 5 resource(s)
    wait.go:66: [debug] beginning wait for 5 resources with timeout of 15m0s
    ready.go:258: [debug] Service does not have load balancer ingress IP address: traefik-ingress/traefik
    ready.go:258: [debug] Service does not have load balancer ingress IP address: traefik-ingress/traefik
    ready.go:258: [debug] Service does not have load balancer ingress IP address: traefik-ingress/traefik
    ready.go:258: [debug] Service does not have load balancer ingress IP address: traefik-ingress/traefik
    ready.go:258: [debug] Service does not have load balancer ingress IP address: traefik-ingress/traefik
    ready.go:258: [debug] Service does not have load balancer ingress IP address: traefik-ingress/traefik
    ready.go:258: [debug] Service does not have load balancer ingress IP address: traefik-ingress/traefik
    ready.go:258: [debug] Service does not have load balancer ingress IP address: traefik-ingress/traefik
    ready.go:258: [debug] Service does not have load balancer ingress IP address: traefik-ingress/traefik
    ready.go:258: [debug] Service does not have load balancer ingress IP address: traefik-ingress/traefik
    ready.go:258: [debug] Service does not have load balancer ingress IP address: traefik-ingress/traefik
    

    any option on deploying the traefik ingress as DS without Loadbalancer?

    It works:

    deployment:
      kind: DaemonSet
    hostNetwork: true
    ports:
        port: 80
        redirectTo: websecure
      websecure:
        port: 443
          enabled: true
    # Customize updateStrategy of traefik pods
    updateStrategy:
      type: RollingUpdate
      rollingUpdate:
        maxUnavailable: 1
        maxSurge: 0
    service:
      enabled: false
    securityContext:
      capabilities:
        drop: [ALL]
        add: [NET_BIND_SERVICE]
      readOnlyRootFilesystem: true
      runAsGroup: 0
      runAsNonRoot: false
      runAsUser: 0
    # as I do not have podSecurityContext enabled, I am not sure this part works.
    podSecurityContext:
      fsGroup: 0
          hub-agent-dev-portal pod in status  CrashLoopBackOff with error msg "listen tcp :80: bind: permission denied" / "Unable to listen and serve dev portal requests"
          traefik/hub-helm-chart#83