Container behaviour should be identical every time in most cases, --privileged could change this, but it's not relevant to this ticket.
Additional environment details (AWS, VirtualBox, physical, etc.)
Valid Use Case for non-TLS Operation
When running in Kubernetes and using dind within a multi-container pod structure the underlying realaity is that the pods are on the same host and talking to each other over local loop back and that TLS doesn't really make sense, there is no man in the middle, anyone who is on the box already owns you and your entire cluster.
I've converted our stack over to SSL due to the threats of removal of behaviour and the extra hassle of making it not sleep 15 seconds during startup in a critical application. However the SSL startup is slower and the SSL communications over local loop back are slightly slower/more CPU intensive, and if someone is on the box and wants to MITM on the loop back interface then they have access to the certs/keys on disk anyway.
Thanks to @tianon who suggested putting this here.
Ref docker-library/docker#292
Hmm.. so there is an exception in the code for the daemon listening on a loopback interface, but that won't apply in the docker-in-docker case https://github.com/moby/moby/blob/c4040417b6fe21911dc7ab5e57db27519dd44a6a/cmd/dockerd/daemon.go#L681-L687
Perhaps we need some "i-know-what-im-doing" env-var to skip 🤔
@AkihiroSuda @cyphar @cpuguy83 any thoughts?
@cpuguy83 that is the case, however I will point out that:
--tls=false / --tlsverify=false alone does not result in non-tls behaviour
DOCKER_TLS_CERTDIR does result in non-tls behaviour, but with obnoxious 15 second startup delay - print the warning and go for it, or deprecate and fail fast
To get GOOD behaviour you MUST use both things. I feel like the cert dir variable should not matter and the argument should be the only thing to do. A dir setting controlling an entire behaviour? Seriously? :-D
This has caused similar difficulties for us in moving from docker 19.x to 20.10.x - TLS is not necessary in our use case (using the dind container as part of a k8s pod for ephemeral build nodes linked to Jenkins). However the delay in bringing up the daemon causes our builds to fail as the docker daemon doesn't respond quickly enough when the pod is initially brought up. We can configure TLS, but as mentioned by @fredcooke this adds unnecessary overhead where it is not needed.
my solution was to run dind with a volume attached to daemon.json that has {tls: false}
here's the full solution
version: "3"
services:
service1:
image: docker:dind
privileged: true
networks:
mynetwork:
ipv4_address: 172.16.0.2
volumes:
- ./daemon.json:/etc/docker/daemon.json
environment:
- DOCKER_TLS_CERTDIR=
- DOCKER_HOST=tcp://172.16.0.2:2375
# Add additional configurations for service1 if needed
networks:
mynetwork:
driver: bridge
ipam:
config:
- subnet: 172.16.0.0/24
with the following daemon.json:
"tls": false
You can also override the command used. In our case, the GitLab service looks like this:
services:
- name: docker:dind
# Good
command: [ "dockerd", "-H", "tcp://0.0.0.0:2375", "--tls=false" ]
# Bad
# command: [ "--tls=false" ]
Without overriding the whole command being executed (not only the arguments), dind would add the artificial wait time.
You can also override the command used. In our case, the GitLab service looks like this:
services:
- name: docker:dind
# Good
command: [ "dockerd", "-H", "tcp://0.0.0.0:2375", "--tls=false" ]
# Bad
# command: [ "--tls=false" ]
Without overriding the whole command being executed (not only the arguments), dind would add the artificial wait time.
worked, Thanks