添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement . We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seeing odd DNS behavior in Docker image alpine:3.8. I'm baffled that nslookup complains yet finds the IP address. In comparison, ping works perfectly. See below. This little test runs a docker container to resolve the name of the host VM. If I can get that working the next test will be to have the docker container resolve the name of other running containers.

Traced this back from behavior in an OpenJDK image, in which Java cannot resolve host names. I'd really prefer to use an Alpine version of a Java/JRE image, it's half the size of a non-Alpine (debian) Java/JRE image, but this network glitch is kind of a killer.

So far I've run this test under a plain Ubuntu VM running docker 17.05.0-ce and under Kubernetes running docker version 18.09.1. Same behavior in both. I know there are many external variables that might affect this so it might not be an Alpine issue at all, altho issue #255 sure seems to be related.

Would someone possibly take a minute to explain please? Thanks in advance.

me@host-dev1-vm01-core:~$ docker run alpine:3.8 nslookup host-dev1-vm01-core
nslookup: can't resolve '(null)': Name does not resolve
Name:      host-dev1-vm01-core
Address 1: 10.1.0.6
me@host-dev1-vm01-core:~$ docker run alpine:3.8 ping host-dev1-vm01-core
PING host-dev1-vm01-core (10.1.0.6): 56 data bytes
64 bytes from 10.1.0.6: seq=0 ttl=64 time=0.061 ms
64 bytes from 10.1.0.6: seq=1 ttl=64 time=0.133 ms
--- host-dev1-vm01-core ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.061/0.097/0.133 ms
  gunthercox, martinsirbe, Efrat19, Jeff-Tian, tsubasaogawa, PKizzle, snugghash, ccois-scwx, rasert, johnmccabe, and 18 more reacted with thumbs up emoji
  sdwerwed reacted with heart emoji
  Jeff-Tian, SnosMe, and drarko reacted with eyes emoji
    All reactions
Apparent DNS failure in Docker image alpine:3.8 can't resolve '(null)'
Apparent DNS failure in Docker image alpine:3.8, nslookup: can't resolve '(null)'
      Jan 23, 2019
          

The BusyBox nslookup, which Alpine uses, does two lookups, one for the DNS server and one for the name you asked for. This can be seen here.

In your example nslookup did resolve the name host-dev1-vm01-core to the address 10.1.0.6.

The line can't resolve '(null)' says that, at that point, it didn't know what its DNS server was.
Looking at the code that might initialize it, we see why.
Sorry, I ran out of time chasing down the reason for that. Hope this is of some help.

fhajji, RuBiCK, ahmedsaab, aknuds1, ncopa, evertharmeling, and johnwbyrd reacted with thumbs up emoji ncopa and jehowe reacted with heart emoji All reactions

Thanks @bboreham for the note, could this failure to resolve the DNS server cause a host-name resolution failure in a Java program? I might be asking the wrong questions, but I guess I'm grasping at straws, trying to explain why in my environment & tests the Java image openjdk:8-jre-alpine (derived from alpine:3.8 image) fails but the Java image openjdk:8-jre-slim (derived from debian) works just fine.

It’s not really a failure at all; it’s just a program printing something out that isn’t helpful or interesting.

What Alpine nslookup does has no bearing on what a Java program does.

I'm not saying it is not a reliable indicator, I'm saying the line where it prints can't resolve '(null)' is not related to what you want to know.

Check the return code from nslookup; ignore that line.

I have not yet figured out the problem, pls see below for shortest possible Java debugging material, hope this will help other people.

file ResolveHostName.java

import java.net.InetAddress;
public class ResolveHostName {
	public static void main(String[] args) throws Exception {
		if (args.length != 1)
			throw new IllegalArgumentException("Usage: program host-name-to-resolve");
		System.out.println("Resolving " + args[0]);
		System.out.println(InetAddress.getByName(args[0]).toString());

file Dockerfile-alpine

FROM openjdk:8-jre-alpine
COPY res-host-name.jar /
ENTRYPOINT ["java", "-jar", "res-host-name.jar"]

file build.sh

#!/bin/bash
set -e -x
javac ResolveHostName.java
jar cvfe res-host-name.jar ResolveHostName ResolveHostName.class
docker build -f Dockerfile-alpine  .
  • yes, you can ignore the alpine (busybox) nslookup outputting message "... can't resolve '(null)'", it's unnecessary, if you set the second param as a NS, the msg would be gone away:
  • # nslookup wx.qlogo.cn 100.100.2.136
    Server:    100.100.2.136
    Address 1: 100.100.2.136
    Name:      wx.qlogo.cn
    Address 1: 203.205.142.155
    Address 2: 203.205.142.154
    

    and the relative busybox codes (e.g. alpine3.8 is using http://busybox.net/downloads/busybox-1.28.4.tar.bz2 , networking/nslookup.c, function nslookup_main) and the debugging logs with strace showed that the program tried the DNS query for the given NS server ('forward' or 'reverse' DNS query, it depends on the second NS param you gave, if the NS param was some ns ip it did reverse query .aka PTR query) first of all, you got that msg because of the empty global NS server at the initialized time.

  • my codes implements (workaround DNS intermittent delays of 5s kubernetes/kubernetes#56903 (comment)) removed the ipv6 from default behavior to decrease that dns timed out issue, at the docker image tier, please consider the pros and cons (it worked well for me, the nodejs/java/go/c++/python application that no needs ipv6), and the perfect solution to solve the dns timed out would be others (kernel conntrack racy fixes, local node dns cache ).
  • I have been running most of applications (java/go/nodejs/c/c++/python) based on my customized alpine docker for 1+ years.
    and highly recommend running some debugging with strace, tcpdump on the docker, you would run docker on the privileged mode to get more system capability,
    can you please post some strace / tcpdump logs here?

    thanks,
    harper

     - There appear to be many reported issues against Alpine DNS. This
       is an attempt to work around the ones we're experiencing.
     - In local testing (specifically under LCOW), DNS resolution under
       Alpine seems to be very problematic.
       `nslookup` may repeatedly fail to perform a DNS resolution against
       another container name like `puppet.local` repeatedly.
       Lookup failures will resemble something like:
         / # nslookup puppet.local
         nslookup: can't resolve '(null)': Name does not resolve
         nslookup: can't resolve 'puppet.local': Name does not resolve
       Even successes have problems with the DNS server
         / # nslookup puppet.local
         nslookup: can't resolve '(null)': Name does not resolve
         Name:      puppet.local
         Address 1: 172.17.212.25
     - Supposedly the "can't resolve '(null)'" part is innocuous, but it's
       unclear if that is the case.  More info at:
       nicolaka/netshoot#6
       gliderlabs/docker-alpine#476
     - It seems that just having the `bind-tools` package installed will
       increase the reliability, but after running dig once against the
       given host, intermittnet DNS resolution problems seem to go away
         / # nslookup puppet.local
         Server:         172.17.208.1
         Address:        172.17.208.1#53
         Non-authoritative answer:
         Name:   puppet.local
         Address: 172.17.212.25
       So the script is changed to query for the postgres hostname
     - We don't use curl here because we're mostly interested in making
       sure a host with a given name *should* exist.
       There are scenarios where host / dig will succeed, but latter
       checks with curl may not - and we want to differentiate those
       failure modes as much as possible
       https://serverfault.com/questions/335359/how-is-it-possible-that-i-can-do-a-host-lookup-but-not-a-curl
    Intermittent DNS failures when running Alpine containers in user-defined docker-compose network microsoft/opengcs#303

    I have similar issue in our Kubernetes cluster when I try to get the ip of Kubernetes DNS server. I'm using the this docker image: nginx:1.16.0-alpine

    / # nslookup kube-dns.kube-system.svc.cluster.local
    nslookup: can't resolve '(null)': Name does not resolve
    Name:      kube-dns.kube-system.svc.cluster.local
    Address 1: 192.168.0.10 kube-dns.kube-system.svc.cluster.local
      HubertBos, JumpMaster, ShadowWaIker, vvrnv, and smartpierre reacted with thumbs up emoji
      jawlitkp reacted with thumbs down emoji
      Jeff-Tian and rubik reacted with confused emoji
        All reactions
          
     - There appear to be many reported issues against Alpine DNS. This
       is an attempt to work around the ones we're experiencing.
     - In local testing (specifically under LCOW), DNS resolution under
       Alpine seems to be very problematic.
       `nslookup` may repeatedly fail to perform a DNS resolution against
       another container name like `puppet.local` repeatedly.
       Lookup failures will resemble something like:
         / # nslookup puppet.local
         nslookup: can't resolve '(null)': Name does not resolve
         nslookup: can't resolve 'puppet.local': Name does not resolve
       Even successes have problems with the DNS server
         / # nslookup puppet.local
         nslookup: can't resolve '(null)': Name does not resolve
         Name:      puppet.local
         Address 1: 172.17.212.25
     - Supposedly the "can't resolve '(null)'" part is innocuous, but it's
       unclear if that is the case.  More info at:
       nicolaka/netshoot#6
       gliderlabs/docker-alpine#476
     - It seems that just having the `bind-tools` package installed will
       increase the reliability, but after running dig once against the
       given host, intermittnet DNS resolution problems seem to go away
         / # nslookup puppet.local
         Server:         172.17.208.1
         Address:        172.17.208.1#53
         Non-authoritative answer:
         Name:   puppet.local
         Address: 172.17.212.25
       So the script is changed to query for the postgres hostname
     - We don't use curl here because we're mostly interested in making
       sure a host with a given name *should* exist.
       There are scenarios where host / dig will succeed, but latter
       checks with curl may not - and we want to differentiate those
       failure modes as much as possible
       https://serverfault.com/questions/335359/how-is-it-possible-that-i-can-do-a-host-lookup-but-not-a-curl

    The BusyBox nslookup, which Alpine uses, does two lookups, one for the DNS server and one for the name you asked for. This can be seen here.

    This was very useful. There are apparently a different nslookup implementation available in busybox which can be enabled with CONFIG_FEATURE_NSLOOKUP_BIG and the comments there says that it is compatible with musl. I wil enable that and see if i can backport it to alpine:3.11 at least.

    BTW, the official docker image has moved to https://github.com/alpinelinux/docker-alpine. Since this was a config option in upstream alpine, it would have been good if it was reported upstream to
    https://gitlab.alpinelinux.org/alpine/aports

    This is a bit of an arse, but I have spotted since switching to alpine as the base image the healthchecks no longer work. If I `docker exec` onto the instance I can replicate the wget calls fine. But the healthchecks report the same thing in both
    ```json
      "Start": "2020-02-27T13:02:17.0507153Z",
      "End": "2020-02-27T13:02:17.2414162Z",
      "ExitCode": 1,
      "Output": "wget: bad address '|| exit 1'\n"
    I believe the problem is a known issue with the DNS in Alpine images. What I'm struggling with right now is trying to find a clean example of what I need to do to resolve it
    - https://medium.com/@xavier.priour/docker-alpine-dns-issue-bad-address-84594d128d9f
    - https://forums.docker.com/t/resolved-service-name-resolution-broken-on-alpine-and-docker-1-11-1-cs1/19307
    - gliderlabs/docker-alpine#476
    - https://unix.stackexchange.com/questions/441664/alpine-linux-sometimes-dns-is-not-resolved
    - docker/for-linux#755
    - https://stackoverflow.com/questions/57202039/resolve-conf-cant-be-changed-docker-alpine
    I have also tried playing with and removing the rails user (in case it was a permissions issue) and carrying out a `apk upgrade -U -a` as part of the build to ensure everything in the image is the latest and greatest but still no joy.
    So as I never actually see these when I'm googling for examples, and I know the apps are currently working, I'm removing them for now. I would like to bring them back in later though once I've got a bit more time to look into the problem.
    akka.discovery.ServiceDiscovery$DiscoveryTimeoutException: Dns resolve did not respond within 5.000 s akka/akka#28948

    A temporary workaround which worked for me: try to create the docker container in host mode:

    --network host
          
     - There appear to be many reported issues against Alpine DNS. This
       is an attempt to work around the ones we're experiencing.
     - In local testing (specifically under LCOW), DNS resolution under
       Alpine seems to be very problematic.
       `nslookup` may repeatedly fail to perform a DNS resolution against
       another container name like `puppet.local` repeatedly.
       Lookup failures will resemble something like:
         / # nslookup puppet.local
         nslookup: can't resolve '(null)': Name does not resolve
         nslookup: can't resolve 'puppet.local': Name does not resolve
       Even successes have problems with the DNS server
         / # nslookup puppet.local
         nslookup: can't resolve '(null)': Name does not resolve
         Name:      puppet.local
         Address 1: 172.17.212.25
     - Supposedly the "can't resolve '(null)'" part is innocuous, but it's
       unclear if that is the case.  More info at:
       nicolaka/netshoot#6
       gliderlabs/docker-alpine#476
     - It seems that just having the `bind-tools` package installed will
       increase the reliability, but after running dig once against the
       given host, intermittnet DNS resolution problems seem to go away
         / # nslookup puppet.local
         Server:         172.17.208.1
         Address:        172.17.208.1#53
         Non-authoritative answer:
         Name:   puppet.local
         Address: 172.17.212.25
       So the script is changed to query for the postgres hostname
     - We don't use curl here because we're mostly interested in making
       sure a host with a given name *should* exist.
       There are scenarios where host / dig will succeed, but latter
       checks with curl may not - and we want to differentiate those
       failure modes as much as possible
       https://serverfault.com/questions/335359/how-is-it-possible-that-i-can-do-a-host-lookup-but-not-a-curl
     - There appear to be many reported issues against Alpine DNS. This
       is an attempt to work around the ones we're experiencing.
     - In local testing (specifically under LCOW), DNS resolution under
       Alpine seems to be very problematic.
       `nslookup` may repeatedly fail to perform a DNS resolution against
       another container name like `puppet.local` repeatedly.
       Lookup failures will resemble something like:
         / # nslookup puppet.local
         nslookup: can't resolve '(null)': Name does not resolve
         nslookup: can't resolve 'puppet.local': Name does not resolve
       Even successes have problems with the DNS server
         / # nslookup puppet.local
         nslookup: can't resolve '(null)': Name does not resolve
         Name:      puppet.local
         Address 1: 172.17.212.25
     - Supposedly the "can't resolve '(null)'" part is innocuous, but it's
       unclear if that is the case.  More info at:
       nicolaka/netshoot#6
       gliderlabs/docker-alpine#476
     - It seems that just having the `bind-tools` package installed will
       increase the reliability, but after running dig once against the
       given host, intermittnet DNS resolution problems seem to go away
         / # nslookup puppet.local
         Server:         172.17.208.1
         Address:        172.17.208.1#53
         Non-authoritative answer:
         Name:   puppet.local
         Address: 172.17.212.25
       So the script is changed to query for the postgres hostname
     - We don't use curl here because we're mostly interested in making
       sure a host with a given name *should* exist.
       There are scenarios where host / dig will succeed, but latter
       checks with curl may not - and we want to differentiate those
       failure modes as much as possible
       https://serverfault.com/questions/335359/how-is-it-possible-that-i-can-do-a-host-lookup-but-not-a-curl