io.netty.resolver.dns.DnsNameResolverTimeoutException when the application is running in Kubernetes
org.springframework.web.reactive.function.client.WebClientRequestException: Failed to resolve 'cloudconnector.linkup-sage.com'; nested exception is java.net.UnknownHostException: Failed to resolve 'cloudconnector.linkup-sage.com'
at org.springframework.web.reactive.function.client.ExchangeFunctions$DefaultExchangeFunction.lambda$wrapException$9(ExchangeFunctions.java:141)
Suppressed: reactor.core.publisher.FluxOnAssembly$OnAssemblyException:
Error has been observed at the following site(s):
*__checkpoint ⇢ Request to GET https://cloudconnector.linkup-sage.com/v1/applications/6bcf0073-be0d-4e7c-9b10-0c75eff8d10a/accountancypractices/a0f37c84-efe9-4b4d-9e53-67d5236c9313/companies/80173bbe-d966-40d3-a995-89841f920be6/accounting/periods [DefaultWebClient]
Original Stack Trace:
at org.springframework.web.reactive.function.client.ExchangeFunctions$DefaultExchangeFunction.lambda$wrapException$9(ExchangeFunctions.java:141)
at reactor.core.publisher.MonoErrorSupplied.subscribe(MonoErrorSupplied.java:55)
at reactor.core.publisher.Mono.subscribe(Mono.java:4455)
at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onError(FluxOnErrorResume.java:103)
at reactor.core.publisher.FluxPeek$PeekSubscriber.onError(FluxPeek.java:222)
at reactor.core.publisher.FluxPeek$PeekSubscriber.onError(FluxPeek.java:222)
at reactor.core.publisher.FluxPeek$PeekSubscriber.onError(FluxPeek.java:222)
at reactor.core.publisher.MonoNext$NextSubscriber.onError(MonoNext.java:93)
at reactor.core.publisher.MonoFlatMapMany$FlatMapManyMain.onError(MonoFlatMapMany.java:204)
at reactor.core.publisher.SerializedSubscriber.onError(SerializedSubscriber.java:124)
at reactor.core.publisher.FluxRetryWhen$RetryWhenMainSubscriber.whenError(FluxRetryWhen.java:225)
at reactor.core.publisher.FluxRetryWhen$RetryWhenOtherSubscriber.onError(FluxRetryWhen.java:274)
at reactor.core.publisher.FluxConcatMap$ConcatMapImmediate.innerError(FluxConcatMap.java:309)
at reactor.core.publisher.FluxConcatMap$ConcatMapInner.onError(FluxConcatMap.java:875)
at reactor.core.publisher.Operators.error(Operators.java:198)
at reactor.core.publisher.MonoError.subscribe(MonoError.java:53)
at reactor.core.publisher.Mono.subscribe(Mono.java:4455)
at reactor.core.publisher.FluxConcatMap$ConcatMapImmediate.drain(FluxConcatMap.java:451)
at reactor.core.publisher.FluxConcatMap$ConcatMapImmediate.onNext(FluxConcatMap.java:251)
at reactor.core.publisher.EmitterProcessor.drain(EmitterProcessor.java:537)
at reactor.core.publisher.EmitterProcessor.tryEmitNext(EmitterProcessor.java:343)
at reactor.core.publisher.SinkManySerialized.tryEmitNext(SinkManySerialized.java:100)
at reactor.core.publisher.InternalManySink.emitNext(InternalManySink.java:27)
at reactor.core.publisher.FluxRetryWhen$RetryWhenMainSubscriber.onError(FluxRetryWhen.java:190)
at reactor.core.publisher.MonoCreate$DefaultMonoSink.error(MonoCreate.java:201)
at reactor.netty.http.client.HttpClientConnect$MonoHttpConnect$ClientTransportSubscriber.onError(HttpClientConnect.java:308)
at reactor.core.publisher.MonoCreate$DefaultMonoSink.error(MonoCreate.java:201)
at reactor.netty.resources.DefaultPooledConnectionProvider$DisposableAcquire.onError(DefaultPooledConnectionProvider.java:158)
at reactor.netty.internal.shaded.reactor.pool.AbstractPool$Borrower.fail(AbstractPool.java:475)
at reactor.netty.internal.shaded.reactor.pool.SimpleDequePool.lambda$drainLoop$9(SimpleDequePool.java:431)
at reactor.core.publisher.FluxDoOnEach$DoOnEachSubscriber.onError(FluxDoOnEach.java:186)
at reactor.core.publisher.MonoCreate$DefaultMonoSink.error(MonoCreate.java:201)
at reactor.netty.resources.DefaultPooledConnectionProvider$PooledConnectionAllocator$PooledConnectionInitializer.onError(DefaultPooledConnectionProvider.java:548)
at reactor.core.publisher.MonoFlatMap$FlatMapMain.secondError(MonoFlatMap.java:192)
at reactor.core.publisher.MonoFlatMap$FlatMapInner.onError(MonoFlatMap.java:259)
at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onError(FluxOnErrorResume.java:106)
at org.springframework.security.config.annotation.web.configuration.SecurityReactorContextConfiguration$SecurityReactorContextSubscriber.onError(SecurityReactorContextConfiguration.java:171)
at reactor.core.publisher.Operators.error(Operators.java:198)
at reactor.core.publisher.MonoError.subscribe(MonoError.java:53)
at reactor.core.publisher.Mono.subscribe(Mono.java:4455)
at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onError(FluxOnErrorResume.java:103)
at reactor.netty.transport.TransportConnector$MonoChannelPromise.tryFailure(TransportConnector.java:534)
at reactor.netty.transport.TransportConnector.lambda$doResolveAndConnect$11(TransportConnector.java:341)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:578)
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:552)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:491)
at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:616)
at io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:609)
at io.netty.util.concurrent.DefaultPromise.setFailure(DefaultPromise.java:109)
at io.netty.resolver.InetSocketAddressResolver$2.operationComplete(InetSocketAddressResolver.java:86)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:578)
at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:571)
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:550)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:491)
at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:616)
at io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:609)
at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:117)
at io.netty.resolver.dns.DnsResolveContext.finishResolve(DnsResolveContext.java:1055)
at io.netty.resolver.dns.DnsResolveContext.tryToFinishResolve(DnsResolveContext.java:1000)
at io.netty.resolver.dns.DnsResolveContext.query(DnsResolveContext.java:418)
at io.netty.resolver.dns.DnsResolveContext.access$600(DnsResolveContext.java:66)
at io.netty.resolver.dns.DnsResolveContext$2.operationComplete(DnsResolveContext.java:467)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:578)
at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:571)
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:550)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:491)
at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:616)
at io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:609)
at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:117)
at io.netty.resolver.dns.DnsQueryContext.tryFailure(DnsQueryContext.java:256)
at io.netty.resolver.dns.DnsQueryContext$4.run(DnsQueryContext.java:208)
at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98)
at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:153)
at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:406)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:833)
Suppressed: java.lang.Exception: #block terminated with an error
at reactor.core.publisher.BlockingSingleSubscriber.blockingGet(BlockingSingleSubscriber.java:99)
at reactor.core.publisher.Mono.block(Mono.java:1707)
(..Project classes..)
at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.net.UnknownHostException: Failed to resolve 'cloudconnector.linkup-sage.com'
at io.netty.resolver.dns.DnsResolveContext.finishResolve(DnsResolveContext.java:1047)
at io.netty.resolver.dns.DnsResolveContext.tryToFinishResolve(DnsResolveContext.java:1000)
at io.netty.resolver.dns.DnsResolveContext.query(DnsResolveContext.java:418)
at io.netty.resolver.dns.DnsResolveContext.access$600(DnsResolveContext.java:66)
at io.netty.resolver.dns.DnsResolveContext$2.operationComplete(DnsResolveContext.java:467)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:578)
at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:571)
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:550)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:491)
at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:616)
at io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:609)
at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:117)
at io.netty.resolver.dns.DnsQueryContext.tryFailure(DnsQueryContext.java:256)
at io.netty.resolver.dns.DnsQueryContext$4.run(DnsQueryContext.java:208)
at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98)
at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:153)
at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:406)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: io.netty.resolver.dns.DnsNameResolverTimeoutException: [/10.43.0.10:53] query via UDP timed out after 300000 milliseconds (no stack trace available)
Steps to Reproduce
Instanciation code
private WebClient buildWebClient() {
var httpClient = HttpClient.create()
.option(ChannelOption.CONNECT_TIMEOUT_MILLIS, 30 * 1000)
.responseTimeout(Duration.ofSeconds(30))
.resolver(spec -> spec.queryTimeout(Duration.ofSeconds(300)));
return WebClient.builder()
.defaultHeader(HttpHeaders.CONTENT_TYPE, MediaType.APPLICATION_JSON_VALUE)
.clientConnector(new ReactorClientHttpConnector(httpClient))
.build();
Web request call code
MultiValueMap<String, String> bodyMap = new LinkedMultiValueMap<>();
bodyMap.add("grant_type", "client_credentials");
bodyMap.add("client_id", myProperties.getClientId());
bodyMap.add("client_secret", myProperties.getClientSecret());
response = webClientDefault.post()
.uri(myProperties.getAuthorizationEndPoint())
.contentType(MediaType.APPLICATION_FORM_URLENCODED)
.body(BodyInserters.fromValue(bodyMap))
.retrieve()
.toEntity(MyResponse.class)
.block();
Possible Solution
We are investing time to look into possible fixes / cache / remediation in the kubernetes infrastructure
Another fix being pursued is upgrading the various component versions, and adding a retry mecanism.
Your Environment
Versions :
JDK 17.0.6
Spring-boot 2.7.4
reactor-netty 1.0.23
netty 4.1.82.Final
The issue is probably well described here : https://www.weave.works/blog/racy-conntrack-and-dns-lookup-timeouts and https://blog.quentin-machu.fr/2018/06/24/5-15s-dns-lookups-on-kubernetes/ and kubernetes/kubernetes#56903
Can reactor-netty set the DNS resolution to use TCP instead of UDP ?
reactor-netty/reactor-netty-core/src/main/java/reactor/netty/transport/NameResolverProvider.java
Line 553
9aa0877
https://github.com/netty/netty/blob/7dba4fcf499425914e09993d1f2bd439794cb3d6/resolver-dns/src/main/java/io/netty/resolver/dns/DnsNameResolverBuilder.java#L145
Also you are running with an old version of Netty and there are fixes in the newer versions that might be related.
Facing similar issue on 4.1.91.Final, seeing high address resolver time 5 - 20s and connect time in multiples of 5 seconds, some requests failing with
io.netty.resolver.dns.DnsResolveContext$SearchDomainUnknownHostException: Failed to resolve '**host**' [A(1), AAAA(28)] and search domain query for configured domains failed as well:
Facing this with EKS 1.24 and older EKS versions as well.
Thanks for looking into this and pointing to multiple ressources.
Upgrading to the reactory-netty 1.0.38 (Spring-boot-webflux 2.7.7, netty 4.1.100.Final) we get the following error :
org.springframework.web.reactive.function.client.WebClientRequestException: Failed to resolve 'id.sage.com' [A(1)]; nested exception is java.net.UnknownHostException: Failed to resolve 'id.sage.com' [A(1)]
at org.springframework.web.reactive.function.client.ExchangeFunctions$DefaultExchangeFunction.lambda$wrapException$9(ExchangeFunctions.java:141)
Suppressed: reactor.core.publisher.FluxOnAssembly$OnAssemblyException:
Error has been observed at the following site(s):
*__checkpoint ⇢ Request to POST https://id.sage.com/oauth/token [DefaultWebClient]
Original Stack Trace:
at org.springframework.web.reactive.function.client.ExchangeFunctions$DefaultExchangeFunction.lambda$wrapException$9(ExchangeFunctions.java:141)
at reactor.core.publisher.MonoErrorSupplied.subscribe(MonoErrorSupplied.java:55)
at reactor.core.publisher.Mono.subscribe(Mono.java:4455)
at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onError(FluxOnErrorResume.java:103)
at reactor.core.publisher.FluxPeek$PeekSubscriber.onError(FluxPeek.java:222)
at reactor.core.publisher.FluxPeek$PeekSubscriber.onError(FluxPeek.java:222)
at reactor.core.publisher.FluxPeek$PeekSubscriber.onError(FluxPeek.java:222)
at reactor.core.publisher.MonoNext$NextSubscriber.onError(MonoNext.java:93)
at reactor.core.publisher.MonoFlatMapMany$FlatMapManyMain.onError(MonoFlatMapMany.java:204)
at reactor.core.publisher.SerializedSubscriber.onError(SerializedSubscriber.java:124)
at reactor.core.publisher.FluxRetryWhen$RetryWhenMainSubscriber.whenError(FluxRetryWhen.java:225)
at reactor.core.publisher.FluxRetryWhen$RetryWhenOtherSubscriber.onError(FluxRetryWhen.java:274)
at reactor.core.publisher.FluxConcatMap$ConcatMapImmediate.innerError(FluxConcatMap.java:309)
at reactor.core.publisher.FluxConcatMap$ConcatMapInner.onError(FluxConcatMap.java:875)
at reactor.core.publisher.Operators.error(Operators.java:198)
at reactor.core.publisher.MonoError.subscribe(MonoError.java:53)
at reactor.core.publisher.Mono.subscribe(Mono.java:4455)
at reactor.core.publisher.FluxConcatMap$ConcatMapImmediate.drain(FluxConcatMap.java:451)
at reactor.core.publisher.FluxConcatMap$ConcatMapImmediate.onNext(FluxConcatMap.java:251)
at reactor.core.publisher.EmitterProcessor.drain(EmitterProcessor.java:537)
at reactor.core.publisher.EmitterProcessor.tryEmitNext(EmitterProcessor.java:343)
at reactor.core.publisher.SinkManySerialized.tryEmitNext(SinkManySerialized.java:100)
at reactor.core.publisher.InternalManySink.emitNext(InternalManySink.java:27)
at reactor.core.publisher.FluxRetryWhen$RetryWhenMainSubscriber.onError(FluxRetryWhen.java:190)
at reactor.core.publisher.MonoCreate$DefaultMonoSink.error(MonoCreate.java:201)
at reactor.netty.http.client.HttpClientConnect$MonoHttpConnect$ClientTransportSubscriber.onError(HttpClientConnect.java:311)
at reactor.core.publisher.MonoCreate$DefaultMonoSink.error(MonoCreate.java:201)
at reactor.netty.resources.DefaultPooledConnectionProvider$DisposableAcquire.onError(DefaultPooledConnectionProvider.java:160)
at reactor.netty.internal.shaded.reactor.pool.AbstractPool$Borrower.fail(AbstractPool.java:475)
at reactor.netty.internal.shaded.reactor.pool.SimpleDequePool.lambda$drainLoop$9(SimpleDequePool.java:431)
at reactor.core.publisher.FluxDoOnEach$DoOnEachSubscriber.onError(FluxDoOnEach.java:186)
at reactor.core.publisher.MonoCreate$DefaultMonoSink.error(MonoCreate.java:201)
at reactor.netty.resources.DefaultPooledConnectionProvider$PooledConnectionAllocator$PooledConnectionInitializer.onError(DefaultPooledConnectionProvider.java:558)
at reactor.core.publisher.MonoFlatMap$FlatMapMain.secondError(MonoFlatMap.java:192)
at reactor.core.publisher.MonoFlatMap$FlatMapInner.onError(MonoFlatMap.java:259)
at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onError(FluxOnErrorResume.java:106)
at org.springframework.security.config.annotation.web.configuration.SecurityReactorContextConfiguration$SecurityReactorContextSubscriber.onError(SecurityReactorContextConfiguration.java:171)
at reactor.core.publisher.Operators.error(Operators.java:198)
at reactor.core.publisher.MonoError.subscribe(MonoError.java:53)
at reactor.core.publisher.Mono.subscribe(Mono.java:4455)
at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onError(FluxOnErrorResume.java:103)
at reactor.netty.transport.TransportConnector$MonoChannelPromise.tryFailure(TransportConnector.java:569)
at reactor.netty.transport.TransportConnector.lambda$doResolveAndConnect$11(TransportConnector.java:368)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:590)
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:557)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:492)
at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:636)
at io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:629)
at io.netty.util.concurrent.DefaultPromise.setFailure(DefaultPromise.java:110)
at io.netty.resolver.InetSocketAddressResolver$2.operationComplete(InetSocketAddressResolver.java:86)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:590)
at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:583)
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:559)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:492)
at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:636)
at io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:629)
at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:118)
at io.netty.resolver.dns.DnsResolveContext.finishResolve(DnsResolveContext.java:1105)
at io.netty.resolver.dns.DnsResolveContext.tryToFinishResolve(DnsResolveContext.java:1044)
at io.netty.resolver.dns.DnsResolveContext.query(DnsResolveContext.java:432)
at io.netty.resolver.dns.DnsResolveContext.access$700(DnsResolveContext.java:66)
at io.netty.resolver.dns.DnsResolveContext$2.operationComplete(DnsResolveContext.java:500)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:590)
at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:583)
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:559)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:492)
at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:636)
at io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:629)
at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:118)
at io.netty.resolver.dns.DnsQueryContext.finishFailure(DnsQueryContext.java:348)
at io.netty.resolver.dns.DnsQueryContext$5.run(DnsQueryContext.java:290)
at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98)
at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:153)
at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:416)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:833)
Suppressed: java.lang.Exception: #block terminated with an error
at reactor.core.publisher.BlockingSingleSubscriber.blockingGet(BlockingSingleSubscriber.java:99)
at reactor.core.publisher.Mono.block(Mono.java:1707)
[snip]
org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54)
at org.springframework.scheduling.concurrent.ReschedulingRunnable.run(ReschedulingRunnable.java:95)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.net.UnknownHostException: Failed to resolve 'id.sage.com' [A(1)]
at io.netty.resolver.dns.DnsResolveContext.finishResolve(DnsResolveContext.java:1097)
at io.netty.resolver.dns.DnsResolveContext.tryToFinishResolve(DnsResolveContext.java:1044)
at io.netty.resolver.dns.DnsResolveContext.query(DnsResolveContext.java:432)
at io.netty.resolver.dns.DnsResolveContext.access$700(DnsResolveContext.java:66)
at io.netty.resolver.dns.DnsResolveContext$2.operationComplete(DnsResolveContext.java:500)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:590)
at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:583)
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:559)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:492)
at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:636)
at io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:629)
at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:118)
at io.netty.resolver.dns.DnsQueryContext.finishFailure(DnsQueryContext.java:348)
at io.netty.resolver.dns.DnsQueryContext$5.run(DnsQueryContext.java:290)
at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98)
at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:153)
at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:416)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: io.netty.resolver.dns.DnsNameResolverTimeoutException: [42865: /10.43.0.10:53] DefaultDnsQuestion(id.sage.com. IN A) query '42865' via UDP timed out after 300000 milliseconds (no stack trace available)
io.netty.resolver.dns.DnsNameResolverTimeoutException when application is running Kubernetes
netty/netty#13705
reactor-netty/reactor-netty-core/src/main/java/reactor/netty/transport/NameResolverProvider.java
Line 553
9aa0877
https://github.com/netty/netty/blob/7dba4fcf499425914e09993d1f2bd439794cb3d6/resolver-dns/src/main/java/io/netty/resolver/dns/DnsNameResolverBuilder.java#L145
@violetagg I found this issue report while I have been investigating a similar problem where we have been seeing a WebClientRequestException
-> DnsResolveContext$SearchDomainUnknownHostException
-> DnsNameResolverTimeoutException
exception chain while using Netty 4.1.105.Final with Spring Boot 3.
I think that the TCP fallback is only turned on by default in reactor-netty in situations where the UDP DNS response has been truncated by the network.
Looking at the code in DnsQueryContext
it looks like the TCP fallback will occur for situations where the DNS response is truncated because the socketBootstrap
is set via the DnsNameResolver
(which occurs because socketChannelFactory
is also set).
https://github.com/netty/netty/blob/151dfa083d28e995a18f7d2c73d4a7d3b7ab73b2/resolver-dns/src/main/java/io/netty/resolver/dns/DnsQueryContext.java#L328
https://github.com/netty/netty/blob/151dfa083d28e995a18f7d2c73d4a7d3b7ab73b2/resolver-dns/src/main/java/io/netty/resolver/dns/DnsNameResolver.java#L437
In situations where the DNS response times out it looks like there is another variable, retryWithTcpOnTimeout
which controls whether a TCP retry is attempted.
https://github.com/netty/netty/blob/151dfa083d28e995a18f7d2c73d4a7d3b7ab73b2/resolver-dns/src/main/java/io/netty/resolver/dns/DnsQueryContext.java#L373
As far as I can tell retryWithTcpOnTimeout
has to be set in the DnsNameResolverBuilder
when calling the socketChannelFactory
or socketChannelType
methods. It's a second parameter to those methods. However NameResolverProvider
doesn't set this to true, so I believe that TCP fallback might not be enabled by default in this situation.
https://github.com/netty/netty/blob/151dfa083d28e995a18f7d2c73d4a7d3b7ab73b2/resolver-dns/src/main/java/io/netty/resolver/dns/DnsNameResolverBuilder.java#L186