SEVERE: JMX scrape failed: java.io.IOException: Failed to retrieve RMIServer stub: javax.naming.ServiceUnavailableException [Root exception is java.rmi.ConnectException
: Connection refused to host: localhost; nested exception is:
java.net.ConnectException: Connection refused (Connection refused)]
at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:369)
at javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:270)
at io.prometheus.jmx.JmxScraper.doScrape(JmxScraper.java:94)
at io.prometheus.jmx.JmxCollector.collect(JmxCollector.java:456)
at io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.findNextElement(CollectorRegistry.java:183)
at io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.<init>(CollectorRegistry.java:147)
at io.prometheus.client.CollectorRegistry.filteredMetricFamilySamples(CollectorRegistry.java:134)
at io.prometheus.client.exporter.HTTPServer$HTTPMetricHandler.handle(HTTPServer.java:60)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
at sun.net.httpserver.AuthFilter.doFilter(AuthFilter.java:83)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:82)
at sun.net.httpserver.ServerImpl$Exchange$LinkHandler.handle(ServerImpl.java:675)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
at sun.net.httpserver.ServerImpl$Exchange.run(ServerImpl.java:647)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: javax.naming.ServiceUnavailableException [Root exception is java.rmi.ConnectException: Connection refused to host: localhost; nested exception is:
java.net.ConnectException: Connection refused (Connection refused)]
at com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:136)
at com.sun.jndi.toolkit.url.GenericURLContext.lookup(GenericURLContext.java:205)
at javax.naming.InitialContext.lookup(InitialContext.java:417)
at javax.management.remote.rmi.RMIConnector.findRMIServerJNDI(RMIConnector.java:1955)
at javax.management.remote.rmi.RMIConnector.findRMIServer(RMIConnector.java:1922)
at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:287)
... 16 more
Caused by: java.rmi.ConnectException: Connection refused to host: localhost; nested exception is:
java.net.ConnectException: Connection refused (Connection refused)
at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:619)
at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:216)
at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:202)
at sun.rmi.server.UnicastRef.newCall(UnicastRef.java:338)
at sun.rmi.registry.RegistryImpl_Stub.lookup(RegistryImpl_Stub.java:112)
at com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:132)
... 21 more
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at java.net.Socket.connect(Socket.java:538)
at java.net.Socket.<init>(Socket.java:434)
at java.net.Socket.<init>(Socket.java:211)
at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:40)
at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:148)
at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:613)
... 26 more
In brokers we see errors like
ERROR [Controller id=2 epoch=145] Controller 2 epoch 145 failed to change state for partition topic-name from OfflinePartition to OnlinePartition (state.change.logger)
kafka.common.StateChangeFailedException: Failed to elect leader for partition topic-name under strategy OfflinePartitionLeaderElectionStrategy(false)
ERROR [Controller id=2 epoch=145] Controller 2 epoch 145 failed to change state for partition topic from OfflinePartition to OnlinePartition (state.change.logger)
liveLeaders=(cp-kafka-headless.kafka:9092 (id: 0 rack: null))) to broker cp-kafka-headless.kafka:9092 (id: 0 rack: null). Reconnecting to broker. (kafka.controller.RequestSendThread)
java.io.IOException: Connection to 0 was disconnected before the response was read
at org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:100)
at kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:253)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:96)
but when I check from AKHQ everything is working fine, I can publish and consume messages from all 3 brokers.
How to check why Prometheus is not able to connect to broker?
Thanks
when I check from AKHQ everything is working fine, I can publish and consume messages from all 3 brokers.
How to check why Prometheus is not able to connect to broker?
Prometheus doesn’t connect to kafka. It connects to the exporter. You will want to exec into the pod, run ps aux
and look at the Java process args to verify the JVM arguments are defined as you expect. Then you can use curl to hit the exporter’s metrics endpoint on your own.
After that’s verified as working, go look at the ServiceMonitor definition.