添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接

Hello,

YARN Timeline Service Reader is not starting anymore due to following error:

2018-12-08 12:59:18,852 INFO  [main] client.RpcRetryingCallerImpl: Call exception, tries=6, retries=6, started=4859 ms ago, cancelled=false, msg=Call to examples.foodscience-01.de/163.49.39.115:17020 failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: examples.foodscience-01.de/163.49.39.115:17020, details=row 'prod.timelineservice.entity' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=examples.foodscience-01.de,17020,1543619998977, seqNum=-1
2018-12-08 12:59:22,895 INFO  [main] client.RpcRetryingCallerImpl: Call exception, tries=7, retries=7, started=8902 ms ago, cancelled=false, msg=Call to examples.foodscience-01.de/163.49.39.115:17020 failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: examples.foodscience-01.de/163.49.39.115:17020, details=row 'prod.timelineservice.entity' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=examples.foodscience-01.de,17020,1543619998977, seqNum=-1
2018-12-08 12:59:32,955 INFO  [main] client.RpcRetryingCallerImpl: Call exception, tries=8, retries=8, started=18962 ms ago, cancelled=false, msg=Call to examples.foodscience-01.de/163.49.39.115:17020 failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: examples.foodscience-01.de/163.49.39.115:17020, details=row 'prod.timelineservice.entity' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=examples.foodscience-01.de,17020,1543619998977, seqNum=-1
2018-12-08 12:59:42,965 INFO  [main] client.RpcRetryingCallerImpl: Call exception, tries=9, retries=9, started=28972 ms ago, cancelled=false, msg=Call to examples.foodscience-01.de/163.49.39.115:17020 failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: examples.foodscience-01.de/163.49.39.115:17020, details=row 'prod.timelineservice.entity' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=examples.foodscience-01.de,17020,1543619998977, seqNum=-1
2018-12-08 12:59:53,064 INFO  [main] client.RpcRetryingCallerImpl: Call exception, tries=10, retries=10, started=39071 ms ago, cancelled=false, msg=Call to examples.foodscience-01.de/163.49.39.115:17020 failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: examples.foodscience-01.de/163.49.39.115:17020, details=row 'prod.timelineservice.entity' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=examples.foodscience-01.de,17020,1543619998977, seqNum=-1
2018-12-08 13:00:03,101 INFO  [main] client.RpcRetryingCallerImpl: Call exception, tries=11, retries=11, started=49108 ms ago, cancelled=false, msg=Call to examples.foodscience-01.de/163.49.39.115:17020 failed on connection exception: org.apache.hbase.th

It seems that HBase has a problem (although I am not using this service on Ambari).

Then I checked following log file hadoop-yarn-timelinereader-foodscience-01.log

Caused by: java.net.ConnectException: Call to examples.foodscience-01.de/163.49.39.115:17020 failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: examples.foodscience-01.de/163.49.39.115:17020
    at org.apache.hadoop.hbase.ipc.IPCUtil.wrapException(IPCUtil.java:165)
    at org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:390)
    at org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$100(AbstractRpcClient.java:95)
    at org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:410)
    at org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:406)
    at org.apache.hadoop.hbase.ipc.Call.callComplete(Call.java:103)
    at org.apache.hadoop.hbase.ipc.Call.setException(Call.java:118)
    at org.apache.hadoop.hbase.ipc.BufferCallBeforeInitHandler.userEventTriggered(BufferCallBeforeInitHandler.java:92)
    at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:329)
    at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:315)
    at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireUserEventTriggered(AbstractChannelHandlerContext.java:307)
    at org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline$HeadContext.userEventTriggered(DefaultChannelPipeline.java:1377)
    at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:329)
    at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:315)
    at org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline.fireUserEventTriggered(DefaultChannelPipeline.java:929)
    at org.apache.hadoop.hbase.ipc.NettyRpcConnection.failInit(NettyRpcConnection.java:179)
    at org.apache.hadoop.hbase.ipc.NettyRpcConnection.access$500(NettyRpcConnection.java:71)
    at org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:269)
    at org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:263)
    at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)
    at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:500)
    at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:479)
    at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:420)
    at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:122)
    at org.apache.hbase.thirdparty.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:327)
    at org.apache.hbase.thirdparty.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:343)
    at org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:633)
    at org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
    at org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
    at org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
    at org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
    at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
    ... 1 more
Caused by: org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: examples.foodscience-01.de/163.49.39.115:17020
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
    at org.apache.hbase.thirdparty.io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:323)
    at org.apache.hbase.thirdparty.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340)
    ... 7 more
Caused by: java.net.ConnectException: Connection refused
    ... 11 more
2018-12-06 13:03:33,051 INFO  zookeeper.ReadOnlyZKClient (ReadOnlyZKClient.java:run(315)) - 0x4d465b11 no activities for 60000 ms, close active connection. Will reconnect next time when there are new requests.
2018-12-06 13:03:57,614 INFO  storage.HBaseTimelineReaderImpl (HBaseTimelineReaderImpl.java:run(170)) - Running HBase liveness monitor
2018-12-06 13:04:24,100 ERROR reader.TimelineReaderServer (LogAdapter.java:error(75)) - RECEIVED SIGNAL 15: SIGTERM
2018-12-06 13:04:24,116 INFO  handler.ContextHandler (ContextHandler.java:doStop(910)) - Stopped o.e.j.w.WebAppContext@12299890{/,null,UNAVAILABLE}{/timeline}
2018-12-06 13:04:24,125 INFO  server.AbstractConnector (AbstractConnector.java:doStop(318)) - Stopped ServerConnector@328af33d{HTTP/1.1,[http/1.1]}{0.0.0.0:8198}
2018-12-06 13:04:24,128 INFO  handler.ContextHandler (ContextHandler.java:doStop(910)) - Stopped o.e.j.s.ServletContextHandler@7d3e8655{/static,jar:file:/usr/hdp/3.0.0.0-1634/hadoop-yarn/hadoop-yarn-common-3.1.0.3.0.0.0-1634.jar!/webapps/static,UNAVAILABLE}
2018-12-06 13:04:24,128 INFO  handler.ContextHandler (ContextHandler.java:doStop(910)) - Stopped o.e.j.s.ServletContextHandler@7dfd3c81{/logs,file:///var/log/hadoop-yarn/yarn/,UNAVAILABLE}
2018-12-06 13:04:24,142 INFO  storage.HBaseTimelineReaderImpl (HBaseTimelineReaderImpl.java:serviceStop(108)) - closing the hbase Connection
2018-12-06 13:04:24,143 INFO  zookeeper.ReadOnlyZKClient (ReadOnlyZKClient.java:close(342)) - Close zookeeper connection 0x4d465b11 to examples.foodscience-01.de:2181,examples.foodscience-02.de:2181,examples.foodscience-03.de:2181
2018-12-06 13:04:24,143 WARN  storage.HBaseTimelineReaderImpl (HBaseTimelineReaderImpl.java:run(183)) - Got failure attempting to read from timeline storage, assuming HBase down
java.io.UncheckedIOException: java.io.InterruptedIOException
    at org.apache.hadoop.hbase.client.ResultScanner$1.hasNext(ResultScanner.java:55)
    at org.apache.hadoop.yarn.server.timelineservice.storage.reader.TimelineEntityReader.readEntities(TimelineEntityReader.java:283)
    at org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineReaderImpl$HBaseMonitor.run(HBaseTimelineReaderImpl.java:174)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.InterruptedIOException
    at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:246)
    at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:58)
    at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithoutRetries(RpcRetryingCallerImpl.java:192)
    at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:269)
    at org.apache.hadoop.hbase.client.ClientScanner.loadCache(ClientScanner.java:437)
    at org.apache.hadoop.hbase.client.ClientScanner.nextWithSyncCache(ClientScanner.java:312)
    at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:597)
    at org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegionInMeta(ConnectionImplementation.java:834)
    at org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:732)
    at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:325)
    at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:153)
    at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:58)
    at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithoutRetries(RpcRetryingCallerImpl.java:192)
    at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:269)
    at org.apache.hadoop.hbase.client.ClientScanner.loadCache(ClientScanner.java:437)
    at org.apache.hadoop.hbase.client.ClientScanner.nextWithSyncCache(ClientScanner.java:312)
    at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:597)
    at org.apache.hadoop.hbase.client.ResultScanner$1.hasNext(ResultScanner.java:53)
    ... 9 more
2018-12-06 13:04:24,153 INFO  zookeeper.ReadOnlyZKClient (ReadOnlyZKClient.java:close(342)) - Close zookeeper connection 0x5b7a5baa to examples.foodscience-01.de:2181,examples.foodscience-02.de:2181,examples.foodscience-03.de:2181
2018-12-06 13:04:24,155 INFO  reader.TimelineReaderServer (LogAdapter.java:info(51)) - SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down TimelineReaderServer at examples.foodscience-01.de/163.49.39.115

I dont know why this error appears when starting the timeline service. How can this be fixed?

@Andreas Kühnert

You will need to have HBase installed because the error snippet

" details=row 'prod.timelineservice.entity' on table 'hbase:meta' at region=hbase:meta, "

YARN Timeline Service v.2 uses a set of collectors (writers) to write data to the backend storage it uses Apache HBase as the primary backing storage, as Apache HBase scales well to a large size while maintaining good response times for reads and writes. The collectors are distributed and co-located with the application masters to which they are dedicated.

On the nodes running the region servers, you should be able to see the software installed on that node.

HTH


Hi Andreas Kühnert

We are facing same issue, we don't want to install hbase as our VM's as we don't have capacity to handle HDFS and Hbase on same cluster. Do we have any work around for this

thanks in advance for the suggestion

Apache Hadoop and associated open source project names are trademarks of the Apache Software Foundation. For a complete list of trademarks, click here.