OpenTSDB/asynchbase

What's problem:Scan RPC response was for scanner ID 0 but we expected99

Closed this issue · 7 comments

Hi,I meet one problem that I don't know how to solve.

  1. Setup a opentsdb 2.3.0 and hbase 1.3.1 on my server(ubuntu 14.04).

  2. Insert one record into tsdb by REST API and the result is success.
    curl -i -H "Content-Type: application/json" -X POST -d '{"metric": "sys.cpu.nice", "timestamp": 1346846400,"value": 18, "tags": { "host": "web01"}}' http://localhost:4242/api/put/?details

  3. check it successfully in hbase.
    hbase(main):011:0> scan "tsdb-uid"
    ROW COLUMN+CELL
    \x00 column=id:metrics, timestamp=1493263774622, value=\x00\x00\x00\x00\x00\x00\x00\x01
    \x00 column=id:tagk, timestamp=1493263774714, value=\x00\x00\x00\x00\x00\x00\x00\x01
    \x00 column=id:tagv, timestamp=1493263774751, value=\x00\x00\x00\x00\x00\x00\x00\x01
    \x00\x00\x01 column=name:metrics, timestamp=1493263774694, value=sys.cpu.nice
    \x00\x00\x01 column=name:tagk, timestamp=1493263774722, value=host
    \x00\x00\x01 column=name:tagv, timestamp=1493263774753, value=web01
    host column=id:tagk, timestamp=1493263774725, value=\x00\x00\x01
    sys.cpu.nice column=id:metrics, timestamp=1493263774706, value=\x00\x00\x01
    web01 column=id:tagv, timestamp=1493263774755, value=\x00\x00\x01
    5 row(s) in 0.0490 seconds

  4. I select datetime and input metric "sys.cpu.nice" in web UI of opentsdb, then give me one ERROR.
    09:00:23.394 INFO [ConnectionManager.handleUpstream] - [id: 0xbba69285, /10.142.91.196:40546 => /192.168.21.45:4242] OPEN
    09:00:23.399 INFO [ConnectionManager.handleUpstream] - [id: 0xbba69285, /10.142.91.196:40546 => /192.168.21.45:4242] BOUND: /192.168.21.45:4242
    09:00:23.399 INFO [ConnectionManager.handleUpstream] - [id: 0xbba69285, /10.142.91.196:40546 => /192.168.21.45:4242] CONNECTED: /10.142.91.196:40546
    09:00:23.403 INFO [ZooKeeper.] - Initiating client connection, connectString=192.168.21.45 sessionTimeout=5000 watcher=org.hbase.async.HBaseClient$ZKClient@4b1c1ea0
    09:00:23.408 INFO [HBaseClient.getDeferredRoot] - Need to find the .META. region
    09:00:23.411 INFO [ClientCnxn.logStartConnect] - Opening socket connection to server 192.168.21.45/192.168.21.45:2181. Will not attempt to authenticate using SASL (unknown error)
    09:00:23.416 INFO [ClientCnxn.primeConnection] - Socket connection established to 192.168.21.45/192.168.21.45:2181, initiating session
    09:00:23.421 INFO [ClientCnxn.onConnected] - Session establishment complete on server 192.168.21.45/192.168.21.45:2181, sessionid = 0x2d5ba819b79a0072, negotiated timeout = 5000
    09:00:23.424 INFO [HBaseClient.handleMetaZnode] - Connecting to .META. region @ 127.0.0.1:16201
    09:00:23.440 INFO [HBaseClient.discoverRegion] - Added client for region RegionInfo(table="tsdb", region_name="tsdb,,1493177138520.f63db24372e77f9cf2925ae121d796d6.", stop_key=""), which was added to the regions cache. Now we know that RegionClient@1802202875(chan=[id: 0x33549b11, /127.0.0.1:36157 => /127.0.0.1:16201], #pending_rpcs=0, #batched=0, #rpcs_inflight=0) is hosting 1 region.
    09:00:23.448 INFO [ZooKeeper.close] - Session: 0x2d5ba819b79a0072 closed
    09:00:23.448 INFO [ClientCnxn.run] - EventThread shut down
    09:00:23.455 ERROR [RegionClient.decode] - Uncaught error during de-serialization of CloseScannerRequest(scanner_id=0x0000000000000003, attempt=0), rpcid=2
    09:00:23.455 ERROR [TsdbQuery.call] - Scanner Scanner(table="tsdb", start_key="", stop_key="", columns={}, populate_blockcache=true, max_num_rows=128, max_num_kvs=4096, region=none, filter=null, scanner_id=0x0EAD000CC000DEAD) threw an exception
    org.hbase.async.InvalidResponseException: Scan RPC response was for scanner ID 0 but we expected99
    at org.hbase.async.Scanner$CloseScannerRequest.deserialize(Scanner.java:1466) ~[asynchbase-1.7.2.jar:na]
    at org.hbase.async.RegionClient.decode(RegionClient.java:1497) [asynchbase-1.7.2.jar:na]
    at org.hbase.async.RegionClient.decode(RegionClient.java:88) [asynchbase-1.7.2.jar:na]
    at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:500) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) [netty-3.9.4.Final.jar:na]
    at org.hbase.async.RegionClient.handleUpstream(RegionClient.java:1223) [asynchbase-1.7.2.jar:na]
    at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.SimpleChannelHandler.messageReceived(SimpleChannelHandler.java:142) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:88) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.handler.timeout.IdleStateAwareChannelHandler.handleUpstream(IdleStateAwareChannelHandler.java:36) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.handler.timeout.IdleStateHandler.messageReceived(IdleStateHandler.java:294) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559) [netty-3.9.4.Final.jar:na]
    at org.hbase.async.HBaseClient$RegionClientPipeline.sendUpstream(HBaseClient.java:3121) [asynchbase-1.7.2.jar:na]
    at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) [netty-3.9.4.Final.jar:na]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_121]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_121]
    at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
    09:00:23.460 INFO [GraphHandler.logInfo] - [id: 0xbba69285, /10.142.91.196:40546 => /192.168.21.45:4242] Query failed (stack trace coming): TsdbQuery(start_time=1493164800, end_time=1493298000, metric=[0, 0, 1], filters=[], rate=false, aggregator=sum, group_bys=())
    09:00:23.464 ERROR [HttpQuery.logError] - [id: 0xbba69285, /10.142.91.196:40546 => /192.168.21.45:4242] Internal Server Error on /q?start=2017/04/26-00:00:00&end=2017/04/27-13:00:00&m=sum:sys.cpu.nice&o=&yrange=%5B0:%5D&wxh=1900x792&style=linespoint&json
    org.hbase.async.InvalidResponseException: Scan RPC response was for scanner ID 0 but we expected99
    at org.hbase.async.Scanner$CloseScannerRequest.deserialize(Scanner.java:1466) ~[asynchbase-1.7.2.jar:na]
    at org.hbase.async.RegionClient.decode(RegionClient.java:1497) ~[asynchbase-1.7.2.jar:na]
    at org.hbase.async.RegionClient.decode(RegionClient.java:88) ~[asynchbase-1.7.2.jar:na]
    at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:500) ~[netty-3.9.4.Final.jar:na]
    at org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435) ~[netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) ~[netty-3.9.4.Final.jar:na]
    at org.hbase.async.RegionClient.handleUpstream(RegionClient.java:1223) ~[asynchbase-1.7.2.jar:na]
    at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) ~[netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) ~[netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.SimpleChannelHandler.messageReceived(SimpleChannelHandler.java:142) ~[netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:88) ~[netty-3.9.4.Final.jar:na]
    at org.jboss.netty.handler.timeout.IdleStateAwareChannelHandler.handleUpstream(IdleStateAwareChannelHandler.java:36) ~[netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) ~[netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) ~[netty-3.9.4.Final.jar:na]
    at org.jboss.netty.handler.timeout.IdleStateHandler.messageReceived(IdleStateHandler.java:294) ~[netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) ~[netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) ~[netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559) ~[netty-3.9.4.Final.jar:na]
    at org.hbase.async.HBaseClient$RegionClientPipeline.sendUpstream(HBaseClient.java:3121) ~[asynchbase-1.7.2.jar:na]
    at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) [netty-3.9.4.Final.jar:na]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_121]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_121]
    at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
    09:00:23.467 INFO [HttpQuery.logInfo] - [id: 0xbba69285, /10.142.91.196:40546 => /192.168.21.45:4242] HTTP /q?start=2017/04/26-00:00:00&end=2017/04/27-13:00:00&m=sum:sys.cpu.nice&o=&yrange=%5B0:%5D&wxh=1900x792&style=linespoint&json done in 65ms
    09:00:23.468 ERROR [RegionClient.exceptionCaught] - Unexpected exception from downstream on [id: 0x33549b11, /127.0.0.1:36157 => /127.0.0.1:16201]
    org.hbase.async.InvalidResponseException: Scan RPC response was for scanner ID 0 but we expected99
    at org.hbase.async.Scanner$CloseScannerRequest.deserialize(Scanner.java:1466) ~[asynchbase-1.7.2.jar:na]
    at org.hbase.async.RegionClient.decode(RegionClient.java:1497) ~[asynchbase-1.7.2.jar:na]
    at org.hbase.async.RegionClient.decode(RegionClient.java:88) ~[asynchbase-1.7.2.jar:na]
    at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:500) ~[netty-3.9.4.Final.jar:na]
    at org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435) ~[netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) [netty-3.9.4.Final.jar:na]
    at org.hbase.async.RegionClient.handleUpstream(RegionClient.java:1223) ~[asynchbase-1.7.2.jar:na]
    at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.SimpleChannelHandler.messageReceived(SimpleChannelHandler.java:142) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:88) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.handler.timeout.IdleStateAwareChannelHandler.handleUpstream(IdleStateAwareChannelHandler.java:36) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.handler.timeout.IdleStateHandler.messageReceived(IdleStateHandler.java:294) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559) [netty-3.9.4.Final.jar:na]
    at org.hbase.async.HBaseClient$RegionClientPipeline.sendUpstream(HBaseClient.java:3121) [asynchbase-1.7.2.jar:na]
    at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) [netty-3.9.4.Final.jar:na]
    at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) [netty-3.9.4.Final.jar:na]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_121]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_121]
    at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
    09:00:23.477 INFO [HBaseClient.handleDisconnect] - Channel [id: 0x33549b11, /127.0.0.1:36157 => /127.0.0.1:16201] is disconnecting: [id: 0x33549b11, /127.0.0.1:36157 => /127.0.0.1:16201] CLOSE
    09:00:23.477 INFO [HBaseClient.removeClientFromCache] - Lost connection with the .META. region

What's happend? How to solve it?
Can anyone help me, thanks!

Fix in https://issues.apache.org/jira/browse/HBASE-18000 (will ship with HBase 1.3.2)

FYI, @longxuanjackey
You will have to backport the patch to 1.3.1 to make this work or use another version.

@lhofhansl
OK, I will try to upgrade HBase 1.3.2 and check this problem is resolved or not.

@longxuanjackey
There are few other issues as well associated with this. Currently AsyncHBase is not compatible with 1.3.1 or above. https://issues.apache.org/jira/browse/HBASE-18042 tracks the progress of that JIRA. It will decide if a patch is needed in HBase or AsyncHBase.

Merged @karanmehta93 fix in 80c6b86 so this should be good. Also commented out the scanner close exception when Hbase returns a scanner ID of zero. Will see if that's fixed in 1.3.2. 509d717