3601314/hbase-python

hbase.exceptions.RequestError: org.apache.hadoop.hbase.exceptions.UnknownProtocolException

Opened this issue · 3 comments

hbase 2.1.0
python 3.6

my code

import hbase 
zk = '172.25.33.230:2181,172.25.33.231:2181,172.25.33.232:2181'


def test():
    with hbase.ConnectionPool( zk ).connect() as conn:
        table = conn['hbase']['TraceV2']

        table.count()
    exit()

if __name__ == '__main__':
    test()

Exception as below :

Traceback (most recent call last):
  File "D:/wehotel_product/untitled/test-hbase/main.py", line 18, in <module>
    test()
  File "D:/wehotel_product/untitled/test-hbase/main.py", line 13, in test
    table.count()
  File "D:\wehotel_product\untitled\venv\lib\site-packages\hbase\table.py", line 204, in count
    batch_size=500
  File "D:\wehotel_product\untitled\venv\lib\site-packages\hbase\table.py", line 477, in __next__
    batch = self._client.iter_scanner(self.scanner)
  File "D:\wehotel_product\untitled\venv\lib\site-packages\hbase\client\client.py", line 1259, in iter_scanner
    region = self._region_manager.get_region(scanner.__table__, start_key)
  File "D:\wehotel_product\untitled\venv\lib\site-packages\hbase\client\region.py", line 199, in get_region
    region = self._region_lookup(meta_key)
  File "D:\wehotel_product\untitled\venv\lib\site-packages\hbase\client\region.py", line 258, in _region_lookup
    resp = self._meta_service.request(req)
  File "D:\wehotel_product\untitled\venv\lib\site-packages\hbase\services\services.py", line 61, in request
    return self._request.call(pb_req)
  File "D:\wehotel_product\untitled\venv\lib\site-packages\hbase\services\request.py", line 192, in call
    raise exceptions.RequestError(error)
hbase.exceptions.RequestError: org.apache.hadoop.hbase.exceptions.UnknownProtocolException

image

when I check the response , find header error info

call_id: 0
exception {
  exception_class_name: "org.apache.hadoop.hbase.exceptions.UnknownProtocolException"
  stack_trace: "org.apache.hadoop.hbase.exceptions.UnknownProtocolException: Is this a pre-hbase-1.0.0 or asynchbase client? Client is invoking getClosestRowBefore removed in hbase-2.0.0 replaced by reverse Scan.\n\tat org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2445)\n\tat org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:41998)\n\tat org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)\n\tat org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)\n\tat org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)\n\tat org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)\n"
  do_not_retry: true
}

there any problem ? please help me

This problem is caused by some protocol changes since hbase 2.0.
To be specific, when the client tries to lookup the region server for your requested row, it calls "getClosestRowBefore" method to query the "meta server". However, this method is removed and replaced by "Scan" method since 2.0 version. So this incompatibility can be fixed to use the new protocol.

Solutions:

  1. Change you hbase server to version 1.2.6 (or some other version older than 2.0) if possible.
  2. I will add the compatibility to the client, but that will cost some time, since our PHDs are very busy in the lab.
    Thank for your understanding!

Any idea when can we get this cool-thing done for HBase-2.1.0? @XoriieInpottn

@mohamedniyaz1996
Recently, I solved this problem。

As @XoriieInpottn and the error message said, this problem occurs because getClosestRowBefore is called when lookup the region server. However, this method is removed and replaced by reverse scan method since 2.0 version

The solution is to modify the RegionManager._region_lookup function in the hbase/client/region.py. The code is as follows:

    def _region_lookup(self, meta_key):
        column = protobuf.Column()
        column.family = b'info'
        # delete
        # req = protobuf.GetRequest()
        # req.get.row = meta_key.encode()
        # req.get.column.extend([column])
        # req.get.closest_row_before = True
        # req.region.type = 1
        # req.region.value = b'hbase:meta,,1'
        
       # add
        req = protobuf.ScanRequest()
        req.scan.column.extend([column])
        req.scan.start_row = meta_key.encode()
        req.scan.reversed = True
        req.region.type = 1
        req.region.value = b'hbase:meta,,1'
        req.number_of_rows = 1

        try:
            resp = self._meta_service.request(req)
        except exceptions.RegionError:
            while True:
                time.sleep(3)
                try:
                    resp = self._meta_service.request(req)
                    break
                except exceptions.RegionError:
                    continue
        # add
        cells = []
        for result in resp.results:
            cells = result.cell
            break
        # delete
        # cells = resp.result.cell
        if len(cells) == 0:
            return None

        region_name = cells[0].row.decode()
        server_info = None
        region_info = None
        for cell in cells:
            qualifier = cell.qualifier.decode()
            if qualifier == 'server':
                server_info = cell.value.decode()
            elif qualifier == 'regioninfo':
                region_info_bytes = cell.value
                magic = struct.unpack(">4s", region_info_bytes[:4])[0]
                if magic != b'PBUF':
                    raise exceptions.ProtocolError(
                        'Meta region server returned an invalid response. b\'PBUF\' expected, got %s.' % magic
                    )
                region_info = protobuf.RegionInfo()
                region_info.ParseFromString(region_info_bytes[4:-4])

        if server_info is None:
            raise exceptions.ProtocolError(
                'Server host information not found.'
            )
        if region_info is None:
            raise exceptions.ProtocolError(
                'Region information not found.'
            )

        host, port = server_info.split(':')
        port = int(port)
        table = region_info.table_name.namespace.decode() + ':' + region_info.table_name.qualifier.decode()
        start_key = region_info.start_key.decode()
        end_key = region_info.end_key.decode()
        return Region(region_name, table, start_key, end_key, host, port)

Further more, you can update the .proto and .py files in hbase/protobuf to be compatible with more features of new hbase version. The .proto file comes from the HBase project, such as https://github.com/apache/hbase/tree/rel/2.1.4/hbase-protocol-shaded/src/main/protobuf. The '.py' files can be generated using the grpc tool.