dnandha/miauth

Xiaomi Pro 2 with DisplayDash: Hang in `COMM` state in `BluePy.wait_notify`

Closed this issue · 11 comments

I'm trying to connect to my Xiaomi Pro 2 (BLE 1.3.6, though I tried 1.2.9 too) using miauth. However, it seems to hang after successfully getting to comms state:

$ python3 -m miauth.cli  ... --serial -d -r
Namespace(mac='...', m365=False, nb=False, command=None, serial=True, version=False, debug=True, register=True, token_file='./mi_token')
Using Mi
Connecting
enabling notifications for: 6e400003-b5a3-f393-e0a9-e50e24dcca9e
enabling notifications for: 00000010-0000-1000-8000-00805f9b34fb
enabling notifications for: 00000019-0000-1000-8000-00805f9b34fb
Registering
Private Key (Val): ...
Public Key (Hex): bb 2c 8c 7e 81 16 66 92 7d 45 64 f5 42 5a 27 c2 d8 e3 79 e8 29 a5 c7 cd 2b 46 a7 1c 40 51 fb fd e0 62 ae 60 18 0e 4d 19 0c 23 15 c0 0e 11 c9 70 45 69 4c 6a cd c0 6d 98 53 66 b3 27 fe 7f d3 17
new state: 0
<- 00 00 00 00 02 00 0
Expecting 2 frames
<- 01 00 01 00 00 00 00 62 6c 74 2e 33 2e 31 37 74 65 6b 67 33 0
<- 02 00 70 30 65 38 30 30 0
All frames received:  01 00 00 00 00 62 6c 74 2e 33 2e 31 37 74 65 6b 67 33 70 30 65 38 30 30
new state: 1
<- 00 00 01 01 1
Mi ready to receive key
<- 00 00 01 00 1
Mi confirmed key receive
new state: 2
<- 00 00 00 03 04 00 2
Expecting 4 frames
<- 01 00 fa 2c ca ef ea 71 40 a0 29 8e b9 60 13 92 a9 41 54 59 2
<- 02 00 fc 0a 26 86 78 5d b9 1d 5b 85 ed 75 3d 9d 8c 26 fc e0 2
<- 03 00 35 08 72 c5 6e 76 18 77 a0 bb 37 7b b7 69 1b 50 1b 6a 2
<- 04 00 e9 19 42 82 64 37 b0 a8 cc 4b 2
All frames received:  fa 2c ca ef ea 71 40 a0 29 8e b9 60 13 92 a9 41 54 59 fc 0a 26 86 78 5d b9 1d 5b 85 ed 75 3d 9d 8c 26 fc e0 35 08 72 c5 6e 76 18 77 a0 bb 37 7b b7 69 1b 50 1b 6a e9 19 42 82 64 37 b0 a8 cc 4b
new state: 3
eShareKey: ...
HKDF result:  ...
token: ...
bind_key: ...
A: ...
AES did CT:  ...
<- 00 00 01 01 3
Mi ready to receive key
<- 00 00 01 00 3
Mi confirmed key receive
new state: 4
<- 11 00 00 00 4
Mi authentication successful!
new state: 5

It hangs here:

Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/florian/proj/xiaomi-garmin/py/miauth/lib/python/miauth/cli.py", line 158, in <module>
    main()
  File "/home/florian/proj/xiaomi-garmin/py/miauth/lib/python/miauth/cli.py", line 154, in main
    mi_main(ble)
  File "/home/florian/proj/xiaomi-garmin/py/miauth/lib/python/miauth/cli.py", line 107, in mi_main
    mc.register()
  File "/home/florian/proj/xiaomi-garmin/py/miauth/lib/python/miauth/mi/miclient.py", line 243, in register
    self.ble.wait_notify(secs=3.0)
  File "/home/florian/proj/xiaomi-garmin/py/miauth/lib/python/miauth/ble/blue.py", line 95, in wait_notify
    while self.p.waitForNotifications(secs):
  File "/home/florian/proj/xiaomi-garmin/py/.venv/lib/python3.10/site-packages/bluepy/btle.py", line 560, in waitForNotifications
    resp = self._getResp(['ntfy','ind'], timeout)
  File "/home/florian/proj/xiaomi-garmin/py/.venv/lib/python3.10/site-packages/bluepy/btle.py", line 407, in _getResp
    resp = self._waitResp(wantType + ['ntfy', 'ind'], timeout)
  File "/home/florian/proj/xiaomi-garmin/py/.venv/lib/python3.10/site-packages/bluepy/btle.py", line 342, in _waitResp
    fds = self._poller.poll(timeout*1000)
KeyboardInterrupt

which seems to make sense to me after reading through the code: MiClient.register never gets over this line:

self.ble.wait_notify(secs=3.0)

Because BluePy.wait_notify is actually an endless loop, calling bluepy's waitForNotifications over and over again:

def wait_notify(self, secs=1.0):
while self.p.waitForNotifications(secs):
continue

So I don't see how this could possibly work. Surely I must be missing something?

Err, and as usual, as soon as you post something, things get clearer of course 😅 - this should work assuming that bluepy's waitForNotifications returns False at some point, i.e. the timeout elapsed without getting a notification.

However, in my case, it seems to always ~immediately return with True when called. If I print handle and data in handleNotification(), I get 11 and b''. From what I understand, that's UUID 6e400003-b5a3-f393-e0a9-e50e24dcca9e, i.e. the RX channel.

@The-Compiler The waitForNotification loop should exit once you stop receiving messages, which normally is the case after confirmation ("11 00 00 00"). So you say that after confirmation you instead keep receiving empty messages b'' in a loop?

It looks like it, yes - though I don't see them being logged anywhere, only when I print them in handleNotification (which seems to ignore them and return).

That's odd. For now you could add a new function on_comm_state that disables notifications (self.ble.disable_notify) for the RX channel and hook that to the COMM state in self.seq. This should get you out of the loop.

To find the actual solution you would need to find out why it keeps receiving messages out of nowhere..

So I tried this:

diff --git i/lib/python/miauth/mi/miclient.py w/lib/python/miauth/mi/miclient.py
index bc6eff8..c135c2d 100644
--- i/lib/python/miauth/mi/miclient.py
+++ w/lib/python/miauth/mi/miclient.py
@@ -229,13 +229,16 @@ def on_send_did_state():
         def on_confirm_state():
             self.ble.write(UUID.UPNP, MiCommand.CMD_AUTH)
 
+        def on_comm_state():
+            self.ble.disable_notify(self.ble.channels[UUID.RX])
+
         self.seq = ((MiClient.State.INIT, None),
                     (MiClient.State.RECV_INFO, on_recv_info_state()),
                     (MiClient.State.SEND_KEY, on_send_key_state),
                     (MiClient.State.RECV_KEY, None),
                     (MiClient.State.SEND_DID, on_send_did_state),
                     (MiClient.State.CONFIRM, on_confirm_state),
-                    (MiClient.State.COMM, None),
+                    (MiClient.State.COMM, on_comm_state),
                     )
         self.seq_idx = 0
 
@@ -255,6 +258,7 @@ def on_confirm_state():
 
                 return self.register()  # return because of recursion
             else:
+                self.ble.enable_notify(self.ble.channels[UUID.RX])
                 break
 
     def save_token(self, filename):
@@ -284,6 +288,9 @@ def on_send_did_state():
                 f"{self.remote_info.hex(' ')} != {expected_remote_info.hex(' ')}"
             self.ble.write(UUID.AVDTP, MiCommand.CMD_SEND_INFO)
 
+        def on_comm_state():
+            self.ble.disable_notify(self.ble.channels[UUID.RX])
+
         self.seq = (
             (MiClient.State.INIT, None),
             (MiClient.State.SEND_KEY, on_send_key_state),
@@ -291,7 +298,7 @@ def on_send_did_state():
             (MiClient.State.RECV_INFO, on_recv_info_state),
             (MiClient.State.SEND_DID, on_send_did_state),
             (MiClient.State.CONFIRM, None),
-            (MiClient.State.COMM, None),
+            (MiClient.State.COMM, on_comm_state),
         )
         self.seq_idx = 0
 
@@ -299,6 +306,8 @@ def on_send_did_state():
         while self.get_state() != MiClient.State.COMM:
             self.ble.wait_notify(secs=3.0)
 
+        self.ble.enable_notify(self.ble.channels[UUID.RX])
+
     def comm(self, cmd):
         if self.get_state() != MiClient.State.COMM:
             raise Exception("Not in COMM state. Retry maybe.")

but with that, I now seem to continue getting some kind of endless messages after the login:

Mi login successful!
new state: State.COMM
Retrieving serial number
<- 55 ab 10 01 00 bc ad c7 96 11 44 ba 8a ae 83 af 31 19 c9 ae State.COMM
<- 2d 3f b4 a0 29 d8 fe 91 4a e2 72 f2 State.COMM
<- 55 ab 06 02 00 8a d2 42 d5 89 f9 44 9b fb c7 f9 6f b1 ea 68 State.COMM
<- f6 f5 State.COMM
<- 55 ab 20 03 00 20 46 b7 5d 54 96 80 09 a9 6f bb 6a 0f 92 2b State.COMM
<- 6f da 60 b9 9d 0d b2 fd 2c b2 3c 23 ce e8 99 c9 07 de 3e 5e State.COMM
<- fc f7 da 2b c2 a8 f3 ea State.COMM
<- 55 ab 0c 04 00 2d d8 39 c1 cd 9d cf 4b eb a6 0b 19 42 d0 3f State.COMM
<- 89 76 31 07 12 16 07 f7 State.COMM
<- 55 ab 06 05 00 30 ac 12 aa e8 29 1c a3 db f6 83 5a 55 5a 23 State.COMM
<- 0c f9 State.COMM
<- 55 ab 20 06 00 70 cb 78 78 73 19 02 02 76 03 b6 89 24 c9 cc State.COMM
<- 88 8b cc c7 3c de ad c0 41 0a f4 5b f0 da 14 a5 dd 6b 4c ba State.COMM
<- cc 56 df c0 0b 7c cd ea State.COMM
<- 55 ab 0c 07 00 76 7d a2 39 06 16 00 09 9d 9e 39 2b 87 ee 99 State.COMM
<- b6 ae 38 49 f1 dc 9a f6 State.COMM
<- 55 ab 22 08 00 af d7 d9 be 4d 89 7b 08 23 6d cb 92 c1 73 31 State.COMM
<- 4b 4c 71 37 4b 86 02 0d 71 e8 c1 e9 7f a2 08 f6 ac c7 d1 1e State.COMM
<- e5 d9 65 9e 98 78 75 3b e9 e9 State.COMM
<- 55 ab 06 09 00 d8 39 a6 2c 82 6d 7d b7 59 ee 03 06 34 d8 47 State.COMM
<- 47 f9 State.COMM
<- 55 ab 0c 0a 00 af 5b 33 66 0b bb f0 c7 d6 77 fb dd 38 a3 aa State.COMM
<- ef 01 9f 0d 25 f2 6c f4 State.COMM
<- 55 ab 22 0b 00 a5 ad 96 99 5a c4 1f b7 61 7a 1f a4 9b 92 55 State.COMM
<- dc 75 c8 32 f4 ff 3b 35 18 0e ae b9 c8 29 c1 22 4f 56 b7 38 State.COMM
<- 81 19 da 8b b6 e8 83 e3 97 e9 State.COMM
<- 55 ab 06 0c 00 95 4c 47 25 2b 7e 80 55 b9 94 0a b7 86 01 bc State.COMM
<- d1 f9 State.COMM
<- 55 ab 22 0d 00 da 4a c6 e0 53 7a 85 38 78 33 85 0c 56 af 87 State.COMM
<- 31 57 ed 30 f2 3f 96 0b 5f e4 80 9f b1 ef 2c b9 81 fb 19 11 State.COMM
<- 89 d5 68 e9 e5 57 40 b6 cf e9 State.COMM
<- 55 ab 06 0e 00 7f 7b 83 23 73 8b ad 71 87 3a 4c eb 7a 52 c0 State.COMM
<- ab f8 State.COMM
<- 55 ab 20 0f 00 60 de cd 30 7f 33 0b 95 1b d6 e3 72 fe b6 22 State.COMM
<- ab 64 cc d6 d9 7c 5a c9 a7 63 3f 61 d8 da 15 75 8b 8d 8f ef State.COMM
<- 5d 1e 9d 2b 55 cc 1e ea State.COMM
<- 55 ab 0c 10 00 0f 84 75 85 cd f8 76 6e e6 e8 09 52 9b 62 92 State.COMM
<- 7b d1 09 22 11 8d e0 f5 State.COMM
<- 55 ab 22 11 00 01 02 e7 6b 86 bf d7 34 f1 64 59 da 74 9e 5f State.COMM
<- ab 82 f2 86 89 ef ab 8e f7 2d e2 8c a1 ab 4d 87 11 e4 36 88 State.COMM
<- d7 26 5f fe b2 4b 9c 87 f9 e7 State.COMM
<- 55 ab 06 12 00 e7 1b 82 ee 94 9b 8d 73 a5 83 4f 0d 40 aa 05 State.COMM
<- d3 f8 State.COMM

any idea on what could be going on there, or how I would debug this? Maybe I should try to sniff/log the comms ScooterHackingUtility does (which works just fine), and see if I notice something compared to miauth?

The only exotic thing about my scooter - other than SHFW - is that I have a Display Dash built in. From what I understood, that only uses the UART bus, but maybe that causes those BLE messages somehow too?

Since it's a single-wire UART line the first thing you should do is to sniff the bus in order to see if that Display Dash is sending messages on the bus. Given the fact that these issues don't appear on my stock scooter I highly suspect this to be the cause.

In my Python implementation I don't expect any rouge messages to appear, the state machine can't handle that. It would be interesting to know if my Java implementation is able to deal with that (but I think yes). Regarding: SHU, we don't have access to the source code to check how they designed the timeouts and filters, so I really don't bother. All I can say is that the EC protocol described here has been modeled 1:1 after the official app, a sniff you can find here.

Edit: I had a look at the stats Display Dash shows and it must be sending messages to the bus, because most of these stats need to be read from ESC and aren't transported over the heartbeat command. So to fix the endless message bug in miauth after login there should be added a filter that matches response to request.

@The-Compiler This kind of response handling could get you out of the endless message loop during COMM states. Decrypting each packet like this will also give you insights into what the commands are, I could imagine dropping all packets not designated to the client device (identified by the destination byte in the command).
You could adopt the relevant part and if it runs on your scooter please open a PR. Thanks for supporting this project :)

I ended up just trying to run the m365-mi mqttdump.py example with the mqtt bits hacked out (and some other bugfixes):

diff --git i/examples/mqttdump.py w/examples/mqttdump.py
index 788dd86..a09f06b 100644
--- i/examples/mqttdump.py
+++ w/examples/mqttdump.py
@@ -26,7 +26,7 @@
 
 from mim365mi.m365scooter import M365Scooter
 
-from message import *
+#from message import *
 
 import struct
 
@@ -47,7 +47,7 @@
 
 from miauth.mi.micrypto import MiCrypto
 
-from paho.mqtt import client as mqtt_client
+#from paho.mqtt import client as mqtt_client
 port = 1883
 broker = '127.0.0.1'
 topic = "m365/test/"
@@ -67,7 +67,7 @@ def on_connect(client, userdata, flags, rc):
     return client
 
 def main():
-    mqtt = connect_mqtt()
+    #mqtt = connect_mqtt()
     mc = M365Scooter(btle.Peripheral(), args.mac, debug=args.debug)
 
     def lol(reg, payload):
@@ -165,13 +165,13 @@ def lol(reg, payload):
 
     print("Retrieving serial number")
 
-    mc.comm_simplex("55aa032001 10 0e")
-    #print("Serial number:", resp.decode())
+    resp = mc.comm("55aa032001 10 0e")
+    print("Serial number:", resp.decode())
     time.sleep(3)
 
-    #print("Retrieving firmware version")
-    #resp = mc.comm("55aa032001 1a 10")
-    #print("Firmware version:", f"{resp[0]}.{resp[1]}")
+    print("Retrieving firmware version")
+    resp = mc.comm("55aa032001 1a 10")
+    print("Firmware version:", f"{resp[0]}.{resp[1]}")
 
     cmd = str(battery_info._raw_bytes.hex())
     print("Sending command:", cmd)

And that seems to connect and start dumping things immediately:

Connecting
Loading token from: ./mi_token
Logging in...
-> 24000000
-> 0000000b0100
Mi ready to receive key
-> 010094780d8ad0e2ba43efd7ae6af2167f5e
Mi confirmed key receive
Expecting 1 frames
-> 00000101
All frames received:  dc7992197baacf24c51b7cf3e85ba0bd
-> 00000100
Expecting 2 frames
-> 00000101
All frames received:  d4c293407354062a3307560ce8ba6fa5868bb95acd996ebcf7fd5e6c4c80dff7
-> 00000100
HKDF result: ...
DEV_KEY: ...
APP_KEY: ...
DEV_IV: ...
APP_IV: ...
-> 0000000a0200
Mi ready to receive key
-> 01008e4739eef78e6244f824b830be5b8479f9f0
-> 0200e44da599ce600768e4ba8759a66f
Mi confirmed key receive
Mi login successful!
Retrieving serial number
-> 55ab0300004efdbd55b34b111632f60079d9fa
230110 1099070470737260410021662072190258
triptime1 5
bms_cell_voltages [3.649, 3.647, 3.649, 3.65, 3.647, 3.648, 3.652, 3.649, 3.651, 3.646]
250131 265486567529551500613918
2301b0 29692108411784786777639893063440094801751860518424936448
triptime1 5
bms_cell_voltages [3.649, 3.646, 3.649, 3.65, 3.647, 3.649, 3.652, 3.649, 3.651, 3.646]
250131 265486567529551500613918
2301b0 29692108411784786777639893063440094801751860518424936448
triptime1 5
bms_cell_voltages [3.649, 3.647, 3.649, 3.65, 3.647, 3.648, 3.652, 3.649, 3.651, 3.647]
250131 265486567529551500613918
2301b0 29692109873286424108542811267124927518034880174357479424
triptime1 6
bms_cell_voltages [3.649, 3.646, 3.649, 3.65, 3.647, 3.649, 3.653, 3.65, 3.651, 3.647]
250131 265486567529551500613918
2301b0 29692109873286424108542811267124927518034880174357479424
triptime1 6
bms_cell_voltages [3.649, 3.647, 3.649, 3.65, 3.647, 3.649, 3.653, 3.65, 3.652, 3.647]
250131 265486567529551500613918
2301b0 29692109873286424108542811267124927518034880174357479424

Note that however it seems to hang waiting for an answer to the serial number:

^CTraceback (most recent call last):
  File "/home/florian/proj/xiaomi-garmin/py/m365-mi/examples/mqttdump.py", line 266, in <module>
    main()
  File "/home/florian/proj/xiaomi-garmin/py/m365-mi/examples/mqttdump.py", line 168, in main
    resp = mc.comm("55aa032001 10 0e")
  File "/home/florian/proj/xiaomi-garmin/py/.venv/lib/python3.10/site-packages/miauth/mi/miclient.py", line 356, in comm
    while self.p.waitForNotifications(3.0):
  File "/home/florian/proj/xiaomi-garmin/py/.venv/lib/python3.10/site-packages/bluepy/btle.py", line 560, in waitForNotifications
    resp = self._getResp(['ntfy','ind'], timeout)
  File "/home/florian/proj/xiaomi-garmin/py/.venv/lib/python3.10/site-packages/bluepy/btle.py", line 407, in _getResp
    resp = self._waitResp(wantType + ['ntfy', 'ind'], timeout)
  File "/home/florian/proj/xiaomi-garmin/py/.venv/lib/python3.10/site-packages/bluepy/btle.py", line 342, in _waitResp
    fds = self._poller.poll(timeout*1000)
KeyboardInterrupt

So yeah, I guess indeed it would probably be a good strategy to somehow drop whatever isn't what we actually requested. However, I guess that won't fix waitForNotificatons, right? We can't tell it what exactly to wait for, so I guess we would need some custom logic or something there too...

Right now I don't feel like I really understand how this all works (only really started with all this recently). Given that I now have a (more or less) working communication example, I'll probably first focus on trying to port it all to the Garmin ecosystem. After that I will probably have a better understanding of it all, and might come back here - but no promises I'm afraid, as I do have a lot on my plate with other projects as well...

Quick update: It's indeed the DisplayDash... With it disconnected, miauth.cli with --serial does seem to work without any modifications...

@The-Compiler Alright, that makes sense so far. Made some changes to the handler since that was a TODO after all. Can you test if #7 solves the issue?

Closing with #7. For future issues, please open a new issue.