Bitmessage/PyBitmessage

msgid is stored as text type

kashikoibumi opened this issue · 5 comments

msgid is stored as text type.
It should be binary blob.

$ sqlite3 messages.dat
SQLite version 3.40.1 2022-12-28 14:03:47
Enter ".help" for usage hints.
sqlite> SELECT TYPEOF(msgid) FROM inbox LIMIT 1;
text

hash in inventory is OK.

sqlite> SELECT TYPEOF(hash) FROM inventory LIMIT 1;
blob

This is true, but there are potential problems with upgrades. That's why the sql thread should be refactored first and added tests. These are the PRs that tried that but they are too big and need cleaning up: #1794 #1999 #2150

The type of values whose declared type is blob:

  • inbox.msgid : text
  • inbox.sighash : text
  • inventory.hash : blob
  • inventory.payload : blob
  • inventory.tag : blob
  • objectprocessorqueue.data : (empty)
  • pubkeys.transmitdata : text
  • sent.msgid : text
  • sent.toripe : text
  • sent.ackdata : text
  • settings.key : text
  • settings.value : text, integer

It actually isn't consistent across versions, there may be older databases which are different. That's why we need tests for this, even more for the data type change.

How about this:

  • Change all code at once to store all blob data as blob by using sqlite3.Binary(). Then,
  • Search any blob data:
    • First, try SELECT by blob keys using sqlite3.Binary(key).
    • When there is no match found, as a fallback, try SELECT by text key using CAST(? AS TEXT) in SQLite sintax.

I think this should be compatible to any past versions and future updates, including migrations to Python3.

I have implemented a quick workaround: #2248