Debugger state restoration
JohanMabille opened this issue · 4 comments
The frontend should be able to request the state of the debugger so it can restore its state after a reload for instance. This implies adding a new message in the debug protocol.
Besides, a cell with breakpoints needs to be dumped to a real file, so that the debugger can break when the execution hits a breakpoint. The current solution is to send the content of the cell to the backend in the dumpCell
request, and let the backend compute the hash of this content, and then dump the content to a file whose name contents this hash. Notice that the backend also needs to compute this file name (and therefore, the hash of the content) in the implementation of execute_request
to "map" the code of the cell to the file. It is required that the names computed in execute_request
and dumpCell
match, otherwise the debugger cannot break.
With the current implementation, it is impossible to restore the breakpoints in the frontend after a reload. The frontend needs to ask the kernel which breakpoints were set, and how files map to cells. After in person conversations with @jtpio and @SylvainCorlay, the following solutions are considered:
- the hash is computed in the backend only. In that case, upon reload, the front end needs to send the content of each cell to get the associated filename, and then can ask the kernel for the breakpoints.
- the hash is computed in the frontend only. In that case, it must be sent to the backend in the dumpCell request, but also as an additional parameter of the
execute_request
. - the frontend and the backend can compute the hash when they need it. In that case, they must agree on a hash method (this can be done in the
debug_info_request
message, or harcoded) and the full state of the debugger can be retrieved with a single request.
I think solution 2 should be avoided since it requires modifying the current protocol with additional parameter that makes sense for debugging only. Solution 1 is simpler than solutoin 3 since the hash method is an implementation detail of the backend, however many messages are sent upon reload while solution 3 requires only one request.
Also, since many notebooks can be opened at the same time, it could be useful to "cache" the state and the breakpoints in the DebugSession objects to avoid requesting the backend each time we switch from one notebook to the other (or to a console).
EDIT: reformulated and exposed the three possible solutions
After in person discussion with @SylvainCorlay and @jtpio the decision is to be able to compute the hash from both the frontend and the backend.
The hash method used in the backend can be retrieved thanks to the debug_info_request
message.
As a first implementation (for testing), we can use MurmurHash2. The inconvenient of this method is that it does not provide the same results between little endian and big endian architecture, so this might be problematic in the case of a Jupyter Hub deployment.
Thanks @JohanMabille for the summary.
I also think it's fine to choose MurmurHash2 for now (for testing), which could be changed later if needed.
Solution 1 would have been simpler and more transparent for the frontend (no need to care about the kernel implementation details). But it doesn't solve the problem of restoring the state.
So solution 3 sounds like a good compromise at the moment.
We can choose this package to calculate the Murmur2 hash on the frontend: https://www.npmjs.com/package/murmurhash-js
Putting the response for the debugInfo
request here for reference:
"content": {
"body": {
"breakpoints": [
{
"lines": [
3,
5
],
"source": "/home/yoyo/dev/quantstack/xeus-python/build/test/external_code.py"
},
{
"lines": [
2,
4
],
"source": "/tmp/xpython_17036/10865146876830579964.py"
}
],
"hashMethod": "Murmur2",
"hashSeed": 1234,
"isStarted": true
},
"command": "debugInfo",
"request_seq": 12,
"success": true,
"type": "response"
}
(@JohanMabille feel free to edit if something is missing)