pytorch/kineto

GPU traces fail when using PyTorch lightning due to square braces in traceName

agkphysics opened this issue · 2 comments

PyTorch Lightning saves traces in the format fit-profiler-[Strategy]SingleDeviceStrategy-ts.pt.trace.json, and thus the traceName key contains a square brace ]. Since the traceName key is the final key in the JSON file, it fails to load in the TensorBoard viewer when GPU ops are present due to this line, which cuts off the string and creates invalid JSON:

raw_data_without_tail = raw_data[: raw_data.rfind(b']')]

This results in the error Uncaught (in promise) SyntaxError: JSON.parse, ... in the web browser.

@agkphysics would this catch this issue
51fd6e6

@briancoutinho No I don't think this would help because this just seems to replace forward slashes and remove newlines from JSON strings. The issue is how the GPU stats are appended to the file before being passed to the trace viewer.