This is a tool to automatically track which piece of code created your data.
In particular, every time you save a numpy array or a torch tensor, this tool will automatically create a file with metadata about the code that created the data. For example:
.
├── data.npy
├── .data.npy.who
├── ...
├── weights.pt
└── .weights.pt.who
The who
dotfiles are stored in plaintext, so you can open them in your favorite IDE or directly cat
/bat
them:
bat .data.npy.who
───────┬─────────────────────────────────────────────────────────────────────
│ File: .data.npy.who
───────┼─────────────────────────────────────────────────────────────────────
1 │ Script Path: /path/to/example_numpy.py
2 │ -------------------------------------------------------
3 │ Code:
4 │ -------------------------------------------------------
5 │ import numpy as np
6 │ import whocreatedme as wcme
7 │ wcme.trace(numpy=True)
8 │
9 │ np.save("./test.npy", np.random.randn(50, 50))
10 │ -------------------------------------------------------
───────┴─────────────────────────────────────────────────────────────────────
You can either use the CLI or the Python API:
python -m whocreatedme.cli <your-script> [--npsave] [--torchsave]
For example, to trace both numpy
and torch
save
methods
for a script script.py
, simply run:
python -m whocreatedme.cli script.py --npsave --torchsave
Alternatively, you can trace the numpy
and torch
save
methods
from your code by using the Python API:
import numpy as np
import whocreatedme
whocreatedme.trace()
# this will automatically create a file `.data.npy.who` in the same directory as `data.npy`
np.save("data.npy", np.array([1, 2, 3]))
pip install whocreatedme
For now, this lightweight module only supports monkey-patching the .save()
methods in numpy
and torch
.
This was me before I started using whocreateme
:
All images in the README were created with DALL-E.