π§ Lab 3.0
This library helps you organize machine learning experiments. It is a quite small library, and most of the modules can be used independently of each other. This doesn't have any user interface. Experiment results are maintained in a folder structure, and there is a Python API to access them.
Features
Organize Experiments
Maintains logs, summaries and checkpoints of all the experiment runs in a folder structure.
logs
βββ experiment1
β βββ run1
β β βββ run.yaml
β β βββ configs.yaml
β β βββ indicators.yaml
β β βββ source.diff
β β βββ checkpoints
β β β βββ π Saved checkpoints
β β βββ tensorboard
β β βββ π TensorBoard summaries
β βββ run1
β ...
βββ experiment2...
βββ
...
π Dashboard to browse experiments
The web dashboard helps navigate experiments and multiple runs. You can checkout the configs and a summary of performance. You can launch TensorBoard directly from there.
Eventually, we want to let you edit configs and run new experiments and analyse outputs on the dashboard.
Logger
Logger has a simple API to produce pretty console outputs.
Manage configurations and hyper-parameters
You can setup configs/hyper-parameters with functions. π§ͺlab would identify the dependencies and run them in topological order.
@Configs.calc()
def model(c: Configs):
return Net().to(c.device)
You can setup multiple options for configuration functions. So you don't have to write a bunch if statements to handle configs.
@Configs.calc('optimizer')
def sgd(c: Configs):
return optim.SGD(c.model.parameters(), lr=c.learning_rate, momentum=c.momentum)
@Configs.calc('optimizer')
def adam(c: Configs):
return optim.Adam(c.model.parameters())
Getting Started
pip
directly from github.
Install it via pip install -e git+git@github.com:vpj/lab.git#egg=lab
.lab.yaml
file.
Create a An empty file at the root of the project should be enough. You can set project level configs for 'check_repo_dirty' and 'path' in the config file.
Lab will store all experiment data in folder logs/
relative to .lab.yaml
file.
If path
is set in .lab.yaml
then it will be stored in [path]logs/
relative to .lab.yaml
file.
You don't need the .lab.yaml
file if you only plan on using the logger.
Samples
samples folder contains a bunch of examples of using π§ͺ lab.
Usage
Logger
from lab import logger
from lab.logger import colors
Logging with colors
logger.log("Red", color=colors.BrightColor.red)
logger.log_color([
('Red', colors.BrightColor.red),
'Normal',
('Cyan', colors.BrightColor.cyan)
])
Logging debug info
logger.info(a=2, b=1)
logger.info(dict(name='Name', price=22))
Log indicators
Here you specify indicators and the logger stores them temporarily and write in batches. It can aggregate and write them as means or histograms.
Example
for i in range(1, N):
logger.add_global_step()
loss = train()
logger.store(loss=loss)
if i % 10 == 0:
logger.write()
if i % 1000 == 0:
logger.new_line()
This stores all the loss values and writes the logs the mean on every tenth iteration.
Console output line is replaced until new_line
is called.
Indicator settings
add_indicator(name: str,
type_: IndicatorType = IndicatorType.scalar,
options: IndicatorOptions = None)
class IndicatorType(Enum):
queue = 'queue'
histogram = 'histogram'
scalar = 'scalar'
pair = 'pair'
class IndicatorOptions(NamedTuple):
is_print: bool = False
queue_size: int = 10
You can specify details of an indicator using add_indicator
.
If this is not called like the example above, it assumes a scalar
with is_print=True
.
histogram
indicators will log a histogram of data.
queue
will store data in a deque
of size queue_size
, and log histograms.
Both of these will log the means too. And if is_print
is True
it will print the mean.
pair
is still under development.
Sections
with logger.section("Load data"):
# code to load data
with logger.section("Create model"):
# code to create model
Sections let you monitor time taken for different tasks and also helps keep the code clean by separating different blocks of code.
Progress
with logger.section("train", total_steps=100):
for i in range(100):
# Multiple training steps in the inner loop
logger.progress(i)
This shows the progress for code within the section.
Iterator and Enumerator
for data, target in logger.iterator("Test", test_loader):
...
for i, (data, target) in logger.enumerator("Train", train_loader):
...
This combines section
and progress
Loop
for step in logger.loop(range(0, total_steps)):
# training code ...
The loop
keeps track of the time taken and time remaining for the loop.
You can use sections, iterators and enumerators within loop.
Experiment
exp = Experiment(name="mnist_pytorch",
comment="Test")
name
: Name of the experiment; If not provided, this defaults to the calling python filenamecomment
: Comment about the current experiment run
Starting an expriemnt
exp.start()
This does the initialization work like creating log folders, writing run details, etc.
exp.start(run=-1)
This starts the experiment from where it was left in the last run. You can provide a specific the run index to start from.
Save Checkpoint
logger.save_checkpoint()
This saves a checkpoint and you can start from the saved checkpoint with
exp.start(run=-1)
.
π§ Configs
π§ Training Loop
Background
I was coding existing reinforcement learning algorithms to play Atari games for fun. It was not easy to keep track of things when I started trying variations, fixing bugs etc. Then I wrote some tools to organize my experiment runs. I found it important to keep track of git commits to make sure I can reproduce results.
I also wrote a logger to display pretty results on screen and to make it easy to write TensorBoard summaries. It also keeps track of training times which makes it easy to spot what's taking up most resources.
This library is was made by combining these bunch of tools.