/inlog

Python module to create logs of program input options.

Primary LanguagePythonOtherNOASSERTION

Inlog

Inlog is your solution to managing and tracking results from your Python programs. It simplifies the process of logging parameters and results, making your work more reproducible and organized.

Motivation

Imagine, you have a python program with some parameters. You make a few experiments with it, producing some results. Traditionally, people write the parameters in the filename:

Result1_day2_3deg_4km.txt

There are a few problems with this:

  • The filenames get very long with more parameters.
  • The list of parameters is often incomplete.
  • The list of parameters is not machine readable.

Things are usually fine, until you try to reproduce your results a few months later...

inlog ("Input Logger") addresses those problems by creating a log file with all parameters next to your results. All you need to do is to put your parameters into a separate file (like YAML or INI format) and with just three additional lines in your python code, inlog will create a comprehensive logfile:

import inlog
config=inlog.load_yaml('Params.yaml', '1.0')
#do your stuff, using the parameters in config and save results into 'Results1.txt'
#...
config.write_log('Results1.txt') #this creates a file 'Results1.log' in the same folder as 'Results1.txt'

But there is more: inlog stores additional information like the date, runtime or a hash of your data, it can keep track of complex histories of your results and it can even visualize them as a flowchart! That`s how, on the long term, inlog helps you to remember:

  • Where did my results come from?
  • What do the results depend on?
  • What programs did I execute to get the results?

To see a more comprehensive example, look at the examples folder

Installation

The basic version of inlog only depends on the python standard library.

python3 -m pip install inlog

If you want YAML support, install the extras version of inlog:

python3 -m pip install inlog[extras]

Usage

Input Parameters

It is not necessary to read input parameters from a separate file. However, in most cases, this is desirable. Inlog supports currently the 'ini', 'json' and '.yaml' file formats.

Ini Format

Parsing is done using the 'configparser' module from the python standard library. ini files consist of sections with options. All options are treated as strings. Ini files allow for value interpolation, see the manual for further information. Example of an .ini file:

[Filepaths_General]
home=/home
bin=/bin
[Filepaths_User]
documents=~/Documents
import inlog
config=inlog.load_ini('config.ini',version='1.0')

JSON Format

Example of a .json file:

{
  "Filepaths": {
    "Common": {
      "home": "/home",
      "bin": "/bin"
    },
    "User": {
      "documents": "~/Documents"
    }
  }
}
import inlog
config=inlog.load_json('config.json',version='1.0')

YAML Format

Parsing YAML files requires the pyyaml library. Example of a .yaml file:

Filepaths:
  Common:
    home: /home
    bin: /bin
  User:
    documents: ~/Documents
import inlog
config=inlog.load_yaml('config.yaml',version='1.0')

Dictionary

You can also pass a dictionary directly to the Logger Class:

import inlog
dictionary={'Filepaths': {'Common': {'home': '/home','bin': '/bin'},'User': {'documents': '~/Documents'}}}
config=inlog.Logger(dictionary,version='1.0')

Accessing Parameters

inlog stores all parameters in a tree data structure. You can access them using the .get() method. This allows inlog to keep track which options you used in your program and to write only those in the log file. Take the yaml example from above:

import inlog
config=inlog.load_yaml('config.yaml',version='1.0')
config.get('Filepaths', 'User', 'documents')

Similarly, use .set() to set a value to a parameter:

config.set('~/MyDocs', 'Filepaths', 'User', 'documents')

You can change all the parameters in a subtree using set_subtree(subtree_dict), using a nested dictionary as argument.

config.set_subtree({'documents':'~/MyDocs', 'pictures':'~/MyPictures'}, 'Filepaths', 'User')

As a shortcut for accessing parameters, you can use bracket notation []:

config['Filepaths', 'User', 'documents']

In this case, inlog actually performs a depth-first search in the config tree, so with brackets, you can go even shorter. All of the following commands yield the same result:

config['Filepaths', 'User', 'documents']
config['User', 'documents']
config['documents']

Type conversion

For the .ini file format, all options are treated as strings. inlog provides the convert_type and convert_array functions as shortcuts for type conversions:

config.convert_type(int, 'option1')
config.convert_type(pathlib.Path, 'option1') #you can provide an arbitrary conversion function
config.convert_array(int, 'option1', removeSpaces=True, sep=",") #this will split the string and convert the elements, returning a list.

Hashes

You can provide inlog the name of the result files your program produced. In this case, inlog will store SHA256 hash values of your results in the log. Therefore, you can later verify that your results truly belong to the parameters given in the log. Use set_outfile() and add_outfile to set or append the list of filenames.

config.set_outfile('Results1.txt')
config.add_outfile('Results2.txt')

Writing Logs

In order to write a log, you need to specify the file path (or multiple paths) of the new log. Optionally, you can specify existing log files, which will be included in the new log as dependencies. There are two different formats for logs: txt and json. By default, write_log() will append the filename of all given filenames with .log.

config.write_log('Results1.txt', old_logs=['Dependency1.txt'])

JSON Format

A json file. This format is the recommended default, since it allows to capture the tree-like structure of dependencies. Example:

{
    "date": "2023-10-08 20:59:51.019125",
    "program": "inlog/examples/Script2.py",
    "version": "1.0",
    "input": "Config2.ini",
    "runtime": "0:00:00.006449",
    "options": {
        "section1": {
            "factor": 2,
            "intermediate": "intermediateResult.dat",
            "result": "FinalResult.dat"
        }
    },
    "output_files": [
        {
            "path": "inlog/examples/FinalResult.dat",
            "hash": "6542c8602f59c351652e382f0448b2caba8c6404a133fca7b137ccd679bd7f4b"
        }
    ],
    "dependencies": {
        "inlog/examples/intermediateResult.dat.log": {
            "date": "2023-10-08 20:59:49.026310",
            "program": "inlog/examples/Script1.py",
            "version": "1.0",
            "input": "Config1.ini",
            "runtime": "0:00:00.005438",
            "options": {
                "section1": {
                    "start": 1,
                    "stop": 10,
                    "increment": 2,
                    "intermediate": "intermediateResult.dat"
                }
            },
            "output_files": [
                {
                    "path": "inlog/examples/intermediateResult.dat",
                    "hash": "22a23bcb0798a2b67902a51faad1d04fca6489abdc7c3f1ced983ac22658a721"
                }
            ],
            "dependencies": {}
        }
    }
}

Text Format

A linear text file, where dependencies are listed first and the new log information is appended at the end of the file. This format is straightforward and easy to read, but gets messy if you have multiple (sub-)dependencies. You can execute such a log as a bash-script to reproduce the data. Example:

cd inlog/examples
python3 Script1.py Config1.ini
# <Date> 2023-10-08 20:57:47.744149
# <Program> inlog/examples/Script1.py
# <Version> 1.0
# <Input> Config1.ini
# <Runtime> 0:00:00.007015
#**************************
#{
#    "section1": {
#        "start": 1,
#        "stop": 10,
#        "increment": 2,
#        "intermediate": "intermediateResult.dat"
#    }
#}
#**************************
#Output files created:
# <PATH> inlog/examples/intermediateResult.dat
# <HASH> 22a23bcb0798a2b67902a51faad1d04fca6489abdc7c3f1ced983ac22658a721
# <Logfile> intermediateResult.dat.log_txt
#=========================================
cd inlog/examples
python3 Script2.py Config2.ini
# <Date> 2023-10-08 20:57:54.775511
# <Program> inlog/examples/Script2.py
# <Version> 1.0
# <Input> Config2.ini
# <Runtime> 0:00:00.007339
#**************************
#{
#    "section1": {
#        "factor": 2,
#        "intermediate": "intermediateResult.dat",
#        "result": "FinalResult.dat"
#    }
#}
#**************************
#Output files created:
# <PATH> inlog/examples/FinalResult.dat
# <HASH> 6542c8602f59c351652e382f0448b2caba8c6404a133fca7b137ccd679bd7f4b

Visualization

Printing the logger object will yield a text version of the current log. Equivalently, you can call show_data().

print(config)
config.show_data()

Flowchart

Calling inlog-flowchart logfile.dat.log will convert a log "logfile.dat.log" in json format to a Mermaid flowchart. Mermaid is a charting application, based on a markdown like syntax. You can paste the output of flowchart.py in the Mermaid Live Editor to obtain a flowchart. Just call:

inlog/flowchart.py examples/FinalResult.log

This willl yield the following output:

flowchart TD
    id_435d36b4a8d3c591b456a027fe49efa3[FinalResult.log]
    id_89f9132e25d1cada0f37669c8e3a2faa[intermediateResult.log]
    id_f1add11e031d031e9f8694f5591c1fc2[No Dependencies]
    id_f1add11e031d031e9f8694f5591c1fc2 --> |Script1.py| id_89f9132e25d1cada0f37669c8e3a2faa
    id_89f9132e25d1cada0f37669c8e3a2faa --> |Script2.py| id_435d36b4a8d3c591b456a027fe49efa3

Pasting into the Mermaid Live Editor will give the following chart:

flowchart TD
    id_435d36b4a8d3c591b456a027fe49efa3[FinalResult.log]
    id_89f9132e25d1cada0f37669c8e3a2faa[intermediateResult.log]
    id_f1add11e031d031e9f8694f5591c1fc2[No Dependencies]
    id_f1add11e031d031e9f8694f5591c1fc2 --> |Script1.py| id_89f9132e25d1cada0f37669c8e3a2faa
    id_89f9132e25d1cada0f37669c8e3a2faa --> |Script2.py| id_435d36b4a8d3c591b456a027fe49efa3
Loading