The combination of datasets, questions, and nature of analysis is growing everyday. Data scientists find it hard to keep track of all the different datasets they dealt with, what they did with those datasets, and what they presented to the model-audience (business etc)
pydatasentry package allows auditability of modeling code and data by logging all relevant information for every single model run (e.g., a regression) You could use this for audit past results for correctness, share models and results with peers, search past results to avoid repition of work.
Note that code is very alpha. Expect it to break often. Please try it out and give me feedback/create issues.
Please see docs for detailed documentation.