batch_scoring_wrapper: A Python repository from nimnull

Batch scoring wrapper

Here is datarobot_batch_scoring script wrapper for batch-scoring tool that follows the same syntax except (-y and -n). It will always pass (Yes) answer for all underlying batch-scoring questions.

Wrapper takes dataset, shuffles it's rows and does two runs through the batch-scoring tool. If there would be some prediction results divergencies between these two runs, it will alert and produce merged prediction result in a format:

	ref_id	predicted_x	predicted_y	diff

Where:

ref_id: is an original dataset row number (passed to the wrapper script, unshuffled)
predicted_x: prediction result from the first shuffled attempt
predicted_y: prediction result from the second shuffled attempt
diff: predicted_x - predicted_y

In case the were no differences between predictions out.csv produced. You will also see a message like "First and second pass predictions matched". Otherwise predictions & differences are saved into diverged.csv file.

Usage

Activate some empty virtualenv
Manually install batch-scoring tool if you want to work with specific version. Otherwise latest version will be provided automatically
Run pip install https://github.com/nimnull/batch_scoring_wrapper/archive/master.zip. If everything went smooth you are ready to go now.

Call:

wrap_scoring --user=someone@somewhere.com --api_token=token-token --host=http://localhost project_id model_id dataset_path

as you used to do with datarobot_batch_scoring

nimnull/batch_scoring_wrapper

Batch scoring wrapper

Usage