In this repository I'm taking TFX for a spin.
Aside from requirements.txt
, this project requires Python 3.8.12. Lastly, the original content of the fraud
directory was populated with the following command:
tfx template copy \
--model=taxi \
--pipeline_name="fraud" \
--destination_path="fraud"
Since the dataset is fairly large, it is not committed to git, but it can be downloaded from here and saved in data/
. We are using a credit card fraud detection dataset.
Here is a list of relevant TFX resources that were used for this exercise:
- Building TFX Pipeline Locally
- Create a TFX pipeline using templates with local orchestrator - Colab Notebook
- TFX in interactive context
- TFX in a notebook
- Creating a custom TFX executor
- Creating a custom TFX component
- Example of a custom component using the Slack API
- clone
- download
application_data.csv
from Kaggle to the top-leveldata/
directory - create a new folder
pipeline_outputs
in the project folder - run
make sample_data
- run
make create_pipeline
- run
make update_and_run
- run
make tensorboard
to check out the training logs - Check out the notebooks (work in progress)
Note that with every run you are accumulating output data.