ML-zomcamp-Homework-5
Homework
In this homework, we will use Credit Card Data from the previous homework.
Question 1
- Install Pipenv
- What's the version of pipenv you installed?
Answer : 2022.10.10
Question 2
- Use Pipenv to install Scikit-Learn version 1.0.2
- What's the first hash for scikit-learn you get in Pipfile.lock?
Answer : 62db916eaa3ba201789358b59e73eb7630266ef79c9c3f4d67236779aaf5f04a
Models
We've prepared a dictionary vectorizer and a model.
They were trained (roughly) using this code:
features = ['reports', 'share', 'expenditure', 'owner']
dicts = df[features].to_dict(orient='records')
dv = DictVectorizer(sparse=False)
X = dv.fit_transform(dicts)
model = LogisticRegression(solver='liblinear').fit(X, y)
Note: You don't need to train the model. This code is just for your reference.
And then saved with Pickle. Download them:
With wget
:
PREFIX=https://raw.githubusercontent.com/alexeygrigorev/mlbookcamp-code/master/course-zoomcamp/cohorts/2022/05-deployment/homework
wget $PREFIX/model1.bin
wget $PREFIX/dv.bin
Question 3
Let's use these models!
- Write a script for loading these models with pickle
- Score this client:
{"reports": 0, "share": 0.001694, "expenditure": 0.12, "owner": "yes"}
What's the probability that this client will get a credit card?
- 0.162
- 0.391
- 0.601
- 0.993
Answer : 0.162
Question 4
Now let's serve this model as a web service
- Install Flask and gunicorn (or waitress, if you're on Windows)
- Write Flask code for serving the model
- Now score this client using
requests
:
url = "YOUR_URL"
client = {"reports": 0, "share": 0.245, "expenditure": 3.438, "owner": "yes"}
requests.post(url, json=client).json()
What's the probability that this client will get a credit card?
- 0.274
- 0.484
- 0.698
- 0.928
Answer : 0.928
Docker
Install Docker. We will use it for the next two questions.
For these questions, we prepared a base image: svizor/zoomcamp-model:3.9.12-slim
.
You'll need to use it (see Question 5 for an example).
This image is based on python:3.9.12-slim
and has a logistic regression model
(a different one) as well a dictionary vectorizer inside.
This is how the Dockerfile for this image looks like:
FROM python:3.9.12-slim
WORKDIR /app
COPY ["model2.bin", "dv.bin", "./"]
We already built it and then pushed it to svizor/zoomcamp-model:3.9.12-slim
.
Question 5
Download the base image svizor/zoomcamp-model:3.9.12-slim
. You can easily make it by using docker pull command.
So what's the size of this base image?
- 15 Mb
- 125 Mb
- 275 Mb
- 415 Mb
Answer: 125 Mb
You can get this information when running docker images
- it'll be in the "SIZE" column.
Dockerfile
Now create your own Dockerfile based on the image we prepared.
It should start like that:
FROM svizor/zoomcamp-model:3.9.12-slim
# add your stuff here
Now complete it:
- Install all the dependencies form the Pipenv file
- Copy your Flask script
- Run it with Gunicorn
After that, you can build your docker image.
Question 6
Let's run your docker container!
After running it, score this client once again:
url = "YOUR_URL"
client = {"reports": 0, "share": 0.245, "expenditure": 3.438, "owner": "yes"}
requests.post(url, json=client).json()
What's the probability that this client will get a credit card now?
- 0.289
- 0.502
- 0.769
- 0.972
Answer : 0.769