Arthena-Data-Challenge

Enviroment: Python 3 Jupyter Notebook.

Part I - Web crawler

Primary task: Write a script that parses the HTML files in the HTML data directory, Extracts the artist, works, currency, price amount and outputs to stdout

Output format: A JSON array of objects

Primary task: Train a machine learning model that predicts the price of a work of art given its 19 variables, including artist_name, auction_date, location, size(depth, height, width), etc.

Target variable: hammer_price

Metric: Root mean squared error RMSE

Final file: "model.py", containing an importable predict function.