Your task is to create a project, where you will load and transform the data from the provided dataset. This is not exactly a data analysis task.
PLEASE DO NOT SPEND MORE THAN 3 HOURS OF YOUR TIME
- Prepare a basic structure for your project. You might want to include:
- Package requirements using your favourite package manager (pip, poetry etc.)
- .gitignore
- README
- Prepare a code that will:
- Load the provided CSV files
- Create following aggregations:
- Age distribution
- Annual income distribution
- Annual income correlated with age
- Save the results in selected binary format.
- Save the results in selected serialization format.
- Create unit tests using your favourite testing library.
- Create a dockerfile that will run the tests inside the container.
- Create a public git repository and provide us with a link.
Link to Mall Customers Dataset on Kaggle
The dataset is provided in data
folder as a zipped csv file.
If you have any questions regarding the tasks, please do not hesitate to contact us.