Data Analysis project looking at public bathing spots in Berlin.
See the publication that accompanies this analysis on Towards Data Science.
The purpose of this project was to take publicly available data regarding bathing spots in the city of Berlin, Germany, and use Exploratory Data Analysis to identify which area is most attractive if one's only criterion regarding where to live is the availability of bathing spots. The intention was to perform some basic EDA which could then be documented in the accompanying publication to give beginners in the Python Data Science universe an introduction to some of the most useful features of pandas and matplotlib.
- EDA
- Data visualisation
- Jupyter Notebook
- pandas
- Numpy
- Matplotlib
Badestellen data from: Berlin Open Data - stored here as badestellen.csv
Badestelle map from berlin.de
Einwohner / Inhabitant data from: Berlin Open Data - stored here as EWR201812E_Matrix.csv
Bezirk / District number data from: Amt für Statistik Berlin-Brandenburg
Bezirk area data from: Amt für Statistik Berlin-Brandenburg / Wikipedia
- Clone this repo (for help see this tutorial).
- Raw Data is being kept in the CSV files in the root folder of this repo.
- All code is contained within the Jupyter Notebook for this project, stored in the root folder as Badestellen.ipynb
All feedback is warmly received. Craig Dickson can be contacted via Twitter as @craigdoesdata