This repository contains example code and documentation for clustering geospatial data using a dbscan algorithm. This includes importing data in different formats (e.g. shapefile, GeoJSON), visualizing, combining and tidying them up for analysis, exploring spatial relationships, ... and will use libraries such as pandas, geopandas, shapely, pyproj, matplotlib, ... displaying the final output as a map generated using Folium.
Contents: Directory Layout | Installation | 🚀 Quick Start | Reference | FAQ
.
├── data
│ ├── geospatial
│ │ ├── DSFRS_Service_Area.cpg
│ │ ├── DSFRS_Service_Area.dbf
│ │ ├── DSFRS_Service_Area.prj
│ │ ├── DSFRS_Service_Area.qpj
│ │ ├── DSFRS_Service_Area.shp
│ │ └── DSFRS_Service_Area.shx
│ └── dsfrs_stations.csv
├── example
│ ├── example.png
│ └── example_data.csv
├── .gitignore
├── Licence
├── README.md
├── ers_failures.sql
├── requirements.txt
└── spatial_clustering.ipynb
First clone the repository and navigate to the project's root directory:
git clone https://github.com/PhilPearson83/density_based_spatial_clustering.git
# navigate to the downloaded (or git cloned) material
cd ./density_based_spatial_clustering/
# creating a virtual environment called "env"
python -m venv env
# activating the environment
source env/Scripts/activate
This project is written in Python
and depends on a number packages to be installed. You can install these packages by running the following command in the project's root directory:
pip install requirements.txt
jupyter notebook
Open spatial_clustering.ipynb.ipynb
file and rull all cells.
max_value_str_len
max length of each variable string, -1 to disable, default=1000max_exc_str_len
max length of exception, should variable print fail, -1 to disable, default=10000