We intend to explore chicago crime data to find intresting insights in the data set . We have chosen this datasets not only beacause of it's popularity but the daily updates it has .
Infra : aws s3 : data storage aws lambda : data preprocessor and convert to native hdf5 format for faster indexing and stores back in s3 aws ec2 (4-core 8-Gb): Bare machine for application deployment
Programming : Python + Dash + Flask + Plotly
Defaulting with visual studio code .
install anaconda as package manager
.(high priority)- Clone the repo .
- change working directory into app directory
conda create -n chicago python=3.8.10 pip
(cerates the virual env)conda activate chicago
( activates the env)conda install -y -c conda-forge --file requirements.txt
( installs vaex to solve dependency issues )- That is all with setup
python3 app.py
( run the app.py file in app directory ) Note: default port is 8080 , keep it open or switch to another port in app.py file
Note : download the data set from the public s3 bucket before starting the app
Downloads the data sets from s3 automatically
- install docker
- navigate to app container
cd ./app
- build image
docker build -t crimalytics .
( dot at the end of the command is imp, this will build with all dependencies ) - run image
docker run -p 8080:80 crimalytics
- Launch a ec2 instance
- Clone the repo into machine
- Build the docker image as instructed
- Export port 8080 in the machine for public access or use default 80 port
- Run the docker image in detached mode
docker run -d -p 8080:80 crimalytics
- See the app running