Building classification for natural disaster relief efforts
The following software should be pre-installed on the system using this repository:
Create the following keys if you don't have them
- GitHub Developer Access Token - to push and pull code to and from GitHub. This should be saved to the terminal or command line.
- AWS IAM Credentials - to access AWS resources remotely.
- Access Key
- Secret Access Key
The following commands need to be run only once, during the initial setup process.
-
Create a conda environment
-
Create the environment (name
alivio
is optional and name can be used)conda create --name alivio python=3.10 -y
-
Activate the environment
conda activate alivio
-
Create a Jupyter Notebook Kernel
conda install -c anaconda ipykernel -y python -m ipykernel install --user --name=alivio
-
-
Install poetry for python dependency management
pip install poetry
-
Configure AWS Credentials. This command will open up a terminal-based prompt with 4 inputs.
pip install awscli aws configure
- Access Key: Your AWS Access Key
- Secret Access Key: Your AWS secret access key
- region:
us-east-1
- format:
json
The following commands need to be run only once, during the initial setup process.
-
Clone the GitHub repository
git clone https://github.com/cricksmaidiene/alivio
-
Visit the repository locally
cd alivio
-
Install the python dependencies (make sure the
alivio
conda environment is active)poetry install --no-root
You can now start executing notebooks and code within this virtual environment.
This utility downloads relevant dataset used by the project into the /data
directory
Additionally install the below package
pip install chardet
Run the below command
python src/utils/sync_data.py
-
/src
: for all source code and notebook filessrc/01_data_ingestion
: For notebooks and source related to ingesting raw datasrc/02_data_analysis
: For EDA, visualization and other analysis taskssrc/03_data_engineering
: For preprocessing the dataset or performing feature engineeringsrc/04_models
: For model training, fine-tuning and experimentation
-
/data
: For all data extracts saved locally -
/docs
: For internal team documentation -
/app
: For the web-interface