/AWK-data-analysis

using AWK to write a Shell Script to analyze a CSV file

Primary LanguageShell

Using AWK and Makefile to analyze a CSV file

use AWK to write a Script analyzing a CSV file use makefile to run the script

Introduction

In this project I create a zsh shell script to understand data that is dowloaded. In order to execute the script easily I created a makefile which actually downloads the data and executes the script, using a makefile makes the execution of the shell easier through one command in the terminal.

This repository allows us to use a makefile to dowload a .csv file containing information about NYC inmates discharged from prison in 2018 and run a shell script which uses AWK commands to analyze the information about the inmates.

The shell script

  • calculates the amount of discharged prisoners according to gender
  • calculates the amount of discharged prisoners according to age
  • calculates the average age of the discharged prisoners
  • prints all this information to a new file

This raw data is downloaded from NYCOpenData. The dataset includes the inmate ID, data admitted, date released, race, gender , age , inmate status, and convicted charge. For this project I will utilize the gender and age (columns 5 and 6) of the inmate information.

How To Use

please be sure that curl is installed on your computer before doing this exercize

  1. Open the terminal
  2. Be sure that you are in the file you would like to clone the repository into, if you are not- cd into the appropriate file
  3. Clone this repo by typing in terminal: git clone https://github.com/RachMink/CISC3140Lab3Task1.git
  4. after cloning direct yourself into the proper folder using cd CISC3140Lab3Task1
  5. use the ls command to see that your folder contains README.md makefile and nycScript.sh
  6. run command make
  7. now if you ls you should have two extra files called inmateFile.csv which contains the original data used and newFile.csv which contains the filtered data
  8. if you would like to only use the terminal in this exercize you can type cat newFile.csv which should print the amount of inmates discharged per gender and age, otherwise you should be able to open a .csv file through your finder on your pc

inmateFile.csv

Screen Shot 2021-07-29 at 8 36 08 AM

newFile

Screen Shot 2021-07-29 at 8 34 42 AM

Notes
  • please look at comments in script to understand how the script works
  • any blank spots in the output file are because these feilds were not specified in the input file