/ca-state-workforce-data

Management and analysis of data on demographics of the California state government workforce.

Primary LanguageHTML

Project Description / Background

This project is part of CalEPA's racial equity work group (REWG) data sub-team, and aims to explore the demographics of California's state government workforce. The project is regularly evolving to fit the data needs of CalEPA's REWG workforce equity sub-team. There are several different analyses within the project, each supporting the larger goal of operationalizing equity throughout all phases of workforce development and supporting a culture across CalEPA where all feel they belong. CalEPA's REWG is guided by equity principles from the Government Alliance on Racial Equity (GARE). (we can potentially add more information about the project here, e.g., a description of the project background and goals, documentation of where the raw/source data came from, instructions for running the scripts, description of intermediate and final outputs, etc.).

Source Data

The workforce data used in this project comes from CalHR's Statewide 5102 report. A cleaned and compiled version of that data is available on the California Open Data Portal at: https://data.ca.gov/dataset/calhr-civil-rights-data-for-gare-capital-cohort-2019

Dataset Description

Description of the dataset coming soon...

Instructions

If using RStudio, open the workforce_data.Rproj file to open the project within RStudio. Otherwise, set your working directory in R to the folder that contains this readme file.

Updating 5102 Dataset

To update the compiled 5102 dataset with data for an additional year:

  1. Get the raw data for the new year from CalHR.
  2. Save the new data to an excel file named "calhr-5102-statewide-YYYY.xlsx" (where YYYY is the year the dataset covers), and put that file in the 02_data_raw/5102 folder.
  3. Run the 01_scripts/data_processing.R script. This script compiles the data from each of the individual years' data files (assuming they are saved with the file naming convention described in step 2), and saves the compiled dataset to a zipped csv file in the 03_data_processed folder. It also updates the compiled dataset on the CA Open Data Portal (the script assumes that you have a data portal key saved to your local environment, in a variable named data_portal_key; as an alternative to using this script to update the data portal, you can manually update the portal by extracting the zipped csv in the 03_data_processed folder and loading it to the portal).

Exploratory Data Analysis

To run the exploratory data analysis script, run the workforce_data_exploration.R script in the 01_scripts folder.

Create Visualizations For Your Agency

To create graphs visualizing your department's workforce demographics, use the script in the 06_reports folder. Your results will be saved locally in the 07_slides folder.

Reports / Slides