elz-harri
Data and design for good. I am a design minded data analyst who approaches complex problems with a human centered approach.
Open to WorkAtlanta, GA
Pinned Repositories
01-Excel-Challenge
Homework for Week 1 - Excel Challenge
02-VBA-Challenge
Week 2 Homework
03-Python-Challenge
Welcome to the world of programming with Python. In this homework assignment, you'll be using the concepts you've learned to complete the **two** Python Challenges, PyBank and PyPoll. Both of these challenges encompasses a real-world situation where your newfound Python scripting skills can come in handy.
04-Pandas-Challenge
Heroes of Pymoli
05-Matplotlib-Challenge
While your data companions rushed off to jobs in finance and government, you remained adamant that science was the way for you. Staying true to your mission, you've joined Pymaceuticals Inc., a burgeoning pharmaceutical company based out of San Diego. Pymaceuticals specializes in anti-cancer pharmaceuticals. In its most recent efforts, it began screening for potential treatments for squamous cell carcinoma (SCC), a commonly occurring form of skin cancer. As a senior data analyst at the company, you've been given access to the complete data from their most recent animal study. In this study, 249 mice identified with SCC tumor growth were treated through a variety of drug regimens. Over the course of 45 days, tumor development was observed and measured. The purpose of this study was to compare the performance of Pymaceuticals' drug of interest, Capomulin, versus the other treatment regimens. You have been tasked by the executive team to generate all of the tables and figures needed for the technical report of the study. The executive team also has asked for a top-level summary of the study results. ## Instructions Your tasks are to do the following: * Before beginning the analysis, check the data for any mouse ID with duplicate time points and remove any data associated with that mouse ID. * Use the cleaned data for the remaining steps. * Generate a summary statistics table consisting of the mean, median, variance, standard deviation, and SEM of the tumor volume for each drug regimen. * Generate a bar plot using both Pandas's `DataFrame.plot()` and Matplotlib's `pyplot` that shows the number of total mice for each treatment regimen throughout the course of the study. * **NOTE:** These plots should look identical. * Generate a pie plot using both Pandas's `DataFrame.plot()` and Matplotlib's `pyplot` that shows the distribution of female or male mice in the study. * **NOTE:** These plots should look identical. * Calculate the final tumor volume of each mouse across four of the most promising treatment regimens: Capomulin, Ramicane, Infubinol, and Ceftamin. Calculate the quartiles and IQR and quantitatively determine if there are any potential outliers across all four treatment regimens. * Using Matplotlib, generate a box and whisker plot of the final tumor volume for all four treatment regimens and highlight any potential outliers in the plot by changing their color and style. **Hint**: All four box plots should be within the same figure. Use this [Matplotlib documentation page](https://matplotlib.org/gallery/pyplots/boxplot_demo_pyplot.html#sphx-glr-gallery-pyplots-boxplot-demo-pyplot-py) for help with changing the style of the outliers. * Select a mouse that was treated with Capomulin and generate a line plot of tumor volume vs. time point for that mouse. * Generate a scatter plot of mouse weight versus average tumor volume for the Capomulin treatment regimen. * Calculate the correlation coefficient and linear regression model between mouse weight and average tumor volume for the Capomulin treatment. Plot the linear regression model on top of the previous scatter plot. * Look across all previously generated figures and tables and write at least three observations or inferences that can be made from the data. Include these observations at the top of notebook. Here are some final considerations: * You must use proper labeling of your plots, to include properties such as: plot titles, axis labels, legend labels, _x_-axis and _y_-axis limits, etc.
06-Python-API-Challenge
API Challenege
07-SQL-Challenge
08-SQLAlchemy-Challenge
Congratulations! You've decided to treat yourself to a long holiday vacation in Honolulu, Hawaii! To help with your trip planning, you need to do some climate analysis on the area.
09-Web-Design-Challenge
Take what we've learned about HTML and CSS to create a dashboard showing off the analysis we've done.
10-Javascript-Challenge
WAKE UP SHEEPLE! The extra-terrestrial menace has come to Earth and we here at ALIENS-R-REAL have collected all of the eye-witness reports we could to prove it! All we need to do now is put this information online for the world to see and then the matter will finally be put to rest. There is just one tiny problem though... our collection is too large to search through manually. Even our most dedicated followers are complaining that they are having trouble locating specific reports in this mess. That's why we are hiring you. We need you to write code that will create a table dynamically based upon a dataset we provide. We also need to allow our users to filter the table data for specific values. There's a catch though... we only use pure JavaScript, HTML, and CSS, and D3.js on our web pages. They are the only coding languages which can be trusted. You can handle this... right? The planet Earth needs to know what we have found!
elz-harri's Repositories
elz-harri/01-Excel-Challenge
Homework for Week 1 - Excel Challenge
elz-harri/02-VBA-Challenge
Week 2 Homework
elz-harri/03-Python-Challenge
Welcome to the world of programming with Python. In this homework assignment, you'll be using the concepts you've learned to complete the **two** Python Challenges, PyBank and PyPoll. Both of these challenges encompasses a real-world situation where your newfound Python scripting skills can come in handy.
elz-harri/04-Pandas-Challenge
Heroes of Pymoli
elz-harri/05-Matplotlib-Challenge
While your data companions rushed off to jobs in finance and government, you remained adamant that science was the way for you. Staying true to your mission, you've joined Pymaceuticals Inc., a burgeoning pharmaceutical company based out of San Diego. Pymaceuticals specializes in anti-cancer pharmaceuticals. In its most recent efforts, it began screening for potential treatments for squamous cell carcinoma (SCC), a commonly occurring form of skin cancer. As a senior data analyst at the company, you've been given access to the complete data from their most recent animal study. In this study, 249 mice identified with SCC tumor growth were treated through a variety of drug regimens. Over the course of 45 days, tumor development was observed and measured. The purpose of this study was to compare the performance of Pymaceuticals' drug of interest, Capomulin, versus the other treatment regimens. You have been tasked by the executive team to generate all of the tables and figures needed for the technical report of the study. The executive team also has asked for a top-level summary of the study results. ## Instructions Your tasks are to do the following: * Before beginning the analysis, check the data for any mouse ID with duplicate time points and remove any data associated with that mouse ID. * Use the cleaned data for the remaining steps. * Generate a summary statistics table consisting of the mean, median, variance, standard deviation, and SEM of the tumor volume for each drug regimen. * Generate a bar plot using both Pandas's `DataFrame.plot()` and Matplotlib's `pyplot` that shows the number of total mice for each treatment regimen throughout the course of the study. * **NOTE:** These plots should look identical. * Generate a pie plot using both Pandas's `DataFrame.plot()` and Matplotlib's `pyplot` that shows the distribution of female or male mice in the study. * **NOTE:** These plots should look identical. * Calculate the final tumor volume of each mouse across four of the most promising treatment regimens: Capomulin, Ramicane, Infubinol, and Ceftamin. Calculate the quartiles and IQR and quantitatively determine if there are any potential outliers across all four treatment regimens. * Using Matplotlib, generate a box and whisker plot of the final tumor volume for all four treatment regimens and highlight any potential outliers in the plot by changing their color and style. **Hint**: All four box plots should be within the same figure. Use this [Matplotlib documentation page](https://matplotlib.org/gallery/pyplots/boxplot_demo_pyplot.html#sphx-glr-gallery-pyplots-boxplot-demo-pyplot-py) for help with changing the style of the outliers. * Select a mouse that was treated with Capomulin and generate a line plot of tumor volume vs. time point for that mouse. * Generate a scatter plot of mouse weight versus average tumor volume for the Capomulin treatment regimen. * Calculate the correlation coefficient and linear regression model between mouse weight and average tumor volume for the Capomulin treatment. Plot the linear regression model on top of the previous scatter plot. * Look across all previously generated figures and tables and write at least three observations or inferences that can be made from the data. Include these observations at the top of notebook. Here are some final considerations: * You must use proper labeling of your plots, to include properties such as: plot titles, axis labels, legend labels, _x_-axis and _y_-axis limits, etc.
elz-harri/06-Python-API-Challenge
API Challenege
elz-harri/07-SQL-Challenge
elz-harri/08-SQLAlchemy-Challenge
Congratulations! You've decided to treat yourself to a long holiday vacation in Honolulu, Hawaii! To help with your trip planning, you need to do some climate analysis on the area.
elz-harri/09-Web-Design-Challenge
Take what we've learned about HTML and CSS to create a dashboard showing off the analysis we've done.
elz-harri/10-Javascript-Challenge
WAKE UP SHEEPLE! The extra-terrestrial menace has come to Earth and we here at ALIENS-R-REAL have collected all of the eye-witness reports we could to prove it! All we need to do now is put this information online for the world to see and then the matter will finally be put to rest. There is just one tiny problem though... our collection is too large to search through manually. Even our most dedicated followers are complaining that they are having trouble locating specific reports in this mess. That's why we are hiring you. We need you to write code that will create a table dynamically based upon a dataset we provide. We also need to allow our users to filter the table data for specific values. There's a catch though... we only use pure JavaScript, HTML, and CSS, and D3.js on our web pages. They are the only coding languages which can be trusted. You can handle this... right? The planet Earth needs to know what we have found!
elz-harri/11-Plotly-Challenge
# Plot.ly Homework - Belly Button Biodiversity ![Bacteria by filterforge.com](Images/bacteria.jpg) In this assignment, you will build an interactive dashboard to explore the [Belly Button Biodiversity dataset](http://robdunnlab.com/projects/belly-button-biodiversity/), which catalogs the microbes that colonize human navels. The dataset reveals that a small handful of microbial species (also called operational taxonomic units, or OTUs, in the study) were present in more than 70% of people, while the rest were relatively rare. ## Step 1: Plotly 1. Use the D3 library to read in `samples.json`. 2. Create a horizontal bar chart with a dropdown menu to display the top 10 OTUs found in that individual. * Use `sample_values` as the values for the bar chart. * Use `otu_ids` as the labels for the bar chart. * Use `otu_labels` as the hovertext for the chart. ![bar Chart](Images/hw01.png) 3. Create a bubble chart that displays each sample. * Use `otu_ids` for the x values. * Use `sample_values` for the y values. * Use `sample_values` for the marker size. * Use `otu_ids` for the marker colors. * Use `otu_labels` for the text values. ![Bubble Chart](Images/bubble_chart.png) 4. Display the sample metadata, i.e., an individual's demographic information. 5. Display each key-value pair from the metadata JSON object somewhere on the page. ![hw](Images/hw03.png) 6. Update all of the plots any time that a new sample is selected. Additionally, you are welcome to create any layout that you would like for your dashboard. An example dashboard is shown below: ![hw](Images/hw02.png) ## Advanced Challenge Assignment (Optional) The following task is advanced and therefore optional. * Adapt the Gauge Chart from <https://plot.ly/javascript/gauge-charts/> to plot the weekly washing frequency of the individual. * You will need to modify the example gauge code to account for values ranging from 0 through 9. * Update the chart whenever a new sample is selected. ![Weekly Washing Frequency Gauge](Images/gauge.png) ## Deployment * Deploy your app to a free static page hosting service, such as GitHub Pages. Submit the links to your deployment and your GitHub repo. * Ensure your repository has regular commits (i.e. 20+ commits) and a thorough README.md file
elz-harri/12-Leaflet-Challenge
elz-harri/antioxidant-data
An analysis of antioxidant data used for nutritional research using Postgres, SQL, Tableau, & Excel
elz-harri/bootcampFinalProject
This is the Final Project for our Data Analytics Bootcamp
elz-harri/D3-Challenge
D3 Homework - Data Journalism and D3
elz-harri/DataViz
elz-harri/elz-harri.github.io
Elizabeth's Perrsonal Page
elz-harri/ETL-Project
ETL Mini-Project
elz-harri/github-slideshow
A robot powered training repository :robot:
elz-harri/HBS_Impact-Weighted
Data and Information: Harvard Business School, Impact-Weighted Accounts - Data Analysis
elz-harri/oreilly-first-steps-january-2021
elz-harri/PlantWorld
elz-harri/project1
elz-harri/Project1_Human-Trafficking
Group project using Python, Pandas, Matplotlib in Jupyter Notebook