Pymaceuticals, a burgeoning pharmaceutical company based out of San Diego, specializes in anti-cancer treatments. In its most recent animal study, it began screening for potential treatments for squamous cell carcinoma (SCC), a commonly occuring form of skin cancer. 249 mice identified with SCC tumor growth were treated through a variety of drug regimens. Over the course of 45 days, tumor development was observed and measured. The study compared the performance of Pymaceuticals' drug of interest, Capomulin, versus the other treatment regimens.
This project uses Python Pandas to read the CSV data files into DataFrames and Matplotlib to produce all of the tables and figures needed to visualize the results of the study, in addition to a written summary.
- Initially checked the data for any mouse ID with duplicate time points and remove any data associated with that mouse ID.
- Generated a summary statistics table consisting of the mean, median, variance, standard deviation, and SEM of the tumor volume for each drug regimen.
- Generated a bar plot using both Pandas's
DataFrame.plot()
and Matplotlib'spyplot
that shows the number of total mice for each treatment regimen throughout the course of the study.
- Calculated the final tumor volume of each mouse across four of the most promising treatment regimens: Capomulin, Ramicane, Infubinol, and Ceftamin.
- Calculated the quartiles and IQR and quantitatively determine if there are any potential outliers across all four treatment regimens.
- Used Matplotlib to generate a box and whisker plot of the final tumor volume for all four treatment regimens and highlight any potential outliers in the plot by changing their color and style. Plotted all four treatment regimens within the same figure.
- Selected "Mouse s185" that was treated with Capomulin and generate a line plot of time point versus tumor volume for that mouse.
- Generated a scatter plot of mouse weight versus average tumor volume for the Capomulin treatment regimen.
- Calculated the correlation coefficient and linear regression model between mouse weight and average tumor volume for the Capomulin treatment. Plot the linear regression model on top of the previous scatter plot.
Visualizing the data in the figures below helps to illustrate the efficacy of the drug Capomulin at reducing the growth of SCC tumors. During the study's 45 day timeframe, the Capomulin treatment group incurred less mice deaths than the other groups, with the exception of the Ramicane group. The Capomulin group also completed the study with, on average, lower tumor volumes (mm3) than the other groups, again with the exception of the Ramicane group which nearly mirrored the final tumor volumes of the Capomulin mice. When the average tumor volume (mm3) of "Mouse s185" was measured over the course of the study, the tumor consistently reduced in size as the mouse received the Capomulin drug. Finally, the average tumor volume (mm3) of the Capomulin mice correlated strongly with their weight.
Data
References
- Python - Pandas, Matplotlib, SciPy.stats
Kiran Rangaraj - 2021
- LinkedIn: @Kiran Rangaraj