This repository contains statistical analyses conducted on various datasets related to medical and health factors. The analyses include Spearman correlation, Chi-square test, and Linear Regression to explore relationships and predictive models related to heart attacks and other medical conditions.
Dataset Description: The dataset used for correlation analysis contains characteristics related to heart attacks or factors contributing to heart attacks.
Results: Based on Spearman correlation analysis, the following relationships were observed:
- Strong Positive Correlation: Pressurelow and Pressurehight. Higher Pressurelow correlates strongly with higher Pressurehight.
- Strong Positive Correlation: Troppin and Class. Higher Troppin values correlate strongly with higher Class values.
- Weak Positive Correlation: Age with Pressurehight and Tropponin. Weak negative correlation between Age and Kcm, indicating older age correlates with higher Pressurehight and Tropponin levels, and lower Kcm.
- Weak Positive Correlation: Gender with Pressurehight and Tropponin.
- Weak Positive Correlation: Impulse with Pressurelow.
- Weak Positive Correlation: Kcm with Glucose. Weak negative correlation between Kcm and Troppin.
Dataset Description: The dataset includes symptoms and patient profiles across various diseases.
Result Interpretations:
- Fever, Cough, Fatigue: Acceptance of H0 (p-value > 0.05) indicates no significant relationship with Disease.
- Difficulty Breathing: Rejection of H0 (p-value < 0.05) indicates a significant relationship with Disease.
- Cough, Difficulty Breathing: Acceptance of H0 (p-value > 0.05) suggests no significant relationship with Diagnosis (Outcome Variable).
- Fever, Fatigue: Rejection of H0 (p-value < 0.05) indicates a significant relationship with Diagnosis.
- Age, Gender: Acceptance of H0 (p-value > 0.05) indicates no significant relationship with Disease.
- Blood Pressure, Cholesterol Level: Acceptance of H0 (p-value > 0.05) indicates no significant relationship with Disease. Rejection of H0 (p-value < 0.05) indicates a significant relationship with Diagnosis.
Dataset Description: The dataset consists of features aimed at predicting patients at high risk of heart disease.
Results: The linear regression model demonstrates statistical significance (F-statistic = 17.65, p-value < 0.05) and explains a moderate portion of variance in Heart Disease (R-squared = 53.2%). Independent variables collectively contribute to predicting Heart Disease.
This repository provides insights into statistical analyses using Spearman correlation, Chi-square test, and Linear Regression methods on medical datasets. These analyses enhance understanding of relationships between medical factors and conditions, facilitating informed decision-making in healthcare contexts.