Data Analysis

This repo contains various projects that involve data analysis and modeling.

  1. Inferential Data Analysis

    • Statistics and Data Visualisation
      • Code for bar plot, histogram, box plot and scatterplot
      • Hypothesis testing between two continuous variables using welch two sample t-test
    • Linear Regression
      • Code to train linear regression model based on survey data collected
      • Analysis of model based on signifiance value and diagnostic plots
    • Logistic Regression
      • Code for logistic regression models including interaction term
      • Analysis and intrepretation of results, model fit and visualization
  2. Projects

    • Atlanta Urban Networks HIV Transmission Study
      • Code for network analysis of HIV transmission in Atlanta Urban population ,metastudy from 1988-2001
      • A stochastic based model without covariates was developed to determine probabilities of a connection based on group membership
      • Network was created based on mode of connection and sex
  3. Statistical Analysis of Networks

    • Network Visualization and Partitioning
      • Code for network data visualization on UKfaculty data
      • Graph is also partitioned using spectral method and Girvan-Newman algorithm
    • Network Edges and Relationship
      • Code for network edge and spillover effects
      • Examination of bias into Horvitz-Thompson estimators for population mean outcomes
    • Exponential Random Graph Model (ERGMs)
      • Code for fitting ERGM model based on faux.mesa.high dataset
      • Interpretation of the model and estimates obtained
    • Stochastic Block Model (SBM)
      • Code for fitting SBM model based on lazega dataset with and without covariates