Data-Science-Internship-Projects

1. Spatial Data Science For Covid-19 Disease Prediction [Link]

Data Set Preview: [Link]

id case_in_country reporting date summary location country gender age symptom_onset If_onset_approximated hosp_visit_date exposure_start exposure_end visiting Wuhan from Wuhan death recovered symptom source link
765 15 02-10-20 new confirmed COVID-19 ... Vinh Phuc Vietnam NA 0.25 NA NA NA NA NA 0 0 0 1 Vietnam News https://vietnamnews.vn/society/591803/viet-nam-confirms-9th-coronavirus-case-hong-kong-reports-first-death-from-infection.html
477 27 02-05-20 new confirmed COVID-19 ... Singapore Singapore male 0.5 NA NA NA 1/23/2020 02-03-20 0 0 0 1 Straits Times https://www.straitstimes.com/singapore/health/coronavirus-4-more-confirmed-cases-in-singapore-28-cases-so-far

Analysis:

image image image image image image image

2. Parkinson’s Disease Prediction – XG Boost Classifier [Link]

Data Set Preview: [Link]

name MDVP:Fo(Hz) MDVP:Fhi(Hz) MDVP:Flo(Hz) MDVP:Jitter(%) MDVP:Jitter(Abs) MDVP:RAP MDVP:PPQ Jitter:DDP MDVP:Shimmer MDVP:Shimmer(dB) Shimmer:APQ3 Shimmer:APQ5 MDVP:APQ Shimmer:DDA NHR HNR status RPDE DFA spread1 spread2 D2 PPE
phon_R01_S01_1 119.992 157.302 74.997 0.00784 0.00007 0.0037 0.00554 0.01109 0.04374 0.426 0.02182 0.0313 0.02971 0.06545 0.02211 21.033 1 0.414783 0.815285 -4.813031 0.266482 2.301442 0.284654
phon_R01_S01_2 122.4 148.65 113.819 0.00968 0.00008 0.00465 0.00696 0.01394 0.06134 0.626 0.03134 0.04518 0.04368 0.09403 0.01929 19.085 1 0.458359 0.819521 -4.075192 0.33559 2.486855 0.368674

Analysis:

image image image image image

3. House Price Prediction Using Random Forest Regression [Link]

Data Set Preview: [Link]

Id MSSubClass MSZoning LotFrontage LotArea Street Alley LotShape LandContour Utilities LotConfig LandSlope Neighborhood Condition1 Condition2 BldgType HouseStyle OverallQual OverallCond YearBuilt YearRemodAdd RoofStyle RoofMatl Exterior1st Exterior2nd MasVnrType MasVnrArea ExterQual ExterCond Foundation BsmtQual BsmtCond BsmtExposure BsmtFinType1 BsmtFinSF1 BsmtFinType2 BsmtFinSF2 BsmtUnfSF TotalBsmtSF Heating HeatingQC CentralAir Electrical 1stFlrSF 2ndFlrSF LowQualFinSF GrLivArea BsmtFullBath BsmtHalfBath FullBath HalfBath BedroomAbvGr KitchenAbvGr KitchenQual TotRmsAbvGrd Functional Fireplaces FireplaceQu GarageType GarageYrBlt GarageFinish GarageCars GarageArea GarageQual GarageCond PavedDrive WoodDeckSF OpenPorchSF EnclosedPorch 3SsnPorch ScreenPorch PoolArea PoolQC Fence MiscFeature MiscVal MoSold YrSold SaleType SaleCondition SalePrice
1 60 RL 65 8450 Pave NA Reg Lvl AllPub Inside Gtl CollgCr Norm Norm 1Fam 2Story 7 5 2003 2003 Gable CompShg VinylSd VinylSd BrkFace 196 Gd TA PConc Gd TA No GLQ 706 Unf 0 150 856 GasA Ex Y SBrkr 856 854 0 1710 1 0 2 1 3 1 Gd 8 Typ 0 NA Attchd 2003 RFn 2 548 TA TA Y 0 61 0 0 0 0 NA NA NA 0 2 2008 WD Normal 208500
2 20 RL 80 9600 Pave NA Reg Lvl AllPub FR2 Gtl Veenker Feedr Norm 1Fam 1Story 6 8 1976 1976 Gable CompShg MetalSd MetalSd None 0 TA TA CBlock Gd TA Gd ALQ 978 Unf 0 284 1262 GasA Ex Y SBrkr 1262 0 0 1262 0 1 2 0 3 1 TA 6 Typ 1 TA Attchd 1976 RFn 2 460 TA TA Y 298 0 0 0 0 0 NA NA NA 0 5 2007 WD Normal 181500

Analysis:

image image image image image image image image

4. Home Loan Prediction Using Decision Tree Classifier [Link]

Data Set Preview: [Link]

Loan_ID Gender Married Dependents Education Self_Employed ApplicantIncome CoapplicantIncome LoanAmount Loan_Amount_Term Credit_History Property_Area Loan_Status
LP001002 Male No 0 Graduate No 5849 0 360 1 Urban Y
LP001003 Male Yes 1 Graduate No 4583 1508 128 360 1 Rural N

Analysis:

image image image image image image image image image image image image image image image image image image image image image

Data Set Preview: [Link]

v1 v2
ham Go until jurong point, crazy.. Available only in bugis n great world la e buffet... Cine there got amore wat...
ham Ok lar... Joking wif u oni...
spam Free entry in 2 a wkly comp to win FA Cup final tkts 21st May 2005. Text FA to 87121 to receive entry question(std txt rate)T&C's apply 08452810075over18's

Analysis:

  • K Nearest Neighbors Accuracy: 94.18521177315147
  • Decision Tree Accuracy: 97.20028715003589
  • Random Forest Accuracy: 98.34888729361091
  • Logistic Regression Accuracy: 98.56424982053123

6. Hand-Written Digit Recognition Using CNN [Link]

Data Set Preview: [Link]

image image

Analysis:

Test loss: 0.7877634763717651 Test accuracy: 0.8399999737739563

7. Churn Prediction Using Tensorflow [Link]

Data Set Preview: [Link]

RowNumber CustomerId Surname CreditScore Geography Gender Age Tenure Balance NumOfProducts HasCrCard IsActiveMember EstimatedSalary Exited
1 15634602 Hargrave 619 France Female 42 2 0 1 1 1 101348.88 1
2 15647311 Hill 608 Spain Female 41 1 83807.86 1 0 1 112542.58 0

Analysis:

Model: "sequential_1"

Layer (type) Output Shape Param#
dense_3 (Dense) (None, 6) 72
dense_4 (Dense) (None, 6) 42
dense_5 (Dense) (None, 1) 7
Total params: 121
Trainable params: 121
Non-trainable params: 0

Epoch -> loss: 0.3331 - accuracy: 0.8668

Confusion Matrix:

1471 98
211 220

Data Set Preview: [Link]

Invoice StockCode Description Quantity InvoiceDate Price Customer ID Country
489434 85048 15CM CHRISTMAS GLASS BALL 20 LIGHTS 12 12/1/09 7:45 6.95 13085 United Kingdom
489434 79323P PINK CHERRY LIGHTS 12 12/1/09 7:45 6.75 13085 United Kingdom
489434 79323W WHITE CHERRY LIGHTS 12 12/1/09 7:45 6.75 13085 United Kingdom
489434 22041 RECORD FRAME 7" SINGLE SIZE 48 12/1/09 7:45 2.1 13085 United Kingdom