This repository contains the initiation project developed to make my entry as a member of the JEST 2020/1 IT team.
The project has two parts, an introductory (part 1), optional, and another mandatory (part 2).
Use Python 3.7 or higher
.
Install the required packages with pip install -r requirements.txt
├── 1) Predict Pulsar star [2-5] # Classify stars between Pulsar and non-Pulsar
├── 2) Leukemia Detect [1, 4, 5] # Classify patients between leukemia or not
This is a classification problem. You need to classify a star dataset in Pulsar and non-Pulsar. So, the sample data has a total of 17,898 entries (rows) and 8 features (columns).
Note: The dataset is unbalanced (non-Pulsar: 16259; Pulsar: 1639).
This is also a classification problem. You need to classify a patient dataset in with leukemia (1) or not (0). Therefore, the sample data has a total of 178 entries (rows) and 186 features (columns).
Note 1: Only the first 128 entries are labeled, I will use this slice for train and test. The left 50 unlabeled will be used to predict.
Note 2: Again, the dataset is unbalanced (0: 111 patients; 1: 17 patients).
Note 3: Feature selection techniques were applied.
Project 1: Predict Pulsar
Project 2: Leukemia Detect - Disabled
Tamagusko, T. (2020). Initiation Project JEST 2020/1. Retrieved from https://github.com/tamagusko/jest20201
@misc{TamaguskoJest20201,
author = {Tamagusko, Tiago},
title = {Initiation Project JEST 2020/1},
year = {2020},
url = {https://github.com/tamagusko/jest20201}
}
[1]
Dataset to support the study. (2020, May 31).
Retrieved from https://github.com/spingegod/ProjetoTI_part2
[2]
Dataset to support the study. (2020, Jun 03).
Retrieved from https://github.com/spingegod/ProjetoTI_part1
[3]
Kaggle (2021).
Predicting a Pulsar Star (2021, Apr 15).
Retrieved from https://www.kaggle.com/colearninglounge/predicting-pulsar-starintermediate
[4]
van Rossum, G. (1995).
Python tutorial, May 1995.
WI Report CS-R9526, CS-R9526, 1–65.
[5]
Breiman, L. (2001).
Random forests. 28.
https://doi.org/http://dx.doi.org/10.1023/A:1010933404324
Please direct bug reports and pull requests to the GitHub page. To contact me directly, send email to tamagusko@gmail.com.
-- Tiago
CC-BY-NC-ND-4.0 (c) 2020, Tiago Tamagusko.