/Assignment1

Primary LanguageJupyter Notebook

Instructions


System hardware and OS:

Windows Server 2016 Datacenter
Processor: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30Ghz 2.29 GHz
RAM: 32.0 GB
System type: 64-bit operating system, x64-based processor 
 


Anaconda version: Anaconda2 5.0.0 64-bit

Python version: Python 3.5.4

Packages requirement

numpy == 1.15.4
pandas == 0.20.3
matplotlib == 2.1.0
csv == 1.0
os
time
sklearn == 0.20.2
pickle
imblearn == 0.4.3



Dataset

Problem 1: Bank Marketing Data Set (bank.csv)
Original URL: https://archive.ics.uci.edu/ml/datasets/Bank+Marketing
my GitHub repository: https://github.com/LUSAQX/Assignment1/tree/master/Data
training dataset: https://github.com/LUSAQX/Assignment1/blob/master/Data/train_test_split/trainset_p1.csv
test dataset: https://github.com/LUSAQX/Assignment1/blob/master/Data/train_test_split/testset_p1.csv
training feature set after resampling: https://github.com/LUSAQX/Assignment1/blob/master/Data/train_test_split/X_train_smote_p1.csv
training target class after resampling: https://github.com/LUSAQX/Assignment1/blob/master/Data/train_test_split/y_train_smote_p1.csv

Problem 2: Large Movie Review Dataset (aclImdb)
Original URL: http://ai.stanford.edu/~amaas/data/sentiment/
training dataset: https://github.com/LUSAQX/Assignment1/blob/master/Data/train_test_split/trainset_p2.csv
test dataset: https://github.com/LUSAQX/Assignment1/blob/master/Data/train_test_split/testset_p2.csv


Code

Problem 1: https://github.com/LUSAQX/Assignment1/blob/master/Code/Problem_1.ipynb


Problem 2: https://github.com/LUSAQX/Assignment1/blob/master/Code/Problem_2.ipynb, https://github.com/LUSAQX/Assignment1/blob/master/Code/Problem_2(Neural%20Net).ipynb