Parallel-self-learning-model

This is a project to implement a machine learning system that can self-learning by collecting data from google using crawler. This system can be split into three major stages, the crawler stage, feature extraction stage, and SVM modeling stage. Each stage takes a long time, so we using lots of parallel methods to speed up them. Finally, we get approximately 85 times speedup over the serial program.

Hardware

CPU: Intel(R) Xeon(R) Gold 6136 CPU @ 3.00GHz
Core: 2 * (12 cores 24 threads)
GPU: RTX 2080 Ti 12GB

OS:

CentOS 8

Reproducing implementation

To reproduct our implementation without do the following steps:

Installation
Dataset Preparation
Train models
Make Submission

Installation

Using Anaconda is strongly recommended.

Build environment

conda create -n pp_smls python=3.6
conda activate pp_smls
git clone https://github.com/vbnmzxc9513/Parallel-self-learning-model.git
pip install -r requirement.txt

yen-junyu/Parallel-self-learning-model

Parallel-self-learning-model

Hardware

Reproducing implementation

Installation

Build environment

Dataset Preparation

Train model

Modification guide