/job_center

Crawl, matching and explore data about jobs in Viet Nam.

Primary LanguageJupyter Notebook

Job Center

@team:  Luong (greyhub)
        Nam (Namdv99)
        Minh (MinhPN101)
        Manh (htlmm99)

Main workflow

  1. Crawling
  2. Matching
  3. EDA
  4. Prediction
  5. Ranking
  6. Suggesting

Structure

.
├── app
│   ├── README.md
│   └── website
├── database
│   ├── data_center
│   ├── raw_data
│   └── temp_storage
├── README.md
└── workflow
    ├── config.py
    ├── crawler
    ├── exploratory
    ├── __init__.py
    ├── matching
    ├── pipeline.py
    ├── __pycache__
    ├── README.md
    ├── requirements.txt
    ├── scrapy.cfg
    ├── test_scrapy
    ├── timer.py
    └── utils.py

Requirements

  • Conda
  • Python
  • Scrapy

Guideline

Run pipeline & more

Run exploratory & more

Job domains

We focus on bellow domains:

Topcv

MyWork

CareerBuilder

TimViecNhanh

ViecLam24h

References

Crawl with Scrapy

Schedule in python

Matplotlib: Visualization with Python