/Kumamon

硅谷第7小队日常repo

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

Kumamon

硅谷第7小队项目repo

1st Project: Web Crawler via Scrapy

Pacing

[2016/02/08 - 2016/02/14]
First Stage: Create a Scrapy project to crawl the content in the Xiaomi Appstore homepage or any other Appstore homepage
[2016/02/15 - 2016/02/21]
Second Stage: Save the crawled content in MongoDB[2]. Install Python MongoDB driver and modify pipelines.py to insert crawled data into MongoDB.
[2016/02/22 - 2016/02/29]
Third Stage: Crawl more content by following next page links. So far you have likely only crawled the content of the home page. We need to use Splash[3] and ScrapyJS[4] to re-render the web page to transform the dynamic part to static content if the next page link is written in JavaScript
Bonus Round

  1. pull results from mongo db and show it in browser via flask
  2. multiprocessing (tbd)

What is next?

  1. 1st project - Crawler (python)
  2. 2nd project - Recommender (python / spark)
  3. 3rd project - website (Meteor/React)

Learn programing via project

Nowadays we spend a lot of time to have a good grap of the Data Structure and Algorithms by solving the problems on CC, LC and GFG. But we still probably cannot end up with a good result in our job seeking, since the CS job market is so hot that you have so many competitors...

Quality beats quantity. Instead of going through a lot of questions, if you can make best use of your knowledge to build a product, you can easily extend to similar problems after some practice(this is what they look for, Your problem solving abilities).