- Implemented a data pipeline which monitors, scrapes and dedupes latest news (MongoDB, Redis, RabbitMQ);
- Designed data monitors for obtaining latest news from famous websites and recommend to web server.
- Successfully fetch useful data from original news websites by building news scrapers.
- Build dedupers which filter same news by using NLP (TF-IDF) to analyze similarities of articles scraped from news websites.
- Use Tensorflow for machine learning which can shows news according to users interests. Build a single-page web.
- Implemented a data Visualization System (D3.js);