- Pri Oberoi, Data Scientist, Commerce Data Service
- Star Ying, Data Scientist, Commerce Data Service
This is a quick introduction to data science and short example of topic clustering using National Institute of Standards and Technology newsfeed.
To follow the example in the workshop, Python 2.7 and pip is required. Here are the steps required for getting started:
- You can use
sudo easy_install pip
orbrew install python
to install pip. - Clone or download a copy of this repo to your local machine.
- Install required packages through pip with this command:
pip install requirements.txt
. - Open a local jupyter-notebook instance with this command:
jupyter-notebook <dir_of_cloned_repo>
. - An instance of jupyter should have launched on your default browser. Open kMeansClustering.ipynb.