/fcc-comment-analysis

Code to index and Analyze FCC comments

Primary LanguagePython

FCC Comment Analysis

This reposity has Python code designed to download FCC data, storing it in an ElasticSearch instance. There's an additional command to tag and analyze the data further.

After a first pass in a Jupyter Notebook, I used Kibana on AWS to do most of my digging.

To install the package and run tests:

$ pip install -e .
$ python setup.py test

To crawl the comments, make sure you have a server setup, and then run:

$ fcc index --endpoint=http://localhost:9200

This will take anywhere from 2-4 hours (or wont' work at all, if the API is down).

I then take another pass on the data, appending "analysis" variables to all of the documents. This makes it a lot easier to spot trends in Kibana.

To analyze the comments:

$ fcc analyze --endpoint=http://localhost:9200