MichielOHerne/BigData5

To Do's for 5 Dec

Opened this issue · 1 comments

5 Minute presentation:

  • Our subject
  • How we plan to achieve our goals
  • what we have done so far
  • what we yet have to do

to do's for next week:

  • finish dataset importing code - tim
  • create new data set maker (NLTK) - casper
  • try setting up sentiment analyser - michiel
  • create somewhat of a presentation - michiel

presentation:
Our subject:

  • Tweet analysing
    • Getting the data in usable format
    • Analysing the data
    • Formatting the analysis in logical manner

What we have done so far:

  • Obtain a twitter “spritzer” dataset from the year 2011
    • Spritzer: random 1% of all tweets
  • Write a piece of code to obtain the tweets (along with username etc) from the data
  • (jokingly-ish: failed to get apache working with twitter datastream)

What we have yet to do:

  • Create a filter for multiple datasets to form sets that only contain certain topics using hashtags / used words
  • Create/import existing visualisers to display the tweet analysis
    • Positive/negative
    • Popularity of given topics
    • Location data per topic
  • Optional: move from pre-created twitter datasets to a stream data input of live tweets
    • Limiting factor: max number of tweets per 24hour obtainable