pablobarbera/streamR

StreamR with mongoDB

Opened this issue · 0 comments

Hi Pablo,

This is Cyrille, alias @Soc_Net_Intel on Twitter, from France. Dad of 3, working, and coming back to university to get a PhD... Trying to in reality !

I need your help on your StreamR repository with MongoDB which I think could be an alternative solution for me.

I have worked during a few months with a JAVA code on IDE to stream Twitter towards a MongoDB database. I was using rmongodb package to extract then the tweets from the mongodb collections. My main problem was that my JAVA code, based on Twitter4j library, was not complete, and thus the data from Twitter where had never been complete.

Since I have discovered R, especially for data analytics and machine learning, I would like to explore this possibility based on your StreamR with mongoDB repository.

I have tested your filterStream( ) function, adding format.twitter.date ( ) function inside, but I have some problems:

  • only the collections of tweets to a file is working properly for me : some minor tests with a few tweets with track function ;
  • the collections of tweets to a mongo DB / COLL does not work properly (stopping before one tweet collected entirely) ; mongod and mongo are configured with mongod (listen & established) and mongo (established) ; a specific port specified also in filterStream ( ) function ; no alert with mongodb through R ;
  • the collection of tweets to a file when collecting to a mongo DB / COLL does not work properly (stopping before one tweet collected entirely) ;
  • I have after a few request for test in RStudio the "Exceeded connection limit for user" alert in the file ; I wonder how may I check how many connections request are in progress with my Twitter account through R ;

I have may be made a mistake or something wrong with MongoDB, which was working well with my JAVA code. Maybe you could help me to solve this.

Cyrille.