carlomazzaferro/kryptoflow

How to leverage reddit news data?

Opened this issue · 2 comments

Hi Carlo,

I have keep trying to deploy the repo.
And just curious,
Are we only using kafka save the news stream data so far, but not trying to train the network based on news data.

Due to there is no label on sentiment analysis for each reddit data for positive/negative or some feature label.
we only using the "polarity", "sentence_count" as the information.
But polarity seems doesn't help too much.
It might be more information we could get by using document2vec or fastext, that kind of embedding work.
What's your idea about leverage the reddit and twitter stream data?

Thanks for the kindly support and awesome work!! Much appreciate.

Like if there is a strategy:
BUY when 'sentiment_score' >= 'sentiment_cutoff' and SELL 'time_to_close_position' minutes later.
SHORT SELL when 'sentiment_score' <= -'sentiment_cutoff' and BUY 'time_to_close_position' minutes later

Hi there! The twitter/reddit data has the polarity measure which is indeed a measure of sentiment, as you can see from the code here: https://github.com/carlomazzaferro/kryptoflow/blob/master/kryptoflow/scrapers/transforms/sent_analysis.py

But moving forward the goal will be using proper feature extraction techniques -- as you hinted correctly, implementing doc2vec seems to fit the bill here and is on the roadmap. I think that a way of making it work with the rest of the data is having the price levels as labels, and the text features as added features to a the model.

As far as developing a trading strategy, I'm open to suggestions. If you are willing to take a stab it, I could use some help. Here is where I was going to start working on adding rules for a trading agent: https://github.com/carlomazzaferro/kryptoflow/tree/master/kryptoflow/trading