zrxbeijing/NewsTrader

Great project

Opened this issue · 4 comments

Hi, I have been exploring your project, and I really like it! I have downloaded your repo to colab and attempted your following example


from NewsTrader.newsfeed.gdeltdatabase import GdeltDatabase
from NewsTrader.newsfeature import extract_title, extract_symbols

df = GdeltDatabase(date='2021-10-01', table='events').query()

df['title'] = extract_title(list(df.SOURCEURL), mode='url', check_english=False, num_process=16)
df = df.dropna(subset=['title']).reset_index(drop=True)

titles = list(df.title)
result = extract_symbols(titles,
                         min_score=0.65, 
                         min_similarity=0.80, 
                         only_preferred=False)

df['possible_symbols'] = [None if candidate_list is None else [candidate['symbol'] for candidate in candidate_list] for candidate_list in result]
df['company_names'] = [None if candidate_list is None else [candidate['long_name'] for candidate in candidate_list] for candidate_list in result]
df = df[['SOURCEURL', 'title',
       'possible_symbols', 'company_names']]

df = df.dropna(subset=['possible_symbols'])

I have tried numerous fixes, and I am unable to get any outputs, would love your recommendation on what to do from here.

Hi, thanks for the interest! I will check when I have some time and get back to you.

With kind regards,
Rongxin

Thanks Rongxin (@zrxbeijing ) I am looking forward to using your repo!

Hi Rongxin, any progress on the project, really looking forward to using it and writing about it in my ml-quant project.

Hi Derek, sorry, I have been unable to provide more progress and update.

The basic idea of this project, is to make full use of transformer models in analyzing financial news, such as extraction of company names and sentiment analysis in financial news. The ultimate goal, is to trade based on financial news.

I guess the problem comes from incompatibility between different packages, especially when GPU computation is necessary. I didn't document my project well, or made a docker image or something like that for reproducibility. I was not able to continue this due to full-time job.

I think it makes sense if you check line by line and see which package doesn't work and how to fix that. Today I tried two hours but also failed. I really can't promise when I have time to continue this, but if I have some update, I will definitely let you know!

Rongxin