sohomghosh
Sr. Data Scientist | Financial Natural Language Processing Researcher | Deep Learning, Machine Learning & AI | Large Language Models
Jadavpur UniversityIndia
Pinned Repositories
AV_MLWARE1_sarcasm_detect_in_tweets
Solutions to MLWARE1 organised by Analytics Vidhya (https://datahack.analyticsvidhya.com/contest/mlware-1/)
company_clustering
Clustering of companies based on their names
Data_Visualization_FDP
FiNCAT_Financial_Numeral_Claim_Analysis_Tool
A tool to detect whether numerals present in Financial Texts are in-claim or out-of-claim
FinRAD_Financial_Readability_Assessment_Dataset
FinRAD: Financial Readability Assessment Dataset - 13,000+ Definitions of Financial Terms for Measuring Readability
Finsim4_ESG
FinSim_Financial_Hypernym_detection
Codes and models to extract hypernyms of Financial Terms
Python_Machine-Learning_Codes
It contains various ways to deal with data using Pandas and PySpark dataframes. It further includes implementation of several ML Algorithms using Python.
Solutions_of_Data_Science_Hackathons
Finally, I am creating a single repository of solutions for all the data science competitions I am presently participating in. So, far I have been maintaining separate repositories which I feel is quite tedious.
Natural-Language-Processing-Fundamentals
Use Python and NLTK to build out your own text classifiers and solve common NLP problems
sohomghosh's Repositories
sohomghosh/FinRAD_Financial_Readability_Assessment_Dataset
FinRAD: Financial Readability Assessment Dataset - 13,000+ Definitions of Financial Terms for Measuring Readability
sohomghosh/FinCausal-2020_2022
sohomghosh/Finsim4_ESG
sohomghosh/FinSim_Financial_Hypernym_detection
Codes and models to extract hypernyms of Financial Terms
sohomghosh/TO-DO
My to do list
sohomghosh/Data_Visualization_FDP
sohomghosh/BERT-Relation-Extraction
PyTorch implementation for "Matching the Blanks: Distributional Similarity for Relation Learning" paper
sohomghosh/CEPN
Code for causality extraction.
sohomghosh/CryptoBubbles-NAACL
sohomghosh/deep-finance
Datasets, papers and books on AI & Finance.
sohomghosh/ECTSum
ECTSum Dataset and Codes
sohomghosh/Evaluating-Impact-of-Social-Media-Posts-by-Executives-on-Stock-Price
sohomghosh/FAAB_Financial_Argument_Ananlysis_Bengali
sohomghosh/FENCE_Financial_Exaggerated_Numeral_ClassifiEr
sohomghosh/Finance-FinNum
Numeral is the crucial part of financial documents. In order to understand the detail of opinions in financial documents, we should not only analyze the text, but also need to assay the numeric information in depth. Because of the informal writing style, analyzing social media data is more challenging than analyzing news and official documents. FinNum is a dataset for fine-grained numeral understanding in financial social media data - to identify the category of a numeral.
sohomghosh/Finance-FinProLex
FinProLex provides 5,162 tokens in professional analysts' reports and the financial social media platform posts with expert-like scores. The expert-like scores are calculated based on the pointwise mutual information (PMI).
sohomghosh/Finance-NTUSD-Fin
NTUSD-Fin provides various scoring methods including frequency, CFIDF, chi-squared value, market sentiment score and word vector for the tokens. Only the tokens appeared at least ten times and shown significantly difference between expected and observed frequency with chi-squared test are remained in our dictionary. The predetermined significance level is 0.05. The market sentiment score is calculated by substracting the bearish PMI from the bullish PMI. There are 8,331 words, 112 hashtags and 115 emojis in the constructed dictionary, NTUSD-Fin.
sohomghosh/Finance-Numeracy-600K
Numeral is the crucial part of in narrative, especially in financial documents. We should not only analyze the text, but also need to assay the numeric information in depth. Numeracy-600K is a dataset for testing the numeracy of machines.
sohomghosh/FinNLP_Multi-Lingual_ESG_Impact_Type_Identification_ML-ESG-2
sohomghosh/Generator-Guided-Crowd-Reaction-Assessment
sohomghosh/Indian_IPO
sohomghosh/IndicFinNLP
sohomghosh/LIPI_ERAI_FinNLP_EMNLP-2022
Codes of the system developed by team LIPI while participating in ERAI shared task of FinNLP, co-located with 2022
sohomghosh/ML-ESG3_LIPI
The codes correspond to the system developed by team LIPI while participating in the ML-ESG3 shared task of FinNLP-KDF@LREC-COLING-2024.
sohomghosh/REFinD
sohomghosh/sohom2ghosh.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
sohomghosh/sohomghosh
sohomghosh/sohomghosh.github.io
A beautiful, simple, clean, and responsive Jekyll theme for academics
sohomghosh/tezansahu-website
The repository for my website
sohomghosh/tweetfinsent
TweetFinSent: A Dataset of Stock Sentiments on Twitter