Pinned Repositories
academic-budget-bert
Repository containing code for "How to Train BERT with an Academic Budget"
covid-berts
BERT models pretrained on the CORD-19 Kaggle dataset
hs-survey-cultural-bias
Resources for WOAH 2024 paper: "From Languages to Geographies: Towards Evaluating Cultural Bias in Hate Speech Datasets"
twitter-unemployment
Resources for ACL 2022 paper "Multilingual Detection of Personal Employment Status on Twitter".
TXTfpbsupervisedBERT
Sentiment classification with BERT
TXTmedia_sentiment_stock_market_prediction
WorldBankDSCompetition
Data Science competition at the World Bank
simpletransformers
Transformers for Information Retrieval, Text Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conversational AI
NaijaHate
Resources for ACL 2024 paper: ""NaijaHate: Evaluating Hate Speech Detection on Nigerian Twitter Using Representative Data"
TwitterEconomicMonitoring
Collection of training materials to download and draw insights from Twitter data.
manueltonneau's Repositories
manueltonneau/covid-berts
BERT models pretrained on the CORD-19 Kaggle dataset
manueltonneau/twitter-unemployment
Resources for ACL 2022 paper "Multilingual Detection of Personal Employment Status on Twitter".
manueltonneau/academic-budget-bert
Repository containing code for "How to Train BERT with an Academic Budget"
manueltonneau/emoji
emoji terminal output for Python
manueltonneau/hs-survey-cultural-bias
Resources for WOAH 2024 paper: "From Languages to Geographies: Towards Evaluating Cultural Bias in Hate Speech Datasets"
manueltonneau/bert
TensorFlow code and pre-trained models for BERT
manueltonneau/BotPercent
implementation of "BotPercent: Estimating Twitter Bot Populations from Groups to Crowds"
manueltonneau/colabtools
Python libraries for Google Colaboratory
manueltonneau/covid-papers-browser
Browse Covid-19 & SARS-CoV-2 Scientific Papers with Transformers 🦠 📖
manueltonneau/CS224N-Project
manueltonneau/ekphrasis
Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).
manueltonneau/electra-1
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
manueltonneau/embedded-topic-model
A package to run embedded topic modelling with ETM. Adapted from the original at: https://github.com/adjidieng/ETM
manueltonneau/fast-bert
Super easy library for BERT based NLP models
manueltonneau/glove-text-cnn
A text classification model with pretrained GloVe embeddings
manueltonneau/hateday_public
manueltonneau/manueltonneau.github.io
manueltonneau/material-bread
Cross Platform React Native Material Design Components
manueltonneau/NeuralNLP-NeuralClassifier
An Open-source Neural Hierarchical Multi-label Text Classification Toolkit
manueltonneau/nitter_scraper
Scrape Twitter API without authentication using Nitter.
manueltonneau/nTSNTM
Code for ACL 2021 paper "Tree-Structured Topic Modeling with Nonparametric Neural Variational Inference"
manueltonneau/OCTIS
OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models (accepted at EACL2021 demo track)
manueltonneau/pytorch-Deep-Learning
Deep Learning (with PyTorch)
manueltonneau/simpletransformers
Transformers for Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conversational AI
manueltonneau/transformers
🤗 Transformers: State-of-the-art Natural Language Processing for TensorFlow 2.0 and PyTorch.
manueltonneau/tweet2vec
Twitter hashtag prediction
manueltonneau/tweetedat
TweetedAt tells the time of a tweet based on its tweet id
manueltonneau/tweeteval
Repository for TweetEval
manueltonneau/TwiBot-22
Offical repository of TwiBot-22 @ NeurIPS 2022, Datasets and Benchmarks Track.
manueltonneau/twitter-hoover
Collect data from filtered Twitter streams.