BirdiD
Avid reader and results-oriented data professional. Building Decolonail NLP technologies for African languages.
Télécom SudParusParis
Pinned Repositories
awesome_fula_nl_resources
list fula language (peulà resources for natural language applications
Bias-Detection-and-Mitigation-in-AI
This notebook aims to show that machine learning models are exposed to risks like risks of bias, and to introduce some simple techniques to detect and mitigate bias. In this case study, we suppose that we are working for a credit institution. The client wants to automate the loan eligibility process thanks to a binary classification ML model trained on historical data. Regarding the dataset used in this notebook, it comes from a public challenge organized by the company Analytics Vidhya. It contains details about customers details such as Gender, Marital Status, Education, Number of Dependents, Income, Loan Amount, Credit History and others as well as the related loan status. Given these customer information, the purpose of the binary classification model is to predict the loan status : yes or no. In this notebook, we are going to apply a logistic regression model as a binary classifier.
BirdiDQ
BirdiDQ leverages the power of the Python Great Expectations open-source library and combines it with the simplicity of natural language queries to effortlessly identify and report data quality issues, all at the tip of your fingers.
hexanonyme
A Python package for PII data anonymization in French
NLP-Email-analysis
The aim of this study is to point out a customer-oriented behavior by analyzing emails sent by companies to users. An intuitive idea to underline such a behavior. An intuitive idea to underline such a behavior is to suppose that companies use methods with a discourse specifically tailored to the customer’s identity, belongings and interests. This is a group project in which we analyze emails contents and their subject, time distribution of emails sending etc.
Portfolio-Management
In this repository, you will find a credit portfolio modeling with Monte Carlo simulation based method for the computation of credit-portfolio loss-distributions and for the estimation of various risk measures. Four principal risk measures are taken into account : Value At Risk (VaR), Expected Shortfall (ES), Expected Loss (EL) and Unexpected Loss (UL). The notebook also includes credit derivatives which are synthetic contracts to buy or sell protection against credit-related losses. There's also an interactive dashboard allowing you to choose parameters for your credit portfolio modeling
Recommender_System
Filtrage collaboratif pour les systèmes de recommandation / collaborative filtering recommender system
Sentiment_Analysis
NLP
Spelling-corrector-Pulaar
In this notebook, we will implement an auto-correct system using pulaar language. Pulaar is spoken by million people across about 20 countries in West and Central Africa. This notebook is inspired by Peter Norvig who first created auto-correct in 2007 and Deeplearning.ai NLP specialization I took last year. Below the original article of Peter Norvig.
Stock-trends-prediction-with-macroeconomic-indicators
Stock markets are an essential component of the economy. Their prediction naturally arouses afascination in the academic and financial world. Indeed, financial time series, due to their widerange application fields, have seen numerous studies being published for their prediction. Some ofthese studies aim to establish whether there is a strong and predictive link between macroeconomicindicators and stock market trends and thus predict market returns. Stock market prediction howeverremains a challenging task due to uncertain noise. To what extent can macroeconomic indicatorsbe strong predictors of stock price ? Can they be used for stock trends modeling ? To answer thesequestions, we will focus on several time series forecasting models. We will on the one hand usestatistical time series models, more specifically the most commonly used time series approachesfor stock prediction : the Autoregressive Integrated Moving Average (ARIMA), the GeneralizedAutoregressive Conditional Heteroscedasticity (GARCH) and the Vector Autoregressive (VAR)approach. On the other hand, we will be using two deep learning models : the Long-Short TermMemory (LSTM) and the Gated Recurrent Unit (GRU) for our prediction task. In the final section ofthis paper, we look directly at companies to detect trends
BirdiD's Repositories
BirdiD/Stock-trends-prediction-with-macroeconomic-indicators
Stock markets are an essential component of the economy. Their prediction naturally arouses afascination in the academic and financial world. Indeed, financial time series, due to their widerange application fields, have seen numerous studies being published for their prediction. Some ofthese studies aim to establish whether there is a strong and predictive link between macroeconomicindicators and stock market trends and thus predict market returns. Stock market prediction howeverremains a challenging task due to uncertain noise. To what extent can macroeconomic indicatorsbe strong predictors of stock price ? Can they be used for stock trends modeling ? To answer thesequestions, we will focus on several time series forecasting models. We will on the one hand usestatistical time series models, more specifically the most commonly used time series approachesfor stock prediction : the Autoregressive Integrated Moving Average (ARIMA), the GeneralizedAutoregressive Conditional Heteroscedasticity (GARCH) and the Vector Autoregressive (VAR)approach. On the other hand, we will be using two deep learning models : the Long-Short TermMemory (LSTM) and the Gated Recurrent Unit (GRU) for our prediction task. In the final section ofthis paper, we look directly at companies to detect trends
BirdiD/BirdiDQ
BirdiDQ leverages the power of the Python Great Expectations open-source library and combines it with the simplicity of natural language queries to effortlessly identify and report data quality issues, all at the tip of your fingers.
BirdiD/Portfolio-Management
In this repository, you will find a credit portfolio modeling with Monte Carlo simulation based method for the computation of credit-portfolio loss-distributions and for the estimation of various risk measures. Four principal risk measures are taken into account : Value At Risk (VaR), Expected Shortfall (ES), Expected Loss (EL) and Unexpected Loss (UL). The notebook also includes credit derivatives which are synthetic contracts to buy or sell protection against credit-related losses. There's also an interactive dashboard allowing you to choose parameters for your credit portfolio modeling
BirdiD/hexanonyme
A Python package for PII data anonymization in French
BirdiD/NLP-Email-analysis
The aim of this study is to point out a customer-oriented behavior by analyzing emails sent by companies to users. An intuitive idea to underline such a behavior. An intuitive idea to underline such a behavior is to suppose that companies use methods with a discourse specifically tailored to the customer’s identity, belongings and interests. This is a group project in which we analyze emails contents and their subject, time distribution of emails sending etc.
BirdiD/Recommender_System
Filtrage collaboratif pour les systèmes de recommandation / collaborative filtering recommender system
BirdiD/Sentiment_Analysis
NLP
BirdiD/Spelling-corrector-Pulaar
In this notebook, we will implement an auto-correct system using pulaar language. Pulaar is spoken by million people across about 20 countries in West and Central Africa. This notebook is inspired by Peter Norvig who first created auto-correct in 2007 and Deeplearning.ai NLP specialization I took last year. Below the original article of Peter Norvig.
BirdiD/awesome_fula_nl_resources
list fula language (peulà resources for natural language applications
BirdiD/Bias-Detection-and-Mitigation-in-AI
This notebook aims to show that machine learning models are exposed to risks like risks of bias, and to introduce some simple techniques to detect and mitigate bias. In this case study, we suppose that we are working for a credit institution. The client wants to automate the loan eligibility process thanks to a binary classification ML model trained on historical data. Regarding the dataset used in this notebook, it comes from a public challenge organized by the company Analytics Vidhya. It contains details about customers details such as Gender, Marital Status, Education, Number of Dependents, Income, Loan Amount, Credit History and others as well as the related loan status. Given these customer information, the purpose of the binary classification model is to predict the loan status : yes or no. In this notebook, we are going to apply a logistic regression model as a binary classifier.
BirdiD/ChatWithScientificPaper
BirdiD/deep-learning-v2-pytorch
Projects and exercises for the latest Deep Learning ND program https://www.udacity.com/course/deep-learning-nanodegree--nd101
BirdiD/Deep-Neural-Networks-RBM-DBN-
In this project, I build from scratch a deep neural network pre-trained (or not) for the classification of handwritten digits. The main goal is to compare the performances, in terms of good classification rates, of a pre-trained network and a randomly initialized network, according to the input data, the number of network layers and finally the number of neurons per layer. In the last section, we try to determine the architechture and hyperparameters giving the lowest classification error rate.
BirdiD/faster-whisper-demo
Fast whisper finetuned checkpoint demo with gradio
BirdiD/From-scratch---ML-algorithms
In this repository, you'll find several machine learning algorithms coded from scratch.
BirdiD/Graph-Neural-Networks
BirdiD/Hiring-Challenge
BirdiD/LLMs-from-scratch
Implementing a ChatGPT-like LLM from scratch, step by step
BirdiD/School-projects
This repository contains some small school projects
BirdiD/TextClassifier
BirdiD/vigogne
French instruction-following and chat models
BirdiD/windanam
A multi-dialectal speech recognition models for Fula varieties
BirdiD/windanam-app