Pinned Repositories
Available-bikes-prediction-with-LSTM
We are going to study bicycle movements in Dublin city. We will retrieve the station data using the JCDeacaux API, store them in a document oriented, NO SQL database: MongoDB. Then study the static and dynamic data in this database, and we will predict for each station, the number of bicycles available per station, with LSTM neural network, to display at the end to the user, the stations that are close to him with the approximate number of bicycles available in each station.
Benchmarking-study-of-income-classification-with-regards-to-data-collected-in-2010-US-census
Modelized and predicted the income level for a person per year , i.e which people save more or less than $50,000 / year. I started with a basic algorithm: logistic regression, then added the SMOTE technique to deal with the high imbalance in the data. I tried other algorithms based on decision trees ( such as Ada boost, Random Forest, Gradient boosting) , and Naive Bayes classifier. To assess and compare the performance of the classifiers, I used precision, recall, ROC auc, accuracy and confusion matrix .
BNA-Internship
- Collected the data (credits of 2016 ) , Identified low quality data , Imputed missing values ,Performed descreptive statistics and Visualized the data using ggplot2, rbokeh, plotly packages in R.
Internship-Credit-Scoring
Recent Methods from Statistics and Machine Learning for Credit Scoring When a bank receives a loan application, based on the applicant’s profile the bank has to make a decision regarding whether to go ahead with the loan approval or not. To minimize loss from the bank’s perspective, the bank needs a decision system regarding who to give approval of the loan and who not to. 1) Develop classification model using Logistic Regression (LR) 2)- Assess the performance of the model using the confusion matrix and the ROC curve
Monte-Carlo-based-estimation-project
In this project, we estimated a stochastic-process-based ( verifiying a Stochastic differential equation) expected value of a given function using numerical methods namely Monte Carlo method, and we tried to reduce the variance of the estimator using the antithetic method, control variates method and stratification method.
News-classification-based-on-description
Multi-class news classification based on description into 42 categories
Principal-Components-Analysis-Stroke-and-consequences
Performed a principal component analysis on a data set that treats patients who have had a stroke in order to follow their condition following the accident and compare assessment tools to evaluate their recovery.
Radial-Basis-Function-Network
Implemented Radial Basis Function Network (clusteringbased approach) with Python(Anaconda) and using it to perform both classification to resolve a credit scoring issue (on an imbalanced data set) , and regression for one-month-ahead and 4-month-ahead forecasting values relevant to Sales/Demands quantity of banking products.
Statistical-study-for-classification-of-products-according-to-quality-parameter
Performed descriptive statistical study and classification of products according to their quality parameter using hierarchical clustering and preference mapping using R software
stopwords-fr
French stopwords collection
YesmineBellalah's Repositories
YesmineBellalah/Radial-Basis-Function-Network
Implemented Radial Basis Function Network (clusteringbased approach) with Python(Anaconda) and using it to perform both classification to resolve a credit scoring issue (on an imbalanced data set) , and regression for one-month-ahead and 4-month-ahead forecasting values relevant to Sales/Demands quantity of banking products.
YesmineBellalah/Benchmarking-study-of-income-classification-with-regards-to-data-collected-in-2010-US-census
Modelized and predicted the income level for a person per year , i.e which people save more or less than $50,000 / year. I started with a basic algorithm: logistic regression, then added the SMOTE technique to deal with the high imbalance in the data. I tried other algorithms based on decision trees ( such as Ada boost, Random Forest, Gradient boosting) , and Naive Bayes classifier. To assess and compare the performance of the classifiers, I used precision, recall, ROC auc, accuracy and confusion matrix .
YesmineBellalah/BNA-Internship
- Collected the data (credits of 2016 ) , Identified low quality data , Imputed missing values ,Performed descreptive statistics and Visualized the data using ggplot2, rbokeh, plotly packages in R.
YesmineBellalah/Internship-Credit-Scoring
Recent Methods from Statistics and Machine Learning for Credit Scoring When a bank receives a loan application, based on the applicant’s profile the bank has to make a decision regarding whether to go ahead with the loan approval or not. To minimize loss from the bank’s perspective, the bank needs a decision system regarding who to give approval of the loan and who not to. 1) Develop classification model using Logistic Regression (LR) 2)- Assess the performance of the model using the confusion matrix and the ROC curve
YesmineBellalah/Monte-Carlo-based-estimation-project
In this project, we estimated a stochastic-process-based ( verifiying a Stochastic differential equation) expected value of a given function using numerical methods namely Monte Carlo method, and we tried to reduce the variance of the estimator using the antithetic method, control variates method and stratification method.
YesmineBellalah/Statistical-study-for-classification-of-products-according-to-quality-parameter
Performed descriptive statistical study and classification of products according to their quality parameter using hierarchical clustering and preference mapping using R software
YesmineBellalah/Available-bikes-prediction-with-LSTM
We are going to study bicycle movements in Dublin city. We will retrieve the station data using the JCDeacaux API, store them in a document oriented, NO SQL database: MongoDB. Then study the static and dynamic data in this database, and we will predict for each station, the number of bicycles available per station, with LSTM neural network, to display at the end to the user, the stations that are close to him with the approximate number of bicycles available in each station.
YesmineBellalah/News-classification-based-on-description
Multi-class news classification based on description into 42 categories
YesmineBellalah/Principal-Components-Analysis-Stroke-and-consequences
Performed a principal component analysis on a data set that treats patients who have had a stroke in order to follow their condition following the accident and compare assessment tools to evaluate their recovery.
YesmineBellalah/stopwords-fr
French stopwords collection