zie225

Statistician-Agroeconomist- Data Scientist,Machine Learning and Artificial Intelligence searcher,Blockchain,Trader. email:coulibalyziemamadou@gmail.com

France

Pinned Repositories

Ab5_Consulting_Agritech
Studies on modeling and optimizing the use of sensors, to achieve the best results for farmers. the objective here is to find the best sensor at low cost for smallholders but depending on the type of soil. We use raspberry pi and the 4 channel 16 bit ADC microship for calibration and instrumentation.
Language:Jupyter Notebook1 1 01
Bayesian-Regression-and-Bitcoin
# Bayesian-Regression-to-Predict-Bitcoin-Price-Variations Predicting the price variations of bitcoin, a virtual cryptographic currency. These predictions could be used as the foundation of a bitcoin trading strategy. To make these predictions, we will have to familiarize ourself with a machine learning technique, Bayesian Regression, and implement this technique in Python. # Datasets We have the datasets in the data folder. The original raw data can be found here: http://api.bitcoincharts.com/v1/csv/. The datasets from this site have three attributes: (1) time in epoch, (2) price in USD per bitcoin, and (3) bitcoin amount in a transaction (buy/sell). However, only the first two attributes are relevant to this project. To make the data to have evenly space records, we took all the records within a 20 second window and replaced it by a single record as the average of all the transaction prices in that window. Not every 20 second window had a record; therefore those missing entries were filled using the prices of the previous 20 observations and assuming a Gaussian distribution. The raw data that has been cleaned is given in the file dataset.csv Finally, as discussed in the paper, the data was divided into a total of 9 different datasets. The whole dataset is partitioned into three equally sized (50 price variations in each) subsets: train1, train2, and test. The train sets are used for training a linear model, while the test set is for evaluation of the model. There are three csv files associated with each subset of data: *_90.csv, *_180.csv, and *_360.csv. In _90.csv, for example, each line represents a vector of length 90 where the elements are 30 minute worth of bitcoin price variations (since we have 20 second intervals) and a price variation in the 91st column. Similarly, the *_180.csv represents 60 minutes of prices and *_360.csv represents 120 minutes of prices. # Project Requirements We are expected to implement the Bayesian Regression model to predict the future price variation of bitcoin as described in the reference paper. The main parts to focus on are Equation 6 and the Predicting Price Change section. # Logic in bitcoin.py 1. Compute the price variations (Δp1, Δp2, and Δp3) for train2 using train1 as input to the Bayesian Regression equation (Equations 6). Make sure to use the similarity metric (Equation 9) in place of the Euclidean distance in Bayesian Regression (Equation 6). 2. Compute the linear regression parameters (w0, w1, w2, w3) by finding the best linear fit (Equation 8). Here you will need to use the ols function of statsmodels.formula.api. Your model should be fit using Δp1, Δp2, and Δp3 as the covariates. Note: the bitcoin order book data was not available, so you do not have to worry about the rw4 term. 3. Use the linear regression model computed in Step 2 and Bayesian Regression estimates, to predict the price variations for the test dataset. Bayesian Regression estimates for test dataset are computed in the same way as they are computed for train2 dataset – using train1 as an input. 4. Once the price variations are predicted, compute the mean squared error (MSE) for the test dataset (the test dataset has 50 vectors => 50 predictions).
Language:Python2 1 00
etl_airflow
Language:Jupyter Notebook0 1 00
FINANCIAL-TRADING-PROJECT-FOR-QUANTAI
a project for our group quantAI in trading , we will build the screener model for analysis method and evolutionary computation with AI and ML model to predict market stock.
Language:Jupyter Notebook2 1 00
kafka-stream-medical
Language:Jupyter Notebook1 1 01
ML-et-AI-project-with-3-datasets
Ce projet en 3 parties est destiné à nous familiariser avec Machine Learning (ML). Les 3 parties sont comme suit: Dans la première partie, nous avons implémente un algorithme de sélection d'attribut. Étant donné un ensemble de données de 𝑚 attributs, l’algorithme calcule simplement le rapport de gain de chacun des attributs et le conserve haut ⌈𝑚⌉ les attributs. Cette partie devrait être mise en œuvre sur le jeu de données d’échecs d’Alen Shapiro.1 Dans ce ensemble de données, il y a 36 attributs, nos algorithmes ont donc choisir les 4 avec le gain le plus élevé Ratio et stockez le jeu de données résultant (avec seulement ces 4 attributs) dans un fichier séparé. Dans la deuxième partie, nous avons implémente l’algorithme de k-NN le plus proche pour la classification. En utilisant la Distance euclidienne et k = 1 et nous avons appliquer notre algorithme au Wisconsin pour le cancer du sein (diagnostic). Cependant, avant de mettre en œuvre l’algorithme, nous avons divisez nos données en un ensemble d’apprentissage et en un ensemble de test. L’ensemble d’entraînement comprend 90% des premiers cas, alors que l’ensemble de test comprend des 10% restants. notre algorithme doit stocker ses prédictions dans un fichier séparé et afficher la précision de ces prédictions. Dans la dernière partie, nous avons implémente une technique de clustering simple qui utilise deux versions de jeux de données du Diabète, une version discrétisée et une version non discrétisée (d’origine). Plus précisément, nous utiliserons le jeu de données sur le diabète Indien Pima discrétisé par mangrove. Le jeu de données a de nombreux attributs, mais nous nous concentrerons que sur 5 attributs non discrétisés (âge, IMC, glucose, insuline, grossesses) et 5 discrétisées (LabelPAge, LabelPBMI, LabelPGlucose, LabelPInsulin, Labelpgrossesses). Ainsi la première chose à faire est de supprimer tout sauf ces 10 attributs. L’algorithme commence par calculer de la corrélation entre chaque paire d’attributs non discrétisés et choisit le pair avec la corrélation la plus faible (c.-à-d., avec le coefficient de corrélation le plus proche de 0). Appelons cette paire AX et Ay. Ensuite, pour ces deux attributs, il crée un cluster pour chaque combinaison possible de valeurs pour les versions discrétisées de AX et AY. Par exemple, disons que la version discrétisée de la hache a les valeurs haute et basse et la version discrétisée d’ay a les valeurs grandes et petites. Alors Il y aura les 4 clusters suivants: C1: avec des enregistrements contenant les valeurs haute et grande pour AX et AY, respectivement. C2: avec des enregistrements contenant les valeurs haute et petite pour AX et AY, respectivement. C3: avec des enregistrements contenant les valeurs basses et grandes pour AX et AY, respectivement. C4: avec des enregistrements contenant les valeurs basses et petites pour AX et AY, respectivement. Notre algorithme a du créer un fichier distinct contenant les enregistrements de chaque cluster. Elle a également évaluer le regroupement résultant en calculant la distance euclidienne maximale entre deux enregistrements dans le même cluster et la distance euclidienne minimale entre deux enregistrements dans différents clusters. Notez que ces distances doivent être calculées en fonction des 5 attributs non discrétisés.
Language:Jupyter Notebook4 1 01
ml-pipeline-airflow
Language:Python1 1 00
mlflow-main
Language:Jupyter Notebook0 1 00
mlflow_prefect_docker
Language:Jupyter Notebook1 1 00
quantecon-notebooks-datascience
Jupyter Notebooks for https://datascience.quantecon.org
Language:Jupyter Notebook2 0 01

zie225's Repositories

zie225/Zie225
Machine Learning project
Language:Jupyter Notebook2 1 02
zie225/airflow_kafka_cassandra_mongodb
Produce Kafka messages, consume them and upload into Cassandra, MongoDB.
Language:Python1 0 0
zie225/data_engineering_project_openweathermap_api_airflow_etl_aws
Language:Python1 0 0
zie225/Deploy-ML-Model-FastAPI-MLFlow-MINIO-MySQL
This repository about how to deploy machine learning model end serving with FastAPI and using MLFlow-MINIO
Language:Jupyter Notebook1 0 02
zie225/detect-data-drift-pipeline
A pipeline to detect data drift and retrain the model when there is drift
Language:Python1 0 01
zie225/God-Level-Data-Science-ML-Full-Stack
A collection of scientific methods, processes, algorithms, and systems to build stories & models. This roadmap contains 16 Chapters, whether you are a fresher in the field or an experienced professional who wants to transition into Data Science & AI
Language:Jupyter Notebook1 0 01
zie225/Human-Face-Detection-using-CNN
1 0 01
zie225/metaanalyse
Language:TeX1 1 01
zie225/sarima_dashboard
Introducing a Dash web app that guides the analysis of time series datasets, using sARIMA models
1
zie225/vscode-python-template
A template for a dockerized Python development environment for VScode
Language:JavaScript1 0 01
zie225/etl_airflow
Language:Jupyter Notebook0 1 00
zie225/mlflow-main
Language:Jupyter Notebook0 1 00
zie225/-Voting-Bagging-random-forest
Used Python’s Voting and Bagging classifiers along with KNN, logistic regression, decision tree & random forest to study their accuracy. Voting gave the best result.
Language:Jupyter Notebook0 0
zie225/apache_airflow_pipeline_postgres_to_s3
Language:Python0 0
zie225/Case-Studies
Language:Jupyter Notebook0 0
zie225/Conformal-Prediction-Regression
Implementation of conformal prediction with MAPIE based on the MAPIE doc example and adapted to another regression dataset.
Language:Python0 0
zie225/csv_to_kinesis_streams
This repo will write a CSV file to the Amazon Kinesis Data Streams
Language:Python0 01
zie225/forecast_dash
A website for viewing forecasting results of commonly used time series.
zie225/hubeau
Hub'Eau, la plateforme pour manipuler facilement les données ouvertes sur l'eau
zie225/kafka_spark_structured_streaming
Get data from API, run a scheduled script with Airflow, send data to Kafka and consume with Spark, then write to Cassandra
Language:Python0 01
zie225/lang2sql
Language to SQL Translator
Language:Jupyter Notebook0 01
zie225/MLOPS_BootCamp_Projects
MLOps Bootcamp Projects - Explore practical MLOps projects showcasing version control, CI/CD, monitoring, scalability, and more. Get hands-on experience with popular tools and frameworks. Clone, contribute, and enhance your MLOps skills. Drive reliable and scalable machine learning deployments.
Language:Jupyter Notebook0 0
zie225/notebooks
Notebooks using the Hugging Face libraries 🤗
Language:Jupyter Notebook0 0
zie225/pygwalker
PyGWalker: Turn your pandas dataframe into a Tableau-style User Interface for visual analysis
Language:Python0 0
zie225/segment-anything
An unofficial Python package for Meta AI's Segment Anything Model
Language:Jupyter Notebook0 01
zie225/shiny-express-poc
Running Shiny Express App Inside a Container
Language:JavaScript0 0
zie225/spring-petclinic
A sample Spring-based application
Language:CSS0 0
zie225/war-in-europe-report-2023-app
1
zie225/ydata-profiling
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
Language:Python0 0
zie225/zie
1 0