AmiriMc
Data Engineering | Machine Learning | Data Science | Python | AWS | SQL
https://www.ahead.com/Phoenix, AZ
Pinned Repositories
Data_Engineering_Data_Lake_with_Spark
This project uses Spark to build an ETL pipeline for a data lake hosted on S3.
Data_Engineering_Data_Modeling_with_Cassandra
In this project, I will apply data modeling techniques with Apache Cassandra and complete an ETL pipeline using Python.
Data_Engineering_Data_Modeling_with_Postgres
The goal of this project was to model user activity data of the music company "Sparkify" to create a database and ETL pipeline in Postgres using Python.
Data_Engineering_Data_Warehouse
In this project, I used AWS to build an ETL pipeline for a database hosted on Redshift to create a data warehouse. The data was loaded from S3 to staging tables on Redshift and then SQL statements were used to create analytics tables.
Data_Engineering_Professional
This repo highlights a small sample of professional data engineering work I have done.
Linear_Algebra
Code challenge .ipynb files from Mike X Cohen's Udemy course Complete Linear Algebra Theory and Implementation. Each file contains a description of the code challenge and my solution to that challenge.
Machine_Learning_Customer_Segmentation_Arvato_Financial_Solutions
Udacity Machine Learning Nanodegree (MLND) capstone project: Arvato Financial Solutions customer segmentation report using both unsupervised and supervised learning techniques.
Machine_Learning_Mini-Projects_XGBoost
Mini-Projects Featuring Python, AWS SageMaker and XGBoost. A series of three mini-projects on IMDB Sentiment Analysis data using Python, AWS SageMaker and XGBoost for batch transform, hyperparameter tuning and updating a model.
Machine_Learning_Moon_Data
Moon Data Classification.
Machine_Learning_Plagiarism_and_Feature_Engineering_Train_Model
Plagiarism Detection, Feature Engineering, train model. Application that uses Python, AWS SageMaker, Amazon S3 and SciKit-Learn to compare student answers to questions about Google to source answers and decides whether or not the student answer plagiarized the source answer.
AmiriMc's Repositories
AmiriMc/Data_Engineering_Data_Modeling_with_Cassandra
In this project, I will apply data modeling techniques with Apache Cassandra and complete an ETL pipeline using Python.
AmiriMc/Linear_Algebra
Code challenge .ipynb files from Mike X Cohen's Udemy course Complete Linear Algebra Theory and Implementation. Each file contains a description of the code challenge and my solution to that challenge.
AmiriMc/Data_Engineering_Data_Modeling_with_Postgres
The goal of this project was to model user activity data of the music company "Sparkify" to create a database and ETL pipeline in Postgres using Python.
AmiriMc/Data_Engineering_Data_Warehouse
In this project, I used AWS to build an ETL pipeline for a database hosted on Redshift to create a data warehouse. The data was loaded from S3 to staging tables on Redshift and then SQL statements were used to create analytics tables.
AmiriMc/Data_Engineering_Professional
This repo highlights a small sample of professional data engineering work I have done.
AmiriMc/Machine_Learning_Mini-Projects_XGBoost
Mini-Projects Featuring Python, AWS SageMaker and XGBoost. A series of three mini-projects on IMDB Sentiment Analysis data using Python, AWS SageMaker and XGBoost for batch transform, hyperparameter tuning and updating a model.
AmiriMc/Machine_Learning_Moon_Data
Moon Data Classification.
AmiriMc/Machine_Learning_Plagiarism_and_Feature_Engineering_Train_Model
Plagiarism Detection, Feature Engineering, train model. Application that uses Python, AWS SageMaker, Amazon S3 and SciKit-Learn to compare student answers to questions about Google to source answers and decides whether or not the student answer plagiarized the source answer.
AmiriMc/Python_Push_Pull_to_GitHub
This code does some very simple push/pull actions. Pull to get information about your repos. Push to create repos programmatically and to add files to repo.
AmiriMc/codebuild-demo
Codebuild Demo
AmiriMc/Data_Engineering_Data_Lake_with_Spark
This project uses Spark to build an ETL pipeline for a data lake hosted on S3.
AmiriMc/Machine_Learning_Customer_Segmentation_Arvato_Financial_Solutions
Udacity Machine Learning Nanodegree (MLND) capstone project: Arvato Financial Solutions customer segmentation report using both unsupervised and supervised learning techniques.
AmiriMc/git-repository
demo repository
AmiriMc/Machine_Learning_IMDB_Sentiment_Analysis
Using Python, Python scripts, AWS SageMaker, AWS Lambda, Amazon S3, Amazon API Gateway and PyTorch Take in a user's movie review and determine whether their movie review is positive or negative.
AmiriMc/Machine_Learning_Payment_Fraud_Detection
Detecting Payment Card Fraud. This project takes in a credit card fraud detection dataset and uses a binary classification model to identify transactions as either fraudulent or valid, based on provided historical data.
AmiriMc/Machine_Learning_Population_Segmentation
Population segmentation with SageMaker using principal component analysis (PCA) and k-means clustering.