Mahadulearpit

Mahadulearpit's Stars

wilfredinni/python-cheatsheet
All-inclusive Python cheatsheet
Language:Vue4.3k 97 721.3k
OBenner/data-engineering-interview-questions
More than 2000+ Data engineer interview questions.
1k 18 2379
alanchn31/Data-Engineering-Projects
Personal Data Engineering Projects
Language:Jupyter Notebook820 9 0185
JagadeeshwaranM/Data_Engineering_Simplified
Language:Python626 28 2143
ayush714/ML001-Project-Sources-Code-and-Learning-Materials
ML001 Sources Code and Learning Materials
Language:Jupyter Notebook487 9 5255
jimdevops19/PythonOOP
The Original Code repository for my Python OOP Series
Language:Python481 9 4359
bcafferky/shared
Code and shared files
Language:Jupyter Notebook418 68 4335
ArjanCodes/examples
All the code examples I use in my videos
Language:HTML407 16 11140
Snowflake-Labs/snowpark-python-demos
This repository provides various demos/examples of using Snowpark for Python.
Language:Jupyter Notebook263 14 12151
in28minutes/roadmaps
Roadmaps of in28minutes courses!
249 12 095
darshilparmar/python-for-data-engineering
This repo contains all the code used in the Python for Data Engineering Course
Language:Jupyter Notebook206 13 4574
raveendratal/PysparkRaveendra
Git Repository
Language:Jupyter Notebook124 7 0272
amartinson193/The-Ultimate-List-of-Free-SQL-Resources
Contains a lot of useful free resources for learning SQL
103 2 127
itversity/data-engineering-spark
Language:Jupyter Notebook84 5 3122
carbotton/IBMDataEngineeringCoursera
IBM Data Engineering Courses from Coursera
Language:JavaScript67 5 049
Snowflake-Labs/snowpark-python-template
Python project template for Snowpark development
Language:Python62 5 520
anandjha90/analyticswithanand
This repository contains all the codes,ppts,project & interview questions which I have used in my LIVE CLASS on YouTube and any other relevant documents and assignments related to the course.
49 1 022
bcafferky/masterazuredatabrickssbs
37 4 117
Snowflake-Labs/sfguide-getting-started-snowpark-python
Quickstart: Getting Started with Snowpark Python
Language:Jupyter Notebook32 4 349
PacktPublishing/Simplifying-Data-Engineering-and-Analytics-with-Delta
Simplifying Data Engineering and Analytics with Delta, published by Packt
Language:Jupyter Notebook20 4 023
dennislamcv1/IBMDataEngineering
IBM Data Engineering Professional Certificate
Language:Jupyter Notebook1812
Code360In/data-engineering-with-databricks
Language:Python171.1k
anandjha90/anandjha90
15 8 01
MariamGado0/Starbucks-Capstone-Project-ML-Udacity-aws
# Starbucks Promotions Project ### This project is the Capstone Project of Udacity's Machine Learning Engineering Nanodegree program. ![intro](/images0/x.png) ![intro](/images0/v.png) ![intro](/images0/s.png) ## Problem Statement This data set contains simulated data that mimics customer behavior on the Starbucks rewards mobile app. Once every few days, Starbucks sends out an offer to users of the mobile app. An offer can be merely an advertisement for a drink or an actual offer such as a discount or BOGO (buy one get one free). Some users might not receive any offer during certain weeks. Not all users receive the same offer, and that is the challenge to solve with this data set. The task is to combine transaction, demographic and offer data to determine which demographic groups respond best to which offer type. This data set is a simplified version of the real Starbucks app because the underlying simulator only has one product whereas Starbucks actually sells dozens of products. Starbucks collects the customer data to understand their behaviour on the rewards and offers sent via the mobile-app. Once every few days, Starbucks sends the personalised offers to its customers. These customers can respond positively/negatively/neutrally. A key thing to note is that not all the customers receive the same offer. The task of this project is to combine transaction, demographic and offer data of the past (which is already provided) to determine which demographic groups respond best to which offer types. In order to develop this project, we needed to use some tools, packages, systems and services that could help us achieve our goals. #### Libraries First of all, we used **Python** to write our scripts not only for algorithm training and serving but also for the orchestration of the whole process. Important packages within this environment are listed below: This project is developed in Python 3.6. You will need install some libraries in order to run the code. Libraries are: * `pandas` so we could work with tabular data in dataframes; * `Ploty` so we could visualize our Dataset; * `matplotlib` for Dataset visualization; * `numpy` so we could easily manipulate arrays and data structures; * `seaborn` and `matplotlib` so we could generate insightful visualizations; * `sklearn` so we could build and develop our model pipeline; * `imblearn` so we could apply SMOTE to our training data; * `xgboost` so we could have our main classifier; * `sagemaker` so we could easily interact with AWS. * `json` for reading our Dataset Files. * `boto3` Finally, we used AWS environment in order to launch training jobs, deploy our model and serve predictions. The main services used are also listed below: * __AWS SageMaker__: training, hyperparameter tuning and endpoint serving; * __Amazon S3__: saving our data and model artifacts; ## Files Descriptions This project is structured as follows: #### 01. Proposal Project proposal documentation. #### 02. Data_Cleaning_[Dataset] Folder to perform data preparation and Dataset Cleaning and Prepare the Final Data for Further using in model algorithms. #### 03. Pre-processing Dataset Visualization Folder to perform final Pre-processing Dataset to be used in Visualization and exploration. #### 04. Dataset_Visualization Folder to perform Visualizations for the Pre-processed Dataset. #### 06. ORG_Starbucks_Capstone_Project.ipynb Jupyter notebook file that deploy final model and create an endpoint and orchestrates the end-to-end process in AWS SageMaker and also interacts with other services.
Language:HTML71
NickAkincilar/Sample_Snowpark_Demos
Sample Simple Snowpark Demo Notebooks for Data Engineering & Data Science
Language:Jupyter Notebook714
Vitols999/bcafferky
Language:Jupyter Notebook42
zubair527/advanced-data-engineering-with-databricks
Language:Python4 0 0323
pysparktelugu/pysparktraining
this repository for to store pyspark code and documenttation
Language:Jupyter Notebook3 2 01
Aakash-Punekar/iNeuron_FSDA
This repository contains all the files and folders taught in the class.Make use of it effectively
10
Aakash-Punekar/SQL-Project-for-Data-Analysis-part-1-7
Complete SQL Project for data analysis with source code.
10