Pinned Repositories
airflow-tutorial
Tutorial like code for how to deploy airflow using docker and how to use the DockerOperator.
Amazon_reviews_scraper
Get all product reviews for a product on Amazon.
CalciteParsingExploration
Data-Analysis
The aim of this project was to analyse given data set and find out if there exists any trends. The data is produced by a tool similar to Google analytics and the dataset is about a website which is an online repository for books.
Data-Ingestion
The aim of this project is automate data ingestion from flat files like CSV and compressed files GZIP into a database like Postgres. The entire setup is automated using Docker and is pretty fast too as multiprocessing is being used.
Ecommerce_scraper_project
Have you ever needed to go through multiple e-commerce websites simultaneously to check the price of the same product? Prices can differ but having a list side by side would be a bonus, wouldn't it? This Flask App can do exactly that.
Loans-data-visualization
This repository contains a loans dataset and a Power BI file to visualize this data.
Mining-Internet-Search
Spark-assignment
The aim of this project is to perform analysis on some (car crash) data using PySpark and make the entire process deployable using Docker.
Soumyadeep-github's Repositories
Soumyadeep-github/airflow-tutorial
Tutorial like code for how to deploy airflow using docker and how to use the DockerOperator.
Soumyadeep-github/Ecommerce_scraper_project
Have you ever needed to go through multiple e-commerce websites simultaneously to check the price of the same product? Prices can differ but having a list side by side would be a bonus, wouldn't it? This Flask App can do exactly that.
Soumyadeep-github/Data-Analysis
The aim of this project was to analyse given data set and find out if there exists any trends. The data is produced by a tool similar to Google analytics and the dataset is about a website which is an online repository for books.
Soumyadeep-github/Data-Ingestion
The aim of this project is automate data ingestion from flat files like CSV and compressed files GZIP into a database like Postgres. The entire setup is automated using Docker and is pretty fast too as multiprocessing is being used.
Soumyadeep-github/Spark-assignment
The aim of this project is to perform analysis on some (car crash) data using PySpark and make the entire process deployable using Docker.
Soumyadeep-github/Amazon_reviews_scraper
Get all product reviews for a product on Amazon.
Soumyadeep-github/CalciteParsingExploration
Soumyadeep-github/Loans-data-visualization
This repository contains a loans dataset and a Power BI file to visualize this data.
Soumyadeep-github/Mining-Internet-Search
Soumyadeep-github/ADF-Spark-Notebook-pipeline
A simple pipeline to transform data within Azure Data Factory using Azure Databricks. Although it is written in Scala the same can be replicated in Python.
Soumyadeep-github/Assignment-1
This contains an attempt towards analyzing the Black Friday data set from Kaggle.
Soumyadeep-github/Assignment-2
This repository contains files which were used to transform a set of .txt files such that the date column can be shifted to the front of the given files.
Soumyadeep-github/AzureDatabricksNotebook
Soumyadeep-github/Bellings-and-Ready-Wear
This repository contains a .pbix (power bi desktop) and excel file. The excel file contains sales reports of Bellings and Ready Ware, two retails chains running throughout Australia while the power bi file contains some analysis about the same.
Soumyadeep-github/Calcite-exploration
Soumyadeep-github/Data_Wrangling_Analysis_and_AB_Testing_with_SQL
Data Wrangling, Analysis and AB Testing with SQL
Soumyadeep-github/DataStructures_CustomModule
Soumyadeep-github/DBT-project
Soumyadeep-github/Django-docker-demo
Basic setup for Django with Docker
Soumyadeep-github/fastai
The fastai deep learning library, plus lessons and tutorials
Soumyadeep-github/Fibonacci_dockerized
Soumyadeep-github/Flipkart_webscraper
Just a barebones program to get into Flipkart scraping.
Soumyadeep-github/Hands-On-Machine-Learning
My ML Learning
Soumyadeep-github/hazelcast-platform-training
Soumyadeep-github/hive-metastore-docker
Example for article Running Spark 3 with standalone Hive Metastore 3.0
Soumyadeep-github/JavaProjectScraping
A simple Java project to fetch top 250 movies from IMDB into a CSV file.
Soumyadeep-github/LearningSpark
A repository for learning Apache Spark using Scala.
Soumyadeep-github/LinkedIn_scraper
Scraping data from LinkedIn. The aim was to scrape data off of LinkedIn. The scraping project was solely performed as an experiment and has no other intentions.
Soumyadeep-github/public-finance-researcher-assignment
Assignment for Public Finance Researcher position CivicDataLab
Soumyadeep-github/ThinkStats2
Text and supporting code for Think Stats, 2nd Edition