aws-glue-crawler

There are 36 repositories under aws-glue-crawler topic.

aws-samples/aws-glue-crawler-utilities
This repository has a collection of utilities for Glue Crawlers. These utilities come in the form of AWS CloudFormation templates or AWS CDK applications.
Language:Python19 6 011
aws-samples/amazon-rds-export-to-s3-automation
This repository contains source code for the AWS Database Blog Post Reduce data archiving costs for compliance by automating RDS snapshot exports to Amazon S3
15 4 02
fermat01/ETL-Data-Pipeline-using-AWS-EMR-Spark-Glue-Athena
ETL Data pipeline using aws services
Language:Python4 1 01
aws-samples/automated-datastore-discovery-with-aws-glue
Automation framework to catalog AWS data sources using Glue
Language:Python3 2 0
GabrielDan92/AWS_Terraform_PySpark-ETL_Job
Terraform configuration that creates several AWS services, uploads data in S3 and starts the Glue Crawler and Glue Job.
Language:Python3 1 00
masood2iq/AWS-Athena-Glue-S3-Bucket-Deployment-Through-AWSConsole
AWS Athena, Glue Database, Glue Crawler and S3 buckets deployment through AWS GUI console.
2 1 00
ShubhamMohanty680/Spotify_end_to_end_data_engineering
It is a project build using ETL(Extract, Transform, Load) pipeline using Spotify API on AWS.
Language:Jupyter Notebook20
Akanksha-tetwar/YouTube-Trending-video-analysis-ETL-using-AWS-Services
In this project I have used the Trending YouTube Video Statistics data from Kaggle to analyze and prepare it for usage.
1 1 00
dhvani-k/YouTrend_Insights_Analyzing_YouTube_Video_Landscape
An end-to-end solution for managing and analyzing YouTube video data from Kaggle, leveraging AWS services and visualized through Quicksight and Tableau
Language:Python1 1 00
DivineSamOfficial/SmartCityProject
Smart City Realtime Data Engineering Project
Language:Python1 1 00
masood2iq/AWS-Athena-Glue-S3-CloudFormation-Deployment-AWSConsole
AWS Athena, Glue Database, Glue Crawler and S3 buckets deployment through CloudFormation stack on AWS console.
1 1 00
SadafAsad/LinkedIn-Jobs-Analysis
Unveiling job market trends with Scrapy and AWS
Language:Python1 1 00
sarah-zhan/data_pipeline_amazon_products
An end-to-end data pipeline built with AWS S3, Glue, Crawler, Athena, Tableau visulization
Language:Jupyter Notebook1 1 00
Saurabhkhandebharad/BigData-SK
Analyzed a multicategory e-commerce store using big data techniques on a Kaggle dataset with the help of AWS EC2, AWS S3, PySpark, AWS Glue ETL, AWS Athena, AWS CloudFormation, AWS Lambda and Power BI!
Language:Python1 1 00
subhamay-cloudworks/0052-agapanthus-cft
Working with Glue Data Catalog and Running the Glue Crawler On Demand
1 1 00
subhamay-cloudworks/0090-deutzia-cft
Creating an audit table for a DynamoDB table using CloudTrail, Kinesis Data Stream, Lambda, S3, Glue and Athena and CloudFormation
Language:Python1 1 1
VvEK-Hiremath/Airlines-Data-Pipeline-Project-AWS
Implementing data pipeline using AWS services for airlines data
Language:Python1
desininja/Quality-Movie-Data-Pipeline
ETL pipeline using AWS services
Language:Python00
h-fuzzy-logic/data-analytics-spring
Open data and cloud computing to answer the question: Are we losing our spring days?
Language:Jupyter Notebook0 1 00
imverma/DataEngineering-YouTube-Analysis-Project
An end-to-end solution for managing and analyzing YouTube video data from Kaggle, leveraging AWS services and visualized through Quicksight and Tableau
Language:Python0 1 00
Kartik-Banga/Automated-ETL-Pipeline-for-Playstore-Data
Implemented ETL pipeline on AWS for Playstore data using Lambda, Glue Crawlers, and Glue ETL Jobs. Orchestrated workflow with Step Functions and achieved seamless integration, optimal data merging, and enhanced data quality/accessibility.
0 1 00
KRISHNASAIRAJ/AWS-Driven-Sales-Performance-Outlook
The Project aims to establish a robust data pipeline for tracking and analyzing sales performance using various AWS services. The process involves creating a DynamoDB database, implementing Change Data Capture (CDC), utilizing Kinesis streams, and finally, storing and querying the data in Amazon Athena.
Language:Python0 1 00
masood2iq/AWS-Athena-Glue-CloudFormation-Deployment-on-Existing-S3-Bucket-AWSConsole
AWS Athena, Glue Database, Glue Crawler deployment through CloudFormation stack on already existing S3 buckets on AWS console.
0 1 00
masood2iq/Serverless-Framework-Athena-Glue-Deployment-on-Existing-S3-Bucket
AWS Athena, Glue Database, Glue Crawler deployment on existing S3 bucket through Serverless (sls) Framework.
Language:JavaScript0 1 00
masood2iq/Serverless-Framework-Athena-Glue-S3-Buckets-Deployment
AWS Athena, Glue Database, Glue Crawler and S3 buckets deployment through Serverless (sls) Framework
Language:JavaScript0 1 00
shahidmalik4/aws-glue-stepfunctions-etl
This project automates an ETL pipeline using AWS Glue, S3, Athena, and Step Functions to transform raw Airbnb data. It cleanses, enriches, and organizes the data into separate raw and transformed databases, enabling efficient querying and analysis via Athena, with automated notifications through SNS.
Language:Python00
Shilpaar90/AWS-Capturing-Schema-Changes-In-S3
A pipeline within AWS to capture schema changes in S3 files and to update them in a DB.
0 1 00
ShreyasLengade/serverless_etl_pipeline
Developed an ETL pipeline for real-time ingestion of stock market data from the stock-market-data-manage.onrender.com API. Engineered the system to store data in Parquet format for optimized query processing and incorporated data quality checks to ensure accuracy prior to visualization.
Language:Python0 1 00
Tyriek-cloud/NYC-Mobility-Survey-Analysis
An end-to-end data engineering project in which five NYC DOT datasets were modified in an ETL process and analyzed for insights.
Language:Python0 1 00
AirtonLira/aws-bigdata-glue-athena
Este projeto tem como objetivo realizar a coleta, catalogo, governança, processamento e visualização de dados.
1 0
jibbs1703/Tickit-Data-Lake
This repository demonstrates the creation of a robust, 3-tier data lake using AWS resources.
1 0
mihirkudale/Stock-Market-Real-Time-Data-Engineering-Project
In this project, you will execute an End-To-End Data Engineering Project on Real-Time Stock Market Data using Kafka. We are going to use different technologies such as Python, Amazon Web Services (AWS), Apache Kafka, Glue, Athena, and SQL.
Language:Jupyter Notebook1 0
productiveAnalytics/aws-cdk-constructs-sandbox
Cloud Development Kit (AWS CDK) using TypeScript, Python and Java
Language:Java2 0
subhamay-cloudworks/0053-bluebonnets-cft
Working with Glue Data Catalog and running the using S3 Event Notification and creating the entire stack using AWS CloudFormation
sumanthmalipeddi/spotify_trending_telugu
Collecting the list of songs,album and artists list details from the Spotify Music Application in specific intervals using spotipy API and performing ETL Operations using Amazon Cloud Services
Language:Jupyter Notebook
TravelXML/KAFKA-PYTHON-AWS-CRAWLER-AMAZON-ATHENA
A comprehensive tutorials / steps / scripts for setting up Apache Kafka on an Amazon EC2 instance, streaming logs to S3, and querying data with AWS Glue and Amazon Athena. Includes Zookeeper configuration, producer and consumer setup, and automated data catalog creation
Language:Jupyter Notebook1 0

aws-glue-crawler

aws-samples/aws-glue-crawler-utilities

aws-samples/amazon-rds-export-to-s3-automation

fermat01/ETL-Data-Pipeline-using-AWS-EMR-Spark-Glue-Athena

aws-samples/automated-datastore-discovery-with-aws-glue

GabrielDan92/AWS_Terraform_PySpark-ETL_Job

masood2iq/AWS-Athena-Glue-S3-Bucket-Deployment-Through-AWSConsole

ShubhamMohanty680/Spotify_end_to_end_data_engineering

Akanksha-tetwar/YouTube-Trending-video-analysis-ETL-using-AWS-Services

dhvani-k/YouTrend_Insights_Analyzing_YouTube_Video_Landscape

DivineSamOfficial/SmartCityProject

masood2iq/AWS-Athena-Glue-S3-CloudFormation-Deployment-AWSConsole

SadafAsad/LinkedIn-Jobs-Analysis

sarah-zhan/data_pipeline_amazon_products

Saurabhkhandebharad/BigData-SK

subhamay-cloudworks/0052-agapanthus-cft

subhamay-cloudworks/0090-deutzia-cft

VvEK-Hiremath/Airlines-Data-Pipeline-Project-AWS

desininja/Quality-Movie-Data-Pipeline

h-fuzzy-logic/data-analytics-spring

imverma/DataEngineering-YouTube-Analysis-Project

Kartik-Banga/Automated-ETL-Pipeline-for-Playstore-Data

KRISHNASAIRAJ/AWS-Driven-Sales-Performance-Outlook

masood2iq/AWS-Athena-Glue-CloudFormation-Deployment-on-Existing-S3-Bucket-AWSConsole

masood2iq/Serverless-Framework-Athena-Glue-Deployment-on-Existing-S3-Bucket

masood2iq/Serverless-Framework-Athena-Glue-S3-Buckets-Deployment

shahidmalik4/aws-glue-stepfunctions-etl

Shilpaar90/AWS-Capturing-Schema-Changes-In-S3

ShreyasLengade/serverless_etl_pipeline

Tyriek-cloud/NYC-Mobility-Survey-Analysis

AirtonLira/aws-bigdata-glue-athena

jibbs1703/Tickit-Data-Lake

mihirkudale/Stock-Market-Real-Time-Data-Engineering-Project

productiveAnalytics/aws-cdk-constructs-sandbox

subhamay-cloudworks/0053-bluebonnets-cft

sumanthmalipeddi/spotify_trending_telugu

TravelXML/KAFKA-PYTHON-AWS-CRAWLER-AMAZON-ATHENA