Pinned Repositories
apache-airflow-study
Python code that implement simple etl on Apache Airflow
Hive_emr_python
Hive streaming using Python and Hive transform function
kafka-stream-poc
POC on how to use kafka-stream that read AVRO from Kafka topic and filter only the desire value to print to console.
kafka_connect_rds_to_s3_json
Retrieve the data from Posgresql on RDS (non CDC) and ingest to AWS S3 as Json String.
kafka_flink_deduplicate1
This project consume the message from Kafka topic using Flink and do deduplication on the incoming message.
kafka_publisher_json_to_azure_event_hub
This demo is for test kafka publisher that publish json string to Azure Event Hub (enable Kafka support)
kstreams
martingale_ea_improvement
To improve the forex robot that use martingale strategy
poc_streaming_twitter_to_kafka_to_spark_to_hdfs
I try to build the data pipeline that read the twitter stream and store tweet data into HDFS
pyspark_read_write_to_hive
Correct way to read the json file on AWS S3 with Pyspark
jitkasempin's Repositories
jitkasempin/ai-platform-samples
Official Repo for Google Cloud AI Platform
jitkasempin/anthos-workshop
jitkasempin/awslambda
AWS Lambda
jitkasempin/awslambda-psycopg2
jitkasempin/beam-samples
jitkasempin/cloud-builders-community
Community-contributed images for Google Cloud Build
jitkasempin/criteo-python-marketing-sdk
A Python SDK to access the Criteo Marketing API
jitkasempin/dbt
dbt (data build tool) enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
jitkasempin/dlp-dataflow-deidentification
Data Tokenization PoC Using Dataflow/Beam and DLP API
jitkasempin/facets
Visualizations for machine learning datasets
jitkasempin/faust
Python Stream Processing
jitkasempin/featuretools
An open source python framework for automated feature engineering
jitkasempin/gae_backend_bootstrap
Boilerplate template for building robust and scalable REST API on Google App Engine using python 3.7 and Cloud Datastore.
jitkasempin/github-issue-slack-notification
Notice GitHub issue created event to Slack
jitkasempin/google-cloud-python
Google Cloud Client Library for Python
jitkasempin/kafka-connect-bigquery
A Kafka Connect BigQuery sink connector
jitkasempin/kafka-streams-query
Library offering http based query on top of Kafka Streams Interactive Queries
jitkasempin/ludwig
Ludwig is a toolbox built on top of TensorFlow that allows to train and test deep learning models without the need to write code.
jitkasempin/ora2pg
Ora2Pg is a free tool used to migrate an Oracle database to a PostgreSQL compatible schema. It connects your Oracle database, scan it automaticaly and extracts its structure or data, it then generates SQL scripts that you can load into PostgreSQL.
jitkasempin/professional-services
Common solutions and tools developed by Google Cloud's Professional Services team
jitkasempin/python-docs-samples
Code samples used on cloud.google.com
jitkasempin/python-fire
Python Fire is a library for automatically generating command line interfaces (CLIs) from absolutely any Python object.
jitkasempin/serverless_data_pipeline_gcp
schedule a data pipeline in Google Cloud using cloud function, BigQuery, cloud storage, cloud scheduler, and pub/sub
jitkasempin/storage-sdrs
Data retention tool for Google Cloud Storage
jitkasempin/streamalert
StreamAlert is a serverless, realtime data analysis framework which empowers you to ingest, analyze, and alert on data from any environment, using datasources and alerting logic you define. Also, we are hiring!!!!!!!!
jitkasempin/TeachApacheCamel-Spring-Boot
jitkasempin/ticketco-pipedrive-zoho-export
Google Cloud Function to export from PipeDrive to Zoho accourding to schedule
jitkasempin/tink
Tink is a multi-language, cross-platform, open source library that provides cryptographic APIs that are secure, easy to use correctly, and hard(er) to misuse.
jitkasempin/xchange-stream
XChange-stream is a Java library providing a simple and consistent streaming API for interacting with Bitcoin and other crypto currency exchanges via WebSocket protocol. It is build on top of of XChange library providing new interfaces for streaming API. User can subscribe for live updates via reactive streams of RxJava library.
jitkasempin/yq
Command-line YAML and XML processor - jq wrapper for YAML/XML documents