Pinned Repositories
ApacheSpark_Projects
awesome-python
A curated list of awesome Python frameworks, libraries, software and resources
CassandraDataModeling
This project involves creating an ETL pipeline and creating tables for three queries using Apache Cassandra. In Apache Cassandra, the data model is modeled depending on the query needs for fast data retrieval. This use case shows how a music startup like spotify can model data using NOSQL for their needs.
Geodjango_test
Testing GeoDjango
Machine-Learning
Includes Data Munging, model building, model evaluation and fine tuning.
NLP
A repository for NLP based experiments
portfolio-project
Django 2.0 practice
PostgresDataModeling
This purpose of setting up this database is to allow the users of **Sparkify(music streaming startup)** to be able to easily query data to achieve their analytic goals. As the data currently resides as a bundle of json files, it is not well organized for fetching information for analytical purposes. Hence using this data in the json files and organizing it as star schema would make it a lot easier for querying purposes. ***Hence this project involves designing a star schema, defining fact and dimension tables, building an ETL pipeling to autonmate the table loading process from source to target. This database is built in postgres and uses python for ETL pipeline.*** This database will be tested against some sql queries provided by the analytics team.
Python_Automation
Script to pull SharePoint data and automate batch refresh in Tableau
TextTranslatorDjango
Top Coder Challenge
vxg7583's Repositories
vxg7583/Python_Automation
Script to pull SharePoint data and automate batch refresh in Tableau
vxg7583/TextTranslatorDjango
Top Coder Challenge
vxg7583/NLP
A repository for NLP based experiments
vxg7583/PostgresDataModeling
This purpose of setting up this database is to allow the users of **Sparkify(music streaming startup)** to be able to easily query data to achieve their analytic goals. As the data currently resides as a bundle of json files, it is not well organized for fetching information for analytical purposes. Hence using this data in the json files and organizing it as star schema would make it a lot easier for querying purposes. ***Hence this project involves designing a star schema, defining fact and dimension tables, building an ETL pipeling to autonmate the table loading process from source to target. This database is built in postgres and uses python for ETL pipeline.*** This database will be tested against some sql queries provided by the analytics team.
vxg7583/ApacheSpark_Projects
vxg7583/awesome-python
A curated list of awesome Python frameworks, libraries, software and resources
vxg7583/CassandraDataModeling
This project involves creating an ETL pipeline and creating tables for three queries using Apache Cassandra. In Apache Cassandra, the data model is modeled depending on the query needs for fast data retrieval. This use case shows how a music startup like spotify can model data using NOSQL for their needs.
vxg7583/CodeSnip
Just a way to share some neat code snippets
vxg7583/Geodjango_test
Testing GeoDjango
vxg7583/Machine-Learning
Includes Data Munging, model building, model evaluation and fine tuning.
vxg7583/portfolio-project
Django 2.0 practice
vxg7583/os-sample-python
Sample Python Flask application for testing OpenShift 3 deployment using OpenShift default Python S2I builder and gunicorn.
vxg7583/Tableau-Cloud-Migration
Migration Plan from VMs in AWS To Tableau Cloud
vxg7583/TC24_images
Images for conference 2024