HuertaRogelio's Stars
Wittline/MachineLearning
The repository contains basic experiments using machine learning algorithms with python
Wittline/Contextual-Data-Transforms
This repository contain the most important contextual data transformation algorithms which help to improve the rate compression reached by statistical encoders. Ramses Alexander Coraspe Valdez
Wittline/Huffman-decoding
A New Approach for Efficient Sequential Decoding of Static Huffman Codes
Wittline/Computer-Vision-and-Deep-Learning
This repository contains information on the basic techniques and algorithms used in computer image processing, in addition to some projects related to pattern recognition using deep learning.
Wittline/SparkSQL-with-Python
This repository has some examples of using Spark and SparkSQL with Python through PySpark
Wittline/Data-Analytics-with-R
Repository for data analytics course using R
Wittline/dataengineering-assignment
Prescreening Tasks for Data Engineer
Wittline/Moving-Average-Spark
How to Compute Moving Average with Spark
Wittline/tf-idf
Term Frequency-Inverse Document Frequency from Scratch
Wittline/data-engineering-challenge-th
Dockerizing a Python Script for Web Scraping and consume the scraped data using FastApi (www.metroscubicos.com)
Wittline/burrows-wheeler-transform
Implementation of the algorithm "Burrows Wheeler Transform" in python for data compression
Wittline/pyspark-on-aws-emr
The goal of this project is to offer an AWS EMR template using Spot Fleet and On-Demand Instances that you can use quickly. Just focus on writing pyspark code.
Wittline/data-engineer-challenge
Challenge Data Engineer
Wittline/text-analysis-speeches-amlo
Text analysis of the speeches, conferences and interviews of the current president of Mexico
Wittline/recommendation-system
Build a Content-Based Movie Recommender System (TF-IDF, BM25, BERT)
Wittline/distance-metrics
Distance metrics are one of the most important parts of some machine learning algorithms, supervised and unsupervised learning, it will help us to calculate and measure similarities between numerical values expressed as data points
Wittline/docker-livy
Dockerizing and Consuming an Apache Livy environment
Wittline/csv-shuffler
A tool to automatically Shuffle lines in .csv files
Wittline/csv-estimate-rows
Wittline/csv-splitter
csv-splitter
Wittline/csv-columnar
Wittline/model-catalog-grpc
A gRPC service to consume any machine learning model stored in a model catalog through a single endpoint.
Wittline/pyDag
Scheduling Big Data Workloads and Data Pipelines in the Cloud with pyDag
Wittline/wbz
A parallel implementation of the bzip2 data compressor in python, this data compression pipeline is using algorithms like Burrows–Wheeler transform (BWT) and Move to front (MTF) to improve the Huffman compression. For now, this tool only will be focused on compressing .csv files, and other files on tabular format.
Wittline/livyc
Apache Spark as a Service with Apache Livy Client
Wittline/Dropout-Students-Prediction
The goal of this project is to identify students at risk of dropping out the school
Wittline/csv-schema-inference
A tool to automatically infer columns data types in .csv files
Wittline/uber-expenses-tracking
The goal of this project is to track the expenses of Uber Rides and Uber Eats through data Engineering processes using technologies such as Apache Airflow, AWS Redshift and Power BI.