HuertaRogelio

HuertaRogelio's Stars

Wittline/MachineLearning
The repository contains basic experiments using machine learning algorithms with python
Language:HTML31
Wittline/Contextual-Data-Transforms
This repository contain the most important contextual data transformation algorithms which help to improve the rate compression reached by statistical encoders. Ramses Alexander Coraspe Valdez
Language:HTML31
Wittline/Huffman-decoding
A New Approach for Efficient Sequential Decoding of Static Huffman Codes
Language:HTML51
Wittline/Computer-Vision-and-Deep-Learning
This repository contains information on the basic techniques and algorithms used in computer image processing, in addition to some projects related to pattern recognition using deep learning.
Language:Python21
Wittline/SparkSQL-with-Python
This repository has some examples of using Spark and SparkSQL with Python through PySpark
Language:HTML22
Wittline/Data-Analytics-with-R
Repository for data analytics course using R
Language:HTML2
Wittline/dataengineering-assignment
Prescreening Tasks for Data Engineer
Language:Jupyter Notebook61
Wittline/Moving-Average-Spark
How to Compute Moving Average with Spark
51
Wittline/tf-idf
Term Frequency-Inverse Document Frequency from Scratch
Language:Python75
Wittline/data-engineering-challenge-th
Dockerizing a Python Script for Web Scraping and consume the scraped data using FastApi (www.metroscubicos.com)
Language:Python132
Wittline/burrows-wheeler-transform
Implementation of the algorithm "Burrows Wheeler Transform" in python for data compression
Language:Python1
Wittline/pyspark-on-aws-emr
The goal of this project is to offer an AWS EMR template using Spot Fleet and On-Demand Instances that you can use quickly. Just focus on writing pyspark code.
Language:Python2513
Wittline/data-engineer-challenge
Challenge Data Engineer
Language:Python258
Wittline/text-analysis-speeches-amlo
Text analysis of the speeches, conferences and interviews of the current president of Mexico
Language:Jupyter Notebook83
Wittline/recommendation-system
Build a Content-Based Movie Recommender System (TF-IDF, BM25, BERT)
Language:Python122
Wittline/distance-metrics
Distance metrics are one of the most important parts of some machine learning algorithms, supervised and unsupervised learning, it will help us to calculate and measure similarities between numerical values expressed as data points
Language:Jupyter Notebook42
Wittline/docker-livy
Dockerizing and Consuming an Apache Livy environment
Language:HTML118
Wittline/csv-shuffler
A tool to automatically Shuffle lines in .csv files
Language:Python4
Wittline/csv-estimate-rows
Language:Python4
Wittline/csv-splitter
csv-splitter
Language:Python1
Wittline/csv-columnar
Language:Python2
Wittline/model-catalog-grpc
A gRPC service to consume any machine learning model stored in a model catalog through a single endpoint.
1
Wittline/pyDag
Scheduling Big Data Workloads and Data Pipelines in the Cloud with pyDag
Language:Python243
Wittline/wbz
A parallel implementation of the bzip2 data compressor in python, this data compression pipeline is using algorithms like Burrows–Wheeler transform (BWT) and Move to front (MTF) to improve the Huffman compression. For now, this tool only will be focused on compressing .csv files, and other files on tabular format.
Language:Python133
Wittline/livyc
Apache Spark as a Service with Apache Livy Client
Language:Python31
Wittline/Dropout-Students-Prediction
The goal of this project is to identify students at risk of dropping out the school
Language:HTML2017
Wittline/csv-schema-inference
A tool to automatically infer columns data types in .csv files
Language:Jupyter Notebook334
Wittline/uber-expenses-tracking
The goal of this project is to track the expenses of Uber Rides and Uber Eats through data Engineering processes using technologies such as Apache Airflow, AWS Redshift and Power BI.
Language:Jupyter Notebook10635