/RAPIDS-NVTabular-Horovod-Spark-Databricks

This repo contains my work based in Nvidia internship as a Data Science Intern

Primary LanguageJupyter NotebookGNU General Public License v3.0GPL-3.0

RAPIDS-NVTabular-Horovod-Spark-Databricks

This repo contains my work based on my Nvidia internship as a Data Science Intern.

⚡️ Worked on data engineering aspect of a Multi-Step Time Series Forecasting problem with GPU accelerated stack on Nvidia GPU Cloud. Check out work: Distributed Data Science using NVTabular on Spark & Dask

⚡️ Improved distributed model training speed up to 10% through the migration of the data loading pipeline from Petastorm to KerasSequenceLoader. RAPIDS-NVTabular-Horovod-Spark-Databricks

image

Docker Hub

hub.docker.com/r/tauhait/databricks_nvtabular

image