Pinned Repositories
alcf-notes
Personal notes about building/running codes on ALCF resources
cudf
cuDF - GPU DataFrame Library
dask
Parallel computing with task scheduling
distributed
A distributed task scheduler for Dask
notebooks
pynvml
Provide Python access to the NVML library for GPU diagnostics
rjzamora's Repositories
rjzamora/pynvml
Provide Python access to the NVML library for GPU diagnostics
rjzamora/cudf
cuDF - GPU DataFrame Library
rjzamora/dask
Parallel computing with task scheduling
rjzamora/distributed
A distributed task scheduler for Dask
rjzamora/NVTabular
A library that sits on top of RAPIDS cuDF library providing a range of benefits for processing extremely large tabular datasets, particularly those that do not fit in GPU or CPU memory. NVTabular has many capabilities including fast terabyte-scale data preparation and accelerated tabular data loading, all on GPU, which streamline the first step for both training and inference to any deep recommender system pipelines.
rjzamora/arrow
Apache Arrow is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. It also provides computational libraries and zero-copy streaming messaging and interprocess communication. Languages currently supported include C, C++, Java, JavaScript, Python, and Ruby.
rjzamora/coiled-benchmarks
rjzamora/core
Core Utilities for NVIDIA Merlin
rjzamora/cugraph
cuGraph - RAPIDS Graph Analytics Library
rjzamora/cuml
cuML - RAPIDS Machine Learning Library
rjzamora/cuxfilter
GPU accelerated cross filtering with cuDF.
rjzamora/dask-blog
Dask development blog
rjzamora/dask-cuda
Utilities for Dask and CUDA interactions
rjzamora/dask-expr-rapids
rjzamora/dask-match
rjzamora/dask-sql
Distributed SQL Engine in Python using Dask
rjzamora/design-docs
Experimental repo for proposals of future work
rjzamora/fastparquet
python implementation of the parquet columnar file format.
rjzamora/filesystem_spec
A specification that python filesystems should adhere to.
rjzamora/Morpheus
Morpheus SDK
rjzamora/NeMo-Curator
Scalable toolkit for data curation
rjzamora/pandas
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
rjzamora/pynvml-feedstock
A conda-smithy repository for pynvml.
rjzamora/rapids-dask-dependency
rjzamora/ray
Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
rjzamora/rjzamora.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
rjzamora/s3fs
S3 Filesystem
rjzamora/systems
Merlin Systems provides tools for combining recommendation models with other elements of production recommender systems (like feature stores, nearest neighbor search, and exploration strategies) into end-to-end recommendation pipelines that can be served with Triton Inference Server.
rjzamora/ucx-py
Python bindings for UCX
rjzamora/xgboost
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow