Pinned Repositories
spark
Apache Spark - A unified analytics engine for large-scale data processing
databricks-cli
(Legacy) Command Line Interface for Databricks
sbt-spark-package
Sbt plugin for Spark packages
spark-deep-learning
Deep Learning Pipelines for Apache Spark
tensorframes
[DEPRECATED] Tensorflow wrapper for DataFrames on Apache Spark
graphframes
linalg-test
tests for linear algebra packages
spark-als
Another, hopefully better, implementation of ALS on Spark
spark-ml
proposal for the new interfaces
spark-vl-bfgs
Vector-free L-BFGS implementation on Spark
mengxr's Repositories
mengxr/spark-corenlp
a Stanford CoreNLP wrapper for Spark ML pipeline API
mengxr/pyspark-xgboost
This feature was merged into XGBoost master. See https://github.com/dmlc/xgboost/pull/8020. If you want to try out this feature, please build from XGBoost master and report issues at https://github.com/dmlc/xgboost/issues.
mengxr/actions
mengxr/bazel-toolchain
LLVM toolchain for bazel
mengxr/cloudpickle
Extended pickling support for Python objects
mengxr/conda
OS-agnostic, system-level binary package manager and ecosystem
mengxr/containers
Sample base images for Databricks Container Services
mengxr/databricks-cli
Command Line Interface for Databricks
mengxr/ecosystem
Integration of TensorFlow with other open-source frameworks
mengxr/gpt-2
Code for the paper "Language Models are Unsupervised Multitask Learners"
mengxr/graphframes
mengxr/horovod
Distributed training framework for TensorFlow, Keras, and PyTorch.
mengxr/hyperopt
Distributed Asynchronous Hyperparameter Optimization in Python
mengxr/joblib-spark
Joblib spark backend
mengxr/keras
Deep Learning for humans
mengxr/langchain
⚡ Building applications with LLMs through composability ⚡
mengxr/mleap
MLeap: Deploy Spark Pipelines to Production
mengxr/mlflow
Open source platform for the machine learning lifecycle
mengxr/morpheus
Morpheus brings the leading graph query language, Cypher, onto the leading distributed processing platform, Spark.
mengxr/nbdev_test
mengxr/petastorm
Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
mengxr/python3statement.github.io
mengxr/spark
Mirror of Apache Spark
mengxr/spark-deep-learning
Deep Learning Pipelines for Apache Spark
mengxr/spark-website
Mirror of Apache Spark Website
mengxr/tensorflow
An Open Source Machine Learning Framework for Everyone
mengxr/tensorflow_recipes
Tensorflow conda recipes
mengxr/tensorframes
Tensorflow wrapper for DataFrames on Apache Spark
mengxr/training-1
Reference implementations of training benchmarks
mengxr/xgboost
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow