Pinned Repositories
spark
Apache Spark - A unified analytics engine for large-scale data processing
xgboost
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
scikit-learn
scikit-learn: machine learning in Python
aexpy
AexPy /eɪkspaɪ/ is Api EXplorer in PYthon for detecting API breaking changes in Python packages. (ISSRE'22)
arrow
Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
arrow-datafusion
Apache Arrow DataFusion and Ballista query engines
breeze
Breeze is a numerical processing library for Scala.
spark
Mirror of Apache Spark
spark-libFM
An implement of Factorization Machines (LibFM)
SparkGBM
Spark-based GBM
zhengruifeng's Repositories
zhengruifeng/spark-libFM
An implement of Factorization Machines (LibFM)
zhengruifeng/SparkGBM
Spark-based GBM
zhengruifeng/spark
Mirror of Apache Spark
zhengruifeng/aexpy
AexPy /eɪkspaɪ/ is Api EXplorer in PYthon for detecting API breaking changes in Python packages. (ISSRE'22)
zhengruifeng/arrow
Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
zhengruifeng/arrow-datafusion
Apache Arrow DataFusion and Ballista query engines
zhengruifeng/breeze
Breeze is a numerical processing library for Scala.
zhengruifeng/dbt-databricks
A dbt adapter for Databricks.
zhengruifeng/pandas
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
zhengruifeng/flink
Apache Flink
zhengruifeng/hive
Apache Hive
zhengruifeng/langchain
⚡ Building applications with LLMs through composability ⚡
zhengruifeng/LightGBM
A fast, distributed, high performance gradient boosting (GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks. It is under the umbrella of the DMTK(http://github.com/microsoft/dmtk) project of Microsoft.
zhengruifeng/modin
Modin: Scale your Pandas workflows by changing a single line of code
zhengruifeng/numpy
The fundamental package for scientific computing with Python.
zhengruifeng/py4j
Py4J enables Python programs to dynamically access arbitrary Java objects
zhengruifeng/ray
An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.
zhengruifeng/scikit-learn
scikit-learn: machine learning in Python
zhengruifeng/scipy
SciPy library main repository
zhengruifeng/spark-connect-go
Apache Spark Connect Client for Golang
zhengruifeng/spark-website
Apache Spark Website
zhengruifeng/xgboost
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow