/dqr

Distributed Quantile Regression by Pilot Sampling and One-Step Updating

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

dqr: Distributed Quantile Regression

by Pilot Sampling and One-Step Updating

Spark implementation

System Requirements

Run the code on a Spark platform

  • Zip the code into a portable package (a zipped file dqr.zip will be placed into the projects folder)
make zip
  • Run the project on a Spark platform
PYSPARK_PYTHON=/usr/local/bin/python3.7 \
        spark-submit --py-files projects/dqr.zip \
        projects/dqr_spark.py

Build a Python module

You could also build the code into standard Python module and deploy to Spark clusters.

python setup.py bdist

Conceptual demo in R

  • Contributed by @edwardguo61

  • The required R version: 3.5.1

  • Files:

    • dqr/Restimator.R: one-shot estimation and one-step estimation for distributed quantile regression
    • dqr/R/simulator.R: simulation functions to generate random/non-random data
    • dqr/R/uilts.R: other functions used
    • projects/dqr_demo.R: generate data, conduct estimation and generate plot. Please run dqr_demo.R to see how to use the functions.

References