/pyspark-csv

An external PySpark module that works like R's read.csv or Panda's read_csv, with automatic type inference and null value handling. Parses csv data into SchemaRDD. No installation required, simply include pyspark_csv.py via SparkContext.

Primary LanguagePythonMIT LicenseMIT

This repository is not active