/java-dataframes

A quick test of a couple of data frame libraries for Java

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

Java dataframes test

This is the companion repository to the following medium post: Doing cool data science in Java: how 3 DataFrame libraries stack up

Data

The data was extracted from Eurostat in the beginning of September 2018. I opened the extracted CSV in LibreOffice and saved it again because there were some illegal UTF-8 characters in the Eurostat output that some csv importers couldn't handle directly.

Code

The code for the three libraries is present in the Test{libraryname}.java files. They all use CheckResult.java to do a basic correctness check for the top-growing cities.

The libraries tested fully are:

As described in the medium post, I couldn't find a good way to do the pivot step in datavec, but I included the code I wrote up until that point.