DavyLandman/csvtools

Sort huge CSV by 7th column

baerbock opened this issue · 1 comments

I would like to sort a 300 MB CSV with a header (https://data.open-power-system-data.org/renewable_power_plants/2018-03-08/renewable_power_plants_DE.csv) by it's 7th column electrical_capacity which contains numerical values:

0.075
0.02937
0.4
0.303

How could I do this with csvtools? Thank you very much for any guidance.

There is no build in tool for sorting yet, primarily because it requires to keep all the output somewhere in memory (or on disk), since the last line might be the top line in the output.

I guess I would say, import it into a database (sqlite comes to mind), and have fun there?

Otherwise, you could use csvawk to rewrite the order of the columns, such that the 7th column is the first column, and then pipe that through the regular gnutools sort. If you want you could then again pipe it through csvawk to reshuffle the columns.