This is a tool that I use to iterate through the rows of a CSV file.
- Mac OS X (tested on Ventura 13.0.1)
- OCaml (tested on version 4.13.1)
- Apple Numbers
I assume you have a CSV file generated from an Apple Numbers spreadsheet. I mention Apple Numbers because CSV is a surprisingly unstandardised format, and my tool is quite specialised for the dialect of CSV that Apple Numbers exports. Other CSV files might or might not work. In particular:
-
values are comma-separated (as you would expect),
-
values that do (or could) contain commas or newlines are wrapped in double-quotes,
-
but empty values are not wrapped in double-quotes,
-
double-quotes that appear inside values are replaced with two consecutive double-quotes (so "this is an ""example"" of a valid value")
Run make
.
This repo includes a sample CSV file. To use it to see how the tool works, run the following command:
./csv_iterator -csv database.csv -cmd "echo \$firstname got \$percent%."
You can also run make install
to copy the executable into ~/bin
. Then, if ~/bin
is in your $PATH
, you can run csv_iterator
from any directory.
-
The tool creates a new file called
database.csv.tmp
in which""
has been globally replaced with”
. This makes a Numbers-generated CSV file easier to parse (see note above). -
For each non-header row of
database.csv.tmp
, the tool runsfield1=v1 ... fieldN=vN eval 'command'
, wherefield1
, ...,fieldN
are the column names of the CSV file andv1
, ...,vN
are the values taken by those fields in the current row. In other words, the commandcommand
is run in a shell where the current row's values have been assigned to environment variables of the same name. -
You can set the
-dryrun
flag so that the commands to be run are printed to the terminal but not actually executed. -
If you set the
-onlyfirstrow
flag, the tool will stop after the first (non-header) row. This can be useful when testing.
Note:
-
I wrote
\$firstname
rather than$firstname
above in order to prevent thefirstname
variable from being expanded when callingcsv_iterator
. It should only be expanded when the generated commands are executed. -
Best avoid having backticks in the CSV file, as Bash might see those as commands to be executed.
-
Column names in the CSV file mustn't begin with a digit (because environment variables can't begin with a digit).