CSV
Almost RFC 4180 compliant CSV parsing and encoding for Elixir. Allows to specify other separators, so it could also be named: TSV. Why it is not idk, because of defaults I think.
Why do we want it?
It parses files which contain rows (in utf-8) separated by either commas or other separators.
If that's not enough reason to absolutely ❤️ 💚 💕 ❤️ 💞 💖 it, it also parses a CSV file in order about 2x times as fast as an unparallelized stream implementation 🚀
When do we want it?
Now.
How do I get it?
Add
{:csv, "~> 1.2.4"}
to your deps in mix.exs
like so:
defp deps do
[
{:csv, "~> 1.2.4"}
]
end
Note: Elixir 1.1.0
is required for all versions above 1.1.5
.
Great! How do I use it right now?
Do this to decode:
File.stream!("data.csv") |> CSV.decode
And you'll get a stream of rows. So, this is upcasing the text in each cell of a tab separated file because someone is angry:
File.stream!("data.csv") |>
CSV.decode(separator: ?\t) |>
Enum.map fn row ->
Enum.each(row, &String.upcase/1)
end
Do this to encode a table (two-dimensional list):
table_data |> CSV.encode
And you'll get a stream of lines ready to be written to an IO. So, this is writing to a file:
file = File.open!("test.csv")
table_data |> CSV.encode |> Enum.each(&IO.write(file, &1))
⁉️
I have this file, but it's tab-separated Pass in another separator to the decoder:
File.stream!("data.csv") |> CSV.decode(separator: ?\t)
If you want to take revenge on whoever did this to you, encode with semicolons like this:
your_data |> CSV.encode(separator: ?;)
Polymorphic encoding
Make sure your data gets encoded the way you want - implement the CSV.Encode
protocol for whatever strange you wish to encode:
defimpl CSV.Encode, for: MyData do
def encode(%MyData{has: fun}, env \\ []) do
"so much #{fun}" |> CSV.Encode.encode(env)
end
end
Or similar.
Ensure performant encoding
The encoding protocol implements a fallback to Any for types where a simple call
o to_string
will provide unambiguous results. Protocol dispatch for the
fallback to Any is very slow when protocols are not consolidated, so make sure
you have consolidate_protocols: true
in your mix.exs
or you consolidate protocols manually for production in order
to get good performance.
There is more to know about everything ™️ - Check the doc
License
MIT