beatrichartz/csv

Option to return atoms instead of strings for keys

drernie opened this issue · 5 comments

When using the headers option, it would be convenient if I could get atom-keys instead of string-keys.
Right now I do this:

  def map_strings_to_atoms(string_key_map) do
    # https://stackoverflow.com/questions/31990134/how-to-convert-map-keys-from-strings-to-atoms-in-elixir
    map = for {key, val} <- string_key_map, into: %{}, do: {String.to_atom(key), val}
    filtered = :maps.filter(fn _, v -> v != "" end, map)
    Factory.create_child_node!(filtered)
  end

Such behaviour would be insecure when using with unsantized input as atoms aren't GC-able.

Hi @drernie apologies for the late reply - could you outline what the use case for this is? I agree with @hauleth that we have to be careful with casting (user) input to atoms without sanitizing it, so this wouldn't be the default option, but there might be a legitimate use case I'm not seeing.

Thanks. :-). I am used to this from Ruby, where having atoms is more concise and efficient when accessing dictionaries. I agree it shouldn't be the default, but it would be a nice option for working with data structures that expect to be passed atoms (so I can use the same key on both sides of the assignment).

I realize there is a security risk, though; is there a way implementing in the library could help enforce sanitization?

Prepared the PR for it: #100
Implemented 2 approaches with String.to_existing_atom as safe and String.to_atom as an unsafe one.
Please review.

Right now this won't make it in I'm afraid. The "safe" option means the user needs to know about internals of the library and the file headers and prepare the atoms before parsing. At that point passing the headers in via a list is already an option. The unsafe option is unsafe since it means that this library could be used to attack services that take CSV as their input by uploading CSVs with large headers.

I do not currently see a way to implement this safely and with a good abstraction, feel free to reopen if there is one.