I'm learning how to parse big CSV files in Haskell. This is my second attempt. I'll be following (loosely) the frames tutorial. No, not the one formatted for web. I'm slowly realising one can learn Haskell by following code (it is readable, and authors are generally good at commenting it). Also, that one cannot do so without reading the source code of the libraries used. Tutorials and plain English documentation are relatively inexistent.
Reading from a stream.
Doing some basic data analysis, like counting records.
Being able to inspect the stream using something like take
or show
with indexing. I assume I would be doing it in GHCi
.
Extracting relevant info from unstructured text, such as addresses. That's a big part of what I do for work, and the main motivation for looking beyond Python. I want to move away from regular expressions and do it fast.
GROUP BY
- Encoding results back into an output file.
Finally, I eagerly welcome help to move this forward. Get in touch!