Reading CSV

All reads start with reader (), which returns a Config with sensible defaults: comma delimiter, no headers skipped. Customize it with the builders, then call parse or parse_keyed.

Positional rows with `parse`

parse returns List (List String) — one inner list per row, in column order:

let rows =
  reader ()
  |> parse "a,b,c\nd,e,f\n"
# [["a", "b", "c"], ["d", "e", "f"]]

A trailing newline is optional. Empty input gives [].

Skipping a header row

When the first row is column names you do not need in the data:

let rows =
  reader ()
  |> skip_header
  |> parse "name,age\nAlice,30\nBob,25\n"
# [["Alice", "30"], ["Bob", "25"]]

skip_header drops exactly one row. If you have multiple lines of preamble, this is not the tool — slice the input before parsing.

Custom delimiters

Tab, semicolon, pipe — anything that fits in one byte. Pass it as a one-character string:

reader () |> with_separator "\t" |> parse "a\tb\tc\n"
reader () |> with_separator "|"  |> parse "x|y|z\n"

The delimiter is read as the first byte of the string, so multi-byte UTF-8 separators will not work. ASCII only.

Keyed rows with `parse_keyed`

When you want each row as a Dict String String keyed by the header names:

let rows =
  reader ()
  |> parse_keyed "name,age\nAlice,30\nBob,25\n"
# [Dict{"name": "Alice", "age": "30"}, Dict{"name": "Bob", "age": "25"}]

The first row is always consumed as the header. Do not combine parse_keyed with skip_header unless you actually want to skip the second row as well — skip_header runs after the header is captured, so it drops the first data row, not the header.

If the input has no rows, parse_keyed panics. Guard with a length check or use parse and convert yourself if empty input is expected.

Extra columns in a data row past the header length are dropped. Missing columns are simply absent from the dict — they do not become empty strings. If you need a stable shape, look up with Dict.get key |> Maybe.unwrap_or "".

Quoting and line endings

You do not have to think about these — csv-core handles them per RFC 4180:

Fields wrapped in "..." have the surrounding quotes stripped.
"" inside a quoted field becomes a literal ".
A delimiter or newline inside a quoted field is part of the value, not a separator.
CRLF (\r\n) line endings parse identically to LF.

reader () |> parse "\"say \"\"hi\"\"\",ok\n"
# [["say \"hi\"", "ok"]]

A note on performance

Each parsed field is a freshly allocated BEAM binary. For very large inputs you may want to stream in chunks at the Erlang layer rather than slurping the whole file into a string — but the current saga API parses the whole input in one call, so plan accordingly.