Parsing csvs to inverted tables using J’s mealy machines (;:
).
rcsv =: [: pcsv 1!:1@<@jpath
pcsv =: 3 : 0
hd=. cln &.> (0;mm;ma) ;: (j=. y i. rsep){.y =. stripbom y
hd,:((#fs)$i.#hd) ([:<(cln;.0)&y)/.,."1 fs=.(2;mm;ma);:y=.(1+j)}.y
)
NB. how to deal with input that is unicode? seems to give domain error...
'qchr csep rsep'=: '"';',';LF NB. char classes
ma =: a. (e.&> i. 1:)"0 _ qchr;csep;rsep NB. alphabet -> char class
mm =: 4 4 2 $ , ". ;. _2 ] 0 : 0
2 1 0 2 0 3 1 1 NB. limbo
0 6 0 2 0 3 1 0 NB. field
3 0 2 0 2 0 2 0 NB. quoted field, quote escapes self
2 0 0 2 0 3 2 0 NB. escaped quote or end of quoted field
)
unq =: ((#~ [: -. (2#qchr)&E.)@}.@}:) ^: ((2#qchr) -: 0 _1&{)
cln =: unq`(0&{.)@.(-:&(,csep))
create =: 3 : 0
if. #y do.
assert. (1 1 -: #&>y) *. 2=#y
'csep qchr' =: y
NB. recalculate alphabet -> char class
ma =: a. (e.&> i. 1:)"0 _ qchr;csep;rsep
end.
)
Populate it’s locale with columns pointing to their data (wip)
import =: 3 : 0
table =: pcsv y
for_c. {. y do.
d =. (1,c_index) {:: y
". (>c),' =: d'
end.
)
rcsv_z_ =: rcsv_jsv_ NB. read from file
pcsv_z_ =: pcsv_jsv_ NB. read from bytes
coclass 'jsv'
<<bom>>
<<mealy>>
<<unquot>>
<<create>>
<<db>>
<<read>>
<<zdefs>>