queryverse/CSVFiles.jl

The streaming load/save example does not work

Opened this issue · 3 comments

julia> using DataFrames, CSVFiles, FileIO

julia> df = DataFrame(a = [1,2,3], b = [4,5,6]);

julia> stream = IOBuffer();

julia> fileiostream = Stream(format"CSV", stream);

julia> save(fileiostream, df)

julia> load(fileiostream)
0x0 CSV file
Error showing value of type CSVFiles.CSVStream:
ERROR: MethodError: no method matching zero(::Type{Any})
Closest candidates are:
  zero(::Type{Union{Missing, T}}) where T at missing.jl:87
  zero(::Type{LibGit2.GitHash}) at /Users/osx/buildbot/slave/package_osx64/build/usr/share/julia/stdlib/v1.1/LibGit2/src/oid.jl:220
  zero(::Type{Pkg.Resolve.VersionWeights.VersionWeight}) at /Users/osx/buildbot/slave/package_osx64/build/usr/share/julia/stdlib/v1.1/Pkg/src/resolve/VersionWeights.jl:19
  ...
Stacktrace:
 [1] zero(::Type{Any}) at ./missing.jl:87
 [2] reduce_empty(::typeof(+), ::Type) at ./reduce.jl:227
 [3] reduce_empty(::typeof(Base.add_sum), ::Type) at ./reduce.jl:234
 [4] mapreduce_empty(::typeof(identity), ::Function, ::Type) at ./reduce.jl:251
 [5] _mapreduce(::typeof(identity), ::typeof(Base.add_sum), ::IndexLinear, ::Array{Any,1}) at ./reduce.jl:305
 [6] _mapreduce_dim at ./reducedim.jl:308 [inlined]
 [7] #mapreduce#548 at ./reducedim.jl:304 [inlined]
 [8] mapreduce at ./reducedim.jl:304 [inlined]
 [9] _sum at ./reducedim.jl:653 [inlined]
 [10] _sum at ./reducedim.jl:652 [inlined]
 [11] #sum#550 at ./reducedim.jl:648 [inlined]
 [12] sum(::Array{Any,1}) at ./reducedim.jl:648
 [13] #printtable#1(::Bool, ::Function, ::IOContext{REPL.Terminals.TTYTerminal}, ::TableTraitsUtils.TableIterator{NamedTuple{(),Tuple{}},Tuple{}}, ::String) at /Users/harry/.julia/packages/TableShowUtils/ImkA9/src/TableShowUtils.jl:43
 [14] printtable(::IOContext{REPL.Terminals.TTYTerminal}, ::TableTraitsUtils.TableIterator{NamedTuple{(),Tuple{}},Tuple{}}, ::String) at /Users/harry/.julia/packages/TableShowUtils/ImkA9/src/TableShowUtils.jl:7
 [15] show(::IOContext{REPL.Terminals.TTYTerminal}, ::CSVFiles.CSVStream) at /Users/harry/.julia/packages/CSVFiles/KysmQ/src/CSVFiles.jl:38
 [16] show(::IOContext{REPL.Terminals.TTYTerminal}, ::MIME{Symbol("text/plain")}, ::CSVFiles.CSVStream) at ./sysimg.jl:194
 [17] display(::REPL.REPLDisplay, ::MIME{Symbol("text/plain")}, ::Any) at /Users/osx/buildbot/slave/package_osx64/build/usr/share/julia/stdlib/v1.1/REPL/src/REPL.jl:131
 [18] display(::REPL.REPLDisplay, ::Any) at /Users/osx/buildbot/slave/package_osx64/build/usr/share/julia/stdlib/v1.1/REPL/src/REPL.jl:135
 [19] display(::Any) at ./multimedia.jl:287
 [20] #invokelatest#1 at ./essentials.jl:742 [inlined]
 [21] invokelatest at ./essentials.jl:741 [inlined]
 [22] print_response(::IO, ::Any, ::Any, ::Bool, ::Bool, ::Any) at /Users/osx/buildbot/slave/package_osx64/build/usr/share/julia/stdlib/v1.1/REPL/src/REPL.jl:155
 [23] print_response(::REPL.AbstractREPL, ::Any, ::Any, ::Bool, ::Bool) at /Users/osx/buildbot/slave/package_osx64/build/usr/share/julia/stdlib/v1.1/REPL/src/REPL.jl:140
 [24] (::getfield(REPL, Symbol("#do_respond#38")){Bool,getfield(REPL, Symbol("##48#57")){REPL.LineEditREPL,REPL.REPLHistoryProvider},REPL.LineEditREPL,REPL.LineEdit.Prompt})(::Any, ::Any, ::Any) at /Users/osx/buildbot/slave/package_osx64/build/usr/share/julia/stdlib/v1.1/REPL/src/REPL.jl:714
 [25] #invokelatest#1 at ./essentials.jl:742 [inlined]
 [26] invokelatest at ./essentials.jl:741 [inlined]
 [27] run_interface(::REPL.Terminals.TextTerminal, ::REPL.LineEdit.ModalInterface, ::REPL.LineEdit.MIState) at /Users/osx/buildbot/slave/package_osx64/build/usr/share/julia/stdlib/v1.1/REPL/src/LineEdit.jl:2273
 [28] run_frontend(::REPL.LineEditREPL, ::REPL.REPLBackendRef) at /Users/osx/buildbot/slave/package_osx64/build/usr/share/julia/stdlib/v1.1/REPL/src/REPL.jl:1035
 [29] run_repl(::REPL.AbstractREPL, ::Any) at /Users/osx/buildbot/slave/package_osx64/build/usr/share/julia/stdlib/v1.1/REPL/src/REPL.jl:192
 [30] (::getfield(Base, Symbol("##734#736")){Bool,Bool,Bool,Bool})(::Module) at ./client.jl:362
 [31] #invokelatest#1 at ./essentials.jl:742 [inlined]
 [32] invokelatest at ./essentials.jl:741 [inlined]
 [33] run_main_repl(::Bool, ::Bool, ::Bool, ::Bool, ::Bool) at ./client.jl:346
 [34] exec_options(::Base.JLOptions) at ./client.jl:284
 [35] _start() at ./client.jl:436

I took a look at the tests, which include mark and reset statements. When I add these statements, it works:

julia> stream = IOBuffer();

julia> fileiostream = Stream(format"CSV", stream);

julia> mark(stream)
0

julia> save(fileiostream, df)

julia> reset(stream)
0

julia> mark(stream)
0

julia> load(fileiostream)
3x2 CSV file
a │ b
──┼──
14
25
36

Weirdly, that works in the REPL, but not in Atom, where I get the same stacktrace as above.

I think the reset makes sense, otherwise the load would try to read from the end of the stream, right? We should probably make that clearer in the README, though!

The Atom case is probably that it triggers a show method call, which reads from the stream, which moves the position in the stream. Not sure what to do about that...

Ahh true. How about something like this to the README:

using CSVFiles, FileIO

stream = Stream(format"CSV", IOBuffer())
mark(stream.io)
save(stream, it)
reset(stream.io)
load(stream)

I think the streaming examples in the README also need using FileIO to be added for clarity.

Shame about Atom, but it's not the end of the world!

Yes, I think making this clearer in the README would be great. Maybe the easiest would be to move all the stream examples into a final section at the end? It is a bit more specialized, and it really helps if the load and save case are closer together?