domitry/nyaplot

Use daru as default DataFrame library

Opened this issue · 3 comments

v0dro commented

While making some tutorials for using daru as a visualization tool with nyaplot for my GSOC project, I came across quite a few instances where I realized that instead of first converting to nyaplot data types, it would be very convenient if users could simply use daru with nyaplot directly.

For example, scatter plots automatically update themselves when the corresponding data frame is updated, but this can only be done when it's a Nyaplot::DataFrame (or now a Mikon::DataFrame).

I think it will be in the best interests of both daru and nyaplot if we used daru with nyaplot.

Here are more reasons for my conviction:

  • Daru will be the primary data library for all the statsample gems (time-series, glm, etc.). Making it simple to pass data between a stats library and nyaplot would make life much much easier for users of both our libraries. Currently I'm almost done with statsample integration and have also written detailed tutorials on using daru (more coming soon).
  • Daru will soon support time series, plotting this data properly is very important and nyaplot has the right capabilities for the same.
  • Daru supports single and hierarchical indexing and many table operations like join, merge, etc. Also pivot table and sorting when preserving indexing.
  • Daru is quite capable of handling 'wild' data. It sports support for statistical analysis with missing data present and has a bunch of methods to deal with missing data. This will only get better with time.
  • Burden on time and space will be greatly reduced if inter-conversion between daru and nyaplot data is avoided.
  • Daru integrates very well with IRuby notebook.
  • There will be a clear focus for Ruby scientific libraries (DataFrame and interactive plotting in this case).

Just go for it. Fork and live :). Naoki may catch up later.

I would love to see a fork of Nyaplot with daru as the dataframe. I spend a lot of time in going between the two.