Importer API: DataFrame.read_csv and refactoring of importers for programmatic import
atbenmurray opened this issue · 2 comments
We should have DataFrame.read_csv for programattic importing of data. The importers should be refactored so that they can be trivially passed to the read_csv function in a dictionary mapping column names to datatypes. Note that importers can map from one input name to several fields.
Would like to clarify a few things:
(1) How the dictionary mapping should look like? Is it looks like the following:
{'foo_field': 'categorical', 'bar_field': 'fixed string‘}
(2) Different importers have their own parameters, such as fixedStringImporter need to know fixed-length, categoricalImporter need to know the categorical list, where should it be defined?
(1) DataFrame.read_csv(file, {'a': CategoricalImporter(...), 'b': NumericImporter(int32)})