d99kris/rapidcsv

How to skip few rows before loading

vagrawal-aptina opened this issue · 1 comments

Hi,

I have a CSV file, whose actual data starts from row other than 1(say 9, its variable).
So i was looking how can i skip few rows from starting before loading so that library gets correct header count/column?
Do we have that feature?

Thanks

Hi @vagrawal-aptina - yes, this should be supported by rapidcsv.

Let's say the CSV file content is:

# Line 0 with no CSV data
# Line 1 with no CSV data
# Line 2 with no CSV data
Date,Open,High,Low,Close,Volume,Adj Close
2017-02-24,64.529999,64.800003,64.139999,64.620003,21705200,64.620003
2017-02-23,64.419998,64.730003,64.190002,64.620003,20235200,64.620003
2017-02-22,64.330002,64.389999,64.050003,64.360001,19259700,64.360001
2017-02-21,64.610001,64.949997,64.449997,64.489998,19384900,64.489998
2017-02-17,64.470001,64.690002,64.300003,64.620003,21234600,64.620003
2017-02-16,64.739998,65.239998,64.440002,64.519997,20524700,64.519997

Then we can tell rapidcsv that the column headers (Date,Open,High,Low,...) are at line 3 using the LabelParams argument, like this:

int main()
{
  rapidcsv::Document doc("examples/rowoffset.csv", rapidcsv::LabelParams(3, -1));
  std::cout << "First data row date: " << doc.GetColumn<std::string>("Date").at(0) << "\n";
}

The expected output of this program will be:

First data row date: 2017-02-24

Hope this illustrates how the row offset works. Feel free to re-open the issue if you have follow-up questions!