Able to read only 1 row from file
theRealNG opened this issue ยท 22 comments
Hi,
I have written the following code:
SmarterCSV.process('test.csv', { headers_in_file: true}) do |arr| puts arr end
The following output is printed:
=> {:someid=>"39981", :somename=>"FoodWorks Inc", :somenumber=>"71821", :somedate=>"07/01/2022 3:14"}
But it is printing only the first row apart from the headers. Not sure what am I doing wrong here.
I'm using Ruby 2.7.0 and smarter_csv version is 1.7.0
Seems to be problem with v1.7.0. Tried v1.6.1 and it is working fine.
@theRealNG Can you provide a sample file or a test?
have you tried different row_sep settings?
Example file content:
a,b,c,d
1,2,3,4
5,6,7,8
Running
SmarterCSV.process(file)
Should return
[{:a=>1, :b=>2, :c=>3, :d=>4}, {:a=>5, :b=>6, :c=>7, :d=>8}]
instead it return
{:a=>"1", :b=>"2", :c=>"3", :d=>"4"}
can not reproduce - this sounds like you have either weird row_sep
characters in your CSV file, or other special characters
> data = SmarterCSV.process('/tmp/test.csv')
=> [{:a=>1, :b=>2, :c=>3, :d=>4}, {:a=>5, :b=>6, :c=>7, :d=>8}]
@mo-rubikal @theRealNG Have you tried to look at your CSV file with hexdump -C filename
or od -X
?
@tilo the version that has the problem is 1.7.0 when I downgraded to the previous version it worked, at the beginning I thought its a separator of file format issue but I was able to reproduce with this simple file from my example above.
@mo-rubikal yes, I understand that rolling-back worked for you.
I need a specific CSV file to reproduce the issue. When I cut+pasted your sample, I was not able to reproduce it, as shown in the snippet above.
Could you either share an exact file, or add a test that shows what is broken?
running with verbose: true
could also shed some light on the issue
$ hexdump -C /tmp/test.csv
00000000 61 2c 62 2c 63 2c 64 0a 31 2c 32 2c 33 2c 34 0a |a,b,c,d.1,2,3,4.|
00000010 35 2c 36 2c 37 2c 38 0a |5,6,7,8.|
00000018
even without the last 0A
it reads both rows correctly ๐ค
I can confirm the bug.
It's the same for me, downgrading to 1.6 fixed.
Here's the file that I've used:
very bizarre ... which Ruby version are you using?
> md5sum ~/Downloads/booksellers2.csv
98089c12c4487ca7cadf2fd3f92d477d /Users/tilo/Downloads/booksellers2.csv
> wc -l ~/Downloads/booksellers2.csv
279 /Users/tilo/Downloads/booksellers2.csv
so one header and 278 rows of data
Version 1.6.1
> RUBY_VERSION
=> "2.7.5"
> require 'smarter_csv'
=> true
> SmarterCSV::VERSION
=> "1.6.1"
> data_1_6_1 = SmarterCSV.process('/Users/tilo/Downloads/booksellers2.csv')
=> [{:id=>117}, {:id=>7}, {:id=>8}, {:id=>290, :origi=>"Oui", :ma=>"Oui", :kd=>"Ouii"}, {:id=>61, :origi=>"Oui", :ma=>"Oui", :kd=>...
> data_1_6_1.first
=> {:id=>117}
> data_1_6_1.last
=> {:id=>49, :origi=>"Oui", :ma=>"Oui", :kd=>"Non"}
> data_1_6_1.size
=> 278
> File.open('/tmp/data_1_6_1', 'w') { |f| f.puts data_1_6_1.inspect }
Version 1.7.0
> RUBY_VERSION
=> "2.7.5"
> require 'smarter_csv'
=> true
> data_1_7_0 = SmarterCSV.process('/Users/tilo/Downloads/booksellers2.csv')
=> [{:id=>117}, {:id=>7}, {:id=>8}, {:id=>290, :origi=>"Oui", :ma=>"Oui", :kd=>"Ouii"}, {:id=>61, :origi=>"Oui", :ma=>"Oui", :kd=>...
> data_1_7_0.first
=> {:id=>117}
> data_1_7_0.last
=> {:id=>49, :origi=>"Oui", :ma=>"Oui", :kd=>"Non"}
> data_1_7_0.size
=> 278
> File.open('/tmp/data_1_7_0', 'w') { |f| f.puts data_1_7_0.inspect }
=> nil
> SmarterCSV.has_acceleration?
=> true
and the output is identical for me:
$ md5sum /tmp/data_1_6_1 /tmp/data_1_7_0
0e9bce174967c97c092b99bb6cb9b9fc /tmp/data_1_6_1
0e9bce174967c97c092b99bb6cb9b9fc /tmp/data_1_7_0
@alextakitani what Ruby version are you using, and what does uname -a
say?
I get the same results for Ruby 3.0.0
@alextakitani @mo-rubikal @theRealNG Still can not reproduce this.
What OS are you guys using? can you send me the uname -a
output, the Ruby version, and maybe another sample file?
I am getting the same results as #197 (comment).
Ruby 3.1.2.
smarter_csv 1.6.1 works great
1.7 only reads the first row and returns that as a hash, instead of an array of records
Linux computername 5.10.0-15-amd64 #1 SMP Debian 5.10.120-1 (2022-06-09) x86_64 GNU/Linux
I'm having the same problem here.
Changing to v 1.6.1 also fixed it for me.
Ruby 3.0.3
I'm getting the same thing... a csv library that parses a single row. Super useful... ๐ค
ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [x86_64-darwin19]
Rails 7.0.3.1
I'm using OSX 10.15.7 (19H1922)
SmarterCSV.process(file, {chunk_size: 1_000, headers_in_file: true, remove_empty_values: true}) do |chunk|
chunk.each do |row|
...
end
end
The sample CSV doesnt matter, it doesnt work with any CSV with headers, comma delimited
Darwin BigWeiner.local 19.6.0 Darwin Kernel Version 19.6.0: Mon Apr 18 21:50:40 PDT 2022; root:xnu-6153.141.62~1/RELEASE_X86_64 x86_6
I am having the same problem.
ruby 3.0.3
rails 7.03.
mac os: Big Sur 11.6.
This config fixed issue for me
remove_empty_values: false,
I am having the same problem.
ruby 2.7.3
rails 6.0.3.2
20.04.1-Ubuntu
Same issue here.
ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux]
Adding remove_empty_values: false
fix it.
Bugfix release 1.7.1 was just published
Please re-evaluate and update this issue if whether fixes the problem or not
TL;DR: the issue only showed up when smarter_csv
was used in a Rails project.
the issue is fixed in 1.7.1
@tilo thanks for the fix. Next time, if possible, please don't remove a gem version because of a bug. I would have to remove all my work if I had to remove buggy versions ๐
LOL - good point @matiasalbarello ๐
In this instance, the 1.7.0 version was broken for everybody who is using it from Rails, and I just wanted to make sure people don't run into this issue anymore, hopefully saving them time & frustration