CSV's having "" removed
xswirelab opened this issue · 6 comments
Hi there,
I'm using .csv files with encapsulated headers and values, like "Header", "Header2" etc,
But for some reason all the "" get stripped off. Any ideas why this happens?
Hey, could you maybe provide some simple example of csv file and what result do you expect after reading it so I can look into this?
Example of inputs:
"Company name","Contact owner","First Name","Last Name","Email","Phone Number"
"Some Inc","Dave Young","Dave","Yung","integer@example.com","(312) 768-3103"
"Company name","Contact owner","First Name","Last Name","Email","Phone Number"
"Some Inc","Dave Young",",",",","integer@example.com","(312) 768-3103"
Example of outputs:
Company name,Contact owner,First Name,Last Name,Email,Phone Number
Some Inc,Dave Young,,,,,integer@example.com,(312) 768-3103
Company name,Contact owner,First Name,Last Name,Email,Phone Number
Some Inc,Dave Young,,,,,integer@example.com,,
This happens all the time, even with the most minimum flows.
CSV::from(
Stream::local_file($input)
)
->unpack row
CSV::to(
Stream::local_file($output)
)
This is pretty much how CSV works.
When flow is loading CSV file into memory it's using PHP function fgetcsv. It's that function that is removing "
and when you are dumping output to a regular file only values/columns with space
will be surrounded by "
which is the default enclosure.
Input
"Company name","Contact owner","First Name","Last Name","Email","Phone Number"
"Some Inc","Dave Young","Dave","Yung","integer@example.com","(312) 768-3103"
Code
<?php
use Flow\ETL\DSL\CSV;
use Flow\ETL\DSL\Stream;
use Flow\ETL\DSL\Transform;
use Flow\ETL\Flow;
require __DIR__ . '/../vendor/autoload.php';
(new Flow())
->read(CSV::from(Stream::local_file(__DIR__ . '/issue289.csv')))
->rows(Transform::array_unpack('row'))
->drop("row")
->write(CSV::to(Stream::local_file(__DIR__ . '/issue289_new.csv'), true, false, ',', "'"))
->run();
Output
'Company name','Contact owner','First Name','Last Name',Email,'Phone Number'
'Some Inc','Dave Young',Dave,Yung,integer@example.com,'(312) 768-3103'
For the better contrast, I changed "
into '
. As you see, single word values/columns are not surrounded by '
but this is expected behavior driven by fputcsv
Could you maybe elaborate on why do you want to keep enclosure
around single word string values?
Also if you would like to just move file, line by line, from one place to another, you can use Text adapter that was just released. It will read file line by line and then write it line by line, by default not changing/removing anything.
I'm closing this issue for now, if my answer did not explain the reported behavior and you are looking for something else please let me know.