FriendsOfCake/cakephp-csvview

Export field values with html and special signs (like in product descriptions) cause problems

Closed this issue · 10 comments

Hey josegonzalez,

I tried your plugin out and its very good, thank you for that! I get a problem with the created csv when field values/ content have html and special signs included like ',' or ';' (for example in Product Description).

Do you have and idea for Fixing?

What are the problems?

When you have a $value['0'] which contains \t or \r the csv file break, in my case there are product html descriptions created with a wysiwyg editor and between

or other html element tags exists \r, this cause a problem.
Now we dont handle cases like these in the Plugin

How would you handle it? How would this be handled in a csv?

On Mon, Oct 14, 2013 at 10:43 AM, Lukas Marks notifications@github.comwrote:

When you have a $value['0'] which contains \t or \r the csv file break, in
my case there are product html descriptions created with a wysiwyg editor
and between
or other html element tags exists \r, this cause a problem.
Now we dont handle cases like these in the Plugin


Reply to this email directly or view it on GitHubhttps://github.com//issues/17#issuecomment-26259909
.

As far as I know (and from tests with OpenOffice and other tools that export csv), all you need to do is to wrap non alphanumeric content in "" (or any other sane delimiter). Then parsing tools should be able to handle those newlines and other problematic chars as normal context and shouldn't interpret it.

So how would you recommend we modify csvview? I am using the php csv built-ins, so maybe my defaults are incorrect?

In fact all my exported values use '"' as enclosure, even those who brake, I see the issue more in the whitespace and return handling, sure we can trim everythin away but the \t \r is a real problem when somebody else use it and cant handle the problem pragmatically, like we do

Well, he is right, the current functionality is broken with newline characters (\n or \r\n) and seems to be limitation of fputcsv() directly.

The following test proofs this:

/**
 * CsvViewTest::testRenderWithSpecialCharacters()
 *
 * @return void
 */
public function testRenderWithSpecialCharacters() {
    App::build(array(
        'View' => realpath(dirname(__FILE__) . DS . '..' . DS . '..' . DS . 'test_app' . DS . 'View' . DS) . DS,
    ));
    $Request = new CakeRequest();
    $Response = new CakeResponse();
    $Controller = new Controller($Request, $Response);
    $Controller->name = $Controller->viewPath = 'Posts';

    $data = array(
        array(
            'User' => array(
                'username' => 'José'
            ),
            'Item' => array(
                'type' => 'äöü',
            )
        ),
        array(
            'User' => array(
                'username' => 'Including,Comma'
            ),
            'Item' => array(
                'name' => 'Containing"char',
                'type' => 'Containing\'char'
            )
        ),
        array(
            'User' => array(
                'username' => 'Some Space'
            ),
            'Item' => array(
                'name' => "A\nNewline",
                'type' => "A\tTab"
            )
        )
    );
    $_extract = array('User.username', 'Item.name', 'Item.type');
    $Controller->set(array('user' => $data, '_extract' => $_extract));
    $Controller->set(array('_serialize' => 'user'));
    $View = new CsvView($Controller);
    $output = $View->render(false);

    $expected = <<<CSV
José,NULL,äöü
"Including,Comma","Containing""char",Containing'char
"Some Space","A\\nNewline","A\tTab"

CSV;
    $this->assertTextEquals($expected, $output);
    $this->assertSame('text/csv', $Response->type());
}

Everything is fine except for the newline element (truncates the rest).
I only got it to work with the following modification:

    $row = str_replace("\n", '\n', $row); // Convert all newlines to a string representation of it
    ...
    if (fputcsv($fp, $row, $delimiter, $enclosure) === false) {
        return false;
    }

Note: Ideas from here

Thnx Mark, I fixed it the same way and use a custom array() for all the specl chars who cause problems and replace them with a non-char "", but that can't be a cakeish solution

It's not about cakeish or not when the PHP function itself cannot handle it properly :)

Someone want to make a pr for this? Preferably in such a way that the replacements are configurable.