wizacode/php-etl

boolean values update issues

Closed this issue ยท 10 comments

With Postgresql database, update are not handled correctly in some cases.

Working :

  • old value : false => new value : true
  • old value : true => new value : NULL
  • old value : NULL => new value : true
  • old value : NULL => new value : false

Not working :

  • old value : true => new value : false
  • old value : false => new value : NULL

After some search I found that there is an issue in the update(array $row, array $current) function of src/Loaders/InsertUpdate.php file.
This section use loose comparison between old values array and new values array :
if ($row == array_intersect_key($current, $row)) { return; }

The $current array return boolean from database as a PHP Boolean while the $row array take string('true', 't', 'false', 'f') or NULL value to define booleans.

This prevent database update when it should be.

Thank you @tristanbsn we are going to look at it.

@tristanbsn could you clarify this point please, I am not sure to understand ๐Ÿ™‡

... while the $row array take string('true', 't', 'false', 'f') or NULL value to define booleans.

@ecourtial In my case I use a transformer with a callback that take a date as source and transform it as string 'false' if date < current_date, 'true' otherwise :
return (date($row->get($column)) < date('Y-m-d')) ? 'false' : 'true';

This value get stored in a Boolean field in my PostgreSQL database.

To clarify the issue is mainly caused by the loose comparision in the update(array $row, array $current) function that cause unwanted behaviors because of thoses cases :

  • true == "false" return true
  • true == "f" return true
  • false == "false" return false
  • false == "f" return false
  • false == NULL return true

In my case if the value is true in my PostegreSQL database, it cannot be updated to false because this line prevent the update to be triggered :
if ($row == array_intersect_key($current, $row)) { return; }

Thank you. I will look at it today.

Thank you, I made a quick patch if it can help you :

    /**
     * Execute the update statement.
     */
    protected function update(array $row, array $current): void
    {
        if (false === $this->doUpdates) {
            return;
        }

        if (null === $this->update) {
            $this->prepareUpdate($row);
        }

        // CURRENT VERSION
        /*
            if ($row == array_intersect_key($current, $row)) {
                return;
            }
        */

        // CORRECTED VERSION
        $correctedRow = $row;

        foreach ($correctedRow as $key => $value) {
            if (isset($current[$key]) && gettype($current[$key]) === "boolean" && gettype($value) === "string" && in_array($value, ["true", "t", "false", "f"])) {
                $correctedRow[$key] = in_array($value, ["true", "t"]) ? true : false;
            }
        }

        if ($correctedRow == array_intersect_key($current, $row)) {
            return;
        }
        // END CORRECTED VERSION

        if ($this->timestamps) {
            $row['updated_at'] = $this->time;
        }

        $this->update->execute($row);
    }

After a quick look, I am still not sure to understand ๐Ÿ˜…

This section use loose comparison between old values array and new values array :
if ($row == array_intersect_key($current, $row)) { return; }

So here the comparison is between keys of the array.
Hence despite the fact that we should have === instead of == at line 188, we are going to fix that on rigour sake, it won't change anything because as we can see in the documentation of PHP:

The two keys from the key => value pairs are considered equal only if (string) $key1 === (string) $key2 . In other words a strict type check is executed so the string representation must be the same.

So if we would like to really strictly compare keys, we could use a callback. But I am not sure why we should do that, because the keys of the array are supposed to represent columns names, not values.

Let me know if I am missing something, I would be very glad to help you.

if ($row == array_intersect_key($current, $row)) { return; } compare not only the keys but also the values of the array.
In my situation,

$row = ["date_naissance" => "1962-06-18", "actif" => "false"]
$current = ["date_naissance" => "1962-06-18", "actif" => true]

the comparison is true because "false" == true is true and this cause the update to not be triggered.

My bad indeed. Easily reproduced... after your explanations.
So I created a PR, could you check it is ok for you, and then I will merge and tag a new version.

It works for me, thank you !