cobwebch/external_import

duplicates if different strorages (enforcePid = false not helping)

Closed this issue · 9 comments

i set the pid based on a value in my XML

$GLOBALS['TCA']['my_ext']['columns']['pid']['external'] = [
    'my_config' => [
        'field' => 'FACHBEREICH',
        'transformations' => [
            10 => [
                'mapping' => [
                    'valueMap' => [
                        'Bereich A' => 10,
                        'Bereich B' => 11,
                        'Bereich C' => 12
                    ],
                ]
            ]
        ],
    ]
];

But then every time i made an import all Items are created new and not get updated.
If i store all in one folder and specify this folder in 'pid' it is working to update the records and not create new ones.

i thought setting enforcePID to FALSE will check all records independently form their storage, but that has not worked.

How could i archive this?

Hi. enforcePid actually defaults to false if not set. I have a project with a similar setup and it works fine. The difference I see is that I needed to define a minimal TCA for the pid field, which normally has none. So my setup looks like:

        'pid' => [
            'config' => [
                'type' => 'passthrough',
            ],
            'external' => [
                'crm' => [
                    'field' => 'new_statut2',
                    'transformations' => [
                        10 => [
                            'mapping' => [
                                'valueMap' => [
                                    '100000001' => 160,
                                    '100000000' => 296
                                ],
                                'default' => 160
                            ]
                        ]
                    ]
                ]
            ]
        ],

Does this work for you?

no that does not working for me

My TCA:

$GLOBALS['TCA']['my_ext']['external']['general'] = [
    'my_config' => [
        'connector' => 'feed',
        'parameters' => [
            'uri' => 'https://example.de/website.xml'
        ],
        'data' => 'xml',
        'nodetype' => 'JOB',
        'referenceUid' => 'tx_myext_externalid',
        'enforcePid' => false,
        'clearCache' => '19,15',
        'priority' => 100,
        'group' => 'test',
        'disabledOperations' => '',
        'description' => 'Import test',
        'updateSlugs' => true
    ],
];

$GLOBALS['TCA']['my_ext']['columns']['tx_myext_externalid']['external'] = [
    'my_config' => [
        'field' => 'ID'
    ]
];

[...]

$GLOBALS['TCA']['my_ext']['columns']['pid'] = [
    'config' => [
        'type' => 'passthrough',
    ],
    'external' => [
        'my_config' => [
            'field' => 'FACHBEREICH',
            'transformations' => [
                10 => [
                    'mapping' => [
                        'valueMap' => [
                            'Bereich A' => 10,
                            'Bereich B' => 11,
                            'Bereich C' => 12,
                        ],
                        'default' => 10
                    ]
                ]
            ],
        ]
    ]
];

I can't spot anything obvious based on the code you share. Have you used the preview mode of the backend module to check the result of the StoreDataStep?

The Result of the StoreDataStep are NEW records to 'add'

it seems to checks only the "pid" defined in Extension Configuration "basic.storagePID"
i tried also to set the pid in my config

$GLOBALS['TCA']['my_ext']['external']['general'] = [
    'my_config' => [
        'connector' => 'feed',
        'parameters' => [
            'uri' => 'https://example.de/website.xml'
        ],
        'data' => 'xml',
        'nodetype' => 'JOB',
        'referenceUid' => 'tx_myext_externalid',
        'enforcePid' => false,
        'pid' => '10,11,12',
        'clearCache' => '19,15',
        'priority' => 100,
        'group' => 'test',
        'disabledOperations' => '',
        'description' => 'Import test',
        'updateSlugs' => true
    ],
];

but it checks only the first id '10'
for this storage '10' it works, but in the others '11','12' it creates always new records (=duplicates)

OK. Are you at ease with debugging code? Because that would be the next step.

The first check is what happens in \Cobweb\ExternalImport\Domain\Repository\UidRepository::retrieveExistingUids()? Are all the uids retrieved as expected. That's where enforcePid is used.

Then check what happens in \Cobweb\ExternalImport\Step\StoreDataStep around lines 167-172, which is where the pid information is evaluated for each record to import (BTW what does the preview of the TransformDataStep show for the pid field? Does the mapping work at that point?). NOTE: before those lines, the process has been through \Cobweb\ExternalImport\Step\StoreDataStep::prepareDataToStore which is where a record is checked for existence or not, this is what happens on line 769 and following.

HTH

In \Cobweb\ExternalImport\Domain\Repository\UidRepository::retrieveExistingUids()

$generalConfiguration pid => 10 (only the first entry is used)
if (array_key_exists('enforcePid', $generalConfiguration)) is TRUE so
$constraints[] = $queryBuilder->expr()->eq('pid', (int)$this->configuration->getStoragePid()); is fired.
i don't see any check whether enforcePid is TRUE or FALSE.

if i add this check, to only add 'pid' constraint if 'enforcePid' is set to TRUE it is working

if (array_key_exists('enforcePid', $generalConfiguration)) {
    if($generalConfiguration['enforcePid']) {
        $constraints[] = $queryBuilder->expr()->eq('pid', (int)$this->configuration->getStoragePid());
    }
}

Thanks for digging. I could find that there was indeed a bug in that part of the code, which was fixed on August 14, 2022. It is part of the public version of external_import since version 6.0.4. So try updating and it should solve the issue.

now it is working, thank you.

don't know why it does not have installed the latest version using composer "*"

Great that it is solved.