tomasnorre/crawler

Bug and resulting Php Warnings with not translated mount points

hannasophieungar opened this issue · 4 comments

Bug Report

Current Behavior
We get with every run of crawler task "buildQueue" a lot of errors releating to mound points.
E.g.
Core: Error handler (BE): PHP Warning: Trying to access array offset on value of type null in tomasnorre/crawler/Classes/Controller/CrawlerController.php line 725
Core: Error handler (BE): PHP Warning: Undefined array key 0 in tomasnorre/crawler/Classes/Controller/CrawlerController.php line 725
We get this errors for line 729 and 740 too. I think this error also concers line 741.
I think mount pages are not indexed correctly with this bug in method getPageTreeAndUrls.

Expected behavior/output
No undefined array key error and correct indexing of mount pages

Steps to reproduce
Start task "buildQueue" in scheduler for root page which including not translated mount pages.

Environment

  • Crawler version(s): 12.0.0-dev
  • TYPO3 version(s): 12.4.14
  • Is your TYPO3 installation set up with Composer (Composer Mode): yes

Possible Solution
In method getPageTreeAndUrls for affected lines it needs to be checked if variable $mountpage has array key 0. I think the reason is that mount pages have no language overlays and with this no array key 0. In our case $mountpage are not translated and variable $mountpage is array with all necessary data like mount_pid etc.. Therefore $mountpage at key 0 not exists and resulting value is null. With e.g. $mountpage['mount_pid'] I think we will get expected result for getPageTreeAndUrls. This have to be fixed / checked for other lines too.

Hi there, thank you for taking your time to create your first issue. Please give us a bit of time to review it.

Thanks for your bug report, if you could provide a test that proves that the code is currently wrong that would be great, and even better if a fix could be provided too.

Thank you for your fast response. I'm sorry to write a unit test on method "getPageTreeAndUrls" for this special constellation with mocks for pages is difficult for me at the moment.

You could test this in backend if you "mount" any page (if the source page has translation doesn't matter in my tests). When you then start scheduler task with command "crawler:buildQueue: Create entries in the queue that can be processed at once" on the root page (see also attached configurations in screenshot) we get the errors above.
In config.yaml we use fallbackType: strict

Here is a fix for method "getPageTreeAndUrls" for concerning code lines starting with comment "recognize mount points":

 // recognize mount points
            if ($data['row']['doktype'] === PageRepository::DOKTYPE_MOUNTPOINT) {
                $mountpage = $this->pageRepository->getPage($data['row']['uid']);

                // fetch mounted pages
                $this->MP = $mountpage[0]['mount_pid'] . '-' . $data['row']['uid'];

                $mountTree = GeneralUtility::makeInstance(PageTreeView::class);
                $mountTree->init(empty($perms_clause) ? '' : ('AND ' . $perms_clause));
                $mountTree->getTree($mountpage[0]['mount_pid'], $depth);

                foreach ($mountTree->tree as $mountData) {
                    $queueRows = array_merge($queueRows, $this->drawURLs_addRowsForPage(
                        $mountData['row'],
                        BackendUtility::getRecordTitle('pages', $mountData['row'], true),
                        (string) $data['HTML']
                    ));
                }

                // replace page when mount_pid_ol is enabled
                if ($mountpage[0]['mount_pid_ol']) {
                    $data['row']['uid'] = $mountpage[0]['mount_pid'];
                } else {
                    // if the mount_pid_ol is not set the MP must not be used for the mountpoint page
                    $this->MP = null;
                }
            }
Bildschirmfoto 2024-05-16 um 12 50 04

I hope you could use this information to reproduce our problem and maybe you could use this fix, if you think that makes sense.

Thanks a lot, will look into it.