jdeathe/vagrant-fess

Crawler prunes anchor content from "content" results

Closed this issue · 1 comments

Since the addition of a[rel="nofollow"] to the setting crawler.document.html.pruned.tags in the file fess_config.properties the content being indexed has anchor content pruned.

Content example:

<p>Contact us using our <a href="https://www.domain.com/contact">contact form</a> or if you're visiting, you can <a href="//www.domain.com/location">get directions</a> to our head office.</p>

Is getting indexed as:

Contact us using our or if you're visiting, you can get directions to our head office.

Using the setting from the test case works:

crawler.document.html.pruned.tags=noscript,script,style,header,footer,nav,a[rel=nofollow]