pressbooks/pressbooks

'Discourage search engines from indexing...' setting merely sets a book to private

Opened this issue · 1 comments

tw77 commented

At a book URL followed by wp-admin/options-reading.php there is a page with a Search Engine Visibility setting:

Screenshot 2023-12-04 at 7 58 16 PM

Expected behavior: This setting adds robots.txt or equivalent to a book without affecting the book's global privacy setting. This allows a book to be public while search engines are discouraged from indexing it.

Actual behavior: If checked, this setting has the effect of setting a book's global privacy setting to private. If unchecked, it has the effect of setting the book to public.

The setting appears to be identical to a book's global privacy setting. When a book is public, it is automatically unchecked. When a book is private, it is automatically checked.

This dates back to 2013 or so:

<?php if (get_option('blog_public') == '1' || (get_option('blog_public') == '0' && current_user_can_for_blog($blog_id, 'read'))): ?>

The reading options page is not shown to Pressbooks users (you can navigate to it if you know the URL but it isn't linked anywhere in the dashboard navigation). Instead, the blog_public option was repurposed on the Sharing and Privacy options page and used to determine whether the book is public or private (see implementation in the modern McLuhan/book theme here: https://github.com/search?q=repo%3Apressbooks%2Fpressbooks-book%20is_book_public&type=code).

If you want to add another option to toggle search engine visibility, the right approach would probably be to add a new option to each book to control whether the book is public or private, defaulting to value of the blog_public option, and restore the core blog_public option to its original function (which is working as intended; when set to false it adds the following meta tag to all pages of the webbook:

<meta name="robots" content="noindex, nofollow">

(See here for an example: https://integrations.pressbooks.network/sasstest/)