qTranslate-Team/qtranslate-x

Consistency on showing untranslated posts

mastershadow opened this issue · 9 comments

Hello!
I've google blaming for 404 in my website.
Doing a bit of lurking I've found it is because the crawler found alternate links for untranslated posts.

qtranxf_wp_head contains the incriminated code

foreach($q_config['enabled_languages'] as $lang) {
        if(!empty($q_config['locale_html'][$lang])){
            $hreflang = $q_config['locale_html'][$lang];
        }else{
            $hreflang = $lang;
        }
        //if($language != qtranxf_getLanguage())//standard requires them all
        echo qtranxf_convertURL('',$lang,false,true);
        echo '<link hreflang="'.$hreflang.'" href="'.qtranxf_convertURL('',$lang,false,true).'" rel="alternate" />'.PHP_EOL;
    }

Shoudn't this be filtering by translation availability? I've tried to fix this myself but I cannot find an easy way to check if a post is translated in a certain language. Is there a built-in function to do this?

Yes, we should filter that indeed if option "Hide Content which is not available for the selected language" is on. Thanks a lot for the tip.

But I do not understand why 404? It should still show the page with a message which languages are available. Does not it do that for you? Is there a way for us to reproduce the problem?

On the second thought, it seems to be still useful to have a link to other languages even if translation is not available. This enables people to find the page in their language and open it and still be able to read it in other languages if they choose to. They may also translate other available languages using their own favorite tool, if desired. It seems no hurt in having those links to be listed in any case?

If you flag "Hide Content which is not available for the selected language" i get 404 as status code. This is a good behaviour to me as the page is not available in that language.
My config is really simple: that flag and the language url prefix /LANG/permalink

About your second thought, you are right but I think it's covered by the x-default

// https://support.google.com/webmasters/answer/189077
echo '<link hreflang="x-default" href="'.qtranxf_convertURL('',$q_config['default_language']).'" rel="alternate" />'.PHP_EOL;

AFAIK 404 is not a seen as bad for SEO if you have some broken links but I think they should be avoided. Also having untranslated content for a language (so you see another one) is a bad behaviour, which could lead search engines to hinder the website for duplicated content, so it should be avoided.

Also, I don't like having about two hundred 404 entries in my search console :P

I see, it is 404 on posts, but shows pages. I thought it shows post or page on single request. Option "Hide Content .." is designed to hide untranslated posts in the lists and searches, but when page is hit directly it should show it with a message which languages are available.

Your solution generates a lot of complexity. First, it is even hard to know which languages are available at front end, since all is already translated to one language. We need either make another db hit (performance is already at the edge, another database hit will make it only worse), or cache and sync this information somewhere, which is a considerably more coding. Besides, this need to be also then changed on sitemaps generation, which is almost impossible to do with the 3rd-party plugins.

In short, it is a lot of work, and following the logic I described before, this work is actually unnecessary. I would much rather fix 404 error. Once it is fixed you would be fine too and even better than 404 + sitemaps and header adjustments, don't you think so? Also, when a translation of a post appears later, google will not need to rebuild its database, the links in searches will start showing a translated content right away, instead of showing a message about available languages. This all sounds good to me, don't you agree?

I fixed the 404 error in the latest version at GitHub. Could you test it?

I assume this worked for you. I am closing this issue for now to save the future clicks, we can still write into a closed issue, or we can re-open it, if needed.

I managed to fix my problem with the following code.
In qtranslate_utils.php i've added a qtranxf_availableIn function


function qtranxf_availableIn($post_id) {
    global $wpdb;
    $post_content = $wpdb->get_var( $wpdb->prepare( "SELECT post_content FROM $wpdb->posts WHERE ID = %d", $post_id ) );
    if(empty($post_content)) {
        return false;
    }
    $languages = qtranxf_getAvailableLanguages($post_content);
    return $languages;
}

In qtranslate_frontend.php I've changed qtranxf_wp_head with:

function qtranxf_wp_head(){
    global $q_config;
    global $post;

    if( $q_config['header_css_on'] ){
        echo '<style type="text/css">' . PHP_EOL .$q_config['header_css'].'</style>'. PHP_EOL;
    }
    do_action('qtranslate_head_add_css');//not really needed?

    // skip the rest if 404
    if(is_404()) return;

        $availableLanguages = qtranxf_availableIn($post->ID);
        if(is_array($availableLanguages)) {
                $availableLanguages = array_intersect($availableLanguages, $q_config['enabled_languages']);
        } else {
                $availableLanguages = $q_config['enabled_languages'];
        }
        // set links to translations of current page
        foreach($availableLanguages as $lang) {
        if(!empty($q_config['locale_html'][$lang])){
            $hreflang = $q_config['locale_html'][$lang];
        }else{
            $hreflang = $lang;
        }

        //if($language != qtranxf_getLanguage())//standard requires them all
        echo '<link hreflang="'.$hreflang.'" href="'.qtranxf_convertURL('',$lang,false,true).'" rel="alternate" />'.PHP_EOL;
    }
    //https://support.google.com/webmasters/answer/189077
    echo '<link hreflang="x-default" href="'.qtranxf_convertURL('',$q_config['default_language']).'" rel="alternate" />'.PHP_EOL;

    //qtranxf_add_css();// since 3.2.5 no longer needed
}

I think it should be added alongside your fix :)

As I explained before this is an inefficient code hurting performance. It is also un-encapsulated code, for example, $post->ID will generate error on non-post pages.

Besides, I still do not see a reason for doing it, I think it is better to show a page with instructions which languages are available. But you can do whatever you think is better for your site, that is exactly why we keep this code "free". Do not forget to adjust sitemap generation as well, if you do it this way. Good luck to your site.

I know about performance (my code adds a query but that's why I use caching to serve all the content) but that's the only way I've that fits the requirements (sometimes you have constraints not depending on you). AFAIK $post->ID is perfectly working even on pages.

Thank you anyway for your suggestions. Sitemap generation is ok as I've restricted it to pages only for untranslated languages.

I have varnish acting like caching proxy, so I've no plugins right now.
Anyway'll look into it for the sake of curiosity 👍