[Bug] Broken URL Attributes when Minify HTML is enabled
FaySmash opened this issue · 68 comments
That issue might be fixed in the next release branch https://github.com/tbela99/gzip/tree/v.next
It is almost ready to be released. I was still testing it. Can you give it a shot?
Thanks
@tbela99 I installed the testing branch and can confirm, that my initial problem is resolved, the urls are now handled properly. (e.g.: //csn-alpha.hosting134897.a2e31.netcup.net/images/banners/Gamesroom.webp
)
BUT
Everything else is broken now.
<div id="jde-q2150115467813748jk" class="jdb-element jde-q2150115467813748jk jdb-element-default jdb-button jdb-button-primary jdb-animation" jdb-animation="type:none"><a class="jdb-button-link" href="/tickets" title="zu den Tickets">
zu den Tickets</a></div>
<div id="jde-q2150115467813748jk"class=jdb-element" jde-q2150115467813748jk="" jdb-element-default="" jdb-button="" jdb-button-primary"="" jdb-animation="type:none" class="jdb-animation"><a class="jdb-button-link" href="/home/tickets" title="zu den Tickets"> zu den Tickets</a></div>
I will work on that.
Thank you
I am not able to reproduce that issue. maybe you have an unclosed quote in your content
I finally found that the Joomla print button has unescaped < and > that can cause this issue
title="Print article < Your Modules >"
This invalid HTML is in the Joomla print article button
Hi @tbela99 thanks for your work! Just tried out the newest 2.7.3 Build an I'm afraid to tell you that it's just as broken as before.
Here is the front page if my website without HTML minifying enabled:
without minifying.html.txt
And with HTML minifying enabled:
with minifying.html.txt
(I replaced my domain with example.com)
Can you disable the plugin and provide that html?
I usually use tools like the w3 validator to catch errors in HTML
Here is the HTML with gzip completely disabled:
gzip disabled.html.txt
And for the record: none of the HTML source code was typed by me, it's all automatically generated by astroid framework and jdbuilder from joomdev
The problem comes from your HTML page. It has too many unclosed quotes ("). Here is a sample
<script src="/misc/custom JS/set global variables.js"
"></script>
<script src="/misc/custom JS/get display mode.js"
"></script>
<script src="/misc/custom JS/set display mode.js"
"></script>
<script src="/misc/custom JS/set Menu Toggle.js"
"></script>
<script src="/misc/custom JS/hide header on scroll.js"
"></script>
<script src="/misc/custom JS/anchor smooth scroll.js"
"></script>
<script src="/misc/custom JS/JDBuilder/set Accordion Mode.js"
"></script>
<script src="/misc/custom JS/JDBuilder/toggle Tab.js"
"></script>
<script src="/misc/custom JS/astroid.js""></script>
You can validate you HTML markup using the w3 html validator https://validator.w3.org/
Hi, I finally got around fixing the unclosed quotes of my specified scripts and also removed all spaces in their URI, but the issue persists.
This is the output of the W3 Validator:
Every error but the spaces in the images filenames are out of my hands because all those code is generated by astroid framework and JDBuilder.
What does gzips minify option actually do? Wouldn't it be enough to remove all line breaks and comments?
It produce a valid HTML5 document as specified here http://www.w3.org/TR/html5/syntax.html#syntax-tag-omission
the code can be seen here https://github.com/tbela99/gzip/blob/master/helpers/HTML.php#L81
Can you provide the HTML code without the plugin active?
Thank you
Interesting, didn't know you could omit those, til.
Here is the HTML with gzip disabled:
gzip disabled.html.txt
After having a look at the code, I think this line causes the breakage:
//remove quotes from HTML attributes that does not contain spaces; keep quotes around URLs!
'~([\r\n\t ])?([a-zA-Z0-9:]+)=(["\'])([^\s\3]+)\3([\r\n\t ])?~' => '$1$2=$4$5'
or at least it acts strangely at the jdb-animation
attribute
I don't see any problem with that. I quickly checked the page and there should be at least one space character between two attributes
id="jdc-fi00013605490449oz"class="jdb-column
I don't know if this can cause the issue you are talking about
it should be id="jdc-fi00013605490449oz" class="jdb-column
I tested the HTML you provide, the issue happens because of the problem I previously mentioned. I will fix it.
Thank you! I've also opened an issue about this: joomdev/JD-Builder#21
Maybe we wait what the folks at joomdev have to say or you rework your code for the sake of compatibility.
You might not be the only one having that problem. The plugin should not produce invalid output anyway
Can you verify that the issue is fixed on this branch https://github.com/tbela99/gzip/tree/v.next ?
Thanks
This worked wonders, the front page is working now! But sub pages are still broken :[
But this time it's because HTML minifying removes leading /
in front of URIs in attributes.
<link href="/cache/z/ssl/cn/example.com/css/iLok7.css" rel="stylesheet">
becomes
<link href="cache/z/ssl/cn/example.com/css/iLok7.css" rel="stylesheet">
I know, this is intended, but it results in this request
turning into this request
(note the added /angebot
in front of /cache
which makes the URI invalid, resulting in 404)
That is your problem. The second link returns 404 error, which means that path cannot be resolved by the plugin. Links have a specific structure. modifying it manually can lead you to a 404 error.
link and meta without the leading / are perfectly valid. I will close this issue soon, since the problem is fixed
The problem comes from the minifying HTML option, no doubt. I haven't manually modified the link structure, gzip removed the leading /.
and NO, Links with a missing / are not valid, read here: https://webmasters.stackexchange.com/questions/56840/what-is-the-purpose-of-leading-slash-in-html-urls
I see what you mean. Links without leading / are perfectly valid but they are resolved using the current page url
so referring "image/something" form /page/news will give
/page/news/image/something
Can you open a different issue with the steps, configuration settings, .... to reproduce the problem? I will close this issue as it is already fixed. You are talking about a different issue
I see what you mean. Links without leading / are perfectly valid but they are resolved using the current page url
yes, and this is causing the problem and is unintended behavior, they should be resolved from the webroot as they are with HTML minifiying disabled.
Can you open a different issue with the steps, configuration settings, .... to reproduce the problem? I will close this issue as it is already fixed. You are talking about a different issue
please have a look at my opening Post which stated exactly the issue with the leading /
missing in #2, so I wouldn't say "this issue [...] is already fixed". To reproduce the problem you could just visit a sub-site of you webroot, e.g. /page/news
with an image in it which residents in image/something
and enable HTML minifiying.
I can see thee image name contains space, can you remove the space in the name and try again?
I removed them but this doesn't change a thing, the problem is still the missing leading slash as mentioned here...
Because the image was just an example to show that the leading slash matters. When HTML minifying is enabled the image actually loads just fine, what doesn't load are all javascript and css files referenced in the <head> section, like the css file shown in here
Can you check this please:
You must have a file called .htaccess in the root of your web folder, otherwise files may not load when using the cache
You can rename the file called htaccess.txt to .htaccess
after that go to your joomla admin -> global configuration
Enable URL rewriting -> On
URL-Rewrite has been enabled on my joomla instance even before I used gzip. The htaccess.txt
has also been renamed to .htaccess
since then.
And that the files won't load has nothing to do with the cache but the URI being made invalid by removing the leading /
. Couldn't you just please make a commit to the v.next branch which disables the part of the minifying Code which removes the leading /
or URIs and I'll report back if sub-sites then work as intended?
I will check again but the minify code is not supposed to change anything in the attributes.
I must first be able to reproduce your problem, otherwise it is hard to narrow down the faulty code.
If I am right, you said your html and css files are in the <head> section, which means you may have the javascript and css settings set to OFF. Is that the case?
I will check again but the minify code is not supposed to change anything in the attributes.
Oh, I haven't found anything and though I was looking at the wrong spot, guess this code isn't the cause then.
If I am right, you said your html and css files are in the section, which means you may have the javascript and css settings set to OFF. Is that the case?
I admit I should have written "normally referenced in the section" because, in fact, the problematic links were referenced at the end of the <body> section
Because Process Javascript and Process CSS were indeed turned ON. So I checked what happend when I toggle Process Javascript/CSS...
If I set Process Javascript and Process CSS to OFF the leading /
is present and the error disappears.
BUT
If I set Process Javascript and Process CSS to ON and minify HTML to OFF, the the leading /
is also present and the error also disappears.
CONCLUSION:
The error only appears if both HTML minifying and Process Javascript/CSS are turned ON. If either Process/CSS or minifying HTML is turned OFF, the error disappears.
Voilà! :)
I will investigate this ASAP
if possible you should also check the error_log file
In your screencap, the URI of a image inside the normal code body is shown. Note, that my problem doesn't affect images, but only referenced scripts/stylesheets (which would in the <head> section by default).
What do you mean by "top directory"? Website root of my site is just example.com/
, not example.com/site/
if you mean this.
PHP is 7.3.15, joomla is 3.9.16, astroid framework is 2.3.0, JDBuilder is 1.5.0.
I also tried disabling all extensions but gzip, astroid and JDBuilder to no avail.
Maybe I could send you a DB dump + webroot backup of my site for further debugging?
And my error log file only contains logs of the failed web requests (ever /angebot
should just be /
)
Log
Access 200 GET /angebot/maid-cafe ApacheError 404 GET /angebot/media/jdbuilder/js/jquery-3.4.1.min.js Apache
Error 404 GET /angebot/media/system/js/mootools-core.js Apache
Error 404 GET /angebot/templates/astroid_template_zero/css/astroid-596c9670e297083b43197ae168a111b4.css Apache
Error 404 GET /angebot/misc/CSS/outdatedbrowser.min.css Apache
Error 404 GET /angebot/media/jdbuilder/css/jdb-8b3fb93e3bf37c1d6cf0713865dbad8a.min.css Apache
Error 404 GET /angebot/media/astroid/assets/css/animate.min.css Apache
Error 404 GET /angebot/media/system/js/core.js Apache
Error 404 GET /angebot/templates/system/css/system.css Apache
Error 404 GET /angebot/misc/CSS/astroid.css Apache
Error 404 GET /angebot/templates/astroid_template_zero/css/style-450c2e70cb0e3f095476390c96488c2e.css Apache
Error 404 GET /angebot/media/system/js/mootools-more.js Apache
Error 404 GET /angebot/templates/astroid_template_zero/js/vendor/bootstrap/popper.min.js Apache
Error 404 GET /angebot/templates/astroid_template_zero/js/vendor/bootstrap/bootstrap.min.js Apache
Error 404 GET /angebot/templates/astroid_template_zero/js/vendor/bootstrap/jquery.min.js Apache
Error 404 GET /angebot/templates/astroid_template_zero/js/vendor/jquery.astroidmobilemenu.js Apache
Error 404 GET /angebot/templates/astroid_template_zero/js/vendor/jquery.easing.min.js Apache
Error 404 GET /angebot/templates/astroid_template_zero/js/script.js Apache
Error 404 GET /angebot/templates/astroid_template_zero/js/vendor/jquery.jdmegamenu.js Apache
Error 404 GET /angebot/templates/astroid_template_zero/js/vendor/jquery.offcanvas.js Apache
Error 404 GET /angebot/misc/JS/get_display_mode.js Apache
Error 404 GET /angebot/misc/JS/set_global_variables.js Apache
Error 404 GET /angebot/misc/JS/outdatedbrowser.min.js Apache
Error 404 GET /angebot/misc/JS/set_Menu_Toggle.js Apache
Error 404 GET /angebot/misc/JS/set_display_mode.js Apache
Error 404 GET /angebot/misc/JS/anchor_smooth_scroll.js Apache
Error 404 GET /angebot/misc/JS/hide_header_on_scroll.js Apache
Error 404 GET /angebot/misc/JS/JDBuilder/set_Accordion_Mode.js Apache
Error 404 GET /angebot/misc/JS/JDBuilder/toggle_Tab.js Apache
Error 404 GET /angebot/misc/JS/astroid.js Apache
Error 404 GET /angebot/media/jdbuilder/js/jdb.noconflict.js Apache
Error 404 GET /angebot/media/jdbuilder/js/jdb.min.js Apache
Access 200 GET /images/sponsors/Inucon.webp Apache
Access 200 GET /images/sponsors/koneko.webp Apache
Access 200 GET /images/sponsors/Cafe%20Into%20Wonderland.webp Apache
Access 200 GET /images/sponsors/Carlsen%20Manga.webp Apache
Access 200 GET /images/sponsors/Franco.webp Apache
Access 200 GET /images/sponsors/Xmas%20Con.webp Apache
Access 200 GET /images/sponsors/Tanzen%20bei%20Pelzer.webp Apache
Access 200 GET /images/sponsors/KAZE.webp Apache
Access 200 GET /images/sponsors/Ticon.webp Apache
Error 404 GET /angebot/templates/astroid_template_zero/css/astroid-596c9670e297083b43197ae168a111b4.css Apache
Error 404 GET /angebot/misc/CSS/outdatedbrowser.min.css Apache
Error 404 GET /angebot/media/system/js/mootools-core.js Apache
Error 404 GET /angebot/media/system/js/core.js Apache
Access 200 GET /images/sponsors/Inucon.webp nginx
Access 200 GET /images/sponsors/Cafe%20Into%20Wonderland.webp nginx
Access 200 GET /images/sponsors/koneko.webp nginx
Access 200 GET /images/sponsors/Carlsen%20Manga.webp nginx
Access 200 GET /images/sponsors/Franco.webp nginx
Access 200 GET /images/sponsors/Xmas%20Con.webp nginx
Access 200 GET /images/sponsors/Tanzen%20bei%20Pelzer.webp nginx
Access 200 GET /images/sponsors/KAZE.webp nginx
Access 200 GET /images/sponsors/Ticon.webp nginx
Error 404 GET /angebot/media/system/js/mootools-more.js Apache
Error 404 GET /angebot/media/jdbuilder/js/jquery-3.4.1.min.js Apache
Error 404 GET /angebot/media/jdbuilder/js/jdb.noconflict.js Apache
Error 404 GET /angebot/media/jdbuilder/js/jdb.min.js Apache
Error 404 GET /angebot/templates/astroid_template_zero/js/vendor/bootstrap/jquery.min.js Apache
Error 404 GET /angebot/templates/astroid_template_zero/js/vendor/jquery.easing.min.js Apache
Error 404 GET /angebot/templates/astroid_template_zero/js/vendor/bootstrap/popper.min.js Apache
Error 404 GET /angebot/templates/astroid_template_zero/js/vendor/bootstrap/bootstrap.min.js Apache
Error 404 GET /angebot/templates/astroid_template_zero/js/vendor/jquery.astroidmobilemenu.js Apache
Error 404 GET /angebot/templates/astroid_template_zero/js/vendor/jquery.jdmegamenu.js Apache
Error 404 GET /angebot/templates/astroid_template_zero/js/vendor/jquery.offcanvas.js Apache
Error 404 GET /angebot/templates/astroid_template_zero/js/script.js Apache
Error 404 GET /angebot/misc/JS/outdatedbrowser.min.js Apache
Error 404 GET /angebot/misc/JS/set_global_variables.js Apache
Error 404 GET /angebot/misc/JS/get_display_mode.js Apache
Error 404 GET /angebot/misc/JS/set_display_mode.js Apache
Error 404 GET /angebot/misc/JS/set_Menu_Toggle.js Apache
Error 404 GET /angebot/misc/JS/hide_header_on_scroll.js Apache
Error 404 GET /angebot/misc/JS/anchor_smooth_scroll.js Apache
Error 404 GET /angebot/misc/JS/JDBuilder/set_Accordion_Mode.js Apache
Error 404 GET /angebot/misc/JS/JDBuilder/toggle_Tab.js Apache
Error 404 GET /angebot/misc/JS/astroid.js Apache
any updates on this matter?
Can you share those files with me please? you can build a docker image or shared a link where I can grab them
I got those files, I will investigate your issues
thanks
From want I can see there is a problem with whitespace in the file names like this
https://domain/images/mascots/Chi%20und%20Sana.webp
after verification I see absolutely no problem. I was using http and the browser was redirecting the links to https. after I used https I saw no 404
my guess is you might have turn "Upgrade Insecure Request" ON under the CSP tab. This will tell the browser to load urls only with https eventhough you use http
Can you check that please?
my guess is you might have turn "Upgrade Insecure Request" ON under the CSP tab. This will tell the browser to load urls only with https eventhough you use http
Can you check that please?
All Settings are exactly the same as in my instance, because it's a direct export of it (I deleted every but two sites and removed unnecessary addons tho).
Upgrade-Insecure-Requests is set to Yes, never cause Problems tho because all my content is served locally via https.
I also noticed that you had http cache setting set to OFF
That's because the site is still WiP and I dind't want to run into caching related problems while debugging.
What sites have you tested? There should be two:
- Home
- home2
Home is served directly at root level (/
) while home2 is served as subpage (/home2
).
- Home has no problems whatsoever
- home2 has the described problems of missing
/
Those Sites are exactly the same, the only difference is where they're published.
Weren't you able to access /home/home2 at all before you renamed htaccess.txt?
Have you checked if leading /
were missing or present?
You need to have a .htaccess in your web root folder if you enable url rewrite, otherwise the home2 will not be available
I know and I have. home2 is avaliable
this site hast issues (look into the console output inside the developer console)
this is 1:1 the website I exported the sql dump and webroot zip from.
I checked the file url. There is a /home/ in the path that causes the 500 error.
I am not sure this is related to the plugin but i will check ASAP
it isn't a copy, it's the same site, just published two times. Check the Menu Tab in joomla to verify this.
I checked the file url. There is a /home/ in the path that causes the 500 error.
that's what I'm trying to say since a month...
the /home is there because the leading /
of the linked sources is missing. the problem disappears when I disabled gzip, so the plugin has to cause this.
The problem doesn't appear on "Home" because it's on the website root, it appears on "home2" because it's a sub-page.
The bug is fixed here d0ffb09,
Because the minify process strips quotes around tags and you were not using other settings (js|css), the joomla SEF plugin failed to rewrite those files path. I have fixed it.
Also, you should set dimensions of your images. They are very large. I see a 3780 pixels width image displayed as a 350px. When you set dimensions, the plugin can automatically resize them
the fix is in this branch https://github.com/tbela99/gzip/tree/v.next
Thank you for your patience :)`
Thanks for your hard work!
Because the minify process strips quotes around tags and you were not using other settings
(js|css), the joomla SEF plugin failed to rewrite those files path. I have fixed it.
sounds logical!
Also, you should set dimensions of your images.
I'll do it just before the site goes live.
the fix is in this branch https://github.com/tbela99/gzip/tree/v.next
fix confirmed working!
this resolves issue #2 stated in OP.
but the issue #1:
all
/
in the url path get replaced by=""
still remains :(
maybe it's just because of bad html and spaces in file paths, I could check this myself. I've inserted banners into the home and home2 pages on my demo site anyway if you want to have a look (the banners should be displayed on the blank section at the top, inspect it with the inspector to see the broken image path).
That issue has been fixed
Yeah, it is and it's working for me too. But this fixes only my 2nd initial issue (missing leading /
) but not my 1st initial issue (/
getting replaced by ""=
) which happens to the header image (Search the page HTML for "Gamesroom" to find the broken url)
Both issues are probably unrelated and only one of them is fixed now.
I will check that, but remember that fix was applied only when minify setting was set to ON
You remember we fixed that issue. The fix only work when you enable minification. Because it adds an extra step, it may be more interesting to implement it as a setting that can be enabled on its own. What do you think about it?
that is a different issue. I will take a look at it
can you provide another dump? I am not able to reproduce that issue on my local copy.
Thanks
Like I said. the bug is not in the plugin itself but probably in astroid framework. When I check its code, i can see many references to background-image
grep -r "background-image" webroot/libraries/ | less
a fix could be preserve quotes around style. I will explore that option
fixed! finally! :)
The fix is in this branch https://github.com/tbela99/gzip/tree/v.next
Like I said. the bug is not in the plugin itself but probably in astroid framework.
Not unlikely, the framework is still young and isn't free of bugs.
fixed! finally! :)
🎉
This issue is now Closed for good, after 2 months of hard work :)
Please keep my Website dump a little longer tho, because I already discovered two other bugs unrelated to HTML minifiyng, I'll open new issues for them.