gildas-lormeau/SingleFile

Add option for space saving without deleting newlines

runxel opened this issue · 3 comments

I use SingleFile not only for longterm storage, but also to feed beautiful soup and/or looking and manipulating code. VSCode has trouble with it tho and stops tokenization when the lines get too long. Since the minimization option also gets rid of all newlines the HTML file will consist of a single line. I've experienced degraded performance because of that on several occasions.

We need an option for minimizing the HMTL but without nuking newlines = preserving them.

SingleFile should not remove newlines in HTML. For example, if you save https://example.com/ with SingleFile, the resulting content still contains the original newlines, see the source code below.

<!DOCTYPE html> <html><!--
 Page saved with SingleFile 
 url: https://example.com/ 
 saved date: Tue Apr 30 2024 01:11:30 GMT+0200 (Central European Summer Time)
--><meta charset=utf-8>
<title>Example Domain</title>
<meta name=viewport content="width=device-width, initial-scale=1">
<style>body{background-color:#f0f0f2;margin:0;padding:0;font-family:-apple-system,system-ui,BlinkMacSystemFont,"Segoe UI","Open Sans","Helvetica Neue",Helvetica,Arial,sans-serif}div{width:600px;margin:5em auto;padding:2em;background-color:#fdfdff;border-radius:0.5em;box-shadow:2px 3px 7px 2px rgba(0,0,0,0.02)}a:link,a:visited{color:#38488f;text-decoration:none}@media (max-width:700px){div{margin:0 auto;width:auto}}</style>
<meta name=referrer content=no-referrer><link rel=canonical href=https://example.com/><meta http-equiv=content-security-policy content="default-src 'none'; font-src 'self' data:; img-src 'self' data:; style-src 'unsafe-inline'; media-src 'self' data:; script-src 'unsafe-inline' data:; object-src 'self' data:; frame-src 'self' data:;"><style>img[src="data:,"],source[src="data:,"]{display:none!important}</style></head>
<body>
<div>
 <h1>Example Domain</h1>
 <p>This domain is for use in illustrative examples in documents. You may use this
 domain in literature without prior coordination or asking for permission.</p>
 <p><a href=https://www.iana.org/domains/example>More information...</a></p>
</div>

However, this problem arises for the content of inline stylesheets but unfortunately I can't fix it (easily). It's an issue in the library used by SingleFile to parse CSS (see csstree/csstree#237).

I'm closing the issue because it's actually a duplicate of #1220.

Note that if you confirm the issue is mainly related to CSS content, I could add an option to add newlines after each CSS rule to circumvent it.