fb55/htmlparser2

When bundled, source maps are inlined rather than put in a separate file

apendua opened this issue · 5 comments

To Reproduce

Please have a look at the example repository, where I was able to reproduce the behavior: https://github.com/apendua/htmlparser2-sourcemaps

Expected behavior

When bundle is generated, the source maps from htmlparser2 package should be included in a separate *.map.js file.

Describe the bug

The source maps are being inlined in the codebase. This is of course affecting the bundle size.

Details

Version of Node.js:
Node v18.12.0

Operating System:
MacOS 12.5

Additional context:
I am using react-scripts, which relies on webpack@5. I haven't tried other bundlers yet, so it may just be specific to webpack.

I suspect that this is somehow related to sourceRoot parameter which is being passed to tsc here:

5aa9b84#diff-7ae45ad102eab3b6d7e7896acd08c427a9b25b346470d7bc6507b6481575d519R53

This behavior wasn't present in versions prior to that patch, e.g. 7.2.0 seems to be completely fine.

Screenshots

Zrzut ekranu 2023-01-25 o 14 11 51

fb55 commented

Hi @apendua, this is an interesting issue. I wouldn't know how to tackle this and it might be a good idea to raise this with the webpack team.

@fb55

wouldn't know how to tackle this

As a matter of interest, why do you have to set sourceRoot while building for cjs to point to the repository path? I think that this is exactly what is directly contributing to the problem, because seeing this non-standard value WebPack probably thinks that the only way to properly ingrate source map with the codebase is to inline them.

fb55 commented

The source root points at the source files, which it is supposed to do. I know that webpack didn't support URLs for this initially and would show a warning, but it's great that they seem to be working on it. If something isn't working well, it's probably best to flag it to them.

@fb55 Thanks for your suggestion. This sounds reasonable to look into potential issues with webpack as well.

My only concern is that I have quite a libraries heavy project and really htmlparser2 is the only one that exposes this issue. So I am wondering if other packages are not even bothering about sourceRoot or perhaps they're just generating source maps quite differently. I think I will spend a bit more time next week trying to understand where this discrepancy is coming from.

fb55 commented

I think a lot of packages aren't including source maps because it isn't incredibly straight-forward. All of my packages use this pattern, and webpack's source-map-loader intends to support the pattern: webpack-contrib/source-map-loader#186. Reading this issue, it also seems like webpack will just forward the source map URL and not actually load files.

Looking at the file sizes, they don't seem particularly large: See https://unpkg.com/browse/entities@4.4.0/lib/generated/ for the sizes as part of the entities module. Unfortunately an HTML parser needs to have a listing of all HTML entities, and this data takes about 47kb (which is already compressed a lot — see https://github.com/fb55/entities/blob/master/scripts/trie/README.md for some context).

Closing this here as there is nothing that can be done in this repo.