Shopify/packwerk

[Bug Report] Crash on ERB containing multiple CDATA blocks

remvee opened this issue · 7 comments

Description
Having ERB files in a project with multiple CDATA blocks inside causes packwerk check to crash.

To Reproduce
Add an ERB file to a project containing multiple CDATA blocks. For instance:

<![CDATA[foo]]><![CDATA[bar]]>

Run packwerk, this result in an exception:

$ packwerk check
📦 Packwerk is inspecting 3141 files
..................................../code/backend/vendor/gems/better_html-1.0.16/lib/better_html/parser.rb:79:in `build_document_node': Unhandled token cdata_end line 1 column 27, [s(:cdata, "foo"), s(:text, "<![CDATA[bar")] (RuntimeError)
	from /code/backend/vendor/gems/better_html-1.0.16/lib/better_html/parser.rb:45:in `ast'
	from /code/backend/vendor/gems/packwerk-2.2.0/lib/packwerk/parsers/erb.rb:27:in `parse_buffer'
	from /code/backend/vendor/gems/packwerk-2.2.0/lib/packwerk/parsers/erb.rb:22:in `call'
	from /code/backend/vendor/gems/packwerk-2.2.0/lib/packwerk/file_processor.rb:76:in `block in parse_into_ast'
	from /code/backend/vendor/gems/packwerk-2.2.0/lib/packwerk/file_processor.rb:75:in `open'
	from /code/backend/vendor/gems/packwerk-2.2.0/lib/packwerk/file_processor.rb:75:in `parse_into_ast'
	from /code/backend/vendor/gems/sorbet-runtime-0.5.10154/lib/types/private/methods/call_validation_2_6.rb:764:in `call'
	from /code/backend/vendor/gems/sorbet-runtime-0.5.10154/lib/types/private/methods/call_validation_2_6.rb:764:in `block in create_validator_method_medium2'
	from /code/backend/vendor/gems/packwerk-2.2.0/lib/packwerk/file_processor.rb:47:in `block in call'
	from /code/backend/vendor/gems/packwerk-2.2.0/lib/packwerk/cache.rb:74:in `with_cache'
	from /code/backend/vendor/gems/sorbet-runtime-0.5.10154/lib/types/private/methods/call_validation.rb:161:in `call'
	from /code/backend/vendor/gems/sorbet-runtime-0.5.10154/lib/types/private/methods/call_validation.rb:161:in `validate_call'
	from /code/backend/vendor/gems/sorbet-runtime-0.5.10154/lib/types/private/methods/call_validation.rb:90:in `block in create_validator_slow'
	from /code/backend/vendor/gems/packwerk-2.2.0/lib/packwerk/file_processor.rb:46:in `call'
	from /code/backend/vendor/gems/sorbet-runtime-0.5.10154/lib/types/private/methods/call_validation_2_6.rb:106:in `call'
	from /code/backend/vendor/gems/sorbet-runtime-0.5.10154/lib/types/private/methods/call_validation_2_6.rb:106:in `block in create_validator_method_fast1'
	from /code/backend/vendor/gems/packwerk-2.2.0/lib/packwerk/run_context.rb:82:in `process_file'
	from /code/backend/vendor/gems/sorbet-runtime-0.5.10154/lib/types/private/methods/call_validation.rb:161:in `call'
	from /code/backend/vendor/gems/sorbet-runtime-0.5.10154/lib/types/private/methods/call_validation.rb:161:in `validate_call'
	from /code/backend/vendor/gems/sorbet-runtime-0.5.10154/lib/types/private/methods/call_validation.rb:90:in `block in create_validator_slow'
	from /code/backend/vendor/gems/packwerk-2.2.0/lib/packwerk/parse_run.rb:82:in `block in find_offenses'

Expected Behaviour
No exception raised.

Version Information

  • Packwerk: 2.2.0
  • Ruby 2.6.10

Additional Context
Issue reported upstream Shopify/better-html#86

Thank you for this bug report and tracing it upstream to better_html!

@remvee

Thanks for the bug report. Since parsing ERB files with CDATA blocks is currently unsupported, I wonder if an initial stopgap for us for now would be to catch this particular error and ignore ERB files with CDATA blocks. Later, as needed, we can decide to investigate other approaches (different parser, changes to better_html, etc.). Does this sound reasonable to you? I figure this would be an improvement since it unblocks you to use packwerk (and anyways packwerk already has a lot of areas where it doesn't capture all references between packages).

I think a temporary workaround may make sense as it seems that the underlying bug is in html_tokenizer (https://github.com/EiNSTeiN-/html_tokenizer/pull/7) and that repo doesn't look like it's maintained.

@remvee I've attempted to fix this here: #214

Let me know what ya think!

I think it makes sense to keep this open to make it clear we don't yet support parsing ERB with CDATA tags.

It should be fixed now in the version 2.0 of better_html.