simonw/strip-tags

Use YAML for test cases

Closed this issue · 3 comments

simonw commented

I think this is more readable (rewritten by GPT-4):

- comment: |
    Block level elements should have a newline
  input: |
    <div><h1>H1</h1><p>Para</p><pre>pre</pre>
  args: ["h1", "pre"]
  expected: |
    H1
    pre
- comment: |
    Various ways whitespace should be stripped
  input: |
    Hello
    This
    Is
    Newlines
  args: ["--minify"]
  expected: |
    Hello
    This
    Is
    Newlines
- input: |
    Hello
    This
    Is
    Newlines
  args: ["-m"]
  expected: |
    Hello
    This
    
    Is
    Newlines
- input: |
    Hello
    This
    
    
    Is
    Newlines
  args: ["--minify"]
  expected: |
    Hello
    This
    
    Is
    Newlines
- input: |
    Hello  this   has     spaces
  args: ["--minify"]
  expected: |
    Hello this has spaces
- comment: |
    Should remove script and style
  input: |
    <script>alert('hello');</script><style>body { color: red; }</style>Hello
  args: []
  expected: |
    Hello
- comment: |
    Test alt text replacement
  input: |
    <img src="foo.jpg" alt="Foo"><img src="bar.jpg" alt="Bar">
  args: []
  expected: |
    FooBar
- comment: |
    Even with --minify <pre> tag content should be unaffected
  input: |
    <pre>this
      has
        spaces</pre>
  args: ["--minify"]
  expected: |
    this
      has
        spaces
- comment: |
    Test --first
  input: |
    Ignore start<p>First paragraph</p><p>Second paragraph</p>Ignore end
  args: ["p", "--first"]
  expected: |
    First paragraph
- comment: |
    Keep tags
  input: |
    Outside of section
    <section>
    Keep these:
    <h1>an h1</h1>
    <h2 class="c" id="i" onclick="f">h2 with class and id</h2>
    <p>This p will have its tag ignored</p>
    </section>
    Outside again
  args: ["section", "-t", "hs", "-m"]
  expected: |
    Keep these:
    <h1>an h1</h1> <h2 class="c" id="i">h2 with class and id</h2> This p will have its tag ignored
simonw commented

Something went wrong with this - it's only 30 tests now and it was 54 before.

I'm going to ditch the GPT-4 generated YAML and make this myself.

simonw commented

Wrote this conversion code for the old TEST_PARAMETERS:

import json

examples = [{"input": t[0], "args": t[1], "expected": t[2]} for t in TEST_PARAMETERS]

for example in examples:
    print("- input: |")
    print("    " + "\n    ".join(example["input"].split("\n")))
    print("  args: " + json.dumps(example["args"]))
    print("  expected: |")
    print("    " + "\n    ".join(example["expected"].split("\n")))

Then manually cleaned up a bit of the resulting YAML's whitespace.