newsdev/archieml.org

Test suite

Opened this issue · 3 comments

Would you be able to publicly share a google doc (or several) that can act as a common test suite for everything in ArchieML? This will be especially useful as the spec is updated. It should be publicly accessible for viewing and copying but not editing.

Yeah definitely; there's already this document that starts to test converting Google Doc-specific markup into text (like native links, making sure headers come through as plain text, bullets come back as * bullet, etc.).

We could do something similar in order to test all of the different components of the spec. We could also explore making a more formal test suite out of a series of text files and the objects they should be converted into. Splitting them out into multiple tests would make it easier to maintain, and possibly easier to track down bugs in individual parsers. But that makes the scaffolding around the test more complex.

Or maybe we could split the difference, and automatically combine the list of txt/json pairs into one big google document + what the output should look like, to let you do an all-or-nothing spot check as well?

What would be the most useful?

A single test document would be most useful in my case, but it may make
sense to have two versions: One as a text file, and one as a google doc.
The google doc version might include the formatting variations
(bold/italic, links, headings, bullets, etc.) that a Google Doc --> export
/ convert --> parser pipeline should be robust to. One test to run would
be checking that the output from both docs is identical.

On Mon, Mar 30, 2015 at 11:43 AM Michael Strickland <
notifications@github.com> wrote:

Yeah definitely; there's already this document
https://docs.google.com/a/nytimes.com/document/d/1JjYD90DyoaBuRYNxa4_nqrHKkgZf1HrUj30i3rTWX1s
that starts to test converting Google Doc-specific markup into text (like
native links, making sure headers come through as plain text, bullets come
back as * bullet, etc.).

We could do something similar in order to test all of the different
components of the spec. We could also explore making a more formal test
suite out of a series of text files and the objects they should be
converted into. Splitting them out into multiple tests would make it easier
to maintain, and possibly easier to track down bugs in individual parsers.
But that makes the scaffolding around the test more complex.

Or maybe we could split the difference, and automatically combine the list
of txt/json pairs into one big google document + what the output should
look like, to let you do an all-or-nothing spot check as well?

What would be the most useful?


Reply to this email directly or view it on GitHub
#10 (comment).

Hey @noamross,

Over the weekend I ported the specs from archieml-js over to a folder in this repo as individual ArchieML files with the json-encoded output they're supposed to translate into.

https://github.com/newsdev/archieml.org/tree/gh-pages/test/1.0

For example, this test describes the test with test, the expected result with result, and the rest of the document is the test itself.

test: A key and a value should be created.
result: {"key": "value"}

key: value

The first file in that directory, all.0.aml, is a combined test that takes all of the individual tests and merges them into one document. It does this by:

  1. Prefixing every top-level key with a unique integer to prevent keys from different tests overwriting teach other
  2. Not including the :ignore tests, since I can't think of a good way to test that multiple times in a single document

The idea is that you could use either the individual tests, or the all test for an all-or-nothing spot check that the parser's working.


The Google docs-specific tests are a little outside the scope for the moment, since the spec doesn't take an opinion on what HTML markup should be generated by Google Doc formatting. But maybe a second, simpler document that just tests formatting (and not parser logic) could be used for that?