jazzband/docopt-ng

Allow dumping of parse tree, to make it easier to check what docopt will actually parse

dwt opened this issue · 1 comments

dwt commented

As mentioned in #54 by @NickCrews and @Sylvain303 , I'm splitting of the issue to ensure it doesn't get lost.

Original issue

Having used the original docopt - so excuse me if you already support this.

I would really love for a special argument that just dumps a structured (json / yaml) representation of the argument parser, that easily allows me to check what is parsed and wether changes I did might have changed the way args will be parsed.

This could also allow diffing between the way stuff is parsed in versions (as a kind of acceptance tests).

That is something I always missed when using the original docopt, that I always just used very rarely and was having a lot of trouble remembering the intricacies of the docopt configuration language.

@NickCrews answer

@dwt that feature makes sense, and should be theoretically be possible.

If you want to discuss this further, please make a new feature request issue, so this thread doesn't go off topic. But in short:
I am not going to implement that (don't have the time time or interest), but I would consider merging a PR if

  • I didn't have to do huge amounts of edits/review on it
  • no additional dependencies, or very lightweight ones
  • didn't look like a huge maintenance burden. I'm skeptical this would be possible
  • didn't break existing users

I don't know of any common json schema that would work for this. I would love to not reinvent the wheel here, so would be worth it to explore what other. It also seems to me like "checking for a change in parsing behavior" would not be that useful on its own. If the behavior started off incorrect, you aren't actually catching that. It seems more useful to actually test assert my_parse("myprogram new --name foo") == , even if it might be a bit more verbose. Also would serve as documentation.

@Sylvain303 answer

Hello, I'm not very active yet on docopts coding (golang version for bash) watch-out the extra 'S'. But I
try to implement such thing: dumping what was actually parsed by docopt parser.

Could you create an new issue with some sample of what you would love to see outputted? That would be very useful. 🤩

Some ideas about the expected output

To be honest I do not have a precise idea what the correct output format would be. My initial idea was, that all implementations should be able to dump that exact same parse tree, to then be able to compare them to each other and have them act as a regression test suite for each other.

However that use case would be mostly about machines being able to compare this, not necessarily humans. I guess some json/yaml, that just documents all the options would be mostly fine here? The format should be as simple as possible to ease interoperability.

The other use case I had for myself is that it can be hard for a casual user to actually write the correct docopt syntax, as it has it's intricacies that can be hard to get right.

To help with that, I'm not quite sure what would help best? To me generating the source for an equivalent argparse based parser would be quite helpful, but having something declarative gives me a better belly feeling.

To be honest, the first thing I would do if this was my project is probably to just dump the tree of the internal data structure that docopt-ng uses to then parse command lines, then take a look at that and see what can be achieved from there. This would already give you internal comparability between different versions of the parser and would enable regression testing with easy regeneration of the fixtures. Then tweak from there to make it useful for users without big knowledge of the internals and propose that to other implementations to make them comparable.

Does that make sense?

Some thoughts about what this dump format should contain

docopt_format: "Naval Fate …" # what was parsed
program_name: naval_fate
alternatives:
- marker: ship new
  positional:
  - name: name
    repeats: infinite
- marker: ship
  positional:
  - name: name
  rest_alternatives:
  - marker: move
    positional:
    - name: x
    - name: y
    named:
    - flag_long: speed
      name: kn

Obviously this format idea is very incomplete and probably unusable for several reasons, but that's what is floating in my mind.

In it's current form I'm gonna close this as out-of-scope. I don't think enough people are gonna desire it to warrant the extra code this would require.

But, this could get implemented externally if docopt had a way to dump the internal guts midway through parsing, before binding the actually passed argv. I am open to lightly adjusting the internal implementation of docopt to make this possible, but I will make no guarantees as to the stability of this. eg then you could do

if docopt.__version__ == "...":
    spec = docopt._parse_spec(docstring)
elif docopt.__version__ == "...":
    spec = docopt._parse_spec(docstring)
    spec = fix_spec(spec)
....

If this sounds agreeable, then if you take a stab at it I will review.