75lb/command-line-args

Support for dense -Xfoo options

rgov opened this issue · 6 comments

rgov commented

I'm trying to do a halfway decent job parsing curl's command line arguments.

curl supports, say, the -X option plus an argument smushed together, e.g., -XPOST is short for --request POST. I've also seen this format used by Clang and others. Python's argparse handles it the right way.

However, this package interprets it as a single option called XPOST.

75lb commented

option names beginning with a single dash (e.g. -X) are interpreted as getopt style short options (from the POSIX standard). A multiple character option beginning with a single dash (-XYZ) is interpreted as a getopt "switch cluster" (e.g. equivalent to -X -Y -Z). This is pretty standard behaviour, enabling you to write commands like ls -al and rm -rvf.

So, with -XPOST we would have an ambiguity issue. Did the user mean -X -P -O -S -T, -X POST or -X -P OST etc? How would you prefer we avoid ambiguity?

rgov commented

If -X takes a parameter, then it makes sense to interpret the following characters as its parameter. Either there is whitespace immediately following -X, in which case the next argument should be taken as its parameter, or there is no whitespace, in which case everything after the -X in the same argument should be taken as the parameter.

Otherwise it would always be a parse error to interpret -XPOST as -X -P -O -S -T, because there is no obvious choice for the parameter to -X (e.g., considered the argument following -T wouldn't make any sense).

It should use the longest match first, so if I actually have an option called, for whatever reason, -XP, that should be used first.

I think this removes all ambiguities.

rgov commented

Also, in a quick test with getopt on macOS 10.13.5 (based on BSD), an option specifier of a: works when I write -axyz (as in, option a with argument xyz), so this appears to be somewhat standard.

75lb commented

You're right. With getopt, whitespace after a single-character-option-expecting-an-arg is optional so we should support this.

I've created a branch with failing tests. Are you able to fork and make them pass? If so, we'll be well on our way. 👍

rgov commented

You might have a better idea about how you want to fix it.

Right now you are doing a simple canonicalization step in the constructor for ArgvParser, which splits up a combined short option -abc into -a -b -c. However, we now know from this thread that this cannot be done correctly without looking up the definition for -a to see if it takes an argument...

I attempted to write a canonicalizer that does this. It splits up --option=value into --option value, converts -o into the longer --option, and correctly splits -abc into --longa --longb --longc or --longa bc as needed.

It passes most tests, but I think it needs to be more tightly folded into the generator method to be able to generate unknown_value events properly. And that will be a fairly substantial refactor to your code.

  _canonicalizeArgv() {
    var newArgv = []
  
    for (let arg of this.argv) {
      /* split `--option=value` into `--option value` */
      if (argvTools.isOptionEqualsNotation(arg)) {
        const matches = arg.match(argvTools.re.optEquals)
        let def = this.definitions.get(matches[1])
        if (def) {
          newArgv.push({ origArg: arg, arg: `--${def.name}` })
          if (!def.isBoolean()) {
            newArgv.push({ origArg: arg, arg: matches[2], value: true })
          } else {
            // note: just dropping the value here, hm
          }
        } else {
          newArgv.push(arg)  // unknown option, put it back
        }
      }
      
      /* split `-abc` into `-a -b -c` or `-a bc`, etc */
      else if (argvTools.re.combinedShort.test(arg)) {
        for (var i = 1; i < arg.length; i++) {
          let flag = arg.charAt(i)
          
          let def = this.definitions.get('-' + flag)
          if (def) {
            newArgv.push({ origArg: arg, arg: `--${def.name}` })
            
            // The rest of this arg might be the option value
            if (!def.isBoolean() && i < arg.length - 1) {
              newArgv.push({ origArg: arg, arg: arg.slice(i + 1), value: true })
              break
            }
          } else {
            // We don't know this short option, but split it up anyway
            newArgv.push({ origArg: arg, arg: `-${flag}` })
          }
        }
      }
        
      /* catch all for everything else: keep as is */
      else {
        newArgv.push(arg)
      }
    }
  
    this.argv = newArgv
  }
75lb commented

This is implemented on the next branch and released with the label preview. To install the preview:

npm install command-line-args@preview