Escaped space not handled
Closed this issue · 8 comments
STEPS TO REPRODUCE
$ node
> var stringArgv = require('string-argv');
undefined
> stringArgv('one\ one two')
EXPECTED RESULTS
[ 'one one', 'two' ]
ACTUAL RESULTS
[ 'one', 'one', 'two' ]
I'm not sure how to achieve this since 'one\ one two' === 'one one two'
I can think of two ways off the top of my head.
-
Giving '\ ' special treatment in your regex. i.e. a space is a space UNLESS it's preceded by a backslash.
-
A pre-processing step that places quotes around strings separated with '\ '.
In javascript there is a difference between one\ one
and one\\ one
the first one outputs one one
the second outputs one\ one
.
\
is a single character in a string, not 2.
It is not possible to differentiate the character ' '
and the character '\ '
in a regex.
If I do "\ ".indexOf("\\")
it returns -1, therefore, I cannot search for backslash as it is not present in the string.
If what you're asking is to support stringArgv('one\\ one two')
I'm afraid it falls out of scope for this project. This would be a specific use case.
Might I ask what is the exact use case for this request so I can better understand where it comes from.
I've written a library that uses child_process.spawn
which requires command line arguments to be presented as a list. This library also uses ssh2.exec
which requires a command line string.
I'd like the interface to the my functions that use these library routines to match and I've settled on a command line string as opposed to a list because it's simpler. This means I need to convert the command line string to a list before I call child_process.spawn
. This is what I'm using string-argv
to accomplish.
This isn't a deal-breaker for me. Your library works well as is (thanks!). My library simply doesn't support '\ '
. I filed this ticket because for completeness sake, I felt thatstring-argv
might want to support this.
https://nodejs.org/api/child_process.html#child_process_child_process_spawn_command_args_options
There is a String.raw() function in JS and in Node. Could this help?
let str = String.raw`one\ one`
let str2 = 'one\ one'
console.log(str) // one\ one
console.log(str2) // one one
@juergba unfortunately String.raw
doesn't support \
character, it simply didn't escape the \
.
So in the this case it would be equivalent of doing const str = "one\\ one"
Unfortunately, the backslash space is exactly the same as a regular space in the string.
" ".charCodeAt(0) // 32
"\ ".charCodeAt(0) // 32
So most likely the orginal demand wanted to support "\\ "
which is actually 2 characters.
The only thing I can see to support this would be to change the regex to
([^\s'"]([^\s'"]*(['"])([^\3]*?)\3)+[^\s'"]*)|([^\s'"]+(\\ )[^\s'"]+)+|[^\s'"]+|(['"])([^\5]*?)\5
(https://regexr.com/4oht2)
But I wouldn't add explicit support for these 2 characters in the library by default as it would be a breaking change.
I can try to see if I can add options for people to opt-in
Actually, my solution doesn't work because it doesn't match one\ two\ three
(strings with multiple \
).
I would need a negative set [^\s'"(\\ )]
but that's not a thing.
Or find a way to detect word\ word
with respect to "quoted words"
"quoted nested with\ and 'nested"
. All these edge cases make this much more complicated
The only alternative I can thing of is to implement a state machine with language parsing with a lookahead of 2 for token matching.
I don't see the value in implementing that whole thing in the library.
I will close the issue for now because after a year of thinking about this, I can't see a way to easily support it without either breaking compatibility or changing the implementation significantly.
If someone can come up with a reasonable way to implement it using regex, I'll reconsider it