output errors do not lead to a non-zero exit code
yahesh opened this issue · 9 comments
When providing trurl
with a URL-encoded component that it is unhappy with (e.g. using URL-encoded characters smaller than %20
) then the tool will not decode the component as requested but output a "trurl note", swallowing all output and only returning a single line break while the exit code is still 0
. This is even the case with the --verify
option set.
This could severely break pipelines that use trurl
on untrusted URLs. Currently, tools using trurl
cannot distinguish between successful parsing and URL-decoding errors based on the exit codes.
# this is okay
% trurl "?" --get "{host}" --verify ; echo "Exitcode: $?"
trurl error: No host part in the URL [?]
trurl error: Try trurl -h for help
Exitcode: 9
# THIS SHOULD RETURN A NON-ZERO EXIT CODE
% trurl ".#fragment%00" --get "{fragment}" --verify ; echo "Exitcode: $?"
trurl note: URL decode error, most likely because of rubbish in the input (fragment)
Exitcode: 0
# THIS SHOULD RETURN A NON-ZERO EXIT CODE
% trurl "./path%00" --get "{path}" --verify ; echo "Exitcode: $?"
trurl note: URL decode error, most likely because of rubbish in the input (path)
Exitcode: 0
# THIS SHOULD RETURN A NON-ZERO EXIT CODE
% trurl ".?query%00" --get "{query}" --verify ; echo "Exitcode: $?"
trurl note: URL decode error, most likely because of rubbish in the input (query)
Exitcode: 0
# THIS SHOULD RETURN A NON-ZERO EXIT CODE
% trurl "user%00:password%00@." --get "{user}" --verify ; echo "Exitcode: $?"
trurl note: URL decode error, most likely because of rubbish in the input (user)
Exitcode: 0
# THIS SHOULD RETURN A NON-ZERO EXIT CODE
% trurl "user%00:password%00@." --get "{password}" --verify ; echo "Exitcode: $?"
trurl note: URL decode error, most likely because of rubbish in the input (password)
Exitcode: 0
It returns 0 when it can parse the URL, which it can in your examples. You might want --no-guess-scheme
if you want it to not guess the scheme, as that makes it rather lenient.
It returns 0 when it can parse the URL, which it can in your examples. You might want
--no-guess-scheme
if you want it to not guess the scheme, as that makes it rather lenient.
I'm not sure we're talking about the same thing here, because even with --no-guess-scheme
set the tool will swallow the output due to the decoding error but still provide an exit code of 0
.
# this is okay
% trurl "https://?" --get "{host}" --no-guess-scheme --verify ; echo "Exitcode: $?"
trurl error: No host part in the URL [https://?]
trurl error: Try trurl -h for help
Exitcode: 9
# THIS SHOULD RETURN A NON-ZERO EXIT CODE
% trurl "https://.#fragment%00" --get "{fragment}" --no-guess-scheme --verify ; echo "Exitcode: $?"
trurl note: URL decode error, most likely because of rubbish in the input (fragment)
Exitcode: 0
# THIS SHOULD RETURN A NON-ZERO EXIT CODE
% trurl "https://./path%00" --get "{path}" --no-guess-scheme --verify ; echo "Exitcode: $?"
trurl note: URL decode error, most likely because of rubbish in the input (path)
Exitcode: 0
# THIS SHOULD RETURN A NON-ZERO EXIT CODE
% trurl "https://.?query%00" --get "{query}" --no-guess-scheme --verify ; echo "Exitcode: $?"
trurl note: URL decode error, most likely because of rubbish in the input (query)
Exitcode: 0
# THIS SHOULD RETURN A NON-ZERO EXIT CODE
% trurl "https://user%00:password%00@." --get "{user}" --no-guess-scheme --verify ; echo "Exitcode: $?"
trurl note: URL decode error, most likely because of rubbish in the input (user)
Exitcode: 0
# THIS SHOULD RETURN A NON-ZERO EXIT CODE
% trurl "https://user%00:password%00@." --get "{password}" --no-guess-scheme --verify ; echo "Exitcode: $?"
trurl note: URL decode error, most likely because of rubbish in the input (password)
Exitcode: 0
Yes, because it returns 0 when it can parse the URL and it only adds a "note" about the problem to output what was asked.
We should probably consider making the output problems return non-zero as well.
It struck me we don't return error for those because you might pass in many URLs, and if you get a problem with --get
on just one of them how is trurl supposed to behave then? Maybe we need another option... ?
For example , this is a legit command line:
$ trurl curl.se curl.se/?one%00 curl.se/?two -g{query}
@bagder Curl has support for a --fail
option. --fail
in trurl could be described as being more strict than --verify
.
It's not about "more strict" though. --verify
is for being strict about (verifying) the input, this discussion is about using parts of the otherwise fine input when creating output. Basically problems with --get
.
The new option would probably be documented as "return error if failing to produce output" or something. I think --fail
is an odd name for that. How about --strict-get
or --werror
(the latter being "treat warnings as errors" as that's the name for the corresponding gcc option). Or something else.
At the end, the decision is up to you, but I don't understand why you say that --fail
is an odd name for the option when the word is even part of how you would document it:
"return error if failing to produce output"