the function is_legal_header_name should not allow any control character (octets 0 - 31) and DEL (127)
openalmeida opened this issue · 7 comments
Describe this issue
The function is_legal_header_name @ against to the RFC2616 (RFC7230 ?)
about CTLs (octets 0 - 31 and DEL 127) and the 19 seprartors (\x2f, \x22)
or I am missing any udates of the RFCs.
ref:
message-header = field-name ":" [ field-value ]
field-name = token
token = 1*<any CHAR except CTLs or separators>
CTL = <any US-ASCII control character
(octets 0 - 31) and DEL (127)>
separators = "(" | ")" | "<" | ">" | "@"
| "," | ";" | ":" | "\" | <">
| "/" | "[" | "]" | "?" | "="
| "{" | "}" | SP | HT
SP = <US-ASCII SP, space (32)>
HT = <US-ASCII HT, horizontal-tab (9)>
<"> = <US-ASCII double-quote mark (34)>
To Reproduce
Steps to reproduce the behavior:
- Using raw headers
Expected behavior
# this should passed
assert is_legal_header_name('\x00') is False # NUL
assert is_legal_header_name('\x07') is False # BEL
assert is_legal_header_name('invalid"') is False # \x22
assert is_legal_header_name('invalid/') is False # \x2f
Stacktrace
If applicable, add stacktrace to help explain your problem.
Additional context
maybe a patch here, if this issue for the `is_legal_header_name` confirmed
file https://github.com/Ousret/kiss-headers/blob/83775798/kiss_headers/utils.py#L385
--- a/kiss_headers/utils.py
+++ b/kiss_headers/utils.py
- and search(r"[^\x00-\x7F]|[:;(),<>=@?\[\]\r\n\t &{}\\]", name) is None
+ and search(r"[^\x21-\x7e]|[()<>@,;:\x5c\x22\x2f\[\]?={}]", name) is None
+ # ^^ ^^ ^^^^^^^^ ^ ^ ^ ^ ^^^^^
+ # -- -- -- ordered 17 separators ---
Hi,
Thank you for the report.
I've tested your entries :
from requests import get
entries = [
'\x00',
'\x07',
'invalid"',
'invalid/'
]
for entry in entries:
r = get(
"https://httpbin.org/headers",
headers={
entry: "test"
}
)
print(
entry,
"KO" if r.status_code == 400 else "OK"
)
KO = does not work and the remote server responds with 400/INVALID REQUEST
OK = Fine
- 'invalid"' is KO
- 'invalid/' is OK
- '\x07' is KO
- '\x00' is KO
Further tests on the '/' indicate that this character is allowed anywhere, beginning, ending, multiple times.
More:
from kiss_headers import parse_it
from kiss_headers.utils import is_legal_header_name
from requests import get, post
if __name__ == "__main__":
entries = [
'invalid/',
'/invalid',
'/',
'//invalid/'
]
for entry in entries:
r = get(
"https://httpbin.org/headers",
headers={
entry: "test"
}
)
print(
entry,
"KO" if r.status_code == 400 else "OK",
f"is_legal_header_name({is_legal_header_name(entry)})"
)
Remote server of httpgin.org is "gunicorn" which not always follow the original source (RFCs),
even more widely used server such as the nginx, is a followling of the original source.
One more thing, the "\x7f" maybe risky for commandline environment (terminal) such as logging to some kind of console.
Fine, we can say the meaning of the 7F unicode codepoint has been unclear, it all depends.
Anyway, what I cause this issue is only because of the RFC defined below:
RFC2616 https://tools.ietf.org/html/rfc2616#page-17
message-header = field-name ":" [ field-value ]
field-name = token
token = 1*<any CHAR except CTLs or separators>
separators = "(" | ")" | "<" | ">" | "@"
| "," | ";" | ":" | "\" | <">
| "/" | "[" | "]" | "?" | "="
| "{" | "}" | SP | HT
RFC7230 https://tools.ietf.org/html/rfc7230#page-27
... Delimiters are chosen from the set of US-ASCII
visual characters not allowed in a token
(DQUOTE and "(),/:;<=>?@[\]{}").
header-field = field-name ":" OWS field-value OWS
field-name = token
token = 1*tchar
tchar = "!" / "#" / "$" / "%" / "&" / "'" / "*"
/ "+" / "-" / "." / "^" / "_" / "`" / "|" / "~"
/ DIGIT / ALPHA
; any VCHAR, except delimiters
which said separators or delimiters are excepted (the "/" is not allowed).
We have to be flexible regarding the RFC. I did not say that httpbin was RFC compliant.
Flexible but not too much.
For ref, look at encode/httpx#1363 + all related topics/issues on httpx deps.