Oldes/Rebol3

read/system/input skips the last character when used in CGI context

fvanzeveren opened this issue · 22 comments

I have been following the Creating and Processing Web Forms with CGI (Tutorial) for Rebol, rewriting the code where necessary to make it work with your Rebol3 fork.
I came across an issue when using read/system/input to read the data of a POST request: the last character is skipped.
E.g: the expected content is Submit=submit&firstname=Jon&lastname=Doom, but the string returned by read/system/input is Submit=submit&firstname=Jon&lastname=Doo without the last character ('m').

The workaround I found is to add a hidden field with a value of one character as the last of my form.

Regards

cgi-helpers.txt
greetings.txt
simple-webform.txt

Oldes commented

Hi @fvanzeveren, the tutorial is for Rebol2... you may want to look at https://stackoverflow.com/a/14184638/494472
But if there is a missing byte when done read system/ports/input, then it may be a bug. On which OS?

Oldes commented

Trying on linux with file test.cgi:

#!/usr/local/bin/rebol3 -cs
REBOL []
print "Input:"
probe read/string system/ports/input

and then from command line:

echo "Submit=submit&test=1" | rebol3 -cs test.cgi

I can see correct result:

Input:
"Submit=submit&test=1"
Oldes commented

Also... your decode-cgi is weird. You should not modify the input string. What is your Rebol version?

Oldes commented

Btw... you may want to check this: https://github.com/Oldes/Rebol-HTTPd

Hello Oldes
The decode-cgi is the one from Carl I got with source decode-cgi.
I am using REBOL/Bulk 3.10.5
I have tried the code under the following systems:

  • Debian 11 x86 with Apache 2.4,
  • Debian 11 WSL (with Windows 11) with Apache 2.4
    and the last byte gets lost on both.

I will now try on Haiku with lighttpd and let you now!

Thank you.

Oldes commented

I don't have installed Apache on any device. What do you see when you use the above mentioned command in the terminal?

Hello Oldes
In a CLI environment, it is working fine, but I confirm the bug in a real server/cgi environment.
Same issue under Haiki OS with lighttpd.
You can also check the problem on my Debian 11 server:

Oldes commented

ok... I will try to setup some cgi server to figure it out.

Oldes commented

@fvanzeveren could you please, meanwhile, try it with one of these old official R3-alpha builds? http://www.rebol.com/r3/downloads.html Or these old community builds? https://rebolsource.net/

Oldes commented

It looks, that CGI was never really implemented in Rebol3... at least the --cgi flag is not included in the original sources and although it is still collected in the C side, I don't see it used anywhere in the code.

Oldes commented

Btw... I managed to reproduce it on Windows with lighttpd. I can see 2 missing chars (which is logical, because on Windows the line break is CRLF and not just LF. But unfortunately I am not familiar, how the data are passed to the app using CGI :/ And one need not just the raw data, but also info from the request's header.

Oldes commented

Ok.. it looks that the header info is available in the environmental variables... so this works:

probe get-env "HTTP_USER_AGENT"

Full list is available here: https://www.tcl.tk/man/aolserver3.0/cgi-ch4.html

Oldes commented

@fvanzeveren I am able to get the correct data, when I turn off the line mode manually.
Try such a code as your CGI script:

print list-env
attempt [
	;- disable line input...              
	modify system/ports/input 'line false
	;- read raw data...                   
	probe raw: read system/ports/input
	probe to string! raw
]

(tested only on Windows)

Do we want to have the CGI object filled automatically like in Rebol2?

Note: one can use read/string directly (eliminate the probe to string! raw line)

Oldes commented

It is possible to test the the POST method directly from Rebol... for example:

print to-string write http://localhost:8082/test.cgi [POST [My-var: "foo"] "a=2&b=2"]

will output:

"CONTENT_LENGTH" "7"
"DOCUMENT_ROOT" "C:/Dev/UTILS/lighttpd/htdocs"
"GATEWAY_INTERFACE" "CGI/1.1"
"HTTP_ACCEPT" "*/*"
"HTTP_ACCEPT_CHARSET" "utf-8"
"HTTP_ACCEPT_ENCODING" "gzip,deflate"
"HTTP_CONTENT_LENGTH" "7"
"HTTP_HOST" "localhost:8082"
"HTTP_MY_VAR" "foo"
"HTTP_USER_AGENT" "rebol/3.10.5 (Windows; x64)"
"PWD" "C:\Dev\UTILS\lighttpd\htdocs\"
"REDIRECT_STATUS" "200"
"REMOTE_ADDR" "127.0.0.1"
"REMOTE_PORT" "64446"
"REQUEST_METHOD" "POST"
"REQUEST_SCHEME" "http"
"REQUEST_URI" "/test.cgi"
"SCRIPT_FILENAME" "C:/Dev/UTILS/lighttpd/htdocs/test.cgi"
"SCRIPT_NAME" "/test.cgi"
"SERVER_ADDR" "127.0.0.1"
"SERVER_NAME" "localhost"
"SERVER_PORT" "8082"
"SERVER_PROTOCOL" "HTTP/1.1"
"SERVER_SOFTWARE" "lighttpd/1.4.49"
"SYSTEMROOT" "C:\WINDOWS"
"WINDIR" "C:\WINDOWS"
"PATH" "c:\Dev\UTILS\lighttpd"
#{613D3226623D32}
"a=2&b=2"

(Note the HTTP_MY_VAR value included)

Oldes commented

I found a problematic commit: 5eb7fb2
It looks, that above code will not fix the issue on Posix platforms. I will push a fix.

@fvanzeveren I am able to get the correct data, when I turn off the line mode manually. Try such a code as your CGI script:

print list-env
attempt [
	;- disable line input...              
	modify system/ports/input 'line false
	;- read raw data...                   
	probe raw: read system/ports/input
	probe to string! raw
]

(tested only on Windows)

Do we want to have the CGI object filled automatically like in Rebol2?

Note: one can use read/string directly (eliminate the probe to string! raw line)

Hello Oldes
Sorry I could not answer earlier.
I confirm the above code did not solve the issue on Posix system (linux)
Do we want Rebol3 to populate the CGI object automatically? Why not, but it is more sugar as accessing env variables is so simple with 'get-env. This is what my %cgi-helpers.r3 does.

Thanks!

@Oldes
I see you have made some commit.
I clone the master branch , thinking these commits where applied on it, and rebuild rebol3. But nothing seems to have changed. I still have the same issue on POSIX system.
How can I get those commits for my build?
Sorry, I am not really used with github.

Thanks.

Oldes commented

Ok.. I will setup some cgi server on my Linux to try it myself. But it should work with the latest commits.

@Oldes
did you change the version number? After cloning right now the master branch, and building rebol3, it is still 3.10.5.
Regards

Oldes commented

No.. not yet. Give me a few more days:)

Oldes commented

@fvanzeveren the version number was updated and I can confirm, that when I use the modify system/ports/input 'line false line before reading the input in the cgi script, then the data are not truncated on my linux machine with lighttpd server.

image

@Oldes

I tested on Debian x32 and x64... and it works fine!
I still don't understand the purpose of modify system/ports/input 'line false, but it works... so it's fine.

Thank you!