roadrunner-server/roadrunner

[πŸ› BUG]: Wrong query parse for application/x-www-form-urlencoded request

Kaspiman opened this issue Β· 2 comments

No duplicates πŸ₯².

  • I have searched for a similar issue in our bug tracker and didn't find any solutions.

What happened?

A bug happened!

Version (rr --version)

rr version 2024.1.5 (build time: 2024-06-20T19:10:34+0000, go1.22.4), OS: linux, arch: amd64

How to reproduce the issue?

Typical PHP worker returns the contents of the request body and the parsed body.:

while (($request = $worker->waitRequest())) {
    $stream = new Stream('php://memory', 'rw');

    $resp = [
        'getContents' => $request->getBody()->getContents(),
        'getParsedBody' => $request->getParsedBody(),
    ];
    
    $stream->write(print_r($resp, true));

    $worker->respond(
        (new ResponseFactory())->createResponse()->withBody($stream),
    );
}

Typical RR config:

http:
    address: :8090
    raw_body: false
    pool:
        num_workers: 2

Request:

curl 'http://127.0.0.1:8090' -H 'accept: application/json, text/javascript, */*; q=0.01' -H 'accept-language: ru,ru-RU;q=0.9,en-US;q=0.8,en;q=0.7'  -H 'content-type: application/x-www-form-urlencoded; charset=UTF-8' --data-raw 'columns%5B%5Drow1=fullname&columns%5B%5Drow2=phone&columns%5B%5Drow3=0&columns%5B%5Drow4=0&columns%5B%5Drow5=0&columns%5B%5Drow6=0&columns%5B%5Drow7=0&columns%5B%5Drow8=0&columns%5B%5Drow9=0&columns%5B%5Drow10=0&columns%5B%5Drow11=0' -X POST

Screenshot

Expected output:

Array
(
    [getContents] => {"columns":{"row1":"fullname","row10":"0","row11":"0","row2":"phone","row3":"0","row4":"0","row5":"0","row6":"0","row7":"0","row8":"0","row9":"0"}}
    [getParsedBody] => Array
        (
            [columns] => Array
                (
                    [row1] => fullname
                    [row10] => 0
                    [row11] => 0
                    [row2] => phone
                    [row3] => 0
                    [row4] => 0
                    [row5] => 0
                    [row6] => 0
                    [row7] => 0
                    [row8] => 0
                    [row9] => 0

                )
        )
)

Relevant log output

Array
(
    [getContents] => {"columns":{"":{"row1":"fullname","row10":"0","row11":"0","row2":"phone","row3":"0","row4":"0","row5":"0","row6":"0","row7":"0","row8":"0","row9":"0"}}}       
    [getParsedBody] => Array
        (
            [columns] => Array
                (
                    [] => Array
                        (
                            [row1] => fullname
                            [row10] => 0
                            [row11] => 0
                            [row2] => phone
                            [row3] => 0
                            [row4] => 0
                            [row5] => 0
                            [row6] => 0
                            [row7] => 0
                            [row8] => 0
                            [row9] => 0
                        )

                )

        )
)

Hey @Kaspiman πŸ‘‹
Thank you for the detailed explanation, I double-checked your case and think that this is not a bug. Your urlformencoded request contains an [] empty key. RR treats that as an empty string "". Thus, you have an additional {"columns":{"": empty key.

Explanation:

Let's parse some string with query params or application/x-www-form-urlencoded data:

<?php

$str1 = "a[]=1&a[]=2&a[]=3";

$str2 = "a[]1=1&a[]2=2&a[]3=3";

$str3 = "a[1]1=1&a[2]2=2&a[3]3=3";

parse_str($str1, $result1);
parse_str($str2, $result2);
parse_str($str3, $result3);

print_r($result1);
print_r($result2);
print_r($result3);

then we get similar results with same structure:

Array
(
    [a] => Array
        (
            [0] => 1
            [1] => 2
            [2] => 3
        )
)
Array
(
    [a] => Array
        (
            [0] => 1
            [1] => 2
            [2] => 3
        )
)
Array
(
    [a] => Array
        (
            [1] => 1
            [2] => 2
            [3] => 3
        )
)

Great, all PHP-frameworks do it every day!

However, in Go and other languages ​​this is not the same thing:

package main

import (
	"fmt"
	"net/url"
)

func main() {
	s1 := "a[]=1&a[]=2&a[]=3"
	fmt.Println(url.ParseQuery(s1))

	s2 := "a[]1=1&a[]2=2&a[]3=3"
	fmt.Println(url.ParseQuery(s2))

	s3 := "a[1]1=1&a[2]2=2&a[3]3=3"
	fmt.Println(url.ParseQuery(s3))
}

Output:

map[a[]:[1 2 3]]

map[a[]1:[1] a[]2:[2] a[]3:[3]]

map[a[1]1:[1] a[2]2:[2] a[3]3:[3]]

See how the structure has changed compared to the previous output from PHP.

The problem here, that RFC 3986 specifies about the format of the urlform, like & as separator and so on. There is no specification on how to parse this foo[bar]baz. It's not the same not only between Go and PHP, but between PHP and most of the world.