omarryhan/aiogoogle

Bug in URL path parameter quoting

pyryjook opened this issue · 10 comments

Hi,

First of all thanks for a great library you have built! I'm using aiogoogle to fetch Google Search Console API and noticed that url path parameter quoting does not work as expected. Please find detailed explanation from below.

Bug description:
The API docs show that for Search Analytics the endpoint with url parameters follows this template:

POST  /sites/{siteUrl}/searchAnalytics/query

so with siteUrl being https://example.com/ the final URL should look like this:

POST /sites/https%3A%2F%2Fwww.example.com%2F/searchAnalytics/query

Currently there is a bug in the library that produces an URL that looks like this

POST /sites/https%3A//example.com//searchAnalytics/query

As the resulted URL contains extra slashes the API returns 404 for requests that have URL type values as their path parameters.

Proposed fix:
I believe this can be fixed by overriding the default safe parameter with a value '' in urllib.parse.quote(string,(url) safe='/', encoding=None, errors=None). quote function is called here:

sorted_required_path_params[k] = quote(str(v))

Hey, thank you so much for the clear bug report and for the PR!!
Making a new release now.

Released in v4.2.0

Good stuff! Thanks for a quick merge and release.

Hey @pyryjook

I just reverted the fix you made for this issue and drafter a new breaking release.
The problem is that some APIs expect forward slashes to be escaped and some don't. I thought it would be better if by default we don't escape it, and for the APIs that require escaping we do outside of the lib before passing it to the lib, because the opposite will be a lot harder i.e. decode the specific parameter after it has been added to the URL.

Please check #104 & #113 for more context. Sorry for the inconvenience.

Hey @omarryhan ,

First of all thanks for pinging me.

Yeah, the default behavior as you described makes totally sense.

When I update the version on my side I'll do the proposed changes there.

Cheers!

Awesome, thanks for understanding and for your work here :)

Cheers!

Hi, @omarryhan
we've been facing an encoding issue due to the latest release v5.0.0.
Suppose, There is a directoryFolder1 inside a bucketbucket_name, and an API call is made to fetch the objectFolder1/File1.txt's data.
the URL should look like this:

GET https://storage.googleapis.com/storage/v1/b/bucket_name/o/Folder1%2FFile1.txt?alt=media

but with the latest release, the URL looks like this:

GET https://storage.googleapis.com/storage/v1/b/bucket_name/o/Folder1/File1.txt?alt=media

I tried your suggestion of encoding the / before sending it to the lib, the URL looks like this:

GET https://storage.googleapis.com/storage/v1/b/bucket_name/o/Folder1%252FFile1.txt?alt=media

I suppose it's due to double encoding,

  1. The user encodes / => %2F then,
  2. The lib encodes % => %25 resulting in / => %252F

Ughh makes sense. Do you happen to know how other Google libs handle this? Maybe we can use a similar approach. Or we can skip encoding entirely for URL path parameters and make it entirely the client's responsibility.

Hey @omarryhan, I looked into the client library provided by google, and I found that the safe characters are applied conditionally. You can refer :

  1. safe-chars=b"/~"
  2. safe-chars=b"/~"
  3. safe-chars=b"~"

we found a way by which we can avoid conditionals by bringing up the safe_chars parameter so that users can configure it as their need when calling any resource method.

@jignesh-crest Thanks for digging into this! I am happy to accept a PR that adds this parameter. Not sure though how it should be added though, one parameter that's used for all path params? or one safe char variable per path parameter?