Decode compressed response
nomasprime opened this issue · 5 comments
I have the following in my conftest.py:
@pytest.fixture(scope="module")
def vcr_config():
return {
"cassette_library_dir": f"{TESTS_DIR}/fixtures/cassettes",
"decode_compressed_response": True,
"filter_headers": ["authorization"]
}
But decoding is happening intermittently.
Hi @nomasprime!
But decoding is happening intermittently.
Could you, please, elaborate? Are there some specific request / response pairs where it doesn't happen? Or how can I reproduce this behavior?
Hi @Stranger6667 🙂
Not decoding:
class TestBaseSpider:
get_headers = {
}
@pytest.mark.vcr
def test_parse(self, faker):
response = requests.get('https://www.google.com/search?q=test', headers=self.get_headers)
Decoding:
class TestBaseSpider:
get_headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Firefox/78.0'
}
@pytest.mark.vcr
def test_parse(self, faker):
response = requests.get('https://www.google.com/search?q=test', headers=self.get_headers)
Not a problem if it's dependent on user agent but would be nice to know why.
The behavior is not really documented in VCRpy, but it happens because of the payload encoding - the decoded one is UTF-8
, and base64 one is ISO-8859-1
.
So, the first one raises an exception here and that function returns bytes
. Then the whole cassette data is passed to yaml.dump
, which serializes bytes
to base64 encoded string and adds !!binary
tag to it. For utf-8 response the body is converted to a string, which yaml
serializes to a string as well, without extra tags.
However, this behavior is confusing, indeed. It actually means - "decode compressed response if possible" :)
I.e. www.google.com
returns responses in different encodings for different user-agents :) And then they are stored differently
I see, very good explanation. Thanks @Stranger6667.