kiwicom/pytest-recording

Decode compressed response

nomasprime opened this issue · 5 comments

I have the following in my conftest.py:

@pytest.fixture(scope="module")
def vcr_config():
    return {
        "cassette_library_dir": f"{TESTS_DIR}/fixtures/cassettes",
        "decode_compressed_response": True,
        "filter_headers": ["authorization"]
    }

But decoding is happening intermittently.

Hi @nomasprime!

But decoding is happening intermittently.

Could you, please, elaborate? Are there some specific request / response pairs where it doesn't happen? Or how can I reproduce this behavior?

Hi @Stranger6667 🙂

Not decoding:

class TestBaseSpider:
    get_headers = {
    }

    @pytest.mark.vcr
    def test_parse(self, faker):
        response = requests.get('https://www.google.com/search?q=test', headers=self.get_headers)

Decoding:

class TestBaseSpider:
    get_headers = {
        'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Firefox/78.0'
    }

    @pytest.mark.vcr
    def test_parse(self, faker):
        response = requests.get('https://www.google.com/search?q=test', headers=self.get_headers)

Not a problem if it's dependent on user agent but would be nice to know why.

The behavior is not really documented in VCRpy, but it happens because of the payload encoding - the decoded one is UTF-8, and base64 one is ISO-8859-1.
So, the first one raises an exception here and that function returns bytes. Then the whole cassette data is passed to yaml.dump, which serializes bytes to base64 encoded string and adds !!binary tag to it. For utf-8 response the body is converted to a string, which yaml serializes to a string as well, without extra tags.

However, this behavior is confusing, indeed. It actually means - "decode compressed response if possible" :)

I.e. www.google.com returns responses in different encodings for different user-agents :) And then they are stored differently

I see, very good explanation. Thanks @Stranger6667.