meltano/sdk

bug: Inconsistent pytest results when runnning test_target_sqlite.py::test_hostile_to_sqlite

Opened this issue · 0 comments

Singer SDK Version

0.36.0

Is this a regression?

  • Yes

Python Version

3.9

Bug scope

Other

Operating System

Windows

Description

When working on PR #1784 I found some test involving sqlite that have inconsistent results. I set them to xfail in the PR. The PR was recently reviewed, and I was asked to open an issue. I see this when doing the following:

PS C:\development\MeltanoContribute\sdk> poetry run pytest -vv tests/samples/test_target_sqlite.py::test_hostile_to_sqlite
========================================================================================================================== test session starts ==========================================================================================================================
platform win32 -- Python 3.9.12, pytest-8.0.2, pluggy-1.4.0 -- C:\Users\dan.norman\AppData\Local\pypoetry\Cache\virtualenvs\singer-sdk-3f75caa4-py3.9\Scripts\python.exe
codspeed: 2.2.0 (callgraph: not supported)
cachedir: .pytest_cache
benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
sqlalchemy: 2.0.28
rootdir: C:\development\MeltanoContribute\sdk
configfile: pyproject.toml
plugins: benchmark-4.0.0, codspeed-2.2.0, snapshot-0.9.0, requests-mock-1.11.0, singer-sdk-0.0.0, time-machine-2.13.0, xdoctest-1.1.3
collected 1 item

tests/samples/test_target_sqlite.py::test_hostile_to_sqlite PASSED                                                                              
PS C:\development\MeltanoContribute\sdk> poetry run pytest -vv tests/samples/test_target_sqlite.py                        
========================================================================================================================== test session starts ==========================================================================================================================
platform win32 -- Python 3.9.12, pytest-8.0.2, pluggy-1.4.0 -- C:\Users\dan.norman\AppData\Local\pypoetry\Cache\virtualenvs\singer-sdk-3f75caa4-py3.9\Scripts\python.exe
codspeed: 2.2.0 (callgraph: not supported)
cachedir: .pytest_cache
benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
sqlalchemy: 2.0.28
rootdir: C:\development\MeltanoContribute\sdk
configfile: pyproject.toml
plugins: benchmark-4.0.0, codspeed-2.2.0, snapshot-0.9.0, requests-mock-1.11.0, singer-sdk-0.0.0, time-machine-2.13.0, xdoctest-1.1.3
collected 12 items

tests/samples/test_target_sqlite.py::test_sync_sqlite_to_sqlite PASSED                                                                                                                                                                                             [  8%]
tests/samples/test_target_sqlite.py::test_sqlite_schema_addition PASSED                                                                                                                                                                                            [ 16%] 
tests/samples/test_target_sqlite.py::test_sqlite_column_addition PASSED                                                                                                                                                                                            [ 25%]
tests/samples/test_target_sqlite.py::test_sqlite_activate_version PASSED                                                                                                                                                                                           [ 33%]
tests/samples/test_target_sqlite.py::test_sqlite_column_morph PASSED                                                                                                                                                                                               [ 41%]
tests/samples/test_target_sqlite.py::test_sqlite_process_batch_message PASSED                                                                                                                                                                                      [ 50%]
tests/samples/test_target_sqlite.py::test_sqlite_process_batch_parquet PASSED                                                                                                                                                                                      [ 58%]
tests/samples/test_target_sqlite.py::test_sqlite_column_no_morph PASSED                                                                                                                                                                                            [ 66%]
tests/samples/test_target_sqlite.py::test_record_with_missing_properties PASSED                                                                                                                                                                                    [ 75%]
tests/samples/test_target_sqlite.py::test_sqlite_generate_insert_statement[no_key_properties] PASSED                                                                                                                                                               [ 83%] 
tests/samples/test_target_sqlite.py::test_hostile_to_sqlite FAILED                                                                                                                                                                                                 [ 91%]
tests/samples/test_target_sqlite.py::test_overwrite_load_method PASSED                                          

Other tests I see a similar behavior with are:

  • test_target_sqlite.py::test_sync_sqlite_to_sqlite
  • test_tap_sqlite.py::test_sqlite_state

Code

def test_hostile_to_sqlite(
    sqlite_sample_target: SQLTarget,
    sqlite_target_test_config: dict,
):
    tap = SampleTapHostile()
    tap_to_target_sync_test(tap, sqlite_sample_target)
    # check if stream table was created
    db = sqlite3.connect(sqlite_target_test_config["path_to_db"])
    cursor = db.cursor()
    cursor.execute("SELECT name FROM sqlite_master WHERE type='table';")
    tables = [res[0] for res in cursor.fetchall()]
    assert "hostile_property_names_stream" in tables
    # check if columns were conformed
    cursor.execute(
        dedent(
            """
            SELECT
                p.name as columnName
            FROM sqlite_master m
            left outer join pragma_table_info((m.name)) p
                on m.name <> p.name
            where m.name = 'hostile_property_names_stream'
            ;
            """,
        ),
    )
    columns = {res[0] for res in cursor.fetchall()}
    assert columns == {
        "name_with_spaces",
        "nameiscamelcase",
        "name_with_dashes",
        "name_with_dashes_and_mixed_cases",
        "gname_starts_with_number",
        "fname_starts_with_number",
        "hname_starts_with_number",
        "name_with_emoji_",
    }


sqlite_sample_target = <samples.sample_target_sqlite.SQLiteTarget object at 0x0000010FC4A67D30>
sqlite_target_test_config = {'hard_delete': False, 'load_method': <TargetLoadMethods.APPEND_ONLY: 'append-only'>, 'path_to_db': 'C:\\Users\\dan.no...ppData\\Local\\Temp\\pytest-of-unknown\\pytest-219\\test_hostile_to_sqlite0\\target_test.db', 'validate_records': True}

    def test_hostile_to_sqlite(
        sqlite_sample_target: SQLTarget,
        sqlite_target_test_config: dict,
    ):
        tap = SampleTapHostile()
>       tap_to_target_sync_test(tap, sqlite_sample_target)

tests\samples\test_target_sqlite.py:539:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
singer_sdk\testing\legacy.py:227: in tap_to_target_sync_test
    target_stdout, target_stderr = target_sync_test(target, tap_stdout, finalize=True)
singer_sdk\testing\legacy.py:199: in target_sync_test
    target._process_lines(input)  # noqa: SLF001
singer_sdk\target_base.py:307: in _process_lines
    counter = super()._process_lines(file_input)
singer_sdk\io_base.py:117: in _process_lines
    line_dict = self.deserialize_json(line)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <samples.sample_target_sqlite.SQLiteTarget object at 0x0000010FC4A67D30>, line = '\n'

    def deserialize_json(self, line: bytes) -> dict:
        """Deserialize a line of json.

        Args:
            line: A single line of json.

        Returns:
            A dictionary of the deserialized json.

        Raises:
            msgspec.DecodeError: raised if any lines are not valid json
        """
        try:
>           return decoder.decode(  # type: ignore[no-any-return]
                line,
            )
E           msgspec.DecodeError: Input data was truncated

singer_sdk\io_base.py:99: DecodeError