ImpulsoGov/etl

Erro ao capturar RAAS-PS - ES 10/2016

bcbernardo opened this issue · 0 comments

Erro ao capturar os arquivos de disseminação dos Registros de Ações Ambulatoriais em Saúde - Psicossociais (RAAS-PS) na competência de 20/2016 para o estado do Espírito Santo:

2022-09-04 09:29:36.091 | INFO     | impulsoetl.scripts.saude_mental:raas_disseminacao:122 - Capturando RAAS Psicossociais do SIASUS.
2022-09-04 09:29:38.250 | INFO     | impulsoetl.siasus.raas_ps:obter_raas_ps:364 - Iniciando captura de RAAS-Psicossociais para Unidade Federativa Federativa 'ES' na competencia de 10/2016.
2022-09-04 09:29:38.251 | INFO     | impulsoetl.utilitarios.datasus_ftp:extrair_dbc_lotes:145 - Conectando-se ao servidor FTP `ftp.datasus.gov.br`...
2022-09-04 09:29:39.073 | INFO     | impulsoetl.utilitarios.datasus_ftp:extrair_dbc_lotes:148 - Conexão estabelecida com sucesso!
2022-09-04 09:29:39.074 | INFO     | impulsoetl.utilitarios.datasus_ftp:extrair_dbc_lotes:152 - Buscando diretório `/dissemin/publicos/SIASUS/200801_/Dados`...
2022-09-04 09:29:39.205 | INFO     | impulsoetl.utilitarios.datasus_ftp:extrair_dbc_lotes:154 - OK!
2022-09-04 09:29:39.206 | INFO     | impulsoetl.utilitarios.datasus_ftp:_listar_arquivos:82 - Listando arquivos compatíveis...
2022-09-04 09:29:40.722 | INFO     | impulsoetl.utilitarios.datasus_ftp:_listar_arquivos:100 - Encontrados 1 arquivos.
2022-09-04 09:29:40.723 | INFO     | impulsoetl.utilitarios.datasus_ftp:extrair_dbc_lotes:161 - Preparando ambiente para o download...
2022-09-04 09:29:40.724 | INFO     | impulsoetl.utilitarios.datasus_ftp:extrair_dbc_lotes:167 - Tudo pronto para o download.
2022-09-04 09:29:40.725 | INFO     | impulsoetl.utilitarios.datasus_ftp:extrair_dbc_lotes:177 - Iniciando download do arquivo `PSES1610.dbc`...
2022-09-04 09:29:42.578 | INFO     | impulsoetl.utilitarios.datasus_ftp:extrair_dbc_lotes:184 - Download concluído.
2022-09-04 09:29:42.710 | INFO     | impulsoetl.utilitarios.datasus_ftp:_checar_arquivo_corrompido:34 - Checando integridade do arquivo baixado...
2022-09-04 09:29:42.711 | DEBUG    | impulsoetl.utilitarios.datasus_ftp:_checar_arquivo_corrompido:35 - Tamanho declarado do arquivo no FTP: 110000 bytes
2022-09-04 09:29:42.711 | DEBUG    | impulsoetl.utilitarios.datasus_ftp:_checar_arquivo_corrompido:39 - Tamanho do arquivo baixado: 110000 bytes
2022-09-04 09:29:42.711 | INFO     | impulsoetl.utilitarios.datasus_ftp:_checar_arquivo_corrompido:54 - OK!
2022-09-04 09:29:42.712 | INFO     | impulsoetl.utilitarios.datasus_ftp:extrair_dbc_lotes:201 - Descompactando arquivo DBC...
2022-09-04 09:29:42.718 | INFO     | impulsoetl.utilitarios.datasus_ftp:extrair_dbc_lotes:204 - Lendo arquivo DBF...
2022-09-04 09:29:42.720 | INFO     | impulsoetl.utilitarios.datasus_ftp:extrair_dbc_lotes:214 - Lendo trecho do arquivo DBF disponibilizado pelo DataSUS e convertendo em DataFrame (linhas 0 a 400000)...
2022-09-04 09:29:43.136 | INFO     | impulsoetl.siasus.raas_ps:transformar_raas_ps:231 - Transformando DataFrame com 4706 registros de RAAS.
2022-09-04 09:29:45.431 | ERROR    | __main__:<module>:2 - An error has been caught in function '<module>', process 'MainProcess' (154171), thread 'MainThread' (139960620660544):
Traceback (most recent call last):

  File "/home/bernardo/etl/.venv/lib/python3.9/site-packages/pandas/core/arrays/datetimes.py", line 2211, in objects_to_datetime64ns
    values, tz_parsed = conversion.datetime_to_datetime64(data.ravel("K"))
                        │          │                      │    └ <method 'ravel' of 'numpy.ndarray' objects>
                        │          │                      └ array([nan, '  YY10DD'], dtype=object)
                        │          └ <built-in function datetime_to_datetime64>
                        └ <module 'pandas._libs.tslibs.conversion' from '/home/bernardo/etl/.venv/lib/python3.9/site-packages/pandas/_libs/tslibs/conve...
  File "pandas/_libs/tslibs/conversion.pyx", line 360, in pandas._libs.tslibs.conversion.datetime_to_datetime64
    raise TypeError(f'Unrecognized value type: {type(val)}')

TypeError: Unrecognized value type: <class 'str'>


During handling of the above exception, another exception occurred:


Traceback (most recent call last):

> File "<stdin>", line 2, in <module>

  File "/home/bernardo/etl/src/impulsoetl/scripts/saude_mental.py", line 134, in raas_disseminacao
    obter_raas_ps(
    └ <function obter_raas_ps at 0x7f4b03fb4dc0>

  File "/home/bernardo/etl/src/impulsoetl/siasus/raas_ps.py", line 382, in obter_raas_ps
    raas_ps_transformada = transformar_raas_ps(
                           └ <function transformar_raas_ps at 0x7f4b03fb4d30>

  File "/home/bernardo/etl/src/impulsoetl/siasus/raas_ps.py", line 245, in transformar_raas_ps
    raas_ps  # noqa: WPS221  # ignorar linha complexa no pipeline
    └      CNES_EXEC  GESTAO CONDIC   UFMUN TPUPS TIPPRE   MN_IND         CNPJCPF  ... TP_DROGA LOC_REALIZ    INICIO       FIM PERM...

  File "/home/bernardo/etl/.venv/lib/python3.9/site-packages/pandas/core/generic.py", line 5898, in astype
    res_col = col.astype(dtype=cdt, copy=copy, errors=errors)
              │   │            │         │            └ 'raise'
              │   │            │         └ True
              │   │            └ 'datetime64[ns]'
              │   └ <function NDFrame.astype at 0x7f4b10c39e50>
              └ 0       NaN
                1       NaN
                2       NaN
                3       NaN
                4       NaN
                       ... 
                4701    NaN
                4702    NaN
                4703    NaN
                4704    NaN
                4705 ...
  File "/home/bernardo/etl/.venv/lib/python3.9/site-packages/pandas/core/generic.py", line 5912, in astype
    new_data = self._mgr.astype(dtype=dtype, copy=copy, errors=errors)
               │    │    │            │           │            └ 'raise'
               │    │    │            │           └ True
               │    │    │            └ 'datetime64[ns]'
               │    │    └ <function BaseBlockManager.astype at 0x7f4b10d89940>
               │    └ SingleBlockManager
               │      Items: RangeIndex(start=0, stop=4706, step=1)
               │      ObjectBlock: 4706 dtype: object
               └ 0       NaN
                 1       NaN
                 2       NaN
                 3       NaN
                 4       NaN
                        ... 
                 4701    NaN
                 4702    NaN
                 4703    NaN
                 4704    NaN
                 4705 ...
  File "/home/bernardo/etl/.venv/lib/python3.9/site-packages/pandas/core/internals/managers.py", line 419, in astype
    return self.apply("astype", dtype=dtype, copy=copy, errors=errors)
           │    │                     │           │            └ 'raise'
           │    │                     │           └ True
           │    │                     └ 'datetime64[ns]'
           │    └ <function BaseBlockManager.apply at 0x7f4b10d894c0>
           └ SingleBlockManager
             Items: RangeIndex(start=0, stop=4706, step=1)
             ObjectBlock: 4706 dtype: object
  File "/home/bernardo/etl/.venv/lib/python3.9/site-packages/pandas/core/internals/managers.py", line 304, in apply
    applied = getattr(b, f)(**kwargs)
                      │  │    └ {'dtype': 'datetime64[ns]', 'copy': True, 'errors': 'raise'}
                      │  └ 'astype'
                      └ ObjectBlock: 4706 dtype: object
  File "/home/bernardo/etl/.venv/lib/python3.9/site-packages/pandas/core/internals/blocks.py", line 580, in astype
    new_values = astype_array_safe(values, dtype, copy=copy, errors=errors)
                 │                 │       │           │            └ 'raise'
                 │                 │       │           └ True
                 │                 │       └ 'datetime64[ns]'
                 │                 └ array([nan, nan, nan, ..., nan, nan, nan], dtype=object)
                 └ <function astype_array_safe at 0x7f4b11944700>
  File "/home/bernardo/etl/.venv/lib/python3.9/site-packages/pandas/core/dtypes/cast.py", line 1292, in astype_array_safe
    new_values = astype_array(values, dtype, copy=copy)
                 │            │       │           └ True
                 │            │       └ dtype('<M8[ns]')
                 │            └ array([nan, nan, nan, ..., nan, nan, nan], dtype=object)
                 └ <function astype_array at 0x7f4b11944670>
  File "/home/bernardo/etl/.venv/lib/python3.9/site-packages/pandas/core/dtypes/cast.py", line 1237, in astype_array
    values = astype_nansafe(values, dtype, copy=copy)
             │              │       │           └ True
             │              │       └ dtype('<M8[ns]')
             │              └ array([nan, nan, nan, ..., nan, nan, nan], dtype=object)
             └ <function astype_nansafe at 0x7f4b11944550>
  File "/home/bernardo/etl/.venv/lib/python3.9/site-packages/pandas/core/dtypes/cast.py", line 1163, in astype_nansafe
    to_datetime(arr).values,
    │           └ array([nan, nan, nan, ..., nan, nan, nan], dtype=object)
    └ <function to_datetime at 0x7f4b10bf3ee0>
  File "/home/bernardo/etl/.venv/lib/python3.9/site-packages/pandas/core/tools/datetimes.py", line 1063, in to_datetime
    cache_array = _maybe_cache(arg, format, cache, convert_listlike)
                  │            │    │       │      └ functools.partial(<function _convert_listlike_datetimes at 0x7f4b10bf3c10>, tz=None, unit=None, dayfirst=False, yearfirst=Fal...
                  │            │    │       └ True
                  │            │    └ None
                  │            └ array([nan, nan, nan, ..., nan, nan, nan], dtype=object)
                  └ <function _maybe_cache at 0x7f4b10bf39d0>
  File "/home/bernardo/etl/.venv/lib/python3.9/site-packages/pandas/core/tools/datetimes.py", line 197, in _maybe_cache
    cache_dates = convert_listlike(unique_dates, format)
                  │                │             └ None
                  │                └ array([nan, '  YY10DD'], dtype=object)
                  └ functools.partial(<function _convert_listlike_datetimes at 0x7f4b10bf3c10>, tz=None, unit=None, dayfirst=False, yearfirst=Fal...
  File "/home/bernardo/etl/.venv/lib/python3.9/site-packages/pandas/core/tools/datetimes.py", line 402, in _convert_listlike_datetimes
    result, tz_parsed = objects_to_datetime64ns(
                        └ <function objects_to_datetime64ns at 0x7f4b117145e0>
  File "/home/bernardo/etl/.venv/lib/python3.9/site-packages/pandas/core/arrays/datetimes.py", line 2217, in objects_to_datetime64ns
    raise err
  File "/home/bernardo/etl/.venv/lib/python3.9/site-packages/pandas/core/arrays/datetimes.py", line 2199, in objects_to_datetime64ns
    result, tz_parsed = tslib.array_to_datetime(
                        │     └ <built-in function array_to_datetime>
                        └ <module 'pandas._libs.tslib' from '/home/bernardo/etl/.venv/lib/python3.9/site-packages/pandas/_libs/tslib.cpython-39-x86_64-...
  File "pandas/_libs/tslib.pyx", line 381, in pandas._libs.tslib.array_to_datetime
    cpdef array_to_datetime(
          └ <built-in function array_to_datetime>
  File "pandas/_libs/tslib.pyx", line 613, in pandas._libs.tslib.array_to_datetime
    return _array_to_datetime_object(values, errors, dayfirst, yearfirst)
  File "pandas/_libs/tslib.pyx", line 751, in pandas._libs.tslib._array_to_datetime_object
    raise
  File "pandas/_libs/tslib.pyx", line 742, in pandas._libs.tslib._array_to_datetime_object
    oresult[i] = parse_datetime_string(val, dayfirst=dayfirst,
                 └ <built-in function parse_datetime_string>
  File "pandas/_libs/tslibs/parsing.pyx", line 281, in pandas._libs.tslibs.parsing.parse_datetime_string
    dt = du_parse(date_string, default=_DEFAULT_DATETIME,
         │                             └ datetime.datetime(1, 1, 1, 0, 0)
         └ <function parse at 0x7f4b15a45b80>
  File "/home/bernardo/etl/.venv/lib/python3.9/site-packages/dateutil/parser/_parser.py", line 1368, in parse
    return DEFAULTPARSER.parse(timestr, **kwargs)
           │             │     │          └ {'default': datetime.datetime(1, 1, 1, 0, 0), 'dayfirst': False, 'yearfirst': False}
           │             │     └ '  YY10DD'
           │             └ <function parser.parse at 0x7f4b158e1f70>
           └ <dateutil.parser._parser.parser object at 0x7f4b158dc760>
  File "/home/bernardo/etl/.venv/lib/python3.9/site-packages/dateutil/parser/_parser.py", line 643, in parse
    raise ParserError("Unknown string format: %s", timestr)
          │                                        └ '  YY10DD'
          └ <class 'dateutil.parser._parser.ParserError'>

dateutil.parser._parser.ParserError: Unknown string format:   YY10DD
2022-09-04 at 09:29:45 | ERROR | <stdin>:2: An error has been caught in function '<module>', process 'MainProcess' (154171), thread 'MainThread' (139960620660544):
Traceback (most recent call last):

  File "/home/bernardo/etl/.venv/lib/python3.9/site-packages/pandas/core/arrays/datetimes.py", line 2211, in objects_to_datetime64ns
    values, tz_parsed = conversion.datetime_to_datetime64(data.ravel("K"))
                        │          │                      │    └ <method 'ravel' of 'numpy.ndarray' objects>
                        │          │                      └ array([nan, '  YY10DD'], dtype=object)
                        │          └ <built-in function datetime_to_datetime64>
                        └ <module 'pandas._libs.tslibs.conversion' from '/home/bernardo/etl/.venv/lib/python3.9/site-packages/pandas/_libs/tslibs/conve...
  File "pandas/_libs/tslibs/conversion.pyx", line 360, in pandas._libs.tslibs.conversion.datetime_to_datetime64
    raise TypeError(f'Unrecognized value type: {type(val)}')

TypeError: Unrecognized value type: <class 'str'>


During handling of the above exception, another exception occurred:


Traceback (most recent call last):

> File "<stdin>", line 2, in <module>

  File "/home/bernardo/etl/src/impulsoetl/scripts/saude_mental.py", line 134, in raas_disseminacao
    obter_raas_ps(
    └ <function obter_raas_ps at 0x7f4b03fb4dc0>

  File "/home/bernardo/etl/src/impulsoetl/siasus/raas_ps.py", line 382, in obter_raas_ps
    raas_ps_transformada = transformar_raas_ps(
                           └ <function transformar_raas_ps at 0x7f4b03fb4d30>

  File "/home/bernardo/etl/src/impulsoetl/siasus/raas_ps.py", line 245, in transformar_raas_ps
    raas_ps  # noqa: WPS221  # ignorar linha complexa no pipeline
    └      CNES_EXEC  GESTAO CONDIC   UFMUN TPUPS TIPPRE   MN_IND         CNPJCPF  ... TP_DROGA LOC_REALIZ    INICIO       FIM PERM...

  File "/home/bernardo/etl/.venv/lib/python3.9/site-packages/pandas/core/generic.py", line 5898, in astype
    res_col = col.astype(dtype=cdt, copy=copy, errors=errors)
              │   │            │         │            └ 'raise'
              │   │            │         └ True
              │   │            └ 'datetime64[ns]'
              │   └ <function NDFrame.astype at 0x7f4b10c39e50>
              └ 0       NaN
                1       NaN
                2       NaN
                3       NaN
                4       NaN
                       ... 
                4701    NaN
                4702    NaN
                4703    NaN
                4704    NaN
                4705 ...
  File "/home/bernardo/etl/.venv/lib/python3.9/site-packages/pandas/core/generic.py", line 5912, in astype
    new_data = self._mgr.astype(dtype=dtype, copy=copy, errors=errors)
               │    │    │            │           │            └ 'raise'
               │    │    │            │           └ True
               │    │    │            └ 'datetime64[ns]'
               │    │    └ <function BaseBlockManager.astype at 0x7f4b10d89940>
               │    └ SingleBlockManager
               │      Items: RangeIndex(start=0, stop=4706, step=1)
               │      ObjectBlock: 4706 dtype: object
               └ 0       NaN
                 1       NaN
                 2       NaN
                 3       NaN
                 4       NaN
                        ... 
                 4701    NaN
                 4702    NaN
                 4703    NaN
                 4704    NaN
                 4705 ...
  File "/home/bernardo/etl/.venv/lib/python3.9/site-packages/pandas/core/internals/managers.py", line 419, in astype
    return self.apply("astype", dtype=dtype, copy=copy, errors=errors)
           │    │                     │           │            └ 'raise'
           │    │                     │           └ True
           │    │                     └ 'datetime64[ns]'
           │    └ <function BaseBlockManager.apply at 0x7f4b10d894c0>
           └ SingleBlockManager
             Items: RangeIndex(start=0, stop=4706, step=1)
             ObjectBlock: 4706 dtype: object
  File "/home/bernardo/etl/.venv/lib/python3.9/site-packages/pandas/core/internals/managers.py", line 304, in apply
    applied = getattr(b, f)(**kwargs)
                      │  │    └ {'dtype': 'datetime64[ns]', 'copy': True, 'errors': 'raise'}
                      │  └ 'astype'
                      └ ObjectBlock: 4706 dtype: object
  File "/home/bernardo/etl/.venv/lib/python3.9/site-packages/pandas/core/internals/blocks.py", line 580, in astype
    new_values = astype_array_safe(values, dtype, copy=copy, errors=errors)
                 │                 │       │           │            └ 'raise'
                 │                 │       │           └ True
                 │                 │       └ 'datetime64[ns]'
                 │                 └ array([nan, nan, nan, ..., nan, nan, nan], dtype=object)
                 └ <function astype_array_safe at 0x7f4b11944700>
  File "/home/bernardo/etl/.venv/lib/python3.9/site-packages/pandas/core/dtypes/cast.py", line 1292, in astype_array_safe
    new_values = astype_array(values, dtype, copy=copy)
                 │            │       │           └ True
                 │            │       └ dtype('<M8[ns]')
                 │            └ array([nan, nan, nan, ..., nan, nan, nan], dtype=object)
                 └ <function astype_array at 0x7f4b11944670>
  File "/home/bernardo/etl/.venv/lib/python3.9/site-packages/pandas/core/dtypes/cast.py", line 1237, in astype_array
    values = astype_nansafe(values, dtype, copy=copy)
             │              │       │           └ True
             │              │       └ dtype('<M8[ns]')
             │              └ array([nan, nan, nan, ..., nan, nan, nan], dtype=object)
             └ <function astype_nansafe at 0x7f4b11944550>
  File "/home/bernardo/etl/.venv/lib/python3.9/site-packages/pandas/core/dtypes/cast.py", line 1163, in astype_nansafe
    to_datetime(arr).values,
    │           └ array([nan, nan, nan, ..., nan, nan, nan], dtype=object)
    └ <function to_datetime at 0x7f4b10bf3ee0>
  File "/home/bernardo/etl/.venv/lib/python3.9/site-packages/pandas/core/tools/datetimes.py", line 1063, in to_datetime
    cache_array = _maybe_cache(arg, format, cache, convert_listlike)
                  │            │    │       │      └ functools.partial(<function _convert_listlike_datetimes at 0x7f4b10bf3c10>, tz=None, unit=None, dayfirst=False, yearfirst=Fal...
                  │            │    │       └ True
                  │            │    └ None
                  │            └ array([nan, nan, nan, ..., nan, nan, nan], dtype=object)
                  └ <function _maybe_cache at 0x7f4b10bf39d0>
  File "/home/bernardo/etl/.venv/lib/python3.9/site-packages/pandas/core/tools/datetimes.py", line 197, in _maybe_cache
    cache_dates = convert_listlike(unique_dates, format)
                  │                │             └ None
                  │                └ array([nan, '  YY10DD'], dtype=object)
                  └ functools.partial(<function _convert_listlike_datetimes at 0x7f4b10bf3c10>, tz=None, unit=None, dayfirst=False, yearfirst=Fal...
  File "/home/bernardo/etl/.venv/lib/python3.9/site-packages/pandas/core/tools/datetimes.py", line 402, in _convert_listlike_datetimes
    result, tz_parsed = objects_to_datetime64ns(
                        └ <function objects_to_datetime64ns at 0x7f4b117145e0>
  File "/home/bernardo/etl/.venv/lib/python3.9/site-packages/pandas/core/arrays/datetimes.py", line 2217, in objects_to_datetime64ns
    raise err
  File "/home/bernardo/etl/.venv/lib/python3.9/site-packages/pandas/core/arrays/datetimes.py", line 2199, in objects_to_datetime64ns
    result, tz_parsed = tslib.array_to_datetime(
                        │     └ <built-in function array_to_datetime>
                        └ <module 'pandas._libs.tslib' from '/home/bernardo/etl/.venv/lib/python3.9/site-packages/pandas/_libs/tslib.cpython-39-x86_64-...
  File "pandas/_libs/tslib.pyx", line 381, in pandas._libs.tslib.array_to_datetime
    cpdef array_to_datetime(
          └ <built-in function array_to_datetime>
  File "pandas/_libs/tslib.pyx", line 613, in pandas._libs.tslib.array_to_datetime
    return _array_to_datetime_object(values, errors, dayfirst, yearfirst)
  File "pandas/_libs/tslib.pyx", line 751, in pandas._libs.tslib._array_to_datetime_object
    raise
  File "pandas/_libs/tslib.pyx", line 742, in pandas._libs.tslib._array_to_datetime_object
    oresult[i] = parse_datetime_string(val, dayfirst=dayfirst,
                 └ <built-in function parse_datetime_string>
  File "pandas/_libs/tslibs/parsing.pyx", line 281, in pandas._libs.tslibs.parsing.parse_datetime_string
    dt = du_parse(date_string, default=_DEFAULT_DATETIME,
         │                             └ datetime.datetime(1, 1, 1, 0, 0)
         └ <function parse at 0x7f4b15a45b80>
  File "/home/bernardo/etl/.venv/lib/python3.9/site-packages/dateutil/parser/_parser.py", line 1368, in parse
    return DEFAULTPARSER.parse(timestr, **kwargs)
           │             │     │          └ {'default': datetime.datetime(1, 1, 1, 0, 0), 'dayfirst': False, 'yearfirst': False}
           │             │     └ '  YY10DD'
           │             └ <function parser.parse at 0x7f4b158e1f70>
           └ <dateutil.parser._parser.parser object at 0x7f4b158dc760>
  File "/home/bernardo/etl/.venv/lib/python3.9/site-packages/dateutil/parser/_parser.py", line 643, in parse
    raise ParserError("Unknown string format: %s", timestr)
          │                                        └ '  YY10DD'
          └ <class 'dateutil.parser._parser.ParserError'>

dateutil.parser._parser.ParserError: Unknown string format:   YY10DD