danieldotnl/ha-multiscrape

Max length is 255 characters

ostracizado opened this issue · 2 comments

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 1175, in _async_write_ha_state
    hass.states.async_set(
  File "/usr/src/homeassistant/homeassistant/core.py", line 1951, in async_set
    state = State(
            ^^^^^^
  File "/usr/src/homeassistant/homeassistant/core.py", line 1468, in __init__
    validate_state(state)
  File "/usr/src/homeassistant/homeassistant/core.py", line 209, in validate_state
    raise InvalidStateError(
homeassistant.exceptions.InvalidStateError: Invalid state with length 314. State max length is 255 characters.

Is there a way to bypass that limit?

I'm trying to scrape a table:

  - name: Football Data  
    log_response: true
    resource: https://tudonumclick.com/futebol/jogos-na-tv/
    scan_interval: 60
    sensor:
      - unique_id: football_dates
        name: Football Dates
        select: "section#page-content h3:nth-child(3)"
      - unique_id: football_games
        name: Football Games
        select_list: "section#page-content table:nth-child(4)"

The output should be:

[ "18:30 Leverkusen  VfL Wolfsbur…", "19:00 Sampaio Corrêa-RJ  Botaf…", "19:00 Sp. Braga B  Lourosa CAN…", "19:45 Fiorentina  Roma SPORT T…", "19:45 Marseille  Nantes ELEVEN…", "20:00 Bétis  Villarreal ELEVEN…", "20:30 Benfica   Estoril BENFIC…", "21:30 Vasco  Nova Iguaçu PFC  …" ]

This is a Home Assistant limitation, not Multiscrape. It does not apply to attributes though, so you could store your results in the attributes of a sensor.

Yeah, that's also the idea I had; but I can't seem to find the right way to add the string as attr.

Multiple errors:

Football Data # Football Dates # football_games # Unable to extract data from HTML
ValueError: Selector error: either select, select_list or a value_template should be provided.

Must be creating the sensor in a wrong way. Can you give any ideas?

  - name: Football Data  
    log_response: true
    resource: https://tudonumclick.com/futebol/jogos-na-tv/
    scan_interval: 600
    sensor:
      - unique_id: football_dates
        name: Football Dates
        select: "section#page-content h3:nth-child(3)"
      - unique_id: football_games
        name: Football Games
        attributes:
          - name: Football Games attr
            select_list: "section#page-content table:nth-child(4)"

Edit:

  - name: Football Data  
    log_response: true
    resource: https://tudonumclick.com/futebol/jogos-na-tv/
    scan_interval: 600
    sensor:
      - unique_id: football_date
        name: Football Date
        select: "section#page-content h3:nth-child(3)"
      - unique_id: football_game
        name: Football Game
        value_template: "{{ value_json[0] }}"
        attributes:
          - name: list
            select_list: "section#page-content table:nth-child(4)"
            value_template: |
              {%-set value = value.split("  ")-%}
              {%for x in value%}
              - {{x}}
              {%-endfor-%}

Did the trick.