tomaae/homeassistant-portainer

[Bug] Creating new entities

IAmTheMilkManMyMilkIsDelicious opened this issue · 17 comments

Describe the issue

Sometimes (usually after a restart of homeassistant) the list of entities will have duplicates. So I'll have two containers:

  • sensor.portainer_myserver_watchtower_2
  • sensor.portainer_myserver_watchtower_3

When I click into sensor.portainer_myserver_watchtower_2 it says the entity is no longer in use by the integration:

Screenshot from 2023-06-30 12-10-15

This means my list of entities keeps growing until I manually remove old entities.

How to reproduce the issue

Hard to reproduce, seems to happen randomly when I restart homeassistant or the machine itself.

Expected behavior

Container entities should always be the same and new ones should not be created:
There should only ever be one entity:

  • sensor.portainer_myserver_watchtower
tomaae commented

hmm, that could happen if container Id in docker is changing. Not sure why that would happen

I don't really understand how entities work in homeassistant but can the integration automatically delete entities no longer managed by portainer?

tomaae commented

its how entities work in docker, not HA.
Integration does not delete them automatically.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

This issue was closed because it has been stalled for 5 days with no activity.

dd310 commented

hmm, that could happen if container Id in docker is changing. Not sure why that would happen

The same thing happened to me: when a stack is stopped/restarted, the containers inside it are re-deployed with a new id, apparently, and a new entity is created every time.

Would it be possible to reference the containers using their names instead of their ids?

tomaae commented

not sure if their name is constant, specially with compose if name is not specified

dd310 commented

If container_name is not passed in the compose config, portainer will deploy the container using stackname-servicename-1.
As long as neither the name of the stack or the name of the service(s) in the docker_compose change, the container name should stay the same.

tomaae commented

Are you able to test it and confirm that it wont change? For me it looks ok, but its better if more people can perform this test.

dd310 commented

I performed this test:

Screenshot_20230802_103404_Chrome.jpg

Screenshot_20230802_105112_Chrome.jpg

After stopping the stack and re-deploying, container names stayed the same.

tomaae commented

thanks

would love to see this implemented as well!

in docker compose you can use:

version: '3'
services:
  homeassistant:
    container_name: homeassistant

This will give the container the same name everytime, i have this for all my containers, cause many of my containers connect via API and use the container name as connection point (DNS).

The name_2 / name_3 etc is spamming my homeassistant with new entitiy's on every image pull upgrade / recreate.

im on the waiting list for when this gets implemented, would love to see a quick overview in homeassistant on running containers, or crashed containers (not that this ever happens tho).

I have the same issue as described. I guess it is related to containers being recreated/updated/restarted with updated container images - either by re-pulling the image manually or automatically e.g. by watchtower. I guess those container image updates result in a new container id (which doesn't seem to change when simply restarting a container).

I just looked really really quickly through the code, but is maybe that the location that uses the container id as unique id and then result in new entity:
https://github.com/tomaae/homeassistant-portainer/blob/master/custom_components/portainer/apiparser.py#L124

I wanted to see if it was a quick-fix that I could make a quick PR for because it's been bugging me, but I realise now that it's pretty fundamental change in how the integration is configured & currently-working, but wanted to continue the discussion.

I took some time tonight to try and track this back; hope my findings are helpful in diagnosing the source & proposing a solution.

Please let me know if anything below is unclear!


For context, here's a cut-down excerpt from my core.entity_registry for the container diun:

      {
            "aliases": [],
            "area_id": null,
            "categories": {},
            "capabilities": null,
            "config_entry_id": "8264ac4f59383c2af498c70f2320ddc6",
            "device_class": null,
            "device_id": "48d3d1e54469fa68a1c952ddb0fa7a03",
            "disabled_by": null,
            "entity_category": null,
            "entity_id": "sensor.portainer_local_diun_3",
            "hidden_by": null,
            "icon": null,
            "id": "5922f825c44547e9864c38d2ce0b585a",
            "has_entity_name": true,
            "labels": [],
            "name": null,
            "options": {
                  "conversation": {
                        "should_expose": false
                  }
            },
            "original_device_class": null,
            "original_icon": null,
            "original_name": "diun",
            "platform": "portainer",
            "supported_features": 0,
            "translation_key": null,
            "unique_id": "portainer-containers-11da9f5caeb19dcc3ec655b14f8dc51727084f5a3b9e51645457a0bff7caa9be",
            "previous_unique_id": null,
            "unit_of_measurement": null
      },
      {
            "config_entry_id": "8264ac4f59383c2af498c70f2320ddc6",
            "entity_id": "sensor.portainer_local_diun_2",
            "id": "86e55c48b07445c1cd206ccee48d80d2",
            "orphaned_timestamp": null,
            "platform": "portainer",
            "unique_id": "portainer-containers-415a5405cc3af21aec9360d24a4cfe72e455d3fae0459d23fb881107bd3f672c"
      },
      {
            "config_entry_id": "8264ac4f59383c2af498c70f2320ddc6",
            "entity_id": "sensor.portainer_local_diun",
            "id": "4acbf7b0aa23cc3bddb45b364e962d7b",
            "orphaned_timestamp": null,
            "platform": "portainer",
            "unique_id": "portainer-containers-2c1eb51417e5274fbe1a48cfdc2448a459faad902620c5cb7272aec8184de65c"
      },

Here's the (REDACTED) matching/latest context for the entry that gets pulled from the API call on L132 of apiparser.py:

{
    "Command": "diun serve",
    "Created": 1712409417,
    "HostConfig": {
        "(REDACTED)"
    },
    "Id": "11da9f5caeb19dcc3ec655b14f8dc51727084f5a3b9e51645457a0bff7caa9be",
    "Image": "crazymax/diun:latest",
    "ImageID": "sha256:ecac071c00b8af8887c851a2fadf16054dcae0ec4876de19cd6acc5133fcae2f",
    "Labels": {
        "(REDACTED)"
    },
    "Mounts": [
        "(REDACTED)"
    ],
    "Names": [
        "/diun"
    ],
    "NetworkSettings": {
        "(REDACTED)"
    },
    "Ports": [],
    "State": "running",
    "Status": "Up 26 hours"
}

Here's the (REDACTED) version of my docker-compose for that Portainer stack:

version: "3.5"

services:
  diun:
    image: crazymax/diun:latest
    container_name: diun
    hostname: diun-portainer
    command: serve
    volumes:
      - data:/data
      - /var/run/docker.sock:/var/run/docker.sock
    environment:
      - DIUN_NOTIF_SLACK_WEBHOOKURL=${DIUN_NOTIF_SLACK_WEBHOOKURL}
      - DIUN_PROVIDERS_DOCKER=${DIUN_PROVIDERS_DOCKER}
      - DIUN_PROVIDERS_DOCKER_WATCHBYDEFAULT=${DIUN_PROVIDERS_DOCKER_WATCHBYDEFAULT}
      - DIUN_PROVIDERS_DOCKER_WATCHSTOPPED=${DIUN_PROVIDERS_DOCKER_WATCHSTOPPED}
      - DIUN_WATCH_FIRSTCHECKNOTIF=${DIUN_WATCH_FIRSTCHECKNOTIF}
      - DIUN_WATCH_JITTER=30s
      - DIUN_WATCH_SCHEDULE=0 */6 * * *
      - DIUN_WATCH_WORKERS=20
      - TZ=${TZ}
    restart: unless-stopped

volumes:
  data:
    (REDACTED)

We can see from the above that the Portainer entry's Id value and the Integration entity's unique_id value line up, and that id value from Portainer is the running SHA of the current container, but it's being used as an immutable unique identifier for HomeAssistant for a long-running sensor.

FYI - I got a little lost in the middle figuring out where Names as an array was converted to name as a property inside the integration, but that's done by forcing the value on L178 of coordinator.py for anyone else trying to track this back.

This mapping from API response to HomeAssistant entity is mainly done as part of the PortainerSensorEntityDescription configuration, where we tell the code to use the Id value as the data_reference for entities keyed as "containers" at L83 of sensor_types.py which is then used when checking if the entity exists in the HomeAssistant entity registry at L52 of entity.py


Understandably, for people using this integration to monitor long-running single, specific container instances, this is helpful, so we shouldn't change this functionality without understanding the impact to that subset of users, but ... I would suggest that for most people that use containers as they were designed (immutable, short-lived, easily-replaced images, leveraging external sources for maintaining state) then this is not ideal.

I would suggest that we should leverage the Name value (similar to the current Portainer Environment Entity configuration), or fall-back to leveraging Id when Names is not provided by Portainer (in the case that someone hasn't leveraged container_name as a parameter in their docker-compose or stack-definition)

Alternatively, we could leverage the in-built migration functionality of HomeAssistant's Entity Registry (similar to other integrations) to specify the previous_unique_id value & update the new unique_id value so that the actual entity_id value doesn't have to be updated. This is not very well documented, but is visible in the core code on line 1073 of the current entity_registry.py implementation.

To my (current) understanding, this should only require a small change in one of the following places:

  • the logic of the sensor-type for containers should be changed to leverage id as the uid and name as the data_reference on Lines 82 & 83 in sensor_types.py
  • the composition of the "unique-id" value for HomeAssistant's Entity Registry should be updated to leverage the name instead of id on Lines 51 to 54 in entity.py
  • updating the integration to pass through the previous_unique_id value and call update_entity on the registry rather than add_entity on the platform -- but this would be a signficant refactor of the current entity.py implementation.

Please let me know what you think & how we should continue - I'd love to get this resolved ASAP as some of my MANY containers are up to _54 and _97 (as an example) and it's really cluttering my entity registry in Home Assistant!

In doing some follow-up research around the usage of that previous_unique_id value - I found a similar problem in the HomeKit integration, where the entities need to be maintained, but the identifiers are hashed (similar to a replaced container id for the same "instance" of container)

Their implementation is visible here, but as I mentioned above - it's going to need a pretty significant refactoring of the current entity.py and persisting the mapping inside this integration before handing it off to HA - just like how this PR made the change.

Let me know what you think?