openhab/org.openhab.binding.zigbee

After a few devices added, discovery is not working any more

Closed this issue · 12 comments

HI,
My system runs on a Raspberry PI3A+, and i have a XBEE3 PRO attached to the hardware serial port of Raspberry (Not software managed port). Running Openhab 2.5.6-2 with the zigbee dinding installed manually, but the same version as official, where i just modified the discovery.txt.

Basically i have about 10 Zigbe devices, most end device, and a couple of routers (light bulbs). When i have the system clean, no device paired, the discovery works fine, for whatever device, doens't matther which device i pair first, the first ones are always discovered pretty fast. But after i added 3-4 devices..... discovery of further devices it's a real pain because transactions to controller/coordinator itself start failing and so coordinator does not get command to enable joining and i have to try many times searching till finally controller allows joining; and then the discovery part.... now i have the 6th device a router, and it just don't get discovered. If I clear all and start over with that router first, it works immediately.

And by the way, if I just stop Openhab and connect manually to XBEE controller by serial port and command it to allow join (by AT commands), it works immediately, so I doubt that it's controller fault.

If this it's not an issue and just my problem... could you please at least suggest which files to look at, to try to modify, because I want to disable discovery of devices already added to Things, and do the discovery only if it's in Inbox but not added to Things.

Thank you.

I found that in an older version (2.5.0M1) in ZigBeeDiscoveryService.java -> private void nodeDiscovered(ZigBeeCoordinatorHandler coordinator, final ZigBeeNode node) there was an "If" that was preventing rediscovery if already existed as thing. This "If" is missing in last version, so that's what I needed. I try to follow the changes to see the reason why it was deleted, just to understand. Then I'll try to put it back for may local version.

As far as I can see from looking at the history there have been no changes to this file for 12 months. If you spot issues, then please provide a reference rather than just a vague statement that and "if" was removed.

Hi, i apologize for being vague. I am a newbie and I thought that the "IF" was straight forward.
So, i n 2.5.0.M1 I see in file ZigBeeDiscoveryService.java, method -> private void nodeDiscovered(ZigBeeCoordinatorHandler coordinator, final ZigBeeNode node):
...
// If this already exists as a thing, then no need to rediscover
if (discoveryServiceCallback != null && discoveryServiceCallback.getExistingThing(defaultThingUID) != null) {
return;
}
...
This seems disappeared in 2.5.0.M2, and this is what I need (I think... as i said I am a newbie). I will try to put it back in my local version, but I was just wondering why it disappeared, maybe there is a good reason that I miss.

For me it's an issue because I cannot add more than 5-6 devices and this makes my system unusable.

Ok, thank you very much for your reply, that's the line I was talking about. So ... I should find some other way to achieve the same result.
I had system working fine with version 2.4.0, but 2.5.x I am testing since a few months. So I got to this point.
You consider it a valid issue? Or not an issue at all?

I don't actually know what your problem is - sure - I know what you are describing, but you've not really provided any information that allows an assessment of the issue, so I can't really help a lot.

What do the logs show - are you willing to share them, or some other information that allows me to work out what the issue is that you might be having?

I try to attach a piece of log that might help. In that log, only Router 000D6F00106571CB is not added to Things because has not completed discovery, and never completes.
LOG.txt

I am experiencing a similar issue.
The discovery appears to stop working after some time, and it works again once I restart OH.
Not sure if it is binding related or framework, but I didn't experience it anywhere else.

I am using cc2538 chip and it used to work without problems.
Attached a mighty log that includes multiple restarts of OH, I hope it will be useful.

Hi all,
I have done many tests modifying the binding and looking around the code. If can help I describe what i observed:

  • when binding starts, finds the coordinator and try to rediscoverNode of all the zigbee things, but because i have many battery-powered things which always sleep and do not respond to any request from coordinator (nor address request, nor attribute values, nothing). And seems that there is a queue somewhere that gets full of commands and retries waiting for response from devices, which won't respond. To pair a thing i have to keep it wake by pushing button. So, when binding starts I should push button on all 10 devices to keep them awake so they respond to coordinator.... impossible.
  • When I do a scan for new things, the same happens: all devices gets scanned and discovered, but the battery powered devices make that coordinator gets no response, and seems that zigbee commands gets stuck in queue and prevent other commands to get to coordinator. For example if i have 1 or 2 devices paired, when I start scan, the coordinator permits join, but when I already have many devices (5-6) and I push "scan for new devices" then coordinator permits joining after many minutes, like the "allow join" command does not get to coordinator because the queue being filled with commands for battery powered devices. I have the zigbee USB sniffer to invstigate.

Look, if I make more confusion than helping just tell me. Chris, if you want me to make some particular test just tell me. I continue to investigate because i really want it work, although i was thinking many time to abandon OH, but I belive in OH more than in others.

What seems strange to me is that are we only a few with these problems? Does anyone has more that 5 battery-power devices working fine? And 5 are nothing, I expect system to work with 20-30 devices.

I am experiencing a similar issue.

In general, please provide information on your configuration - it's hard to know what the issue might be if I don't know what you are running.

Please ensure that you are using the latest version. There will likely also be further updates coming so I don't really intend to spend time looking at this right now.

@cdjackson I appreciate your time on this, and I provided the log just for reference.

I am using the latest snapshot. Although the chip I am using (cc2538) is different from the cc2531, but AFAIK it uses the same API, so there should be no differences unless a hardware-specific software is implemented in the driver.

Closing this as it's quite old, and there have been a lot of updates since. If the problem persists, then please provide a debug log showing the issue.