eclipse-cyclonedds/cyclonedds-cxx

Maximum number of keyed instances

ProfCodeGen opened this issue · 6 comments

I'm using cyclone as part of a distributed systems class. For the last assignment I was writing a publisher to air traffic state data from the Open Sky project. I restricted the data 1 hr of historical data in an area around Toronto Pearson Airport. I added the @key annotation to the callsign attribute, and left the history at 1 to keep the last instance of each aircraft report.

However throws a segment fault a little over 2 minutes in with about 452 instances, on an Intel Mac i9 running Venura with 32GB of memory. Also happens on the class server (2x Intel Xeon Gold total 63 cores with 256GB memory). If I don't explicitly register the handles, then the subscriber crashes. If I explicitly register the instances (storing the instances in a map) then the publisher has the segmentation fault. I added some code to ask the default values for resources, and the ones that I queried all came back -1 which I believe means unlimited. Even if it was a limit, I would expect an error rather than a fault. Not sure where I should upload code / data to reproduce.

452 instances should not be a problem at all, running out of memory and crashing is definitely a bug. The resource limits wouldn't be of any help here anyway, because you're doing nothing weird.

Any chance you are using C++ or Python and don't have #471 or eclipse-cyclonedds/cyclonedds-python#233 in your build? It sounds like you're running into the bug that those PRs fix.

Otherwise, for a reproducer you probably would only have to make a list of key values that you're trying to insert (in the order in which you're trying to insert them), and then add the source for a trivial publisher/subscriber along the lines of a "hello world" example. The easiest way to upload it would be to zip it and drop it in a GitHub comment.

just my 2 cts:

  • without setting resource-limits explicitly they are indeed all set as 'INFINITE' where you would run out of memory before reaching 'such limits' ;)
  • when you mention '452 instances' that's indeed not a lot (esp. when your history is set to KEEP_LAST with depth=1 and the samples are not Gbytes in size) yet sometimes there's something like 'active' instances (for which there are still writers 'alive' that registered those instances as well as 'non-active instances' that are not updated anymore yet are not explicitly unregistered so they still occupy memory so perhaps you're creating new instances at a very high rate ?
  • finally you mention a history (depth) of 1 which means that for each instance, you'll maintain exactly 1 sample (which when using i.c.w. a KEEP_LAST history policy will either be overwritten or removed from a reader cache when using the 'take()' operation (rather than the read() operation), noting that a 'take()' operation also typically 'gets rid of' instance-administration when there's nothing left for that instance.

I haven't updated since late last fall, as that was when we built and tested the student server for this term and we normally try to keep that stable. I just did a pull for the C++ on a virtual machine, and it seems to work. Do I have to update the C version as well?
I was searching through the issues for max instances, but didn't find 471.
Thanks.

Also only one writer and one subscriber. The students have to add three more subscribers and publish a different topic.

I haven't updated since late last fall, as that was when we built and tested the student server for this term and we normally try to keep that stable. I just did a pull for the C++ on a virtual machine, and it seems to work. Do I have to update the C version as well? I was searching through the issues for max instances, but didn't find 471. Thanks.

Spotting the relationship between that PR and the crash is indeed hard if you don't know the details. I'm happy my educated guess worked out 🙂

I think there's no need to update to C version, the bug is exclusively in the non-C bindings. I do generally recommend tracking the C repository because it mostly improves things and it almost never regresses 🤞

Thanks. We are in week 11 of a 12 week term and the assignment is due on the 8th. So I would like to do the minimum that will disrupt the assignment at this point in time. We will update both after the end of term so we can use an up to date version for planning next years class over the summer.