HubSpot/Singularity

Singularity getPortByIndex returns incorrect values sometimes.

mikebell90 opened this issue · 2 comments

Singularity: .16, .20
Mesos: 1.31, 1.60

The service in question allocates 3 ports. PORT0 is the Http Port, PORT1 JMX, and Port2 is a JGroups port.

Recently we noticed a storm of errors which we traced to Jgroups hitting the JMX port. "Corrupted Stream", etc. In other words it was hitting PORT1 not PORT2.

Here's the part relevant to Singularity however:

  • The destination server has PORT0, PORT1, and PORt2 allocated, however it has not
    received a monotonic assignment. In the singularity UI it listed the ports as

31929, 31930, 31605

And the Docker startup confirmed the ports were injected in that order:

PORT0=31929
PORT1=31930
PORT2=31605

However, the originating server went to the wrong port. This is because:

It used Singularity's API to get active tasks, filtered for the correct task.
It then got the port from

  • SingularityTask.getPortByIndex(portIndex).get())

This ended up providing 31930 as the answer, because

getPortByIndex SORTS the port list. (so effectively it built a list of [ 31605, 31929, 31930 ])

We've worked around this for now, and would be happy to submit a PR, however since the
Collections.sort was clearly added deliberately at some point, before we removed it,
I thought we'd bring it up for discussion.

Thanks!

Was looking through the commit histories trying to remind myself why it was added. I could see removing, but would definitely want to add a unit test around the ports staying in order

Right the question is if the natural order of the mesos offers is the correct one. I would hope and expect it mirrors the order of port injection into docker?

The distinction between the two is the issue after all.