RobThree/IdGen

GeneratorId for Azure Function(s)

lwalker-kforce opened this issue · 5 comments

Hi, I would like to use the IdGen lib to generate UUIDs via an Azure Function. As the Function scales up, the GeneratorId should be unique to avoid collision as I understand the docs. I am having a real challenge with a identifying a unique GeneratorID between 0-1023 dynamically. I thought of using the Process.ID which should be unique enough. I thought of converting a GUID to an int (via a byte[]) but both are too large. Any thoughts or assistance would be appreciated.

I thought of using the Process.ID which should be unique enough

I'm quite sure it's not unique at all; process ID's are very predictable (and usually a multiple of 4 so you'd waste 75% unless you divide by 4 - not sure if this is true for Azure functions though).

I thought of converting a GUID to an int (via a byte[]) but both are too large

You could take 10 bits (any 10) of a GUID and use those. But the chances of a collision are quite high. You really, really want to make sure no two generators use the same generator ID because that will cause collisions and if you have that then you might as well just try to generate random ID's (which will also result in collisions).

What you want is to coordinate the handing out of generator ID's. If you can't use an instance id (I don't know enough about azure functions to know if that's a thing) then you'll have to create a service or something that 'hands out' generator-id's for the generators ensuring you don't hand out an ID that's already in use.

(Also: what's up with the empty image below your post?)

Thanks for the response @RobThree. I'm not quite sure about the image, it was supposed to be a clip from the exception; not really needed.
The instance id (InvocationId) of the Az Funcs is a GUID, so not directly usable. I thought about managing the generator ids via some kind of factory but wanted to check in with you before heading in that direction. It'll be tricky though in a high volume distributed service where I'm trying to keep the points of failure to a minimum. Sequence is not a requisite for us, uniqueness is. Any experience you have with a factory pattern for IdGen would be useful.

Any experience you have with a factory pattern for IdGen would be useful.

I usually have generators that are long(er) lived; spun up instances of services. I take/hand-out a generator-id whenever a new generator is "spun up"; it being 'costly' doesn't matter so much since they're around for a long(er) time. In essence you can keep track of the generators in a 'coordinator service' that binds a generator id to, say, a hostname or whatever.

Even better (and actually the most common usecase for me): I put the generator-id in the configuration of a service. That way you don't need a coordinator (you could consider manually configuring generator id's as coordination though). But that isn't very 'dynamic'.

Hi All, i am facing the same issue in my kubernetes cluster. I need to generate an integer between 0 en 1023. My pods spin up automatically based on a configuration file in ArgoCd. We can increase/decrease the running number of pods when we want. First i thought to use the last part of the kubernetes pod hostname together with the gethashcode which generates an integer but that is not in the range of the 10 available bits. I am now thinking at the last part of the ip address, as i want to avoid adding a sidecontainer to coordinate the generation of ids. That's my last resort :-)

i thought to use the last part of the kubernetes pod hostname together with the gethashcode which generates an integer but that is not in the range of the 10 available bits

To make something in range you can also do hashcode % 1024 but that is very likely going to give collisions.

the last part of the ip address

You can take the last 10 bits. As long as all IP's are in the same subnet that should work I suppose.