Opinion: _checkServicePresence should not shuffle resultant array
sjmcdowall opened this issue · 8 comments
https://github.com/flywheelsports/hydra/blob/master/index.js#L985
I don't believe (obviously my opinion) that this routine should go ahead and randomize the results of the services. That's not really part of it's contract. Also -- this is a "lower" level routine that gets called "often" and any sort of "Shuffle" routine is O(N) at a minimum .. i.e. it takes at least SOME processing time and probably increases with size of array.
Also since this Shuffled array is cached, that means it's really not random upon multiple uses anyway. The code I see appears to use array[0] to get a "random" service, but if its cached its the same array each time?
Much better is to store just the pure array returned and cache that. Then in the FEW cases where we need a "random" element from it, just have a Util that does something like:
var randomInstance = instances[Math.floor(Math.random() * instances.length)]
@sjmcdowall First some history ;-) . The reason for the shuffle is to offer load balancing through randomization of available instances. The arrays in question should be small... since they're per microservice type. So if you have a file-processing microservice and you have 100 running instances then the array would only be 100 elements long.
Instance data is cached for up to three seconds for performance reasons. The time length was chosen because it's also the same amount of time that services have to update their presence information.
So -- if I am reading this right -- if I have some "high-speed" stream of data from say a Kafka source .. and "route" it to a service.. say .. oh .. 100 msgs/sec .. 300 messages will all be routed to the exact same service -- which doesn't seem too "random" to me. :) Where as in my suggested implementation each message is guaranteed to go to a random available service.. I still may just implement how I think it should go and let you see ..
That's a valid point. A potential fix is simply to shuffle the cached array here: https://github.com/flywheelsports/hydra/blob/master/index.js#L957
I'm open to seeing a PR... you know the saying... "Running code speaks volumes" Go for the PR ;-)
@sjmcdowall I did a bit more research. I need to point out that the reason shuffling the instance array is necessary is because the same array is used by both the _makeAPIRequest
call here: https://github.com/flywheelsports/hydra/blob/master/index.js#L1325 and the _sendMessage
call here https://github.com/flywheelsports/hydra/blob/master/index.js#L1377
The important bit is that the makeAPIRequest call serves HTTP based requests and uses the returned instance array to work through service instances to try. So it starts by sending a request to the first service in the list and if that fails it tries the next instance in the list. This is why randomly indexing into the instance list wouldn't work.
In the case of the sendMessage usage checking whether a message to a specific instance is valid (i.e present) makes sense. So this check: https://github.com/flywheelsports/hydra/blob/master/index.js#L1385 needs to ensure that instance
is present in the list of instances
.
@sjmcdowall at the very least... this line Utils.shuffeArray(instanceList);
https://github.com/flywheelsports/hydra/blob/master/index.js#L987 should be copied up to L:959 before the resolve. That would solve a bug where the presence map isn't random for the duration of the cache - which is currently 3 seconds.
let cacheKey = `checkServicePresence:${name}`;
let cachedValue = this.internalCache.get(cacheKey);
if (cachedValue) {
>>> Utils.shuffeArray(cachedValue);
resolve(cachedValue);
return;
}
Thanks. Actually, the docs should point out the algorithm name ;-)
https://github.com/flywheelsports/hydra/blob/master/lib/utils.js#L104