Random Element vs Predetermined

Question

Random Element vs Predetermined

Closed this issue 10 years ago · 16 comments

Some strategy planners have decided that rather than choose at random, they would use the steamid%4 to decide on which element a user should choose. They don't auto upgrade, but they lock just that element.

parseInt(g_steamID) % 4
Where:
0 Fire
1 Water
2 Earth
3 Air

If this interests us, I'd be happy to make a patch and pull request.

Answer 1 · 2015-06-15T17:42:32.000Z

I actually had thought this was already in, but apparently not... Should probably sync up for things like those as best as possible

Answer 2 · 2015-06-15T17:49:05.000Z

While we have evidence and know that javascript's Math.random is evenly distributed, there is no way of knowing that steam_ID is, and even if it is, its still essentially just random (other than not changing elements between days).

This could more easily lead to unbalanced elemental distribution if it is indeed the case that steam_ID is not uniformly distributed (edit, I checked, steam_ID is just an autoIncrement, so steam_ID%4 will be evenly distributed).

tl;dr: There are 2 cases, steam_ID is uniformly distributed or not. If it is, there is no change. If not, making this change has a NEGATIVE effect on elemental distribution.

Let me know if you believe I'm wrong.

EDIT: It may be worth linking anyone using steamID in their script to this issue, in order to either fix things on their end or clarify if I'm wrong somewhere

Answer 3 · 2015-06-15T18:10:51.000Z

Javascripts Math.random may be distributed evenly on a specific machine, but with 1500 users in a room, that's as much room for unbalanced as with steam_ID....

The important part about the steam community ID for our purpose (last digit) is determined by the users account ID which is just an auto-incremented value from when a user has joined. But once again, with the range of users and join dates of those users, making that truly evenly would be impossible using this as well.

Either way it could stack an element in a room... steam_ID has the higher probability of being more uniform however

Answer 4 · 2015-06-15T18:17:15.000Z

"Javascripts Math.random may be distributed evenly on a specific machine, but with 1500 users in a room, that's as much room for unbalanced as with steam_ID"

Unless I'm mistaken, evenly distributed is evenly distributed, there's no "it follows a different distribution pattern when used as a group". Do you have evidence to support this?

Answer 5 · 2015-06-15T18:37:47.000Z

If it was another language I would agree, but Javascript's implementation of Math.random is dependent on the browser the client is using and thus not evenly distributed. I wasn't making a point on scaling of mentality, I was making the point of the flaw of the language. Bad english on my part, I apologize

Answer 6 · 2015-06-15T18:49:45.000Z

Your 'not evenly distributed' argument is based on the assumption that the browser being used has implemented js.random() incorrectly. Unless you're using a small scale custom browser that isn't updated much, JS's math.rand should always be evenly distributed. In any decently popular browser, if that error existed it would be widely known, as it'd break a lot of stuff.

As for Steam_ID being evenly distributed because it's just an autoIncrement value, I looked into that and you're correct.

The only case it'd have any effect is on a client with an improperly implemented rand function.

I'll merge the pull request if one of you really wants to add it, but I don't see this edge case worth spending time on when there's other stuff to be done. Reopening Issue until a pull request is issued or people decide it isn't worth it.

Answer 7 · 2015-06-15T19:09:58.000Z

Also, as a side note, to whoever implements this (if anyone), keep in mind that the number of elements to choose from may not be 4. If the user has already upgraded 1 or more elements, it should only select from the top 'upgraded' set as to not waste any upgrade levels.

Answer 8 · 2015-06-15T19:38:20.000Z

I'd also like to add, here's what wchill and SteamDatabase do:

var hashCode=function(str) {
    var t=0, i, char;
    if (0 === str.length) {
        return t;
    }

    for (i=0; i<str.length; i++) {
        char=str.charCodeAt(i);
        t=(t<<5)-t+char;
        t&=t;
    }

    return t;
};

var elem = Math.abs(hashCode(w.g_steamID) % 4);

It's pretty clear even this isn't equivalent to parseInt(g_steamID) % 4 so why even bother?

> Math.abs(hashCode(g_steamID) % 4)
< 2
> parseInt(g_steamID) % 4
< 0

And to clarify what was said earlier, random() is defined by the ECMAScript specification to be

chosen randomly or pseudo randomly with approximately uniform distribution over that range

Given that Math.random() is uniformly distributed and g_steamID is uniformly distributed, we might as well just flip a coin, metaphorically speaking.

As an aside, I actually wrote this already. I didn't like it because it added clutter and made the code even messier, so I scrapped it because there's no point.

Answer 9 · 2015-06-15T20:10:19.000Z

@meishuu That code isn't equivalent logically, but it doesn't need to be. It's still a perfectly uniform distribution (just different values result in different results). Not sure why they didn't just do parseInt(g_steamID) % 4 since it has the same perfect distribution and is a LOT simpler / less resource intensive.

That said, that code doesn't check for existing values, so if you already started leveling one element, then start using that script, you're SOL and any levels you spent on an element that doesn't match your hash are wasted.

There IS something to be said about making it a function of g_steamID % number_of_elements_tied_for_first, as this is perfectly even distribution always (Except for when people manually level, as people are probably more likely to click the 'first' elements in the list on their own), whereas random has variance by design. Still not sure it's worth the effort however.

EDIT: I'm wrong about the always perfectly even distribution. The only difference here is our script uses js.rand as the PRNG, and theirs uses steamID as one.

Answer 10 · 2015-06-15T20:36:27.000Z

parseInt is apparently unreliable as "parseInt(76561198036572057) = 76561198036572060" (In Chrome x86 at least), hence their hashCoding

Answer 11 · 2015-06-15T20:45:35.000Z

Yes. The best precision we can get with a Number is 53 bits as per spec.

Preliminary tests with hashCode suggests it's not uniform either.

Answer 12 · 2015-06-15T20:48:14.000Z

var sampleSize = 4000; // This should be divisible by 4

var hashCode=function(str) {
    var t=0, i, char;
    if (0 === str.length) {
        return t;
    }

    for (i=0; i<str.length; i++) {
        char=str.charCodeAt(i);
        t=(t<<5)-t+char;
        t&=t;
    }

    return t;
};

var getElement = function(id) {
    return Math.abs(hashCode(id) % 4);
}

function checkUniformity(n, fn){
    var elemCount = [0,0,0,0];

    for(var i=0; i<n; i++) 
        elemCount[getElement(i.toString())]++;

    return elemCount;
}

// Check with some n
var result = checkUniformity(sampleSize, hashCode);

if(result[0] == result[1] && result[1] == result[2] && result[2] == result[3])
    console.log('function is perfectly uniform for sampleSize '+sampleSize, result);
else
    console.log('function is NOT perfectly uniform for sampleSize '+sampleSize, result);

It's uniform.

Answer 13 · 2015-06-15T21:00:26.000Z

Try again with sampleSize = 4 * Math.pow(2, 20). I get [1048578, 1048581, 1048574, 1048571].

Answer 14 · 2015-06-15T21:04:21.000Z

@meishuu looks like you're right.

Even if you weren't, I'm wrong about the always perfectly even distribution in their code. Given sequential steamIDs, theirs would perfectly uniform (though we now know it isn't). That'll never be the case though, and you can think of the set of steamIDs as its' own PRNG.

Essentially the only difference here is our script uses js.rand as the PRNG, and theirs uses steamID as one, though the last post by @meishuu suggests a logical error in their code, making me wary to switch to it.

Answer 15 · 2015-06-15T21:05:57.000Z

My test script:

var uint64 = require('./uint64'); // https://github.com/pierrec/js-cuint

var hashCode=function(str) {
    var t=0, i, char;
    if (0 === str.length) {
        return t;
    }

    for (i=0; i<str.length; i++) {
        char=str.charCodeAt(i);
        t=(t<<5)-t+char;
        t&=t;
    }

    return t;
};

// https://developer.valvesoftware.com/wiki/SteamID
// universe: 1, account type: 1, instance: 1
var hi = (1 << (56 - 32)) | (1 << (52 - 32)) | (1 << (32 - 32));

var bins = [0, 0, 0, 0];
var i, j;
var n = uint64(1, hi), u = uint64(1, 0);
for (i = 0; i < 128; i++) {
  console.log(i + ' / 128 (' + (i * 100 / 128).toFixed(2) + '%)');
  for (j = 0; j < 65536; j++) {
    var str = n.add(u).toString(10);
    var elem = Math.abs(hashCode(str) % 4);
    bins[elem]++;
  }
}
console.log(bins);

Result: [ 2097151, 2097146, 2097153, 2097158 ]

It's close, but not quite. I think that's enough justification to just not bother with cluttering the code with hashCode.

My attempt at explaining this is that hashCode operates on the string representation of a number. This means the input for each iteration is in the range [48, 57] for ASCII '0' - '9'. Even with the bit-twiddling it does, we won't end up at a nice, even distribution over the ring Z_4.

Answer 16 · 2015-06-15T21:07:54.000Z

Agreed. Closing this Issue once again.