Too many friends and too many colleagues
ambitious-octopus opened this issue · 5 comments
Not "too large" just larger than the random forest expects it to be. It might be a bug or just a high variance in the simulation. In the second case, depending on what?
The problem is relative to the size of the jobs. In order to create jobs in the setup, scalar numbers representing the size of each company are extracted. Possible values range from 1 to 248, the extraction probability of a scalar depends on its input file: /simulator/inputs/palermo/data/employer_sizes.csv
PyPROTON-OC/protonoc/simulator/inputs/palermo/data/employer_sizes.csv
Lines 1 to 522 in f53721c
1 | |
1 | |
1 | |
1 | |
1 | |
1 | |
1 | |
1 | |
1 | |
1 | |
1 | |
1 | |
1 | |
1 | |
1 | |
1 | |
1 | |
1 | |
1 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
2 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
3 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
4 | |
10 | |
10 | |
10 | |
10 | |
10 | |
10 | |
10 | |
10 | |
10 | |
10 | |
10 | |
10 | |
10 | |
10 | |
10 | |
10 | |
10 | |
12 | |
12 | |
12 | |
12 | |
12 | |
12 | |
12 | |
12 | |
12 | |
12 | |
12 | |
12 | |
12 | |
20 | |
20 | |
34 | |
34 | |
34 | |
34 | |
34 | |
34 | |
34 | |
34 | |
66 | |
66 | |
139 | |
248 | |
248 | |
22 |
The job generation algorithm creates problems between different runs. There will be some runs with very large companies and some runs with very small companies. It follows that when agents make professional type connections the pool from which to draw will be very large in one case and very small in another. As the following chart shows, the number of companies (employers) is inversely proportional to the number of total professional connections.
Solutions:
a. @mariopaolucci proposed to add a scaling factor during setup that modifies the size of companies based on the number of initial agents.
b. decrease number of maximum professional links allowed in remove_excess_professional_links.
PyPROTON-OC/protonoc/simulator/model.py
Lines 645 to 658 in f53721c
def remove_excess_professional_links(self) -> None: | |
""" | |
Given a max number (30) this procedure cut the excess professional links. | |
Is activated at every tick. | |
:return: None | |
""" | |
for agent in self.schedule.agents: | |
friends = agent.get_neighbor_list('professional') | |
if len(friends) > 30: | |
to_remove = self.random.choice(list(friends), | |
int(len(friends) - 30), | |
replace=False) | |
for friend in to_remove: | |
friend.remove_professional(agent) |
However, for large number of agents (the scale here is the biggest company) the problem should disappear, that is, if we run for >10000 agents. For lower number of agents we will have to deal with the extra variability at setup.
I evaluated the differences in the number of professional-links during setup. (number of agents = 1000) (netlogo-model: LABSS/PROTON-OC@5e7e34a)
netlogo_prof_links = [ 1416, 1736, 282, 2868, 886, 846, 3430,
1056, 2678, 692, 1026, 1738, 922, 878, 1196,
1802, 942, 450, 2644, 460, 466, 1048, 4542]
python_prof_links = [1104, 2134, 1220, 700, 1620, 676,
1500, 2660, 4960, 5420, 628, 426, 5038, 4830,
1378, 724, 2148, 4082, 524, 1198, 1784, 3068, 1678]
metric | netlogo | python |
---|---|---|
mean | 1478.43 | 2152.17 |
std | 1056.8 | 1584.02 |
min | 282 | 426 |
max | 4542 | 5420 |
(number of agents = 3000) (netlogo-model: LABSS/PROTON-OC@5e7e34a)
netlogo_n_prof_links = [9688, 5922, 13222, 2396, 11702, 1248, 5074, 2142, 9686, 13884, 3264, 7106]
python_n_prof_links = [6442, 8814, 4008, 10910, 13420, 2924, 6142, 9534, 4462, 10438, 3156, 11202]
metric | netlogo | python |
---|---|---|
mean | 7111.16 | 7621.0 |
std | 4274.2 | 3402.1 |
min | 1248 | 2924 |
max | 13884 | 13420 |
This goes against the results of the overall comparisons. Please note the tests are 12 versus 30.