Sorting/Scoring System For Instances
Closed this issue · 2 comments
Discussed in https://github.com/tgxn/lemmy-explorer/discussions/23
Originally posted by tgxn June 14, 2023
Because we need to determine if an instance is "good" there needs to be a way to score each instance based on data we have about it.
Currently, my thinking/implementation looks at the lists of federated sites, and scores each instance based on the amount of other instances that refer to it (in the linked, allowed and blocked lists).
Scoring is applied by the following rules:
Instances
let score = 0;
if (linkedFederation[siteBaseUrl]) {
score += linkedFederation[siteBaseUrl];
}
if (allowedFederation[siteBaseUrl]) {
score += allowedFederation[siteBaseUrl] * 2;
}
if (blockedFederation[siteBaseUrl]) {
score -= blockedFederation[siteBaseUrl] * 10;
}
Communities
Uses the same base score as instances, and then adjusts based on a posts per subscriber metric.
let score = 0;
if (linkedFederation[siteBaseUrl]) {
score += linkedFederation[siteBaseUrl];
}
if (allowedFederation[siteBaseUrl]) {
score += allowedFederation[siteBaseUrl] * 2;
}
if (blockedFederation[siteBaseUrl]) {
score -= blockedFederation[siteBaseUrl] * 10;
}
// also score based subscribers
score = score * community.counts.subscribers;
These rules are obviously not ideal, as I'd need to run some more analysis to determine if they are tuned correctly.
I'm also thinking that it might be worthwhile to log an "uptime" or "first seen" score also to determine if it's been around/up for a while.
I think active users per week would be the best default sorting. This avoids over-emphasizing communities that might be older but not as active.