dfm/osrc

Stats frozen?

Opened this issue · 14 comments

My stats seem to be pretty out of date. Could I have accidentally opted out at some point or something? It seems to be listing P01, a repo I removed sometime last year.

I came here about the same issue.

https://osrc.dfm.io/tejasmanohar/ is my new profile (did a username change), https://osrc.dfm.io/tejas-manohar/ is old one. Any idea how to get my new username - 'tejasmanohar' - reindexed? @dfm

I am confirming that this issue has popped up again since the new year.

On the new years issue popping out:
Its because Github deprecated the Timeline API in favor of the Events API(1/1/2015). Seems like we should update that.

dfm commented

Sorry about being slow to respond. I've been totally swamped with my PhD thesis so I don't have much time to maintain this :-\

GitHub's API changes are actually not the issue. I get the data from githubarchive.org and it is updating nightly with events. Either way, I can't confirm this issue but if y'all can give me more details of exactly what symptoms you're seeing I'll look into it!

the format of the json has changed though, i think. i am currently looking into this, but i have no experience with githubarchives, so it may take a while.

dfm commented

Ah. Interesting point. I'll look into that too! Thanks!

My % rank in JS language went up (like top 10% ish to like top 20% ish) right after changing username?

dfm commented

Unfortunately the stats don't transfer when you change username. Every event is tied to a username (not an ID). I should probably change this eventually but I'm not sure that I'll be able to do it anytime too soon! But we'll see :-)

ah ok

So, I was able to clone the repo and get my local version to run(damn, the initial indexing takes loooong). I can confirm that the problem persists and it all went downhill at 01/01/2015(so i can confirm for my repos, at least).

Comparing the datasets from 31. December 2014 and 1. January 2015, there were quite a few differences. Let's compare a push event:

{'actor': 'mYmNeo',
 'actor_attributes': {'blog': '',
                      'company': '',
                      'email': 'mymneo@163.com',
                      'gravatar_id': '',
                      'location': 'China',
                      'login': 'mYmNeo',
                      'name': 'mYmNeo',
                      'type': 'User'},
 'created_at': '2014-12-31T00:05:58-08:00',
 'payload': {'head': '0f1c82b6444aff88a08665fd2ba02d43bb39bf0d',
             'ref': 'refs/heads/master',
             'shas': [['0f1c82b6444aff88a08665fd2ba02d43bb39bf0d',
                       'mymneo@163.com',
                       'add num at first not last',
                       'mYmNeo',
                       True]],
             'size': 1},
 'public': True,
 'repository': {'archive_url': 'https://api.github.com/repos/mYmNeo/query-whois/{archive_format}{/ref}',
                'assignees_url': 'https://api.github.com/repos/mYmNeo/query-whois/assignees{/user}',
                'blobs_url': 'https://api.github.com/repos/mYmNeo/query-whois/git/blobs{/sha}',
                [...cut out for brevity...]
                'watchers': 0,
                'watchers_count': 0},
 'type': 'PushEvent',
 'url': 'https://github.com/mYmNeo/query-whois/compare/c62309cc61...0f1c82b644'}

That's how it used to look.

{'actor': {'avatar_url': 'https://avatars.githubusercontent.com/u/9152315?',
           'gravatar_id': '',
           'id': 9152315,
           'login': 'davidjhulse',
           'url': 'https://api.github.com/users/davidjhulse'},
 'created_at': '2015-01-01T00:00:00Z',
 'id': '2489368070',
 'payload': {'before': '86ffa724b4d70fce46e760f8cc080f5ec3d7d85f',
             'commits': [{'author': {'email': 'david.hulse@live.com',
                                     'name': 'davidjhulse'},
                          'distinct': True,
                          'message': 'Altered BingBot.jar\n'
                                     '\n'
                                     'Fixed issue with multiple account '
                                     'support',
                          'sha': 'a9b22a6d80c1e0bb49c1cf75a3c075b642c28f81',
                          'url': 'https://api.github.com/repos/davidjhulse/davesbingrewardsbot/commits/a9b22a6d80c1e0bb49c1cf75a3c075b642c28f81'}],
             'distinct_size': 1,
             'head': 'a9b22a6d80c1e0bb49c1cf75a3c075b642c28f81',
             'push_id': 536740396,
             'ref': 'refs/heads/master',
             'size': 1},
 'public': True,
 'repo': {'id': 28635890,
          'name': 'davidjhulse/davesbingrewardsbot',
          'url': 'https://api.github.com/repos/davidjhulse/davesbingrewardsbot'},
 'type': 'PushEvent'}

Thats how it looks now. The structure is pretty different . Just look at actor, which used to be a string, with a different entity holding the actors info, and is now an object which holds all the info by itself. Also repository is now repo and doesn't come with nearly as much info as it used to. I dont know anything about your indexing with osrcd, but I assume that breaks a few things?

I will be doing a bit more research, just wanted to give you a short heads-up in case you haven't looked into it yet.

Also, feel free to correct me if I am somehow misguided and looking into the wrong direction here.

dfm commented

Nope this looks right! This is definitely the problem. Thanks for looking into it. I'll definitely have to fix it! Damn.

I will try to come up with a pull request in the next couple days if I can wrap my head around your code and find the time(it will be a busy weekend for me anyway). But feel free to beat me to it if you find the time. And thanks for all the work you put into the tool!

one major drawback is that the language the repo is in is not included in push events anymore. the rest should not be a problem. i am currently testing a few experimental changes i just concocted, this may take a while.