HackerSpace-PESU/Best11-Fantasycricket

Ignore retired players

roysti10 opened this issue · 4 comments

Describe the bug
It is evident that retired players don't play anymore. The webcrawler still includes them which needs to be filtered

To Reproduce
Steps to reproduce the behavior:

  1. Follow the instructions in README.md and notice once it starts collecting players. It can also be noticed in data_crawler/ids_names.csv.

Possible Solution
The solution :
in cralwer/cricketcrawler/spiders/howstat.py , in function parse_player

if retired == False:
          yield PlayerItem(name=url[url.find("?PlayerID=")+10:],gametype=gametype,folder=".",longname=name,retired=retired)

Screenshots
Screenshot from 2020-11-14 13-51-43

Desktop (please complete the following information):

  • Version [master]

note from my side:
i included the retired as im not sure that all i mark as retired actually are. Also you might notice that a player can ocurr up to three times in the csv due to there being a page for each player on test,T20,ODI if they plaeyed in them. there could be the possibility that a player might not be playing in one of those catergories but is still active in another.

i included the retired as im not sure that all i mark as retired actually are.

Could you elaborate on this? I didnt quite get you

Also you might notice that a player can ocurr up to three times in the csv due to there being a page for each player on test,T20,ODI if they plaeyed in them. there could be the possibility that a player might not be playing in one of those catergories but is still active in another.

I was actually gonna change this to one time when i got the time and delete the gametype column entirely. If he isnt active in the other formats , That should'nt matter cause his records will simply not be present in the respective format's folder. That shouldnt cause any problems.

i included the retired as im not sure that all i mark as retired actually are.

Could you elaborate on this? I didnt quite get you

i havent verfied that the data in the retired column is correct and that every player with retired=True is actually retired and i haven't checked if this also is the same for every gametype (meaning that when player is retired he is marked as such in every gametype)

i included the retired as im not sure that all i mark as retired actually are.

Could you elaborate on this? I didnt quite get you

i havent verfied that the data in the retired column is correct and that every player with retired=True is actually retired and i haven't checked if this also is the same for every gametype (meaning that when player is retired he is marked as such in every gametype)

Aah , then this issue can be dangerous if fixed, Ill verify this asap
Ill add a wontfix label for now