Ignore retired players
roysti10 opened this issue · 4 comments
Describe the bug
It is evident that retired players don't play anymore. The webcrawler still includes them which needs to be filtered
To Reproduce
Steps to reproduce the behavior:
- Follow the instructions in README.md and notice once it starts collecting players. It can also be noticed in
data_crawler/ids_names.csv
.
Possible Solution
The solution :
in cralwer/cricketcrawler/spiders/howstat.py
, in function parse_player
if retired == False:
yield PlayerItem(name=url[url.find("?PlayerID=")+10:],gametype=gametype,folder=".",longname=name,retired=retired)
Desktop (please complete the following information):
- Version [
master
]
note from my side:
i included the retired as im not sure that all i mark as retired actually are. Also you might notice that a player can ocurr up to three times in the csv due to there being a page for each player on test,T20,ODI if they plaeyed in them. there could be the possibility that a player might not be playing in one of those catergories but is still active in another.
i included the retired as im not sure that all i mark as retired actually are.
Could you elaborate on this? I didnt quite get you
Also you might notice that a player can ocurr up to three times in the csv due to there being a page for each player on test,T20,ODI if they plaeyed in them. there could be the possibility that a player might not be playing in one of those catergories but is still active in another.
I was actually gonna change this to one time when i got the time and delete the gametype
column entirely. If he isnt active in the other formats , That should'nt matter cause his records will simply not be present in the respective format's folder. That shouldnt cause any problems.
i included the retired as im not sure that all i mark as retired actually are.
Could you elaborate on this? I didnt quite get you
i havent verfied that the data in the retired column is correct and that every player with retired=True is actually retired and i haven't checked if this also is the same for every gametype (meaning that when player is retired he is marked as such in every gametype)
i included the retired as im not sure that all i mark as retired actually are.
Could you elaborate on this? I didnt quite get you
i havent verfied that the data in the retired column is correct and that every player with retired=True is actually retired and i haven't checked if this also is the same for every gametype (meaning that when player is retired he is marked as such in every gametype)
Aah , then this issue can be dangerous if fixed, Ill verify this asap
Ill add a wontfix label for now