I found bug in this function <fb_league_stats>
Closed this issue · 7 comments
I don't think this is really a bug. This function is "experimental" and can be highly dependent on
- your internet connection
- FBRef's server latency
On the backend, we use chromote and have to wait for the table to be loaded via JavaScript on the page. this is the only function like this in the package AFAIK.
for transparency, i basically copied over a lot of the rvest / promise for fb_league_stats()
(see worldfootballr_chromote_session()
) from this branch in the {rvest} package. that branch has been in draft state for months, seemingly because it's never completely worked properly
I can also reproduce it today, although yesterday, there was not issue.
Strangely, the function works for team
but doesn't work for player
. Even with the stat_type
such as keepers
, where the tables are of comparable sizes.
For team:
fb_league_stats(country = "ENG", gender = "M", season_end_year = 2024, tier = "1st", non_dom_league_url = NA,
stat_type = "keepers",
team_or_player = "team"
)
# A tibble: 20 × 23
Team_or_Opponent Squad Num_Players `MP_Playing Time` `Starts_Playing Time` `Min_Playing Time` Mins_Per_90_Playing …¹
<chr> <chr> <int> <int> <int> <dbl> <dbl>
1 team Arsen… 2 15 15 1350 15
2 team Aston… 2 15 15 1350 15
........................
For player:
fb_league_stats(country = "ENG", gender = "M", season_end_year = 2024, tier = "1st", non_dom_league_url = NA,
stat_type = "keepers",
team_or_player = "player"
)
Error: Request failed after 3 attempts.
# A tibble: 0 × 1
# ℹ 1 variable: url <chr>
I've intentionally left this issue open to give notice that we're aware that this function is unreliable. I'm not sure there's a great solution without using Selenium, which we've implicitly decided to not use, so as to reduce the scope of this package.
I think the reason why the function sometimes works but sometimes doesn't is due to server latency on FBref's end, but I'm not entirely sure. The function relies on {promises}
for async behavior, which can be very dependent on Internet connection and server response time.
Thanks for the answer @tonyelhabr!
One more note, that may help in solving this issue. I found a workaround by using fb_big5_advanced_season_stats(season_end_year= c(2024), stat_type= "shooting", team_or_player= "player")
instead and filtering by the league name after. The fact that the fb_big5_advanced_season_stats
is working while fb_league_stats
isn't might indicate that this is not the server latency issue. However, this is only my guess 😃
Thanks for the answer @tonyelhabr! One more note, that may help in solving this issue. I found a workaround by using
fb_big5_advanced_season_stats(season_end_year= c(2024), stat_type= "shooting", team_or_player= "player")
instead and filtering by the league name after. The fact that thefb_big5_advanced_season_stats
is working whilefb_league_stats
isn't might indicate that this is not the server latency issue. However, this is only my guess 😃
i'm glad that works for you! that outcome is not unexpected to me. fb_big5_advanced_season_stats()
doesn't require any special backend logic (i.e. promises
) because FBref loads that data on the server-side. The individual league player stats are loaded on the client side (in the browser), so they can't be scraped in the typical manner (i.e. with rvest::read_html()
).
Issue is being archived.