Scraping FanMatch Page
Closed this issue · 9 comments
Not really an issue but a request.
I was going to use this tool to scrape the FanMatch page daily and export the data into an excel file. I was then going to use William Hill's API to get the lines and the spreads of those games and match them up.
The FanMatch page shows the projected final score and the totals for every game going by the Kenpom model.
My thought process was to view them daily and try to identify outlier lines that the sportsbooks have compared to the Kenpom model to gain an edge.
For example, a spread might be Kansas -7, but Kenpom has them winning by 11.
The request was to include the FanMatch page (https://kenpom.com/fanmatch.php) in the utils.
Great, thank you. I appreciate the work. This thing will get a lot of use.
Hi j-andrews7, any luck with the fanmatch.php page? thanks!
I haven't forgotten about this, it's just an annoying page to parse. The future/current page and past pages are different, so accounting for all possibilities is taking me longer than I thought. And real life has been working me hard lately.
I totally understand.. I'm taking a shot at parsing it now using your code. I'll just start with this season..
Alright, this is more or less done. You can get it on the v0.2.0 branch. I will make a release and get it on pypi once I have time to test it and clean up a few things. Given the extra info on the bottom of the table, this is a bit different from the other functionality in this package, as it's a class.
Still easy to use:
from kenpompy.utils import login
from kenpompy.FanMatch import FanMatch
browser = login("email", "pass")
fm = FanMatch(browser, date = "2020-01-24")
# The dataframe most of you care about.
fm.fm_df
Full docs here, showing the other attributes of the class scraped from the last few lines on page.
I went ahead and created additional columns that might be useful, like actual margin of victory and breaking up the game results into more columns to make them easier to work with.
If you run into any bugs, please report them here, I haven't ran it through the ringer all that intensively yet. Sorry for the delay, this was an annoying one.
This has been fairly well tested with a number of bugs ironed out now. I will likely push a new release to pypi this weekend.
Hey all, this has been pushed to pypi via #3. Updating via pip install kenpompy
should contain the new FanMatch
class now. Let me know if there are any issues - I think I've caught most edge cases (with the help of a few folks who had e-mailed me).