Scrape and store data from Forældreintra in MySql Using Selenium til eliminate the problem of javascript not being enabled, or missing data from lazy loaded pages on Foraeldreintra Modules like httpx and requests wil not be able to get the data properly, even wtih async.
- Download newest Selenium drivers on the fly or use local copies
- Default is to use local Selenium driver and browser, so you must download these and enter the paths in
settings.py
- Alternatively change the setting
selenium.local
insettings.py
toFalse
- Default is to use local Selenium driver and browser, so you must download these and enter the paths in
- Choose Firefox or Chrome
- Pass parameters to Selenium
- Get data from one or more children
- Ugeplan
- Lektiebog
- Implemented with MySQL
- Setup script included
- Review and change
settings_example.py
- RENAME
settings_example.py
tosettings.py
- Run
db.setup()
- Run
update.py [db]
Optional: [db] is either "test" or "prod".
Not supplying argument will use "test".
Example: Python update.py prod
Change database properties in settings.py
- Build a family dashboard with homework and weekplan data
- Send alerts by email or other communication platforms
Create a .bat file and schedule the bat file to update data on a regular basis
ex: cmd /k "cd /d c:\<path to your project>\.venv\Scripts & call .\activate.bat & cd /d c:\<path to the update file> & python update.py <db>"
This is a webscraper. If at any point in time Forældre intra changes it's HTML, the code will have to be changed to reflect the new structure.