Description:
This tool effortlessly scrapes and saves Educative.io courses for offline use enabling you to
learn at your own pace, even without an internet connection.
Contributions:
I wholeheartedly welcome contributions from individuals in any capacity to enhance this project.
Thank you for your support!
Disclaimer:
I want to clarify that I am not accountable for any inappropriate use of this scraper.
I developed it solely for research purposes and take no responsibility for its misuse.
Repository Version: v3.5.0 (Recommended) multiple fixes and added undetected-chromedriver
Master Branch: v3-master
Note: 1. If you have updated to v3.4.2+,
- Run with --install arg again OR manually clean install
- Delete the old UserDataDir
- No existing Chrome browser should be running in the background
- Redownload Chrome Binary and Chrome driver.
- Undetected may not work in Mac Arm, so uncheck the option to use the default webdriver.
2. Send a mail notification status, Setup here: /src/Main/MailNotify.py
To view the downloaded courses, you can use the Educative-Viewer repository, which provides a better readability and user-friendly interface for accessing the downloaded course content.
Git
Python 3.12 or more
OS: Win(x86/x64) - Mac(ARM64/x64) - Linux(ARM64/x64)
git clone https://github.com/anilabhadatta/educative.io_scraper.git
cd educative.io_scraper
-
-
python setup.py --install python setup.py --run [Commands] --install: Creates a virtual environment and installs the required dependencies. --run: Activates the environment and starts the scraper. [Default = True] --create: Creates a shortcut executable file linked to the scraper directory. If the git repository is moved to a different location after creating the executable then recreate it again to set the new repository path.
-
-
-
pip install virtualenv python -m venv env <or> virtualenv env env\Scripts\activate pip install -r requirements.txt python EducativeScraper.py
-
pip3 install virtualenv python3 -m venv env <or> virtualenv env source env/bin/activate pip3 install -r requirements.txt python3 EducativeScraper.py
-
-
-
Create a text file.
-
Copy the URLs of the first topic/lesson from any number of courses.
-
Paste all the URLs into the text file and save it.
-
Select a configuration if you prefer not to use the default configuration.
-
If you prefer not to display the browser window, choose the
headless
option. -
Please provide a unique
User Data Directory
name that the browser will use to store your current session. Ensure thateach instance
of the scraper has adistinct
User Data Directory name. -
Please select the file path of the text file containing the course URLs, as well as the directory where you would like to save the downloaded content.
-
You can choose to save/export the current configuration for later use, or you can opt for the default configuration.
-
For the initial setup or updates, click on
Download Chromedriver
andDownload Chrome Binary
to automatically Download them into the project directory. -
If you intend to utilize proxies, simply enable the proxy option and enter the proxy in proxies box.
- For IP authorized proxy, you can directly enter IP:PORT of the proxy.
- For USER:PASS authorized proxy, you'll need to create a localhost tunnel using the Proxy-Login-Automator repository.
- After setting up the tunnel, enter the IP:PORT of the localhost proxy that you configured in the Proxy Login Automator.
-
Click on
Login Account
to log in to your Educative.io account and click onClose Browser Button
to close the browser after the login is completed. -
Click on
Start Scraper
to begin scraping the courses. -
The scraper will automatically stop after scraping all the URLs in the selected text file.
-
If you decide to stop the scraper using the
Stop Scraper Button
before it finishes or face any errors, the most recent URL will be saved in theEducativeScraper.log
file. Simply copy the URL from the INFO logger and replace the URL of the topic/lesson that has already been completed with the copied URL. This will allow you to resume the scraper from where you left off. -
An index is
NOT
required in the URL's text file, Simply paste the URLs of the topic from which you want to start/resume scraping.
-