WordPress Article Extractor is a Python script designed to extract articles from a MySQL database, particularly from a WordPress installation. It saves each article in plain text format to individual text files, which can be useful for tasks such as content analysis, data migration, or archiving blog posts.
- Connects to a MySQL database to retrieve articles stored in the
wp_posts
table. - Extracts articles with the post type 'post' and status 'publish'.
- Converts HTML content of articles to plain text using BeautifulSoup.
- Saves each article as a separate text file named with the format
article_{post_id}.txt
.
- Install Python 3.x from python.org.
- Install MySQL Connector/Python and BeautifulSoup4 using pip:
pip install mysql-connector-python beautifulsoup4
-
Run the script:
python SQL_POST_EXT.py
-
Enter your MySQL User when prompted.
-
Enter your MySQL Password when prompted.
-
Enter your MySQL Database_Name when prompted.
-
The script will extract articles from the database and save each article as a separate text file in the current directory.
This project is licensed under the MIT License. See the LICENSE file for details.