/WordPressArticleExtractor

This Python script is a tool designed to extract articles from a MySQL database, particularly from a WordPress installation, and save each article in plain text format to individual text files. It can be useful for tasks such as content analysis, data migration, or archiving blog posts.

Primary LanguagePythonMIT LicenseMIT

WordPress Article Extractor

Description

WordPress Article Extractor is a Python script designed to extract articles from a MySQL database, particularly from a WordPress installation. It saves each article in plain text format to individual text files, which can be useful for tasks such as content analysis, data migration, or archiving blog posts.

Features

  • Connects to a MySQL database to retrieve articles stored in the wp_posts table.
  • Extracts articles with the post type 'post' and status 'publish'.
  • Converts HTML content of articles to plain text using BeautifulSoup.
  • Saves each article as a separate text file named with the format article_{post_id}.txt.

Requirements

Installation

  1. Install Python 3.x from python.org.
  2. Install MySQL Connector/Python and BeautifulSoup4 using pip:
    pip install mysql-connector-python beautifulsoup4

Usage

  1. Run the script:

    python SQL_POST_EXT.py
  2. Enter your MySQL User when prompted.

    Screenshot from 2024-04-15 00-49-11

  3. Enter your MySQL Password when prompted.

    Screenshot from 2024-04-15 00-49-42

  4. Enter your MySQL Database_Name when prompted.

    Screenshot from 2024-04-15 00-49-58

  5. The script will extract articles from the database and save each article as a separate text file in the current directory.

    Screenshot from 2024-04-15 00-50-38

License

This project is licensed under the MIT License. See the LICENSE file for details.