This tool is intended to provide a way to convert the exported data from Over-Blog to a more readable and processable format.
It includes a converter for HTMLy but can be used as a base to import the data to any other platform.
- Export each post or page to a separate HTML file
- Convert each post or page to a separate Markdown file
- Normalize posts and pages URL
- Retrieve all images included in posts and pages
This tool has been only developed for a one-shot personal need. It won't likely be fixed, improved or more generally maintained.
Feel free to fork it and adjust it to your needs!
- PHP 7+ with:
- DOM
- JSON
- SimpleXML
- Composer
Extract the XML file from OB archive to export.xml
and place it in this folder.
Install the dependencies:
composer install
Then you may want to:
-
Run the conversion:
make run-convert
It will create or replace a
export/
folder withposts
andpages
, containing all the converted content (HTML + Markdown).
Additionnally, it will also create aexport.json
andexport.clean.json
files with preprocessed data in them. -
Retrieve all images (you need to have the
export.clean.json
first):make run-images
It will download any image found in the content of your posts or pages, and place it in a
export/images/
directory. Additionnally, it will also create aexport/images.json
file with preprocessed data in it. -
Convert the previous content to HTMLy:
make run-tohtmly
It will create and populate a
to-htmly/
folder with all the content converted for HTMLy.
See LICENSE.