Posted on

I was recently converting a Wordpress site to a Hugo site. One of the major challenges was to convert all of the blog posts to markdown. Of course one option is to copy text from the site and paste it into a markdown file and then perform the formatting once again. This is time consuming and not so intuitive. I was looking for a smarter option.

Looking around the web i found the following project: wordpress-export-to-markdown. There are two pre-requisites to use this tool. The first one is a WordPress export file, instructions to extract this from your wordpress installation can be found in this Wordpress support article. The second one is node.js, I unfortunately did not have this setup on my computer, I did however have Docker setup. Here I will guide you how I used this tool using docker.

I always start by pulling the image to local drive.

docker pull node
T:\WP2MD>docker pull node
Using default tag: latest
latest: Pulling from library/node
99760bc62448: Pull complete
e3fa264a7a88: Pull complete
a222a2af289f: Pull complete
c1f89293f045: Pull complete
115b6fc5ace1: Pull complete
9eb516295c24: Pull complete
d23358c1492a: Pull complete
08d6736f797a: Pull complete
3dcecd6cc67a: Pull complete
Digest: sha256:101d1d7ba7562fcb36b23eeff46607107802f1a439a571d86cf490cf9fe2150e
Status: Downloaded newer image for node:latest
docker.io/library/node:latest

T:\WP2MD>

Next step is to run the docker container. In this project i will use the following parameter:

  • Name: node
  • Volume: T:\WP2MD:/opt where T:\WP2MD is my project directory on local machine and /opt is my project directory in the docker container.
  • Autoclean up
  • Interactive mode
docker run --name node -v T:\WP2MD:/opt --rm -it node bash

Then navigate to the /opt directory and run the following command npx wordpress-export-to-markdown

cd /opt
npx wordpress-export-to-markdown

This will

  1. Install the requirements to run the tool
  2. Start the wizard to configure the tool and export

Before running the wizard, ensure you have placed the exported export.xml file from WordPress into the project directory.

The tool will now output all posts in markdown format, and also include any images referenced in the posts (is selection is made).

root@1f4bdeaf93fd:/# cd /opt
root@1f4bdeaf93fd:/opt# npx wordpress-export-to-markdown
npx: installed 138 in 8.9s

Starting wizard...
? Path to WordPress export file? export.xml
? Path to output folder? output
? Create year folders? Yes
? Create month folders? No
? Create a folder for each post? No
? Prefix post folders/files with date? Yes
? Save images attached to posts? Yes
? Save images scraped from post body content? Yes

Parsing...
20 posts found.
70 attached images found.
16 images scraped from post body content.

Saving posts...
[OK] valkommen-till-var-blogg
[OK] gott-nytt-jubileumsar
[OK] yeah-ny-hemsida
Done, got them all!

Downloading and saving images...
[OK] Gamla-sidan.png
[OK] dsc_4924.jpg
[OK] dsc_4795.jpg
[OK] dsc_4924-678x1024.jpg
[OK] IMG_20180912_105425_955-300x288.jpg
[FAILED] Namnlös.png (StatusCodeError: 404)
[OK] Namnl%C3%B6s.png
Done, but with 1 failed.

All done!
Look for your output files in: /opt/output
root@1f4bdeaf93fd:/opt#
comments powered by Disqus