I was recently converting a Wordpress site to a Hugo site. One of the major challenges was to convert all of the blog posts to markdown. Of course one option is to copy text from the site and paste it into a markdown file and then perform the formatting once again. This is time consuming and not so intuitive. I was looking for a smarter option.
Looking around the web i found the following project: wordpress-export-to-markdown. There are two pre-requisites to use this tool. The first one is a WordPress export file, instructions to extract this from your wordpress installation can be found in this Wordpress support article. The second one is node.js, I unfortunately did not have this setup on my computer, I did however have Docker setup. Here I will guide you how I used this tool using docker.
I always start by pulling the image to local drive.
docker pull node
T:\WP2MD>docker pull node Using default tag: latest latest: Pulling from library/node 99760bc62448: Pull complete e3fa264a7a88: Pull complete a222a2af289f: Pull complete c1f89293f045: Pull complete 115b6fc5ace1: Pull complete 9eb516295c24: Pull complete d23358c1492a: Pull complete 08d6736f797a: Pull complete 3dcecd6cc67a: Pull complete Digest: sha256:101d1d7ba7562fcb36b23eeff46607107802f1a439a571d86cf490cf9fe2150e Status: Downloaded newer image for node:latest docker.io/library/node:latest T:\WP2MD>
Next step is to run the docker container. In this project i will use the following parameter:
T:\WP2MDis my project directory on local machine and
/optis my project directory in the docker container.
- Autoclean up
- Interactive mode
docker run --name node -v T:\WP2MD:/opt --rm -it node bash
Then navigate to the
/opt directory and run the following command
cd /opt npx wordpress-export-to-markdown
- Install the requirements to run the tool
- Start the wizard to configure the tool and export
Before running the wizard, ensure you have placed the exported
export.xml file from WordPress into the project directory.
The tool will now output all posts in markdown format, and also include any images referenced in the posts (is selection is made).
root@1f4bdeaf93fd:/# cd /opt root@1f4bdeaf93fd:/opt# npx wordpress-export-to-markdown npx: installed 138 in 8.9s Starting wizard... ? Path to WordPress export file? export.xml ? Path to output folder? output ? Create year folders? Yes ? Create month folders? No ? Create a folder for each post? No ? Prefix post folders/files with date? Yes ? Save images attached to posts? Yes ? Save images scraped from post body content? Yes Parsing... 20 posts found. 70 attached images found. 16 images scraped from post body content. Saving posts... [OK] valkommen-till-var-blogg [OK] gott-nytt-jubileumsar [OK] yeah-ny-hemsida Done, got them all! Downloading and saving images... [OK] Gamla-sidan.png [OK] dsc_4924.jpg [OK] dsc_4795.jpg [OK] dsc_4924-678x1024.jpg [OK] IMG_20180912_105425_955-300x288.jpg [FAILED] Namnlös.png (StatusCodeError: 404) [OK] Namnl%C3%B6s.png Done, but with 1 failed. All done! Look for your output files in: /opt/output root@1f4bdeaf93fd:/opt#