HTML vs Markdown — When to Convert, Migration Strategies, and the Complete Syntax Mapping
You're staring at 200 WordPress posts exported as HTML, and your shiny new Hugo site wants Markdown files. Manually rewriting each one would take weeks. But automated conversion can do it in minutes — if you know the pitfalls to watch for.
Why the World Is Moving From HTML to Markdown
Something interesting happened over the past decade. Developers, writers, and content creators gradually shifted from HTML to Markdown for content creation. GitHub adopted it for README files and documentation. Static site generators like Hugo, Jekyll, and Gatsby made it the default content format. Note-taking apps like Obsidian and Notion built their entire editing experience around it.
The reason is simple: Markdown is human-readable even in its raw form. When you see # Hello World, you immediately know it's a heading. When you see **bold**, the meaning is obvious. Compare that to <h1>Hello World</h1> and <strong>bold</strong> — the HTML tags obscure the content instead of enhancing readability.
This shift means that content originally created in HTML — blog posts, documentation, wiki pages, email templates — increasingly needs to be converted to Markdown. Whether you're migrating platforms, consolidating content into a knowledge base, or just cleaning up legacy documentation, HTML-to-Markdown conversion is now a routine workflow step.
The Complete HTML to Markdown Mapping
Here's every common HTML element and its Markdown equivalent. Bookmark this table — you'll reference it more than you'd expect.
| HTML | Markdown | Notes |
|---|---|---|
| <h1>...</h1> | # ... | Space after # required |
| <h2>...</h2> | ## ... | Up to h6 (######) |
| <p>...</p> | ...\n\n | Blank line = new paragraph |
| <strong>...</strong> | **...** | Also <b> tag |
| <em>...</em> | *...* | Also <i> tag |
| <del>...</del> | ~~...~~ | Strikethrough |
| <a href="url">text</a> | [text](url) | Title attr optional |
| <img src="url" alt="x"> |  | ! prefix for images |
| <ul><li>...</li></ul> | - ... | Also * or + prefix |
| <ol><li>...</li></ol> | 1. ... | Numbers auto-increment |
| <blockquote>...</blockquote> | > ... | Each line needs > |
| <code>...</code> | `...` | Inline code |
| <pre><code>...</code></pre> | ```\n...\n``` | Fenced code block |
| <hr> | --- | Also *** or ___ |
| <br> | Two trailing spaces or \n | Line break within paragraph |
| <table>...</table> | | col | col |\n|---|---| | GFM extension |
What Doesn't Convert: The HTML Elements Markdown Can't Replace
Here's what most conversion guides don't tell you: not every HTML element has a Markdown equivalent. Understanding these gaps prevents frustration during migration.
No Markdown equivalent:
<div>,<span>,<section>— Structural containers have no meaning in Markdown. Their text content is preserved but the container is removed.<iframe>— Embeds (YouTube, maps, forms) have no Markdown syntax. You can leave them as raw HTML inside Markdown files — most processors support inline HTML.<form>,<input>,<button>— Interactive elements don't exist in Markdown. These need to be handled separately, often with JavaScript or platform-specific shortcodes.<style>,<script>— CSS and JavaScript are stripped during conversion. If your content relies on inline styles for visual effects, those effects are lost.- CSS classes and IDs — Markdown has no native concept of styling. Some extended Markdown flavors (like Kramdown) support adding attributes, but this isn't standard.
The good news: Markdown files support inline HTML. So when you encounter an element that can't be converted, you can leave the raw HTML in the Markdown file and most processors will render it correctly. This is the pragmatic solution used by most migration workflows.
Migration Playbook: WordPress to Static Site Generator
This is the most common migration scenario, and here's exactly how to do it step by step.
Step 1: Export your WordPress content. Use the built-in WordPress export (Tools > Export) which generates an XML file. Alternatively, use a plugin like "WordPress to Hugo Exporter" or "Jekyll Exporter" which directly outputs Markdown files with front matter.
Step 2: Extract HTML content. If you're working with the XML export, each post's content is stored in the <content:encoded> element as HTML. You need to extract this HTML for each post.
Step 3: Convert HTML to Markdown. Use our HTML to Markdown converter to transform each post's HTML content. For bulk conversion, scripting tools like Pandoc (pandoc -f html -t markdown) can process files in batch.
Step 4: Add front matter. Static site generators require YAML or TOML front matter at the top of each Markdown file with metadata like title, date, categories, and tags. Create this from the WordPress export data.
Step 5: Handle images. WordPress stores images in /wp-content/uploads/. You need to download all images and update the paths in your Markdown files to match your new site's image directory structure.
Step 6: Review and fix. Automated conversion gets you 90% of the way. The remaining 10% — shortcodes, custom HTML blocks, embedded content, WordPress-specific formatting — needs manual review and fixes.
Pro tip: Don't try to convert everything at once. Start with your 10 most important posts. Convert them, review the output, identify recurring issues, and build fixes into your workflow before processing the remaining content.
Real Migration Stories
🇮🇳 A Tech Blog in Bengaluru — 300 Posts From WordPress to Hugo
A developer ran a tech blog on WordPress for 5 years. Performance was degrading, hosting costs were climbing, and he wanted the speed and simplicity of a static site.
He used the WordPress Hugo Exporter plugin for the initial export, which handled front matter automatically. For 40 posts with complex HTML (tables, code blocks with syntax highlighting, nested lists), he used the converter tool to fine-tune the output. Image paths were batch-updated using a find-and-replace script.
Total migration time: 3 days. Site went from 3.2s load time on WordPress to 0.4s on Hugo hosted on Netlify. Hosting cost dropped from ₹3,000/month to ₹0 (Netlify free tier).
🇮🇳 An Education Platform in Pune — Documentation to Notion
An EdTech startup had 200 pages of teacher training documentation in HTML files. They wanted to consolidate everything in Notion for collaborative editing.
Each HTML file was converted to Markdown, then imported into Notion using the Markdown import feature. Tables, headers, and lists all transferred cleanly. Embedded videos (iframes) were manually re-added as Notion embed blocks.
Total migration time: 1.5 days for 200 pages. Teachers could now collaboratively edit documentation without learning HTML.
🇺🇸 A Documentation Team in Seattle — Confluence to GitHub Wiki
A software company's documentation was trapped in Confluence. They wanted it in their GitHub repository as Markdown files, version-controlled alongside code.
Confluence's export produced HTML. The team batch-converted 150 pages using Pandoc, then manually fixed table formatting and diagram references. Internal links between pages required a script to update from Confluence URLs to relative Markdown file paths.
Total migration time: 1 week for 150 pages. Documentation was now version-controlled, searchable on GitHub, and editable by any developer without a Confluence license.
Markdown Flavors: Why Your Output Might Look Different
Here's something that trips up a lot of people: there isn't one Markdown. There are several "flavors" with slightly different features.
CommonMark is the standardization effort — a strict specification that defines exactly how Markdown should be parsed. It's what most modern processors target.
GitHub Flavored Markdown (GFM) extends CommonMark with tables, task lists (- [x] Done), strikethrough (~~text~~), and autolinked URLs. If you're targeting GitHub, GitLab, or any platform that says "GFM," you get these extras.
Kramdown is used by Jekyll (and therefore GitHub Pages). It adds attribute lists, footnotes, and definition lists beyond standard Markdown.
MultiMarkdown adds footnotes, tables, citations, and cross-references. It's popular in academic and long-form writing circles.
Our converter outputs CommonMark-compatible Markdown with GFM table syntax. This is the safest choice — it works with virtually every Markdown processor you'll encounter. If you need flavor-specific features (like footnotes or task lists), you may need to add those manually after conversion.
Common Conversion Pitfalls and How to Avoid Them
Nested lists break formatting. HTML allows deep nesting with clear <ul><li><ul><li> structure. In Markdown, nested lists depend on precise indentation — usually 2 or 4 spaces. If the indentation is off by even one space, the nesting breaks. Always review nested lists manually after conversion.
Inline styles are lost silently. If your HTML uses style="color: red" or style="text-align: center", the conversion strips these without warning. The text content is preserved but the visual formatting disappears. If styling is important, you'll need to handle it through your Markdown processor's CSS or theme.
HTML entities need decoding. Characters like &, <, >, and should be converted back to their actual characters (&, <, >, space) in the Markdown output. Good converters handle this automatically, but some miss edge cases.
WordPress shortcodes become garbage. If your WordPress posts use shortcodes like , , or plugin-specific shortcodes, the converter treats them as plain text. You need a separate strategy for handling shortcodes — usually converting them to HTML first, then to Markdown, or replacing them with Markdown equivalents.
Table complexity exceeds Markdown's capabilities. HTML tables support rowspan, colspan, cell alignment, and nested tables. Markdown tables are simple grids — one header row, one separator row, and content rows. Complex table layouts need to be simplified or left as raw HTML.
When to Keep HTML Instead of Converting
Not everything should be converted to Markdown. Here's when HTML is the better choice.
Complex layouts. Multi-column layouts, grid systems, and precisely positioned elements have no Markdown equivalent. Keep these as HTML.
Interactive content. Forms, calculators, interactive widgets, and anything requiring JavaScript should remain as HTML. Markdown is for static content.
Email templates. Email clients interpret Markdown differently (most don't support it at all). Email content should stay as HTML for reliable rendering.
Precise visual control. When you need exact control over colors, spacing, fonts, and layout, HTML with CSS is the right tool. Markdown intentionally abstracts away visual presentation.
The beauty of Markdown is that it supports inline HTML. So even in a Markdown-based workflow, you can drop in HTML snippets wherever Markdown's syntax falls short. The two coexist peacefully.
Building an Automated Conversion Pipeline
If you regularly need to convert HTML to Markdown — for content migration, CMS integration, or automated workflows — here are tools to build a proper pipeline.
Pandoc is the Swiss Army knife of document conversion. It converts between dozens of formats including HTML to Markdown. For batch processing: pandoc -f html -t gfm input.html -o output.md. You can script it to process entire directories.
Turndown is a JavaScript library purpose-built for HTML-to-Markdown conversion. It's what many online converters use under the hood. You can install it via npm and integrate it into Node.js scripts for automated processing.
Python libraries like markdownify and html2text provide similar functionality. html2text is particularly good at handling messy HTML from real-world web pages.
For a quick one-off conversion without installing anything, our web-based converter handles the job instantly in your browser.
Understanding HTML to Markdown Across Languages
HTML to Markdown Conversion Globally
The Bottom Line: Convert Strategically, Not Blindly
HTML-to-Markdown conversion isn't a magic wand. It's a powerful tool that handles the bulk of the work, but it requires human judgment for the edge cases. Simple content with headings, paragraphs, lists, and links converts perfectly. Complex layouts with tables, embedded media, and custom styling need manual attention.
Our recommendation? Use automated conversion for the 90% of content that's straightforward. Spend your time on the 10% that needs manual tweaking. And always review the output before publishing — a few minutes of review catches issues that would embarrass you if your readers found them first.
The shift toward Markdown is irreversible. More tools, more platforms, and more workflows adopt it every year. Learning to convert and work with Markdown isn't just a nice skill — it's becoming as fundamental as knowing basic HTML was a decade ago.
Convert Your HTML to Markdown Now
Need to transform HTML into clean Markdown? Use our converter with full tag support, nested element handling, and one-click download as .md files.
Convert HTML to Markdown →Recommended Hosting
Hostinger
If you are building a website for your tools, blog, or store, reliable hosting matters for speed and uptime. Hostinger is a popular option used worldwide.
Visit Hostinger →Disclosure: This is a sponsored link.
