HTML Encoder And Decoder

Encode text as HTML character references, or decode HTML entities back to readable text.



Question

What is HTML entity encoding?

HTML entity encoding, also called HTML character reference encoding, replaces markup-significant characters with text sequences that a browser can parse as data instead of markup. For example, a less-than sign can be written as <, a greater-than sign as >, an ampersand as &, and a quotation mark as ".

This tool is useful when you need to show literal HTML examples in a page, prepare snippets for documentation, inspect encoded content, or decode text that contains named, decimal numeric, or hexadecimal numeric character references. Encoding turns reserved characters into entity references; decoding converts recognized references back to their corresponding Unicode characters.

HTML supports named character references such as ©, decimal numeric references such as ©, and hexadecimal numeric references such as ©. The encoded form is still plain text, so it can be copied into source code, templates, emails, and examples without immediately being interpreted as an HTML tag.

Entity encoding is context-sensitive security hygiene, not encryption or sanitization. It can help display untrusted text safely in normal HTML text nodes, but attributes, JavaScript, CSS, URLs, and rich HTML sanitization each have their own rules. Decode only when you intentionally want the original characters back, because decoded markup may become executable if it is inserted into a page as HTML.

HTML

Sample HTML

The following sample shows the same HTML fragment before and after encoding.

Encoded HTML

<!doctype html>
<html>
  <head>
    <title>This is the title of the webpage!</title>
  </head>
  <body>
    <p>This is an example paragraph. Anything in the <strong>body</strong> tag will appear on the page, just like this <strong>p</strong> tag and its contents.</p>
  </body>
</html>

Decoded HTML

<!doctype html>
<html>
  <head>
    <title>This is the title of the webpage!</title>
  </head>
  <body>
    <p>This is an example paragraph. Anything in the <strong>body</strong> tag will appear on the page, just like this <strong>p</strong> tag and its contents.</p>
  </body>
</html>

Lightbulb

Useful Resources


Books

References


Code

Code Samples

Below are code samples that demonstrate HTML entity encoding and decoding in several programming languages.

C#

C#

Encode HTML in C# code sample

Use HTML encoding before displaying literal markup as text.

using System;
using System.Net;

class Program
{
   static void Main()
   {
       string input = "<html><head><title>T</title></head></html>";
       string encodedString = WebUtility.HtmlEncode(input);
       Console.WriteLine(encodedString);
   }
}

Decode HTML in C# code sample

Decode character references when you intentionally need the original text back.

using System;
using System.Net;

class Program
{
    static void Main()
    {
        string encodedString = "&lt;html&gt;&lt;head&gt;&lt;title&gt;T&lt;/title&gt;&lt;/head&gt;&lt;/html&gt;";
        string decodedString = WebUtility.HtmlDecode(encodedString);
        Console.WriteLine(decodedString);
    }
}
JavaScript

JavaScript

Encode to HTML in JavaScript code sample

Encode reserved characters before inserting plain text into HTML examples.

function htmlEncode(input) {
  const textArea = document.createElement('textarea');
  textArea.innerText = input;
  return textArea.innerHTML.split('<br>').join('\\n');
}

Decode to HTML in JavaScript code sample

Decode only when the result should be treated as plain text, not trusted markup.

function htmlDecode(input) {
  const textArea = document.createElement('textarea');
  textArea.innerHTML = input;
  return textArea.value;
}
Python

Python

Encode HTML in Python code sample

Escape HTML-sensitive characters before showing raw snippets in a page.

import html

print(html.escape("<p>hello</p>"))

Decode HTML in Python code sample

Unescape character references when reviewing encoded text or examples.

import html

print(html.unescape('&#xA3;682m'))
Go

Golang

Encode HTML in Go code sample

Encode text before rendering examples that contain angle brackets or ampersands.

package main

import (
    "fmt"
    "html"
)

func main() {
    var s string = `"<script>alert(123);</script>"`
    fmt.Println(html.EscapeString(s))
}

Decode HTML in Go code sample

Decode encoded text when you need to inspect the original characters.

package main

import (
    "fmt"
    "html"
)

func main() {
    s := `&lt;html&gt;&lt;head&gt;&lt;title&gt;T&lt;/title&gt;&lt;/head&gt;&lt;/html&gt;`
    fmt.Println(html.UnescapeString(s))
}
Java

Java

Encode HTML in Java code sample

Encode markup-sensitive text before displaying it as literal content.

import org.apache.commons.text.StringEscapeUtils;

public class Main {
    public static void main(String[] args) {
        String text = "a > b && a < c";
        String encoded = StringEscapeUtils.escapeHtml4(text);
        System.out.println(encoded);
    }
}

Decode HTML in Java code sample

Decode character references when converting encoded text back to readable characters.

import org.apache.commons.text.StringEscapeUtils;

public class Main {
    public static void main(String[] args) {
        String text = "a &#x3E; b &amp;&amp; a &#x3C; c";
        String decoded = StringEscapeUtils.unescapeHtml4(text);
        System.out.println(decoded);
    }
}
Rust

Rust

Encode HTML in Rust code sample

Encode text when a page needs to show HTML syntax rather than interpret it.

extern crate html_escape;

use html_escape::encode_text;

fn main() {
    let text = "a > b && a < c";
    let encoded = encode_text(text);
    assert_eq!(encoded, "a &gt; b &amp;&amp; a &lt; c");
}

Decode HTML in Rust code sample

Decode character references before comparing or reading the original text.

extern crate html_escape;

use html_escape::decode_html_entities;

fn main() {
    let text = "a &#x3E; b && a &#x3C; c";
    let decoded = decode_html_entities(text);
    assert_eq!(decoded, "a > b && a < c");
}
Scala

Scala

Encode HTML in Scala code sample

Encode ampersands, angle brackets, and quotes before presenting snippets as text.

import io.scalajs.npm.escapehtml._

EscapeHtml("foo & bar") // "foo &amp; bar"

Decode HTML in Scala code sample

Decode encoded snippets only when the readable characters are the intended output.

import org.apache.commons.lang.StringEscapeUtils

val decodeHtml = (html: String) => {
  StringEscapeUtils.unescapeHtml(html)
}