How to use the HTML Entity Decoder
Decode HTML entities in three steps:
1
Paste your entity-encoded content
Enter any text containing HTML entity references — named like ©, decimal like ©, hex like ©, or any mix of the three.
2
View the decoded source
The Source tab shows decoded characters instantly. Switch to Preview to render the decoded HTML visually in the browser.
3
Inspect the Breakdown table
Click the Breakdown tab to see a full table of every unique entity found in your input — showing the entity code, its decoded character, its type (named/decimal/hex), and how many times it appears.
4
Copy or download
Click Copy to grab the decoded text or Download to save the result as an HTML file.
When to use this tool
Use this tool when you need to decode entities beyond basic HTML encoding:
- →Decoding content from RSS feeds or Atom feeds that use extensive named entities for typography
- →Converting HTML entity-encoded data from third-party APIs back to readable Unicode characters
- →Auditing CMS or database content that may mix named, decimal, and hex entity formats inconsistently
- →Decoding WYSIWYG editor output that converts typographic characters to named entities
- →Reverse-engineering minified or obfuscated HTML that uses numeric entity encoding
- →Debugging internationalization issues where accented characters have been entity-encoded instead of stored as UTF-8
Frequently asked questions
Q:What is the difference between HTML Entity Decoder and HTML Decoder?
HTML Decoder focuses on reversing the five core HTML structural encodings (& < > " ') and is the counterpart to the HTML Encoder tool — ideal for decoding basic HTML-escaped content. HTML Entity Decoder handles the full spectrum of HTML entity references including all 2,000+ named HTML5 entities, decimal numeric references, and hexadecimal references. It also adds the entity Breakdown table for auditing. Use HTML Decoder for simple tasks and HTML Entity Decoder for complex or mixed entity content.
Q:What are the three types of HTML entity references?
HTML has three entity reference formats that all produce the same decoded result: (1) Named entities use an ampersand, an entity name, and a semicolon — like &eacute; for é or &copy; for ©. (2) Decimal numeric character references use &# followed by the Unicode code point as a base-10 integer — like &#233; for é. (3) Hexadecimal numeric character references use &#x followed by the code point in hex — like &#xE9; for é. This tool decodes all three interchangeably.
Q:What does the entity Breakdown table show?
The Breakdown tab displays a table of every unique HTML entity reference found in your input. For each entity it shows: the raw entity string as it appears in the source (e.g. &eacute;), the decoded character it represents (é), a color-coded type badge indicating whether it is a named, decimal, or hex entity, and the number of times that entity appears in the full input. Entities are sorted by occurrence count (most frequent first), making it easy to spot which characters dominate an encoded document.
Q:Can this tool decode entities from RSS, Atom, or XML feeds?
Yes — RSS and Atom feeds commonly use HTML entity encoding for typographic characters like smart quotes (&ldquo; &rdquo;), dashes (&mdash; &ndash;), and special symbols. XML natively supports only the five core entities (&amp; &lt; &gt; &quot; &apos;) but many feed generators use numeric references like &#8220; for curly quotes. This tool decodes all these formats. Paste the raw feed content and the Breakdown table will show you exactly which entities your feed contains.
Q:How does this tool decode entities — does it have a lookup table?
Rather than maintaining a static lookup table, this tool uses the browser's native HTML parser by setting the innerHTML of a textarea element and reading back its value property. This approach gives complete HTML5 coverage — all 2,000+ named entities the browser supports are decoded correctly without any custom mapping. It also means the tool automatically handles edge cases like entities without semicolons that some older content may use.
Q:My decoded output contains garbled characters — what went wrong?
Garbled characters after decoding usually have one of three causes: (1) The content is double-encoded — run it through the decoder a second time to remove the additional layer. (2) The content uses a non-Unicode encoding like Windows-1252 or ISO-8859-1, and the characters were pasted with encoding mismatch — try copying from a source that specifies the correct character encoding. (3) The content contains numeric entity references that point to control characters or invalid Unicode code points. The Breakdown table will show you exactly which entities are present to help diagnose the issue.