HTML Entity Encoder/Decoder
Convert text to HTML entities or decode entities back. Named + numeric. 80+ common entity reference.
| Name | Char | Named | Numeric |
|---|---|---|---|
| Ampersand | & | & | & |
| Less than | < | < | < |
| Greater than | > | > | > |
| Double quote | " | " | " |
| Apostrophe | ' | ' | ' |
| Non-breaking space | |   | |
| Copyright | © | © | © |
| Registered | ® | ® | ® |
| Trademark | ™ | ™ | ™ |
| Em dash | — | — | — |
| En dash | – | – | – |
| Ellipsis | … | … | … |
| Left single quote | ‘ | ‘ | ‘ |
| Right single quote | ’ | ’ | ’ |
| Left double quote | “ | “ | “ |
| Right double quote | ” | ” | ” |
| Degree | ° | ° | ° |
| Plus-minus | ± | ± | ± |
| Times | × | × | × |
| Divide | ÷ | ÷ | ÷ |
| Cent | ¢ | ¢ | ¢ |
| Pound | £ | £ | £ |
| Euro | € | € | € |
| Yen | ¥ | ¥ | ¥ |
| Left arrow | ← | ← | ← |
| Right arrow | → | → | → |
What HTML entities are for
HTML uses certain characters as syntax: < opens a tag, > closes one, & introduces an entity. To put these characters in text content (rather than markup), you encode them as entities. Without encoding, the browser would try to interpret them as HTML.
Five characters always need encoding in HTML body: &, <, >, ", ' (the last two only matter in attribute values). Beyond those, encoding is optional but useful for readability of non-typeable characters (em dashes, copyright, accented letters).
Named vs numeric entities
Three forms for the copyright symbol:
- Named:
©— readable but only ~250 are predefined in HTML. - Numeric decimal:
©— universal, works for any Unicode codepoint. - Numeric hexadecimal:
©— same as decimal but in hex (matches Unicode notation).
All three render identically. Named entities are easier to read, numeric entities are more portable. Modern HTML5 supports thousands of named entities.
When to encode
- Outputting user-generated content: critical to prevent XSS. Always encode before injecting into HTML.
- Attribute values: encode quotes (
"in double-quoted attributes,'in single-quoted). - Embedding HTML in other HTML: like showing code samples on a tutorial site.
- Serializing XML/HTML for storage or transport.
When NOT to encode
- JavaScript context (e.g., embedded in onclick or <script> blocks): use JavaScript escaping instead. HTML entities aren't parsed inside <script>.
- JSON: JSON has its own escaping rules (\", \\, \\n). Don't HTML-encode JSON.
- URLs: use URL percent-encoding (URL Encoder/Decoder), not HTML entities.
- CSS: CSS has its own escaping (\\ + hex). Don't mix.
Common entities you should memorize
&— & (always encode in HTML body)</>— < and >"— " (in attribute values) — non-breaking space©— © (copyright)—/–— — and –…— …“/”— “ and ” (smart quotes)×/÷— × and ÷
Security: when entities save you
XSS (cross-site scripting) attacks work by tricking your site into rendering attacker-controlled JavaScript. The basic defense: HTML-encode user input before placing it in HTML output. <script>alert(1)</script> becomes harmless text after encoding the angle brackets.
But: encoding ONLY for HTML body context. Different contexts need different escaping. Use a templating library (React, Vue, Angular, Django templates, Rails ERB with auto-escape) — they handle context-aware escaping automatically. Hand-rolling it for production is risky.
For other text/encoding tools: URL Encoder/Decoder, Base64 Encoder/Decoder, and JSON Formatter.