Unicode Normalizer

Normalize Unicode strings into NFC, NFD, NFKC, and NFKD forms and inspect every code point. Useful for debugging identifier comparison issues, spotting hidden homoglyphs or zero-width characters, and understanding how combining marks and compatibility decompositions change text. Everything runs in your browser.

Which form should I use?

NFC (canonical composition) — preferred for storage & display. é stays as one code point.
NFD (canonical decomposition) — splits é into e + combining acute. Useful for accent-insensitive search.
NFKC / NFKD (compatibility forms) — additionally replace compatibility characters: ﬁ → fi, ½ → 1⁄2, full-width ASCII → ASCII. Use for matching identifiers, but note this is lossy — never for round-trip storage.
Homoglyph attacks (Cyrillic а vs Latin a) are not collapsed by any normal form — only by scripts like Unicode Security confusables.

Unicode Normalizer

Code points

Escape forms (from current input)

Which form should I use?