← All Tools

N-gram Text Analyzer

Tokenize text into character or word n-grams, count their frequencies, and inspect the most common patterns. Useful for language detection, autocomplete, plagiarism heuristics, and exploratory NLP.

Input Text

Statistics

Tokens
N-grams
Unique
Type/Token
Entropy (bits)

Frequency Table

#N-gramCount%

About N-grams

An n-gram is a contiguous sequence of n items from a text. Word n-grams capture phrase patterns ("the quick" is a bigram). Character n-grams are great for language detection, fuzzy matching, and signature hashing because they tolerate misspellings. Type/Token Ratio measures lexical diversity (unique grams / total grams). Shannon entropy reflects the average information per n-gram in bits.