Tokenizer — Software that converts text into tokens that AI models can process. BPE (Byte Pair Encoding) is the most common algorithm. Different models use different tokenizers — ‘hello world’ might be 2 tokens in one model and 3 in another. Token count determines cost and context window usage.
Part of the XLUXX AI Encyclopedia — the most comprehensive AI reference on the web.

Leave a Reply