PHP Text Tokenizer for GPT models
A PHP toolkit to tokenize text like GPT family of models process it.
Forked from semji/gpt3-tokenizer-php to bug fixes and improvement.
- PHP 8.1
- mbstring extension details here on how to install mbstring
First install the package using composer:
composer require mehrab-wj/tiktoken-php
use TikToken\Encoder;
$prompt = "Ai is cool";
$encoder = new Encoder();
$tokens = $encoder->encode($prompt); // [32, 72, 318, 3608]
// Get tokens count:
echo count($tokens); // 4