Tokenizer Endpoint | Voters

Tokenizer Endpoint

under review

Gus Puffy

Hello, I am really expanding now that I can use a VLLM compatible AI service, its amazing to be able to use deepseek and use VLLM parameters. One thing I am misssing is the /tokenize endpoint from VLLM. Is it possible you could expose this endpoint? Here is my use case.
I am running in a browser environment, so I can't run a tokenizer locally easily
I tokenize a '<<BREAK>>' string or find some token like <unk> or a special reserved token for that model that won't show up normally in a text generated by the AI or during inference
I then add that break string at the end of every chat message:
https://github.com/guspuffygit/sentient-sims-app/blob/main/src/main/sentient-sims/services/VLLMAIService.ts#L198-L206
Then after that I can truncate the oldest user/assistant messages to a certain token length because I can count how many tokens are in a chat message by finding the break string token
Here is the code where I use the tokenized output to truncate chat messages to a certain context length.
https://github.com/guspuffygit/sentient-sims-app/blob/main/src/main/sentient-sims/util/tokenTruncate.ts#L17-L69

June 7, 2025

M.R.

marked this post as

under review

This post was marked as

planned