Rayrun
← Back to tools

Byte Pair (BPE) Encoder

A byte pair encoder tool encodes text using byte pair encoding (BPE), a data compression technique that replaces the most frequently occurring character pairs with a single, unused character, reducing the size of the data. It is used by a lot of Transformer models, including GPT, GPT-2, RoBERTa, BART, and DeBERTa.

Use Ctrl + k + "Tools" to quickly access all tools.

Options

Dictionary used for tokenization.

Output format.

Feedback or suggestions? Send it to [email protected]

Thank you!
Was this helpful?

Open Source Packages

Install with https://npmjs.com/:

  • gpt3-tokenizer

Related Tools

TwitterGitHubLinkedIn
AboutQuestionsDiscord ForumBrowser ExtensionTagsQA Jobs

Rayrun is a community for QA engineers. I am constantly looking for new ways to add value to people learning Playwright and other browser automation frameworks. If you have feedback, email [email protected].