Understanding Tokens
Learn how tokens work and affect pricing
What are Tokens?
Tokens are the units that AI models use to process input and generate output. Since AI can not interpret raw data directly, it needs to convert it into a format it can understand. This conversion process is called tokenization.
How to Interpret Tokens
- On average, 1 token equals ~4 characters of text
- Different types of content are tokenized differently:
- Text: ~4 characters per token
- Images: Generally 1,000-1,500 tokens per image
- Audio/Video: Varies by length (e.g., a 5-minute video typically uses around 50,000 tokens)
You can experiment with how text gets converted to tokens using OpenAI’s tokenizer tool.
Input vs Output Tokens
Input Tokens
Input tokens include everything the AI processes
- Any details to remember you’ve provided
- Your current message
- All previous messages in the conversation (we only keep at most the last 5 turns)
- Any additional context from tools or files.
Each AI interaction reprocesses the entire conversation context, which includes any time the AI uses a tool then begins replying again in the same turn. Due to the compounding nature of input tokens, we took many steps to ensure you don’t get hit with unexpected costs - which is why we implement smart context management to optimize token usage and costs.
Output Tokens
Output tokens are more straightforward - they just represent the AI’s response to you.