token-count¶
The token-count command calculates the OpenAI API token count for text input. This is useful for ensuring that a text is within maximum token limits for the API model and also for estimating API costs.
Usage¶
If no input file is specified, reads from standard input (stdin).
Examples¶
Count Tokens in File¶
Count Tokens from Stdin¶
Pipeline Usage¶
# Count tokens after processing
cat input.txt | tnh-fab punctuate | token-count
# Count tokens at multiple stages
cat input.txt | tee >(token-count >&2) | \
tnh-fab punctuate | tee >(token-count >&2) | \
tnh-fab process -p format_xml | token-count
Output¶
Returns a single integer representing the number of tokens in the input text, calculated using the OpenAI tokenizer.
Notes¶
- Uses the same tokenizer as GPT-4
- Counts are exact matches to OpenAI API usage
- Useful for:
- Cost estimation
- Context window planning
- Processing pipeline optimization
- Model input validation
See Also¶
- OpenAI tokenizer documentation
- TNH Scholar API documentation for token counting
- tnh-fab documentation for text processing