Skip to content

TNH Scholar Documentation

token-count

token-count

The token-count command calculates the OpenAI API token count for text input. This is useful for ensuring that a text is within maximum token limits for the API model and also for estimating API costs.

Usage

token-count [INPUT_FILE]

If no input file is specified, reads from standard input (stdin).

Examples

Count Tokens in File

token-count input.txt

Count Tokens from Stdin

echo "Sample text" | token-count
cat input.txt | token-count

Pipeline Usage

# Count tokens after processing
cat input.txt | tnh-fab punctuate | token-count

# Count tokens at multiple stages
cat input.txt | tee >(token-count >&2) | \
  tnh-fab punctuate | tee >(token-count >&2) | \
  tnh-fab process -p format_xml | token-count

Output

Returns a single integer representing the number of tokens in the input text, calculated using the OpenAI tokenizer.

Notes

Uses the same tokenizer as GPT-4
Counts are exact matches to OpenAI API usage
Useful for:
Cost estimation
Context window planning
Processing pipeline optimization
Model input validation

See Also

OpenAI tokenizer documentation
TNH Scholar API documentation for token counting
tnh-fab documentation for text processing