Skip to content

token-count

The token-count command calculates the OpenAI API token count for text input. This is useful for ensuring that a text is within maximum token limits for the API model and also for estimating API costs.

Usage

token-count [INPUT_FILE]

If no input file is specified, reads from standard input (stdin).

Examples

Count Tokens in File

token-count input.txt

Count Tokens from Stdin

echo "Sample text" | token-count
cat input.txt | token-count

Pipeline Usage

# Count tokens after processing
cat input.txt | tnh-fab punctuate | token-count

# Count tokens at multiple stages
cat input.txt | tee >(token-count >&2) | \
  tnh-fab punctuate | tee >(token-count >&2) | \
  tnh-fab process -p format_xml | token-count

Output

Returns a single integer representing the number of tokens in the input text, calculated using the OpenAI tokenizer.

Notes

  • Uses the same tokenizer as GPT-4
  • Counts are exact matches to OpenAI API usage
  • Useful for:
  • Cost estimation
  • Context window planning
  • Processing pipeline optimization
  • Model input validation

See Also

  • OpenAI tokenizer documentation
  • TNH Scholar API documentation for token counting
  • tnh-fab documentation for text processing