num_tokens_from_gpt_image

num_tokens_from_gpt_image(
    image_data: str | ForwardRef('Image'),
    model: str = 'gpt-4-vision',
    low_quality: bool = False
) -> int

Calculate the number of tokens required to process an image based on its dimensions after scaling for different GPT models. Supports “gpt-4-vision”, “gpt-4o”, and “gpt-4o-mini”.
This function scales the image so that its longest edge is at most 2048 pixels and its shortest edge is at most 768 pixels (for “gpt-4-vision”). It then calculates the number of 512x512 tiles needed to cover the scaled image and computes the total tokens based on the number of these tiles.
Reference: https://openai.com/api/pricing/

Parameters:
NameDescription
image_dataUnion[str, Image.Image]: The image data which can either be a base64 encoded string, a URL, a file path, or a PIL Image object.

Type: str | ForwardRef(‘Image.Image’)
modelstr: The model being used for image processing.

Can be “gpt-4-vision”, “gpt-4o”, or “gpt-4o-mini”.

Type: str

Default: ‘gpt-4-vision’
low_qualityType: bool

Default: False
Returns:
TypeDescription
intint: The total number of tokens required for processing the image.