autogen.agentchat.contrib.img_utils.num_tokens_from_gpt_image

num_tokens_from_gpt_image

num_tokens_from_gpt_image(
    image_data: str | ForwardRef('Image'),
    model: str = 'gpt-4-vision',
    low_quality: bool = False
) -> int

Calculate the number of tokens required to process an image based on its dimensions after scaling for different GPT models. Supports “gpt-4-vision”, “gpt-4o”, and “gpt-4o-mini”.
This function scales the image so that its longest edge is at most 2048 pixels and its shortest edge is at most 768 pixels (for “gpt-4-vision”). It then calculates the number of 512x512 tiles needed to cover the scaled image and computes the total tokens based on the number of these tiles.
Reference: https://openai.com/api/pricing/

Parameters:

Name	Description
`image_data`	Union[str, Image.Image]: The image data which can either be a base64 encoded string, a URL, a file path, or a PIL Image object. Type: str \| ForwardRef(‘Image.Image’)
`model`	str: The model being used for image processing. Can be “gpt-4-vision”, “gpt-4o”, or “gpt-4o-mini”. Type: str Default: ‘gpt-4-vision’
`low_quality`	bool: Whether to use low-quality processing. Defaults to False. Type: bool Default: False

Returns:

Type	Description
int	int: The total number of tokens required for processing the image.

message_formatter_pil_to_b64 pil_to_data_uri

On this page

num_tokens_from_gpt_image

autogen
- Overview
- Agent
- AgentNameConflictError
- AssistantAgent
- Cache
- ChatResult
- ContextExpression
- ConversableAgent
- GroupChat
- GroupChatManager
- InvalidCarryOverTypeError
- LLMConfig
- ModelClient
- NoEligibleSpeakerError
- OpenAIWrapper
- SenderRequiredError
- UndefinedNextAgentError
- UpdateSystemMessage
- UserProxyAgent
- a_initiate_swarm_chat
- a_run_swarm
- config_list_from_dotenv
- config_list_from_json
- config_list_from_models
- config_list_gpt4_gpt35
- config_list_openai_aoai
- filter_config
- gather_usage_summary
- get_config_list
- initiate_chats
- register_function
- run_swarm
- agentchat
  - Overview
  - a_initiate_chats
  - a_initiate_group_chat
  - a_run_group_chat
  - run_group_chat
  - chat
  - contrib
    - agent_eval
    - agent_optimizer
    - capabilities
    - captainagent
    - gpt_assistant_agent
    - graph_rag
    - img_utils
      - Overview
      - convert_base64_to_data_uri
      - extract_img_paths
      - get_image_data
      - get_pil_image
      - gpt4v_formatter
      - llava_formatter
      - message_formatter_pil_to_b64
      - num_tokens_from_gpt_image
      - pil_to_data_uri
    - llamaindex_conversable_agent
    - llava_agent
    - math_user_proxy_agent
    - multimodal_conversable_agent
    - qdrant_retrieve_user_proxy_agent
    - rag
    - retrieve_assistant_agent
    - retrieve_user_proxy_agent
    - society_of_mind_agent
    - swarm_agent
    - text_analyzer_agent
    - vectordb
    - web_surfer
  - group
  - realtime
  - utils
- agents
- browser_utils
- cache
- code_utils
- coding
- doc_utils
- events
- exception_utils
- fast_depends
- formatting_utils
- graph_utils
- import_utils
- interop
- io
- json_utils
- llm_config
- logger
- math_utils
- mcp
- messages
- oai
- retrieve_utils
- runtime_logging
- token_count_utils
- tools
- types

num_tokens_from_gpt_image

num_tokens_from_gpt_image(
    image_data: str | ForwardRef('Image'),
    model: str = 'gpt-4-vision',
    low_quality: bool = False
) -> int

Parameters:

Name	Description
`image_data`	Union[str, Image.Image]: The image data which can either be a base64 encoded string, a URL, a file path, or a PIL Image object. Type: str \| ForwardRef(‘Image.Image’)
`model`	str: The model being used for image processing. Can be “gpt-4-vision”, “gpt-4o”, or “gpt-4o-mini”. Type: str Default: ‘gpt-4-vision’
`low_quality`	bool: Whether to use low-quality processing. Defaults to False. Type: bool Default: False

Returns:

Type	Description
int	int: The total number of tokens required for processing the image.

message_formatter_pil_to_b64 pil_to_data_uri

On this page

num_tokens_from_gpt_image

​num_tokens_from_gpt_image

API Reference

​num_tokens_from_gpt_image

num_tokens_from_gpt_image

num_tokens_from_gpt_image