Open In Colab Open on GitHub Various LLM providers offer functionality for defining a structure of the messages generated by LLMs and AG2 enables this by propagating response_format, in the LLM configuration for your agents, to the underlying client. You can define the JSON structure of the output in the response_format field in the LLM configuration. To assist in determining the JSON structure, you can generate a valid schema using .model_json_schema() on a predefined pydantic model, for more info, see here. Your schema should be OpenAPI specification compliant and have a title field defined for the root model which will be loaded as a response_format for the Agent. For more info on structured outputs, see our documentation.
Install ag2:
pip install -U ag2[openai]
Note: If you have been using autogen or ag2, all you need to do is upgrade it using:
pip install -U autogen
or
pip install -U ag2
as autogen, and ag2 are aliases for the same PyPI package.
For more information, please refer to the installation guide.

Supported clients

AG2 has structured output support for following client providers: - OpenAI (openai) - Anthropic (anthropic) - Google Gemini (google) - Ollama (ollama)

Set your API Endpoint

The LLMConfig.from_json method loads a list of configurations from an environment variable or a JSON file. Here is an example of a configuration using the gpt-4o-mini model that will use a MathReasoning response format. To use it, paste it into your OAI_CONFIG_LIST file and set the api_key to your OpenAI API key.
[
    {
        "model": "gpt-4o-mini",
        "api_key": "<YOUR_OPENAI_API_KEY>",
        "response_format": {
            "$defs":{
                "Step":{
                    "properties":{
                        "explanation":{
                        "title":"Explanation",
                        "type":"string"
                        },
                        "output":{
                        "title":"Output",
                        "type":"string"
                        }
                    },
                    "required":[
                        "explanation",
                        "output"
                    ],
                    "title":"Step",
                    "type":"object"
                }
            },
            "properties":{
                "steps":{
                    "items":{
                        "$ref":"#/$defs/Step"
                    },
                    "title":"Steps",
                    "type":"array"
                },
                "final_answer":{
                    "title":"Final Answer",
                    "type":"string"
                }
            },
            "required":[
                "steps",
                "final_answer"
            ],
            "title":"MathReasoning",
            "type":"object"
        },         
        "tags": ["gpt-4o-mini-response-format"]
    }
]
import autogen

# Load the configuration including the response format
llm_config = autogen.LLMConfig.from_json(path="OAI_CONFIG_LIST", cache_seed=42).where(tags=["gpt-4o-response-format"])

# Output the configuration, showing that it matches the configuration file.
llm_config
Learn more about configuring LLMs for agents here.

Example: math reasoning

Using structured output, we can enforce chain-of-thought reasoning in the model to output an answer in a structured, step-by-step way.

Define chat actors

Now we can define the agents that will solve the posed math problem. We will keep this example simple; we will use a UserProxyAgent to input the math problem and an AssistantAgent to solve it. The AssistantAgent will be constrained to solving the math problem step-by-step by using the MathReasoning response format we defined above. The response_format is added to the LLM configuration and then this configuration is applied to the agent.
user_proxy = autogen.UserProxyAgent(
    name="User_proxy",
    system_message="A human admin.",
    human_input_mode="NEVER",
)

assistant = autogen.AssistantAgent(
    name="Math_solver",
    llm_config=llm_config,  # Response Format is in the configuration
)

Start the chat

Let’s now start the chat and prompt the assistant to solve a simple equation. The assistant agent should return a response solving the equation using a step-by-step MathReasoning model.
summary = user_proxy.initiate_chat(
    assistant, message="how can I solve 8x + 7 = -23", max_turns=1, summary_method="last_msg"
).summary

summary