Use ReasoningAgent for o1 style reasoning in Agentic workflows with LLMs using AG2
ReasoningAgent
is designed to enhance language models’ reasoning
capabilities through systematic exploration of thought processes. By
implementing the Tree of Thoughts (ToT) framework, it enables LLMs like
GPT-4 and Llama to break down complex problems into manageable steps and
explore multiple solution paths simultaneously.
This notebook demonstrates the key features and capabilities of the
ReasoningAgent
, showing how it can effectively reason about problems.
ReasoningAgent
supports multiple search strategies for exploring
the reasoning space:
k
most promising paths at each stepreason_config
dictionarylast_meaningful_msg
summary function.
user_proxy
agent.
method="dfs"
in the reason_config, the agent will: 1. Generate one
reasoning step at a time 2. Follow that single path until reaching a
conclusion 3. Never explore alternative branches
Note: The effectiveness depends on the underlying model’s training.
Models not specifically trained for step-by-step reasoning may show
limited improvement with this approach.
Note 2: To enable the execution of each selected step before generating
the next step suggestions, pass "interim_execution": True
in
reason_config.
beam_size
greater than 1, the agent can maintain several candidate
solutions at each step, evaluating them based on their potential to lead
to the best final answer. This method is particularly effective when the
solution space is large and complex, as it balances exploration and
exploitation, ensuring that promising paths are prioritized while still
considering alternative options.
In this approach, the agent generates multiple reasoning steps in
parallel, allowing it to compare different trajectories and select the
most promising ones for further exploration. This can lead to more
robust and accurate conclusions, especially in scenarios where
intermediate evaluations are critical to the final outcome.
"batch_grading": True
in the
reason_config
. By default, batch_grading
is set to False
, meaning
individual node grading is performed without batching.
grader_llm_config
argument when initializing the ReasoningAgent
.
This ensures that the grading of trajectories is performed using the
specified configuration from the config_list
, separate from the main
llm_config
.
interim_execution
by setting it to True
in
reason_config
. This allows intermediate steps to be executed during
the reasoning process, promoting more effective step-by-step thinking
and enabling future steps to be informed by the outputs of earlier ones.
By default interim_execution
is False
which means that the selected
steps won’t be executed during reasoning.
code_execution_config
in reasoning agent
to enable code execution during reasoning. By default,
code_execution_config=False
, which means it will not execute code for
reasoning. Note that to allow for code execution, interim_execution
must be set to True
at reason_config
.
pip install
may not be sufficient for all operating
systems. In some cases, you might need to manually download and install
Graphviz.
pip install graphviz
max_turns
parameter to
execute multiple times.
grader_llm_config
argument when initializing the ReasoningAgent
.
This ensures that the grading of trajectories is performed using the
specified configuration from the config_list
, separate from the main
llm_config
.
ReasoningAgent
, a scope
parameter can be specified during
initialization. This parameter will provide valuable context about the
agent’s intended use, the reasoning process it should follow, and any
constraints or pitfalls to avoid. This information is incorporated into
the agent’s thought process to guide its behavior more effectively.
Note: The scope
differs from the system_message
in that it informs
the agent’s reasoning throughout the entire thinking process, whereas
the system_message
is used solely when generating the final response.