agentchat.contrib.reasoning_agent
ThinkNode
__init__
A node in a tree structure representing a step in the reasoning process.
This class implements a tree node that stores content (text describing a reasoning step), maintains parent-child relationships, tracks node statistics, and provides utilities for traversing/visualizing the reasoning path.
Arguments:
content
str - The text content/description for this reasoning step.parent
Optional[ThinkNode] - The parent node in the tree, if any.
Attributes:
-
content
str - The text content/description for this reasoning step. -
value
Optional[float] - A numeric score/value assigned to this node. -
parent
Optional[ThinkNode] - Reference to the parent node. -
reflection
str - A string containing reflections on the reasoning process. -
rating_details
str - A string providing details about the rating of this node. -
depth
int - The depth of this node in the tree (root = 0). -
children
List[ThinkNode] - List of child nodes. -
visits
int - Number of times this node has been visited during search.The node automatically maintains the tree structure by:
- Setting its depth based on the parent’s depth + 1.
- Adding itself to the parent’s children list if the parent exists.
- Providing trajectory utilities to get the full path from root to this node.
trajectory
Get a formatted string representation of the path from root to this node.
Returns:
str
- A formatted string showing the question and each step in the reasoning process
backpropagate
Update the score of this node and its parents using moving average.
to_dict
Convert ThinkNode to dictionary representation.
Returns:
Dict
- Dictionary containing all node attributes and recursive children
from_dict
Create ThinkNode from dictionary representation.
Arguments:
data
Dict - Dictionary containing node dataparent
Optional[ThinkNode] - Parent node to attach to
Returns:
ThinkNode
- Reconstructed node with all children
visualize_tree
Visualize the tree of thoughts using graphviz.
extract_sft_dataset
Extract the best trajectory or multiple equally good trajectories for SFT training.
Arguments:
root
- The root node of the tree.
Returns:
List of best trajectories, where each trajectory is a pair of instruction and response.
extract_rlhf_preference_dataset
Extract and generate preference pairs for RLHF training by comparing sibling nodes.
Arguments:
root
- The root node of the tree.contrastive_threshold
float - between (0, 1), a distance measure that we are confidence to call one is positive and another is negative.
Returns:
A list of preference pairs, where each pair contains two responses and indicates which one is preferred.
ReasoningAgent
__init__
Initialize a ReasoningAgent that uses tree-of-thought reasoning.
Arguments:
-
name
- Name of the agent -
llm_config
- Configuration for the language model -
grader_llm_config
- Optional separate configuration for the grader model. If not provided, uses llm_config -
max_depth
int - Maximum depth of the reasoning tree -
beam_size
int - DEPRECATED. Number of parallel reasoning paths to maintain -
answer_approach
str - DEPRECATED. Either “pool” or “best” - how to generate final answer -
verbose
bool - Whether to show intermediate steps -
reason_config
dict - Configuration for the reasoning method. Supported parameters: -
method
str - The search strategy to use. Options:- “beam_search” (default): Uses beam search with parallel paths
- “mcts”: Uses Monte Carlo Tree Search for exploration
- “lats”: Uses Language Agent Tree Search with per-step rewards
- “dfs”: Uses depth-first search (equivalent to beam_search with beam_size=1) Common parameters:
-
max_depth
int - Maximum depth of reasoning tree (default: 3) -
forest_size
int - Number of independent trees to maintain (default: 1) -
rating_scale
int - Scale for grading responses, e.g. 1-10 (default: 10)Beam Search specific:
-
beam_size
int - Number of parallel paths to maintain (default: 3) -
answer_approach
str - How to select final answer, “pool” or “best” (default: “pool”)MCTS/LATS specific:
-
nsim
int - Number of simulations to run (default: 3) -
exploration_constant
float - UCT exploration parameter (default: 1.41)Example configs:
-
\{"method"
- “beam_search”, “beam_size”: 5, “max_depth”: 4} -
\{"method"
- “mcts”, “nsim”: 10, “exploration_constant”: 2.0} -
\{"method"
- “lats”, “nsim”: 5, “forest_size”: 3}
generate_forest_response
Generate a response using tree-of-thought reasoning.
Arguments:
messages
- Input messages to respond tosender
- Agent sending the messagesconfig
- Optional configuration
Returns:
Tuple[bool, str]: Success flag and generated response
rate_node
Rate the quality of a reasoning path using the grader agent.
Arguments:
node
ThinkNode - Node containing the reasoning trajectory to evaluateis_outcome
bool - indicates whether the rating is for an outcome (final answer) or a process (thinking trajectory).
Returns:
float
- Normalized score between 0 and 1 indicating trajectory quality