Use Cases
- Use cases
- Notebooks
- All Notebooks
- Websockets: Streaming input and output using websockets
- Perplexity Search Tool
- Auto Generated Agent Chat: Task Solving with Code Generation, Execution, Debugging & Human Feedback
- Using a local Telemetry server to monitor a GraphRAG agent
- Auto Generated Agent Chat: Task Solving with Provided Tools as Functions
- RealtimeAgent with gemini client
- Agentic RAG workflow on tabular data from a PDF file
- Language Agent Tree Search
- Config loader utility functions
- Wikipedia Agent
- Google Drive Tools
- FSM - User can input speaker transition constraints
- DeepResearchAgent
- Group Chat with Retrieval Augmented Generation
- AgentOptimizer: An Agentic Way to Train Your LLM Agent
- Using RetrieveChat with Qdrant for Retrieve Augmented Code Generation and Question Answering
- RealtimeAgent with WebRTC connection
- Agent with memory using Mem0
- Preprocessing Chat History with `TransformMessages`
- RealtimeAgent in a Swarm Orchestration
- RAG OpenAI Assistants in AG2
- Using Guidance with AG2
- Using RetrieveChat Powered by MongoDB Atlas for Retrieve Augmented Code Generation and Question Answering
- Using Neo4j's graph database with AG2 agents for Question & Answering
- Use AG2 in Databricks with DBRX
- Wikipedia Search Tools
- Solving Complex Tasks with Nested Chats
- DeepSeek: Adding Browsing Capabilities to AG2
- Solving Complex Tasks with A Sequence of Nested Chats
- Nested Chats for Tool Use in Conversational Chess
- OpenAI Assistants in AG2
- Group Chat with Coder and Visualization Critic
- Agent Chat with Multimodal Models: LLaVA
- SocietyOfMindAgent
- Conversational Workflows with MCP: A Marie Antoinette Take on The Eiffel Tower
- Agent Chat with Multimodal Models: DALLE and GPT-4V
- StateFlow: Build Workflows through State-Oriented Actions
- Small, Local Model (IBM Granite) Multi-Agent RAG
- Group Chat with Tools
- DuckDuckGo Search Tool
- Load the configuration including the response format
- RealtimeAgent in a Swarm Orchestration
- MCP Clients
- Agent Tracking with AgentOps
- SQL Agent for Spider text-to-SQL benchmark
- AutoBuild
- Automatically Build Multi-agent System from Agent Library
- Generate Dalle Images With Conversable Agents
- Agent Observability with OpenLIT
- Using RetrieveChat Powered by Couchbase Capella for Retrieve Augmented Code Generation and Question Answering
- A Uniform interface to call different LLMs
- OptiGuide with Nested Chats in AG2
- Conversational Chess using non-OpenAI clients
- Chat Context Dependency Injection
- Agent with memory using Mem0
- `run` function examples with event processing
- From Dad Jokes To Sad Jokes: Function Calling with GPTAssistantAgent
- Mitigating Prompt hacking with JSON Mode in Autogen
- Group Chat with Customized Speaker Selection Method
- Trip planning with a FalkorDB GraphRAG agent using a Swarm
- Tavily Search Tool
- Structured output
- Adding Google Search Capability to AG2
- Task Solving with Provided Tools as Functions (Asynchronous Function Calls)
- Conversational Workflows with MCP: A French joke on a random Wikipedia article
- Auto Generated Agent Chat: Using MathChat to Solve Math Problems
- Agent Chat with custom model loading
- Auto Generated Agent Chat: Function Inception
- Use AG2 to Tune ChatGPT
- Auto Generated Agent Chat: Group Chat with GPTAssistantAgent
- Using FalkorGraphRagCapability with agents for GraphRAG Question & Answering
- Solving Multiple Tasks in a Sequence of Async Chats
- Discord, Slack, and Telegram messaging tools
- RealtimeAgent in a Swarm Orchestration using WebRTC
- Runtime Logging with AG2
- WebSurferAgent
- Use MongoDBQueryEngine to query Markdown files
- Conversational Workflows with MCP: A Shakespearean Take on arXiv Abstracts
- RAG with DocAgent
- Solving Multiple Tasks in a Sequence of Chats
- Cross-Framework LLM Tool for CaptainAgent
- Web Scraping using Apify Tools
- Auto Generated Agent Chat: Collaborative Task Solving with Coding and Planning Agent
- Currency Calculator: Task Solving with Provided Tools as Functions
- Use ChromaDBQueryEngine to query Markdown files
- Using RetrieveChat for Retrieve Augmented Code Generation and Question Answering
- Writing a software application using function calls
- Using OpenAI’s Web Search Tool with AG2
- Using RetrieveChat Powered by PGVector for Retrieve Augmented Code Generation and Question Answering
- Enhanced Swarm Orchestration with AG2
- Perform Research with Multi-Agent Group Chat
- Auto Generated Agent Chat: Teaching AI New Skills via Natural Language Interaction
- Usage tracking with AG2
- Using Neo4j's native GraphRAG SDK with AG2 agents for Question & Answering
- Groupchat with Llamaindex agents
- Assistants with Azure Cognitive Search and Azure Identity
- Swarm Orchestration with AG2
- Tools with Dependency Injection
- Structured output from json configuration
- Solving Multiple Tasks in a Sequence of Chats with Different Conversable Agent Pairs
- Chat with OpenAI Assistant using function call in AG2: OSS Insights for Advanced GitHub Data Analysis
- Auto Generated Agent Chat: Collaborative Task Solving with Multiple Agents and Human Users
- Adding YouTube Search Capability to AG2
- Chatting with a teachable agent
- Making OpenAI Assistants Teachable
- RealtimeAgent in a Swarm Orchestration
- Translating Video audio using Whisper and GPT-3.5-turbo
- Run a standalone AssistantAgent
- Auto Generated Agent Chat: Task Solving with Langchain Provided Tools as Functions
- Use LLamaIndexQueryEngine to query Markdown files
- Auto Generated Agent Chat: GPTAssistant with Code Interpreter
- Interactive LLM Agent Dealing with Data Stream
- Agent Chat with Async Human Inputs
- ReasoningAgent - Advanced LLM Reasoning with Multiple Search Strategies
- Auto Generated Agent Chat: Solving Tasks Requiring Web Info
- Use AG2 to Tune OpenAI Models
- Engaging with Multimodal Models: GPT-4V in AG2
- Supercharging Web Crawling with Crawl4AI
- Use AG2 in Microsoft Fabric
- Cross-Framework LLM Tool Integration with AG2
- Demonstrating the `AgentEval` framework using the task of solving math problems as an example
- Group Chat
- Adding Browsing Capabilities to AG2
- CaptainAgent
- (Legacy) Implement Swarm-style orchestration with GroupChat
- Task Solving with Code Generation, Execution and Debugging
- RealtimeAgent with local websocket connection
- Community Gallery
Generate Dalle Images With Conversable Agents
Generate images with conversable agents.
This notebook illustrates how to add the image generation capability to a conversable agent.
Some extra dependencies are needed for this notebook, which can be installed via pip:
pip install ag2[openai,lmm]
For more information, please refer to the installation guide.
First, let’s import all the required modules to run this example.
import os
from IPython.display import display
from PIL.Image import Image
import autogen
from autogen.agentchat.contrib import img_utils
from autogen.agentchat.contrib.capabilities import generate_images
Let’s define our LLM configs.
gpt_config = {
"config_list": [{"model": "gpt-4-turbo-preview", "api_key": os.environ["OPENAI_API_KEY"]}],
"timeout": 120,
"temperature": 0.7,
}
gpt_vision_config = {
"config_list": [{"model": "gpt-4-vision-preview", "api_key": os.environ["OPENAI_API_KEY"]}],
"timeout": 120,
"temperature": 0.7,
}
dalle_config = {
"config_list": [{"model": "dall-e-3", "api_key": os.environ["OPENAI_API_KEY"]}],
"timeout": 120,
"temperature": 0.7,
}
Learn more about configuring LLMs for agents here.
Our system will consist of 2 main agents: 1. Image generator agent. 2. Critic agent.
The image generator agent will carry a conversation with the critic, and generate images based on the critic’s requests.
CRITIC_SYSTEM_MESSAGE = """You need to improve the prompt of the figures you saw.
How to create an image that is better in terms of color, shape, text (clarity), and other things.
Reply with the following format:
CRITICS: the image needs to improve...
PROMPT: here is the updated prompt!
If you have no critique or a prompt, just say TERMINATE
"""
def _is_termination_message(msg) -> bool:
# Detects if we should terminate the conversation
if isinstance(msg.get("content"), str):
return msg["content"].rstrip().endswith("TERMINATE")
elif isinstance(msg.get("content"), list):
for content in msg["content"]:
if isinstance(content, dict) and "text" in content:
return content["text"].rstrip().endswith("TERMINATE")
return False
def critic_agent() -> autogen.ConversableAgent:
return autogen.ConversableAgent(
name="critic",
llm_config=gpt_vision_config,
system_message=CRITIC_SYSTEM_MESSAGE,
max_consecutive_auto_reply=3,
human_input_mode="NEVER",
is_termination_msg=lambda msg: _is_termination_message(msg),
)
def image_generator_agent() -> autogen.ConversableAgent:
# Create the agent
agent = autogen.ConversableAgent(
name="dalle",
llm_config=gpt_vision_config,
max_consecutive_auto_reply=3,
human_input_mode="NEVER",
is_termination_msg=lambda msg: _is_termination_message(msg),
)
# Add image generation ability to the agent
dalle_gen = generate_images.DalleImageGenerator(llm_config=dalle_config)
image_gen_capability = generate_images.ImageGeneration(
image_generator=dalle_gen, text_analyzer_llm_config=gpt_config
)
image_gen_capability.add_to_agent(agent)
return agent
We’ll define extract_img
to help us extract the image generated by the
image generator agent.
def extract_images(sender: autogen.ConversableAgent, recipient: autogen.ConversableAgent) -> Image:
images = []
all_messages = sender.chat_messages[recipient]
for message in reversed(all_messages):
# The GPT-4V format, where the content is an array of data
contents = message.get("content", [])
for content in contents:
if isinstance(content, str):
continue
if content.get("type", "") == "image_url":
img_data = content["image_url"]["url"]
images.append(img_utils.get_pil_image(img_data))
if not images:
raise ValueError("No image data found in messages.")
return images
Start the conversation
dalle = image_generator_agent()
critic = critic_agent()
img_prompt = "A happy dog wearing a shirt saying 'I Love AG2'. Make sure the text is clear."
# img_prompt = "Ask me how I'm doing"
result = dalle.initiate_chat(critic, message=img_prompt)
Let’s display all the images that was generated by Dalle
images = extract_images(dalle, critic)
for image in reversed(images):
display(image.resize((300, 300)))