TL;DR:
- We introduce Teachable Agents so that users can teach their LLM-based assistants new facts, preferences, and skills.
- We showcase examples of teachable agents learning and later recalling facts, preferences, and skills in subsequent chats.
Introduction
Conversational assistants based on LLMs can remember the current chat with the user, and can also demonstrate in-context learning of user teachings during the conversation. But the assistant’s memories and learnings are lost once the chat is over, or when a single chat grows too long for the LLM to handle effectively. Then in subsequent chats the user is forced to repeat any necessary instructions over and over.
Teachability
addresses these limitations by persisting user teachings across chat boundaries in long-term memory implemented as a vector database. Instead of copying all of memory into the context window, which would eat up valuable space, individual memories (called memos) are retrieved into context as needed. This allows the user to teach frequently used facts and skills to the teachable agent just once, and have it recall them in later chats.
Any instantiated agent
that inherits from ConversableAgent
can be made teachable by instantiating a Teachability
object and calling its add_to_agent(agent)
method.
In order to make effective decisions about memo storage and retrieval, the Teachability
object calls an instance of TextAnalyzerAgent
(another AutoGen agent) to identify and reformulate text as needed for remembering facts, preferences, and skills. Note that this adds extra LLM calls involving a relatively small number of tokens, which can add a few seconds to the time a user waits for each response.
Run It Yourself
AutoGen contains four code examples that use Teachability
.
-
Run chat_with_teachable_agent.py to converse with a teachable agent.
-
Run test_teachable_agent.py for quick unit testing of a teachable agent.
-
Use the Jupyter notebook agentchat_teachability.ipynb to step through examples discussed below.
-
Use the Jupyter notebook agentchat_teachable_oai_assistants.ipynb to make arbitrary OpenAI Assistants teachable through GPTAssistantAgent
.
Basic Usage of Teachability
- Install dependencies
Please install autogen with the [teachable] option before using Teachability
.
- Import agents
- Create llm_config
- Create the agents
- Chat with the teachable agent
Example 1 - Learning user info
A user can teach the agent facts about themselves.
(Note that due to their finetuning, LLMs can be reluctant to admit that they know personal information.)
In a later conversation, the user can check whether the teachable agent remembers their name. (For readability, the user prompts and some logged notices are not repeated below.)
Example 2 - Learning new facts
A user can teach the agent more complex, related facts.
Then in a later chat the teachable agent can answer questions about the facts it has been taught.
(Remember to first close the previous chat by typing ‘exit’.)
Example 3 - Learning user preferences
A user can teach the agent how they prefer to have things done.
Be aware that a message like the next one cannot be entered as a single message through a command line because it contains a newline character.
Such messages can be entered in a Jupyter notebook, or through some UI layer like that of ChatGPT.
--------------------------------------------------------------------------------
user (to teachable_agent):
Please summarize this abstract.
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation
Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Xiaoyun Zhang, Shaokun Zhang, Jiale Liu, Ahmed Hassan Awadallah, Ryen W White, Doug Burger, Chi Wang
AutoGen is an open-source framework that allows developers to build LLM applications via multiple agents that can converse with each other to accomplish tasks. AutoGen agents are customizable, conversable, and can operate in various modes that employ combinations of LLMs, human inputs, and tools. Using AutoGen, developers can also flexibly define agent interaction behaviors. Both natural language and computer code can be used to program flexible conversation patterns for different applications. AutoGen serves as a generic infrastructure to build diverse applications of various complexities and LLM capacities. Empirical studies demonstrate the effectiveness of the framework in many example applications, with domains ranging from mathematics, coding, question answering, operations research, online decision-making, entertainment, etc.
--------------------------------------------------------------------------------
teachable_agent (to user):
AutoGen is an open-source framework designed to enable developers to create LLM applications with multiple customizable agents that can converse with each other to complete tasks. These agents can operate using different combinations of LLMs, human inputs, and tools, allowing developers to define agent interaction behaviors flexibly. AutoGen supports programming flexible conversation patterns using both natural language and code, making it suitable for building diverse applications with varying complexities and LLM capacities. Its effectiveness has been demonstrated through empirical studies across various domains including mathematics, coding, operations research, decision-making, and entertainment.
--------------------------------------------------------------------------------
Please summarize this abstract. When I'm summarizing an abstract, I try to make the summary contain just three short bullet points: the title, the innovation, and the key empirical results.
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation
Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Xiaoyun Zhang, Shaokun Zhang, Jiale Liu, Ahmed Hassan Awadallah, Ryen W White, Doug Burger, Chi Wang
AutoGen is an open-source framework that allows developers to build LLM applications via multiple agents that can converse with each other to accomplish tasks. AutoGen agents are customizable, conversable, and can operate in various modes that employ combinations of LLMs, human inputs, and tools. Using AutoGen, developers can also flexibly define agent interaction behaviors. Both natural language and computer code can be used to program flexible conversation patterns for different applications. AutoGen serves as a generic infrastructure to build diverse applications of various complexities and LLM capacities. Empirical studies demonstrate the effectiveness of the framework in many example applications, with domains ranging from mathematics, coding, question answering, operations research, online decision-making, entertainment, etc.
--------------------------------------------------------------------------------
teachable_agent (to user):
- Title: AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation
- Innovation: Open-source framework for creating customizable LLM applications through agent conversations, supporting various modes and interaction behaviors.
- Key Empirical Results: Demonstrated effectiveness across diverse application domains, including mathematics, coding, question answering, and more.
--------------------------------------------------------------------------------
Then in later chats the teacher doesn’t need to reiterate their detailed preferences.
--------------------------------------------------------------------------------
user (to teachable_agent):
Please summarize this abstract.
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sébastien Bubeck, Varun Chandrasekaran, Ronen Eldan, Johannes Gehrke, Eric Horvitz, Ece Kamar, Peter Lee, Yin Tat Lee, Yuanzhi Li, Scott Lundberg, Harsha Nori, Hamid Palangi, Marco Tulio Ribeiro, Yi Zhang
Artificial intelligence (AI) researchers have been developing and refining large language models (LLMs) that exhibit remarkable capabilities across a variety of domains and tasks, challenging our understanding of learning and cognition. The latest model developed by OpenAI, GPT-4, was trained using an unprecedented scale of compute and data. In this paper, we report on our investigation of an early version of GPT-4, when it was still in active development by OpenAI. We contend that (this early version of) GPT-4 is part of a new cohort of LLMs (along with ChatGPT and Google's PaLM for example) that exhibit more general intelligence than previous AI models. We discuss the rising capabilities and implications of these models. We demonstrate that, beyond its mastery of language, GPT-4 can solve novel and difficult tasks that span mathematics, coding, vision, medicine, law, psychology and more, without needing any special prompting. Moreover, in all of these tasks, GPT-4's performance is strikingly close to human-level performance, and often vastly surpasses prior models such as ChatGPT. Given the breadth and depth of GPT-4's capabilities, we believe that it could reasonably be viewed as an early (yet still incomplete) version of an artificial general intelligence (AGI) system. In our exploration of GPT-4, we put special emphasis on discovering its limitations, and we discuss the challenges ahead for advancing towards deeper and more comprehensive versions of AGI, including the possible need for pursuing a new paradigm that moves beyond next-word prediction. We conclude with reflections on societal influences of the recent technological leap and future research directions.
--------------------------------------------------------------------------------
teachable_agent (to user):
- Title: Sparks of Artificial General Intelligence: Early experiments with GPT-4
- Innovation: GPT-4, an LLM with remarkable capabilities, demonstrates human-level performance across various domains, like math, coding, vision, medicine, law, and psychology.
- Key results: GPT-4 significantly surpasses prior models, suggesting it may be an early version of AGI; limitations and challenges toward deeper AGI are also discussed.
--------------------------------------------------------------------------------
Example 4 - Learning new skills
Users can extend the teachable agent’s capabilities by teaching it new skills for accomplishing challenging tasks. It usually works best to first describe the task, then (in the same turn) provide a hint or advice for approaching the task.
The Sparks of AGI paper evaluated GPT-4 on math problems like the following, which it could only solve 32% of the time. We first show a failure case, then teach the agent a strategy which lifts GPT-4’s success rate above 95%.
In a later chat the user doesn’t need to repeat the detailed advice.
Planned improvements
- Understanding user instructions distributed over multiple turns.
- Learning from the agent’s own experience, to reduce dependence on explicit user teachings.
- Learning skills built on top of previously learned skills.
Conclusion
Teachability
is still under active research and development. For any problems you find or improvements you have in mind, please join our discussions in this repo and on our Discord channel. We look forward to seeing how you and the rest of the community can use and improve teachable agents in AutoGen!