Cohere
Cohere is a cloud based platform serving their own LLMs, in particular the Command family of models.
Cohere’s API differs from OpenAI’s, which is the native API used by AutoGen, so to use Cohere’s LLMs you need to use this library.
You will need a Cohere account and create an API key. See their website for further details.
Features
When using this client class, AutoGen’s messages are automatically tailored to accommodate the specific requirements of Cohere’s API.
Additionally, this client class provides support for function/tool calling and will track token usage and cost correctly as per Cohere’s API costs (as of July 2024).
Getting started
First you need to install the pyautogen
package to use AutoGen with
the Cohere API library.
Cohere provides a number of models to use, included below. See the list of models here.
See the sample OAI_CONFIG_LIST
below showing how the Cohere client
class is used by specifying the api_type
as cohere
.
As an alternative to the api_key
key and value in the config, you can
set the environment variable COHERE_API_KEY
to your Cohere key.
Linux/Mac:
Windows:
API parameters
The following parameters can be added to your config for the Cohere API. See this link for further information on them and their default values.
- temperature (number > 0)
- p (number 0.01..0.99)
- k (number 0..500)
- max_tokens (null, integer >= 0)
- seed (null, integer)
- frequency_penalty (number 0..1)
- presence_penalty (number 0..1)
- client_name (null, string)
Example:
Two-Agent Coding Example
In this example, we run a two-agent chat with an AssistantAgent (primarily a coding agent) to generate code to count the number of prime numbers between 1 and 10,000 and then it will be executed.
We’ll use Cohere’s Command R model which is suitable for coding.
Importantly, we have tweaked the system message so that the model doesn’t return the termination keyword, which we’ve changed to FINISH, with the code block.
Tool Call Example
In this example, instead of writing code, we will show how Cohere’s Command R+ model can perform parallel tool calling, where it recommends calling more than one tool at a time.
We’ll use a simple travel agent assistant program where we have a couple of tools for weather and currency conversion.
We start by importing libraries and setting up our configuration to use
Command R+ and the cohere
client class.
Create our two agents.
Create the two functions, annotating them so that those descriptions can be passed through to the LLM.
We associate them with the agents using register_for_execution
for the
user_proxy so it can execute the function and register_for_llm
for the
chatbot (powered by the LLM) so it can pass the function definitions to
the LLM.
We pass through our customer’s message and run the chat.
Finally, we ask the LLM to summarise the chat and print that out.
We can see that Command R+ recommended we call both tools and passed
through the right parameters. The user_proxy
executed them and this
was passed back to Command R+ to interpret them and respond. Finally,
Command R+ was asked to summarise the whole conversation.