autogen.agentchat.contrib.vectordb.mongodb.MongoDBAtlasVectorDB
MongoDBAtlasVectorDB
A Collection object for MongoDB.
Initialize the vector database.
Name | Description |
---|---|
connection_string | Type: str Default: ” |
database_name | Type: str Default: ‘vector_db’ |
embedding_function | Type: Callable[…, Any] | None Default: None |
collection_name | Type: str Default: None |
index_name | Type: str Default: ‘vector_index’ |
overwrite | Type: bool Default: False |
wait_until_index_ready | Type: float | None Default: None |
wait_until_document_ready | Type: float | None Default: None |
Class Attributes
active_collection
embedding_function
type
Instance Methods
create_collection
Create a collection in the vector database and create a vector search index in the collection.
Name | Description | |
---|---|---|
collection_name | str | The name of the collection. Type: str |
overwrite | bool | Whether to overwrite the collection if it exists. Default is False. Type: bool Default: False |
get_or_create | bool | Whether to get or create the collection. Default is True Type: bool Default: True |
create_index_if_not_exists
Creates a vector search index on the specified collection in MongoDB.
Name | Description |
---|---|
index_name | Type: str Default: ‘vector_index’ |
collection | The MongoDB collection to create the index on. Defaults to None. Type: Collection Default: None |
create_vector_search_index
Create a vector search index in the collection.
Name | Description |
---|---|
collection | An existing Collection in the Atlas Database. Type: Collection |
index_name | Vector Search Index name. Type: str | None Default: ‘vector_index’ |
similarity | Algorithm used for measuring vector similarity. Type: Literal[‘euclidean’, ‘cosine’, ‘dotProduct’] Default: ‘cosine’ |
delete_collection
Delete the collection from the vector database.
Name | Description | |
---|---|---|
collection_name | str | The name of the collection. Type: str |
delete_docs
Delete documents from the collection of the vector database.
Name | Description | |
---|---|---|
ids | List[ItemID] | A list of document ids. Each id is a typed ItemID .Type: list[str | int] |
collection_name | str | The name of the collection. Default is None. Type: str Default: None |
**kwargs |
get_collection
Get the collection from the vector database.
Name | Description | |
---|---|---|
collection_name | str | The name of the collection. Default is None. If None, return the current active collection. Type: str Default: None |
Type | Description |
---|---|
Collection | Collection | The collection object. |
get_docs_by_ids
Retrieve documents from the collection of the vector database based on the ids.
Name | Description | |
---|---|---|
ids | List[ItemID] | A list of document ids. If None, will return all the documents. Default is None. Type: list[str | int] Default: None |
collection_name | str | The name of the collection. Default is None. Type: str Default: None |
include | List[str] | The fields to include. If None, will include [“metadata”, “content”], ids will always be included. Basically, use include to choose whether to include embedding and metadata Type: list[str] Default: None |
**kwargs |
Type | Description |
---|---|
list[Document] | List[Document] | The results. |
insert_docs
Insert Documents and Vector Embeddings into the collection of the vector database.
For large numbers of Documents, insertion is performed in batches.
Name | Description | |
---|---|---|
docs | List[Document] | A list of documents. Each document is a TypedDict Document .Type: list[Document] |
collection_name | str | The name of the collection. Default is None. Type: str Default: None |
upsert | bool | Whether to update the document if it exists. Default is False. Type: bool Default: False |
batch_size | Number of documents to be inserted in each batch Type: int Default: 100000 | |
**kwargs | Type: Any |
list_collections
List the collections in the vector database.
Returns:
List[str] | The list of collections.
retrieve_docs
Retrieve documents from the collection of the vector database based on the queries.
Name | Description | |
---|---|---|
queries | List[str] | A list of queries. Each query is a string. Type: list[str] |
collection_name | str | The name of the collection. Default is None. Type: str Default: None |
n_results | int | The number of relevant documents to return. Default is 10. Type: int Default: 10 |
distance_threshold | float | The threshold for the distance score, only distance smaller than it will be returned. Don’t filter with it if 0. Default is -1. Type: float Default: -1 |
**kwargs | Type: Any |
Type | Description |
---|---|
list[list[tuple[Document, float]]] | QueryResults | For each query string, a list of nearest documents and their scores. |
update_docs
Update documents, including their embeddings, in the Collection.
Optionally allow upsert as kwarg.
Uses deepcopy to avoid changing docs.
Name | Description | |
---|---|---|
docs | List[Document] | A list of documents. Type: list[Document] |
collection_name | str | The name of the collection. Default is None. Type: str Default: None |
**kwargs | Type: Any |