ChromaDBQueryEngine

ChromaDBQueryEngine(
    host: str | None = 'localhost',
    port: int | None = 8000,
    settings: ForwardRef('Settings') | None = None,
    tenant: str | None = None,
    database: str | None = None,
    embedding_function: Optional[EmbeddingFunction[Any]] = None,
    metadata: dict[str, Any] | None = None,
    llm: ForwardRef('LLM') | None = None,
    collection_name: str | None = None
)

This engine leverages Chromadb to persist document embeddings in a named collection and LlamaIndex’s VectorStoreIndex to efficiently index and retrieve documents, and generate an answer in response to natural language queries. Collection can be regarded as an abstraction of group of documents in the database.
It expects a Chromadb server to be running and accessible at the specified host and port.
Refer to this link for running Chromadb in a Docker container.
If the host and port are not provided, the engine will create an in-memory ChromaDB client.
Initializes the ChromaDBQueryEngine with db_path, metadata, and embedding function and llm.

Parameters:
NameDescription
hostType: str | None

Default: ‘localhost’
portType: int | None

Default: 8000
settingsType: ForwardRef(‘Settings’) | None

Default: None
tenantType: str | None

Default: None
databaseType: str | None

Default: None
embedding_functionType: Optional[EmbeddingFunction[Any]]

Default: None
metadataType: dict[str, typing.Any] | None

Default: None
llmType: ForwardRef(‘LLM’) | None

Default: None
collection_nameType: str | None

Default: None

Instance Methods

add_docs

add_docs(
    self,
    new_doc_dir: Path | str | None = None,
    new_doc_paths_or_urls: Sequence[Path | str] | None = None,
    *args: Any,
    **kwargs: Any
) -> None

Add new documents to the underlying database and add to the index.

Parameters:
NameDescription
new_doc_dirA dir of input documents that are used to create the records in database.

Type: pathlib.Path | str | None

Default: None
new_doc_paths_or_urlsA sequence of input documents that are used to create the records in database.

A document can be a path to a file or a url.

Type: Sequence[pathlib.Path | str] | None

Default: None
*argsAny additional arguments

Type: Any
**kwargsAny additional keyword arguments

Type: Any

connect_db

connect_db(
    self,
    *args: Any,
    **kwargs: Any
) -> bool

Connect to the database.
It does not overwrite the existing collection in the database.
It takes the following steps,

  1. Set up ChromaDB and LlamaIndex storage.
    2. Create the llamaIndex vector store index for querying or inserting docs later
Parameters:
NameDescription
*argsAny additional arguments

Type: Any
**kwargsAny additional keyword arguments

Type: Any
Returns:
TypeDescription
boolbool: True if connection is successful

get_collection_name

get_collection_name(self) -> str

Get the name of the collection used by the query engine.
Returns:
The name of the collection.

Returns:
TypeDescription
strThe name of the collection.

init_db

init_db(
    self,
    new_doc_dir: Path | str | None = None,
    new_doc_paths_or_urls: Sequence[Path | str] | None = None,
    *args: Any,
    **kwargs: Any
) -> bool

Initialize the database with the input documents or records.
It overwrites the existing collection in the database.
It takes the following steps,

  1. Set up ChromaDB and LlamaIndex storage.
    2. insert documents and build indexes upon them.
Parameters:
NameDescription
new_doc_dira dir of input documents that are used to create the records in database.

Type: pathlib.Path | str | None

Default: None
new_doc_paths_or_urlsa sequence of input documents that are used to create the records in database.

a document can be a path to a file or a url.

Type: Sequence[pathlib.Path | str] | None

Default: None
*argsAny additional arguments

Type: Any
**kwargsAny additional keyword arguments

Type: Any
Returns:
TypeDescription
boolbool: True if initialization is successful

query

query(self, question: str) -> str

Retrieve information from indexed documents by processing a query using the engine’s LLM.

Parameters:
NameDescription
questionA natural language query string used to search the indexed documents.

Type: str
Returns:
TypeDescription
strA string containing the response generated by LLM.