- Ingest documents from a local file or URL
- and answer questions with RAG capability
Installation
To get started with the document agent integration in AG2, follow these steps: Install AG2 with therag
extra:
- DocAgent only queries ingested documents, this ensures that it won’t make up information if it can’t find it.
- Answers may not be accurate for documents that cannot be parsed correctly to Markdown format.
Documents supported
The following documents can be ingested: - PDF - DOCX (DOCX, DOTX, DOCM, DOTM) - XLSX - PPTX (PPTX, POTX, PPSX, PPTM, POTM, PPSM) - HTML - ASCIIDOC (ADOC, ASCIIDOC, ASC) - MD (MD, MARKDOWN) - XML (XML, NXML) - TXT - JSON - CSV - IMAGE (BMP, JPG, JPEG, PNG, TIFF, TIF) You can also have the DocAgent use a web page by giving it a URL to ingest.Inside the DocAgent

- Triage Agent: Decides what type of task to perform from user requests.
- Task Manager Agent: Manages the tasks and initiates actions.
- Data Ingestion Agent: Ingests the documents.
- Query Agent: Answers user questions based on ingested documents.
- Error Agent: If anything fails, the error agent will report the problem back.
- Summary Agent: Generates a summary of the completed tasks.
- Triage User Requests: The
Triage Agent
categorizes the tasks into ingestions and queries. - Task Management: The
Task Manager Agent
manages the tasks and ensures they are executed in the correct sequence. - Data Ingestion: The
Data Ingestion Agent
processes any document ingesting tasks. - Query Execution: The
Query Agent
answers any user queries. - Summary Generation: The
Summary Agent
generates a summary of the completed tasks.
Code example
Ingesting local documents and answering questions
[With Citations Support] Ingesting local documents and answering questions
Fetching a webpage and answering questions
Multiple DocAgents in a Swarm
Now we’re going to use multiple DocAgents, each responsible for their own data. An nvidia_agent agent will ingest NVIDIA’s financial report and query it. Similarly, an amd_agent will do the same with AMD’s financial report. Although a single agent could ingest and query the documents, we want to ensure that their queries aren’t tainted by the other company’s documents. So we keep them separate and give them a uniquecollection_name
which, in turn, will create individual data stores
(See more on Chroma
collections).
Tips for DocAgent
- When asking for information from ingested information, be precise. For example, asking for revenue for the quarter in the previous example could retrieve a number of different revenue values, so we ask for “GAAP revenue” for the specific quarter.
- If you have ingested documents in previous runs and just need to query the information, be clear to the DocAgent that they don’t need to ingest documents you refer to.
- Ensure that any files to be ingested can be accessed by the process you are running.
- Ingestions take time, be sure to use
collection_name
to reuse collections that have already had the documents ingested in to. - You can review the Markdown files ingested in the
parsed_docs
folder to see how effective the conversion to Markdown was. This will help you investigate any query issues.