DocAgent Performance
This page is a summary of our testing of DocAgent. The version (or code date) of AG2 is included as a reference.
If you want to run the test code, install AG2 with the rag
extra.
Test Results
Code version: As of March 3rd 2025 (will be in version after v0.8.0b1)
For InMemory Query Engine the LLM used is OpenAI’s GPT-4o mini.
# | Task | Ingested | In-memory Query Engine | Chroma Query Engine |
---|---|---|---|---|
1 | URL to Markdown file, query to summarize | ✅ | ✅ | ✅ |
2 | URL to Microsoft Word document, query highlights | ✅ | ✅ | ✅ |
3 | URL to PDF annual report, query specific figures | ✅ | ✅ | ✅ |
4 | URL to PDF document, query to explain | ✅ | ✅ | ✅ |
5 | Local file, PDF, query to explain | ✅ | ✅ | ✅ |
6 | URL to JPG of scanned invoice, query a figure | ❌ | 🔶 | 🔶 |
7 | Local file, PNG of scanned invoice, query a figure | ❌ | ❌ | ❌ |
8 | URL to XLSX using a redirect URL, query table | ✅ | 🔶 | 🔶 |
9 | URL to XLSX, query data | ❌ | 🔶 | ✅ |
10 | URL to CSV, query a figure | ❌ | N/A | N/A |
11 | URL to CSV, query to summarize | ✅ | ✅ | ✅ |
12 | URL with CSV, query unrelated | ✅ | ✅ | ✅ |
13 | Local files, 2 x Markdowns, Query to compare | ✅ | ✅ | ✅ |
14 | Local file, Markdown, unrelated query | ✅ | ✅ | ✅ |
15 | Local file, Markdown, unrelated query but general knowledge | ✅ | ✅ | ✅ |
16 | No files to ingest but has query | N/A | ✅ | ✅ |
17 | Local file, PDF of annual report, query a figure | ✅ | ✅ | ✅ |
18 | Local file, Microsoft Word, query a figure | ✅ | ✅ | ❌ |
19 | URL to web page with query to summarize | ✅ | ✅ | ✅ |
20a | Local files, PDF and DOCX, one query to cover both | ✅ | ✅ | ✅ |
20b | Follow-up query to DocAgent | N/A | ✅ | ❌ |
Task 1: URL to Markdown file, query to summarize
Task Message:
Ingested | InMemoryQueryEngine Query | Chroma Vector Store Query |
---|---|---|
✅ | ✅ | ✅ |
Sample output:
Task 2: URL to Microsoft Word document, query highlights
Task Message:
Ingested | InMemoryQueryEngine Query | Chroma Vector Store |
---|---|---|
✅ | ✅ | ✅ |
Sample output:
Task 3: URL to PDF annual report, query specific figures
Task Message:
Ingested | InMemoryQueryEngine Query | Chroma Query Engine |
---|---|---|
✅ | ✅ | ✅ |
Sample output:
Task 4: URL to PDF document, query to explain
Task Message:
Ingested | InMemoryQueryEngine Query | Chroma Query Engine |
---|---|---|
✅ | ✅ | ✅ |
Sample output:
Task 5: Local file, PDF, query to explain
Task Message:
Note: This is the same document as Task 4, just stored locally.
Ingested | InMemoryQueryEngine Query | Chroma Query Engine |
---|---|---|
✅ | ✅ | ✅ |
Sample output:
Task 6: URL to JPG of scanned invoice, query a figure
Task Message:
Ingested | InMemoryQueryEngine Query | Chroma Query Engine |
---|---|---|
❌ | 🔶 | 🔶 |
Notes:
- During ingestion, the image was downloaded correctly but Docling could not OCR the text correctly, missing a lot of the text and substituting the British Pound symbol as a “2”.
- The InMemoryQueryEngine and VectorChromaQueryEngine successfully read the converted text and provided the correct result based on that, however the result is erroneous.
Sample output:
Task 7: Local file, PNG of scanned invoice, query a figure
Task Message:
Ingested | InMemoryQueryEngine Query | Chroma Query Engine |
---|---|---|
❌ | ❌ | ❌ |
Notes:
- During ingestion, the image was downloaded correctly but Docling could not OCR the text correctly, missing all the text and tables.
- The InMemoryQueryEngine and VectorChromaQueryEngine read but could not gather any information from the converted file so they could not answer the question.
Sample output:
Task 8: URL to XLSX using a redirect URL, query table
Task Message:
Ingested | InMemoryQueryEngine Query | Chroma Query Engine |
---|---|---|
✅ | 🔶 | 🔶 |
Notes:
- The ingested and converted Excel file was correctly interpreted in the Markdown
- The InMemoryQueryEngine’s LLM, GPT-4o mini, was able to pull two rows of data out correctly but could not compile the 20 rows of data based on the broad query.
- The VectorChromaQueryEngine was able to pull one row of data out correctly but could not compile the 20 rows of data based on the broad query.
Sample output:
Task 9: URL to XLSX, query data
Task Message:
Ingested | InMemoryQueryEngine Query | Chroma Query Engine |
---|---|---|
❌ | 🔶 | ✅ |
Notes:
- The ingested Excel file was not correctly converted to Markdown tables, with some end columns being moved to new tables
- The InMemoryQueryEngine could not answer the question due to the incorrectly parsed Markdown
- Interestingly, the VectorChromaQueryEngine was able to determine the correct answer
Sample output:
Task 10: URL to CSV, query a figure
Task Message:
Ingested | InMemoryQueryEngine Query | Chroma Query Engine |
---|---|---|
❌ | N/A | N/A |
Notes:
- This URL could not be downloaded correctly, though it works in a browser.
Task 11: URL to CSV, query to summarize
Task Message:
Ingested | InMemoryQueryEngine Query | Chroma Query Engine |
---|---|---|
✅ | ✅ | ✅ |
Sample output:
Task 12: URL with CSV, query unrelated (should not answer)
Task Message:
Ingested | InMemoryQueryEngine Query | Chroma Query Engine |
---|---|---|
✅ | ✅ | ✅ |
Sample output:
Task 13: Local files, 2 x Markdowns, Query to compare
Task Message:
Files:
Ingested | InMemoryQueryEngine Query | Chroma Query Engine |
---|---|---|
✅ | ✅ | ✅ |
Sample output:
Task 14: Local file, Markdown, unrelated query (should not answer)
Task Message:
Ingested | InMemoryQueryEngine Query | Chroma Query Engine |
---|---|---|
✅ | ✅ | ✅ |
Sample output:
Task 15: Local file, Markdown, unrelated query but general knowledge (should not answer)
Task Message:
Ingested | InMemoryQueryEngine Query | Chroma Query Engine |
---|---|---|
✅ | ✅ | ✅ |
Sample output:
Task 16: No files to ingest but has query
Task Message:
Ingested | InMemoryQueryEngine Query | Chroma Query Engine |
---|---|---|
N/A | ✅ | ✅ |
Sample output:
Task 17: Local file, PDF of annual report, query a figure
Task Message:
Files used:
Ingested | InMemoryQueryEngine Query | Chroma Query Engine |
---|---|---|
✅ | ✅ | ✅ |
Sample output:
Task 18: Local file, Microsoft Word, query a figure
Task Message:
Files used:
MSFT FY25Q2 10-Q FINAL.docx
from Microsoft Quarterly Report asset package
Ingested | InMemoryQueryEngine Query | Chroma Query Engine |
---|---|---|
✅ | ✅ | ❌ |
Notes:
- VectorChromaQueryEngine could not answer the question.
Sample output:
Task 19: URL to web page with query to summarize
Task Message:
Ingested | InMemoryQueryEngine Query | Chroma Query Engine |
---|---|---|
✅ | ✅ | ✅ |
Sample output:
Task 20a: Local files, PDF and DOCX, one query to cover both
Task Message:
Ingested | InMemoryQueryEngine Query | Chroma Query Engine |
---|---|---|
✅ | ✅ | ✅ |
Sample output:
Task 20b: Follow-up query to DocAgent
Task Message:
Ingested | InMemoryQueryEngine Query | Chroma Query Engine |
---|---|---|
N/A | ✅ | ❌ |
Notes:
- VectorChromaQueryEngine could not correctly answer the Microsoft operating income figure, it could only provide a note about the amount it increased.
Sample output:
Test Code
Here is a basic Python code in line with the tests above. You will need to have OPENAI_API_KEY
set in your environment variables.