
AG2 is providing the key tools required to move towards automated AI-driven Cosmology and Astrophysics research. As cosmologists, and more generally as researchers, our workflows involve working with complex software tools, large complex datasets, while also keeping up with advances in our fields. Using AG2 we have implemented CMBAGENT, a multi-agent system that automates some of the key tasks involved in research workflows. Our fully open-source research-ready work, based on a Planning and Control strategy, has been successfully applied not only to cosmology tasks but also to data analysis tasks in economics, banking and education.Boris Bolliet, Agentic AI Lead - Infosys-Cambridge AI Centre

Overview
Our AG2 application is called CMBAGENT. It’s a multi-agent system for science with foundation models backends (e.g., Large Language Models; LLMs). It is built with the goal of having an AI system for Autonomous Scientific Discovery, with a focus on our research in Cosmology, which is the domain of origin of our team. In Cosmology, our goal is to process data from telescopes to learn about the fundamental properties of the universe. This is an example of an “inverse problem” known as parameter inference, where we are given the data and, based on assumptions on how the universe works (expansion of space, formation of galaxies), we extract the most likely values of our model parameters. For instance, from observations of the cosmic microwave background, and assuming Einstein’s theory of General Relativity, we can measure the age of the universe. The AG2 framework has allowed CMBAGENT to automate the workflow needed to solve the cosmological parameter inference problem.Challenge
Measuring the fundamental properties of the universe based on telescope data requires running computationally expensive simulations and comparing their output with observations. The software packages required to run the simulations and confront them with measurements are research-level libraries that take PhD students several years to learn. Furthermore, cosmological analyses must be informed by the latest results in our field, for instance to know whether a new model should be considered. To automate these research challenges we are designing AI agents that can act as expert users on the libraries of interest and that can inform the analysis at hand with results from the scientific literature. Also, to achieve full automation the agents must be capable of running experiments (e.g., data analysis pipelines) and simulations, as well as generating and selecting hypotheses.Solution: AG2 Integration
The very first version of CMBAGENT allowed us to exactly reproduce a cutting-edge cosmological data analysis in about 100 times less time and without writing the data analysis pipeline ourselves. The entire codebase was written by CMBAGENT. However, our initial version had a human-in-the-loop at all stages, i.e., human feedback was provided after every LLM-agent output. This was presented in Laverick et al (2024).
- Multi-LLM calls: we utilize different LLM for different tasks. Current examples include gpt-4o-mini for simple straightforward tasks, claude-3.7 for reviewing or critique tasks, o3-mini-high for reasoning, gpt-4o for planning, o3-min or gemini-2.5-pro-exp for coding.
- Swarm orchestration: to handle agent transitions we implemented CMBAGENT using swarm orchestration, where transitions are done based on conditions on a set of context variables that evolve and define the state of each agent.
- Tool calls allow us to operate on data, either by executing code locally or storing relevant information into context. With AG2 we are able to control these tool or function calls strictly.
- Structured Output is a key feature that we found to dramatically increase the robustness of CMBAGENT. For instance, during the planning stage of the workflow, the plan has a very specific structure with steps corresponding to sub-tasks, the agent in charge and the corresponding instructions. These are then injected into the agent’s context using swarm capabilities.
- Retrieval Augmented Generation agents: The GPTAssitantAgent in AG2 is a simple and powerful interface to OpenAI assistant with vector stores containing research software packages, documentations, tutorials and scientific papers of interest. Upon initialization, CMBAGENT creates a vector embedding of software package documentation or cosmology papers that is pushed on the OpenAI platform.
What’s Next
CMBAGENT is only the start of our research programme and we consider it as a prototype for future systems that will be able to carry out scientific research with super-human capabilities. Our next areas of development include evaluation and the development of an end-to-end research framework.- Evaluation with Inspect.AI: we are collaborating with Lorenzo Pacchiardi and Irene Testini at the Leverhulme Centre for the Future of Intelligence to develop an evaluation framework consisting of extensive benchmarks for cosmology research in order to assess quantitatively and systematically how AI agents perform as they evolve. With such a framework, we will be able to make quantifiable positive steps in the development of our AI agents and, even more importantly, to trust them fully. To illustrate how such a benchmark can be developed, we released CosmoPaperQA (developed by Adrian Dimitrov) on the ASTROANTS HuggingFace space, inspired by the work of FutureHouse.
- End-to-end cosmology research with ASTROPILOT. We have integrated CMBAGENT within ASTROPILOT. The ASTROPILOT project implements not only the analysis, but also research idea generation and paper writing. The starting point is simply the dataset and a description of the data. The rest is done by ASTROPILOT completely automatically.