FathomGPT: A Natural Language Interface for Interactively Exploring Ocean Science Data

| Demo | Paper (coming soon) |

FathomGPT is an open source system for the interactive investigation of ocean science images and data via a natural language web interface. It was designed in close collaboration with marine scientists to enable researchers and ocean enthusiasts to explore and analyze the FathomNet database.

FathomGPT introduces a custom information retrieval pipeline that leverages OpenAI’s GPT technologies to enable: the creation of complex database queries to retrieve images, taxonomic information, and scientific measurements; mapping common names and morphological features to scientific names; generating interactive charts on demand; and searching by image or specified patterns within an image. In designing FathomGPT, particular emphasis was placed on the user experience, facilitating free-form exploration and optimizing response times.

fathomgpt-preview.mp4

Usage Scenarios

In this scenario, the scientist wants to compare information about deep sea creatures. FathomGPT would first use name resolution to find creatures that match the description in the prompt. It would then generate a scatterplot of the data to help the user understand the relationships between the temperature and pressure levels.

usecase-deepsea (1) usecase-scatterplot (1)

FathomGPT is also able to search for images similar to an image that the user uploaded. This would help the user identify species and allow them to ask followup questions to learn more.

usecase-similar-jellyfish (1)

Architecture

Here we show a high-level pipeline of how an input prompt is processed by the system to produce a JSON output response, which is sent to the frontend webpage to be rendered.

Architecture (1)

We use an LLM (GPT-3.5) to understand the user prompt and determine which functions to call (eg. name resolution, sql generation, fetch taxonomy tree, etc).

Name resolution converts common names or descriptions into scientific names that could be used to fetch data from the FathomNet database. We use knowledge graph alignment between the prompt and the species data to resolve descriptions (eg. habitat, morphology, predator/prey relations).

We use an LLM to convert natural language prompts into SQL queries. We fine-tuned it to work specifically for the FathomNet database.