Project summary

Funding - £500k via the UKRI DRI Phase 2 call

Timeline - October 2025 to March 2026

Aim - FRAME-FM aims to enable the fast and easy processing of diverse, complex and very large (peta-byte scale) environmental datasets by end users in economic sectors including energy, food, finance and logistics. We will achieve this aim by delivering a framework that facilitates the development of Foundation Models (FMs): machine learning (ML) models that can encapsulate the information contained in large datasets - such as those existing in the data archives of NERC’s Environmental Data Service (EDS) - and can be fine-tuned to perform specialised tasks. FRAME-FM will build a software framework that enables experts in environmental science to create such models without requiring deep expertise in the field of ML. Within the project, we will demonstrate how such models can be built and applied to real-world use cases. 

Core project team - Alberto Arribas Herranz, Ag Stephens, Lily Gouldsbrough, Andrew Kingdon

Previous research - the project runs alongside the High5 for AI project. Some of the work, relating to the accessibility of environmental data to Machine Learning workflows, follows from the previous EDS project: API4AI

Main goals of the project

  • Develop a software framework to enable building of ML models from environmental data
  • Explore and develop improved interfaces to large geospatial datasets
  • Demonstrate use of the framework using real datasets
  • Demonstrate how the resulting ML models can be applied to downstream tasks

Why are we doing this? / What are we trying to solve?

The UK Government and UKRI has identified AI as a strategic priority for growth across a range of sectors, including science. AI and Machine Learning have the potential to address some of the biggest environmental challenges. At present, the environmental research community is building its capabilities in this domain but many scientists are constrained by both a lack of domain knowledge around ML as well as technical barriers that can make it difficult to get started.

What are we doing to fix it? (In this project)

FRAME-FM aims to lower the barriers to accessing ML for a range of environmental scientists. We can do this by:

  • Providing a software and configuration framework that allows scientists to plug their datasets and models together without detailed knowledge compute platforms, storage systems and data formats/APIs
  • Providing recipes and guidance on selecting appropriate techniques, model architectures and datasets.
  • Improving interfaces to large geospatial datasets for ML workflows.

Why it matters

For society

Global environmental issues, such as climate change, are having a profound effect on communities, economies and the earth system. Improving our understanding, prediction and management of the environment has never been more important. 

For users of the data (e.g. researchers, developers) 

  • Scientists can get to the data, build models and apply them to real-world issues in less time
  • Programmers and Data Scientists can plug into the data interfaces and components of the framework to speed up their development and use of environmental data 
  • Improved efficiency in the configuration and training of ML models can reduce the energy and financial costs associated with this work
  • Decision-makers can benefit from faster and more accurate outputs from the above

For the EDS

  • UK research outputs, as disseminated by the EDS, are made more readily available
  • Improved data interfaces can be applied to more datasets in future
  • EDS scientists and data scientists are upskilled and can explore further use of ML to improve their services and outputs

Project reports 

To be added after project end date.

Next steps for API4AI 

To be added after project end date.