Making IPCC data FAIR
It is indisputable that human activities are causing climate change, making extreme climate events, including heatwaves, heavy rainfall, and droughts, more frequent and severe.
This is the announcement of the Intergovernmental Panel on Climate Change (IPCC) as part of its sixth assessment report, released in August 2021. The IPCC provides policymakers with regular scientific assessments about climate change.
For the first time, the data underpinning the key figures of the Summary for Policymakers (SPM) section of the report were made freely available by staff from the NERC Environmental Data Service (EDS) on the same day as the publication of the report. The EDS staff, based at the Centre for Environmental Data Analysis (CEDA), were critical members of the team that ensured these data were instantly and easily accessible to everyone.
Work to publish the data is ongoing and CEDA staff are expecting data from a total of approximately 150 figures. The IPCC Secretariat have described the team’s contribution as "an important milestone in the open science dimension of IPCC." Real-time data publication of key figures from important reports is a service that the EDS can offer in the future - ensuring data is transparent and available for those who need it.
IPCC data need to be easily understood by a non-technical audience, especially the SPM figures that are used by policymakers and media outlets. The FAIR principles - standing for Findable, Accessible, Interoperable, and Reuseable - are key guidelines the EDS aims to follow. The role of the EDS was to act as data custodian and provide advice on following the FAIR principles, such as clearly describing the data so it is easily Findable, and providing best practice examples of this.
Catalogue records, containing metadata (information about the data), put the data into context and explain technical terms. They provide a Digital Object Identifier (DOI) to ensure data are easily findable and citeable and allow data authors to be credited for their work.
To ensure consistency across the data, CEDA staff worked closely with the technical support team at the IPCC to develop a ‘gold standard’ example of a metadata catalogue record. The technical team used this example to provide training workshops with the report authors to help them to prepare data in a standardised way.
EDS experts helped to provide consistency checks based on standard conventions that were used in the main report. These quality checks ensured the data provided were relevant and clear.
To aid data re-use and maximise the impact of the report, journalists were provided with a ‘press package’, a few days before the report was officially released. This included the first snapshot of data - allowing media outlets to make graphs in their own style for their own audiences. The first release of the data proved to be very popular - it accounted for 40% of webpage views across all CEDA services in the week following the release of the report.
The versioning system our experts applied to the data enabled us to ‘upgrade’ the catalogue records with rich metadata. The first ‘snapshots’, or versions, of the data had compromises on interoperability. For example, detailed file metadata was included in a 'readme' file rather than in each data file. This was the most suitable area for compromise as the first data versions only needed to be understood by people as opposed to computers; formatting for computer interoperability was not an initial priority. IPCC reports have a stringent and long review process which includes iterations made by scientists and governments over many months, and can include changes to the data file metadata. CEDA's versioning system can accommodate these changes.
The work undertaken by EDS staff ensured that the data behind the IPCC assessment was freely available for anyone to re-use on the same day as report publication - a first for the IPCC.
Read the report here
Masson-Delmotte, V., P. Zhai, A. Pirani, S.L. Connors, C. Péan, S. Berger, N. Caud, Y. Chen, L. Goldfarb, M.I. Gomis, M. Huang, K. Leitzell, E. Lonnoy, J.B.R. Matthews, T.K. Maycock, T. Waterfield, O. Yelekçi, R. Yu, and B. Zhou (eds.), 2021. Summary for Policymakers. Climate Change 2021: The Physical Science Basis. Contribution of Working Group I to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change, pp. 3−32. DOI: http://doi.org/10.1017/9781009157896.001
Further information on IPCC data management
Stockhause, M., Al Khourdajie, A., Alegria, A., Chen, R., Huard, D., Juckes, M., Pascoe, C., Pirani, A., Matthews, R., Poloczanska, E., 2020. IPCC Sixth Assessment approaches towards FAIR data and an enhanced data reuse. Earth and Space Science Open Archive. DOI: https://doi.org/10.1002/essoar.10504782.1
Stockhause, M., Juckes, M., Chen, R., Moufouma Okia, W., Pirani, A., Waterfield, T., Xing, X. and Edmunds, R., 2019. Data Distribution Centre Support for the IPCC Sixth Assessment. Data Science Journal, 18(1), p.20. DOI: http://doi.org/10.5334/dsj-2019-020
Further examples of data re-use and supporting open science
This thread on Twitter, @openclimatedata, showcases multiple examples that demonstrate the importance of open data.