Free Our Data: the blog

A Guardian Technology campaign for free public access to data about the UK and its citizens


UEA CRU climate data is a free data issue too

I’ve been researching the apparent hack of the University of East Anglia’s Climate Research Unit (CRU), where a huge amount of email going back more than a decade, plus huge numbers of documents, have been released onto the internet – they’re indexed on various sites in searchable form and through Wikileaks, for example.

What I find interesting is some of the discussion around it. There have been multiple freedom of information (FOI) requests to the CRU from people who want to examine the underlying data used to make the analysis about human-driven global warming.

You’d think it would be straightforward. Science operates by data leading to theory leading to prediction leading to test against data, with a parallel process of independent test against the same data. So you’d think that access to the data would be a key thing.

Lots of Freedom of Information requests have thus come into the CRU demanding (that’s the word) the original data used for the papers. But the CRU has turned them down. Why? Because, UEA says, it came from weather organisations which charge for their datasets – and restrict those datasets’ redistribution.

Read for yourself at the CRU Data Availability page:

Since the early 1980s, some NMSs [national meteorological services], other organizations and individual scientists have given or sold us (see Hulme, 1994, for a summary of European data collection efforts) additional data for inclusion in the gridded datasets, often on the understanding that the data are only used for academic purposes with the full permission of the NMSs, organizations and scientists and the original station data are not passed onto third parties.

And:

In some of the examples given, it can be clearly seen that our requests for data from NMSs have always stated that we would not make the data available to third parties. We included such statements as standard from the 1980s, as that is what many NMSs requested.

The inability of some agencies to release climate data held is not uncommon in climate science. The Dutch Met Service (KNMI) run the European Climate Assessment and Dataset (ECA&D, http://eca.knmi.nl/) project. They are able to use much data in their numerous analyses, but they cannot make all the original daily station temperature and precipitation series available because of restrictions imposed by some of the data providers

CRU insists it wants to make the data available:

We receive numerous requests for these station data (not just monthly temperature averages, but precipitation totals and pressure averages as well). Requests come from a variety of sources, often for an individual station or all the stations in a region or a country. Sometimes these come because the data cannot be obtained locally or the requester does not have the resources to pay for what some NMSs charge for the data. These data are not ours to provide without the full permission of the relevant NMSs, organizations and scientists. We point enquirers to the GHCN web site. We hope in the future that we may be able to provide these data, jointly with the UK Met Office Hadley Centre, subject to obtaining consent for making them available from the rights holders. In developing gridded temperature datasets it is important to use as much station data as possible to fully characterise global- and regional-scale changes. Hence, restricting the grids to only including station data that can be freely exchanged would be detrimental to the gridded products in some parts of the world.

The problem arises because the centre has been running in this way since the 1980s – before the internet reached even most universities, and when the culture of “pay for data” (because it was so hard to acquire, and so jealously guarded) was much more ingrained.

But it is a problem that needs to be overcome. The CRU has all sorts of PR difficulties because it hasn’t grasped this nettle – which needs to be grasped so that it can finally get past any questions about its research. There are people who aren’t satisfied at being told that the data needed to investigate a scientific paper can’t be passed on because of long-lost contracts. (We wouldn’t be very impressed by that if we were told it either.)

Paying for public data: it’s never a good idea. Especially when it creates problems like this.

    The following posts may be related...(the database guesses):