Free Our Data: the blog

A Guardian Technology campaign for free public access to data about the UK and its citizens

Archive for the 'General' Category

Freeing the oceans’ data: new Estonian startup aims to do just that

Tuesday, July 10th, 2012

Ben Rooney writing in the Wall Street Journal, in an article titled “Public Ocean Data Made Available“:

An Estonian start-up is offering a single point of access to much of the world’s available oceanographic data.

Marinexplore, based in Tallinn, has pulled together a large number of public and commercial datasets, to provide over three petabytes (3 million gigabytes) of public ocean data.

“The tools and working processes for ocean exploration have changed little over the last 15 years,” said Rainer Sternfeld, founder and CEO of Marinexplore.

That’s a hell of an ambitious plan. It’s not just the cost of getting oceanographic data out there; it’s also the challenge of getting around the multitude of copyright entanglements in them. When the Free Our Data campaign was in full swing (or doing its jazz hands if you prefer), we prodded at the UK Hydrographic Office a few times – principally over tide data, since that’s hardly what you’d call “international” – but soon discovered that the people there are even more aware of the problems in disentangling quite who owns what. Making the UKHO data free might end up just meaning less data availability altogether.

“Most of the data comes from public sources”, the story says, which is telling in itself. There are lots of sources, but the problem often comes about in determining which is canonical – that is, which ones you really trust. One of the challenges for European countries has been making sure that simple things like sea levels are the same across borders. It’s a bit embarrassing if the data say it dips or rises by a few centimetres when you take one step along the shore from one country to another.

Which is why it’s no surprise to find that near the end of the story we learn that

Most of the data is from public data sets such as the US NOAA (National Oceanic and Atmosphere Administration) stations

. Yes, that’s the US taxpayers-pay-government-collects-and-gives-back model again.

This is course means that the datasets when it comes to locations such as the UK may be severely restricted both in use, detail and timeliness. We haven’t examined it yet – if anyone gets a change to look at Marinexplore’s data and assess it, please let us know.

(As a sidenote: isn’t it fascinating how the the US government and its citizens are perfectly happy to run a system where everyone contributes to the cost of collecting data about the oceans and then gets it back for free at the point of use – even if some won’t use it, for example because they live in landlocked states – but cannot countenance the idea of everyone contributing to the cost of healthcare which they then get back for free at the point of use? One is the peoples’ right to what their tax dollars are spent on; the other is, apparently, socialism. Beats us. We still like the US model for data payment/collection/use.)

Environment Agency… sells our data?

Wednesday, June 15th, 2011

We asked about a year ago whether we could declare the campaign done, finished, over. And the answer was clearly no, we can’t.

We said at the time that we’d consider it done when the Environment Agency began making its data available for free.

Well, the big red phone has been flashing in the Free Our Data batcave. Things are not how they should be. We’re hearing reports that it is selling data.

Here’s an email we received today:

“As you might know, the Environment Agency in England and Wales manage/collect/produce several datasets. Back in 2007 you wrote an article about ononemap using EA’s flood maps and how they forced the website admin to remove it from ononemap website.

(Yes, we remember that.)

“Unfortunately, EA has gone far beyond that.” (We’ve redacted the rest for confidentiality reasons.)

We’ll file a Freedom of Information request to see if we can get more details. Stay tuned.

Vive les données publiques ouvertes!

Tuesday, July 13th, 2010

Many thanks to the organisers of the excellent LIFT10 conference in Marseille for inviting us to tell the story of our campaign. There’s an account here:

Best of luck to our co-campaigners in France and elsewhere.

Is the campaign won? What do you think?

Friday, June 4th, 2010

Choices by anemoneprojectors.

Choices. Photo by anemoneprojectors on Flickr. Some rights reserved
Simon Rogers writes at the Guardian about the release of a snapshot of the Treasury’s COINS (Combined Online Information System):

“known universally as Coins [it] is the most detailed record of public spending imaginable. Some 24m individual spending items in a CSV file of 120GB presents a unique picture of how the government does its business.”

Although that’s because it was presented in UTF-16, which would be great if it were encoding Mandarin (in-joke?), but which actually meant that every second byte was blank – so the Guardian’s developers simply did a quick conversion to UTF-8, halving the size at a keystroke.

He notes:

“This is a different kind of database. It shows how the government actually works; the millions of tiny items that make up the billions of public expenditure every year. It could well be the government’s largest database: if you know of anything of equivalent size and complexity let us know – we can’t come up with anything.”

And then he comments:

“It was only 2006 that the Guardian launched the Free Our Data campaign to push for the government to release public data that we’ve paid for but was previously hidden behind paywalls or official secrecy. Now, that battle is won.”

That’s an interesting one. For me, Coins isn’t actually the marker for the point where we can hang up our campaigning clothes. The Ordnance Survey data release was a huge step. But actually, for me the point at which I’ll feel we’ve really reached the parts that we want to reach is when the Environment Agency makes its flood map data available for free commercial re-use. (I haven’t asked Mike Cross. Perhaps he’ll pitch in.)

But what do you think? It’s your campaign too. Is everything that needs to be done, done? Or is there more to be done, and if so, what?

Local council? Want to publish your data? Here’s how

Wednesday, June 2nd, 2010

In a very timely fashion, has come up with a blogpost explaining for any local authorities who want to know (and are listening/reading) how to publish itemised local authority expenditure.

Publishing raw data quickly is an immediate priority, but in the medium term local authorities should work towards structured, regularly updated data published on the Web using open standards. Subject to other issues below, our immediate advice to local authorities is:

  • Users will be interested in the core information held in the accounts system – such as expenditure code, amount paid, transaction date, beneficiary, and payment reference number. The expenditure code has to be explained and steps taken to help users identify the
  • As a first stage, publish the raw data and any lookup table needed to interpret it in a spreadsheet as a CSV or XML file as soon as possible. This should be put on the council’s website as a document for anyone to download. Or even published in a service such as Google Docs
  • There is not yet a national approach for publishing local authority expenditure data. This should not stop publication of data in its raw, machine-readable form. Observing such raw data being used is the only route to a national approach, should one be required
  • Publishing raw data will allow the panel and others to assess how that data could/should be presented to users. Sight of the data is worth a hundred meetings. Members of the panel will study the data, take part in the discussion and revise this advice.
  • As a second stage, informed by the discussion, the panel and users can then give feedback about publishing data (RDF, CSV, etc) in a way that can be consistent across all local authorities involving structured, regularly updated data published on the Web using openstandards.

There’s more too, about experiences of some of the councils which have published data. Very well worth reading.

The open data flood begins here: Cameron writes to central and local government

Wednesday, June 2nd, 2010

OK – stand by: the data flood is about to get started.

David Cameron has declared his intention to make a lot more data open. First there was the podcast (with, helpfully, transcript), and then a letter to government departments.

Remember that this is following on from the Government Transparency page of the Coalition programme. (We added our own comment there, as you’ll see.)

The podcast first – which was recorded on the Saturday as David Laws was caught in the web of expenses:

“If there’s one thing I’ve noticed since doing this job, it’s how all the information about government – the money it spends, where it spends it, the results it achieves – how so much of it is locked away in a vault marked sort of private for the eyes of ministers and officials only.

“I think this is ridiculous. It’s your money, your government, you should know what’s going on.

“So we’re going to rip off that cloak of secrecy and extend transparency as far and as wide as possible. By bringing information out into the open, you’ll be able to hold government and public services to account. You’ll be able to see how your taxes are being spent. Judge standards in your local schools and hospitals. Find out just how effective the police are at fighting crime in your community.

“Now I think that’s going to do great things. It’s certainly going to save us money.”

That’s just an extract – it’s quite forceful, though of course there’s a big gap between speaking forcefully and actually getting a result. (It’s called politics.)

Then there’s his letter that he wrote to government departments:

Central government spending transparency

  • Historic COINS spending data to be published online in June 2010.
  • All new central government ICT contracts to be published online from July 2010.
  • All new central government lender documents for contracts over £10,000 to be published on a single website from September 2010, with this information to be made available to the public free of charge.
  • New items of central government spending over £25,000 to be published online from November 2010.
  • All new central government contracts to be published in full from January 2011.
  • Full information on all DFID international development projects over £500 to be published online from January 2011, including financial information and project documentation.

    Local government spending transparency

    • New items of local government spending over £500 to be published on a council-by-council basis from January 2011.
    • New local government contracts and tender documents for expenditure over £500 to be published in full from January 2011.

    Other key government datasets

    • Crime data to be published at a level that allows the public to see what is happening on their streets from January 2011.
    • Names, grades, job titles and annual pay rates for most Senior Civil Servants with salaries above £150,000 to be published in June 2010.
    • Names, grades, job titles and annual pay rates for most Senior Civil Servants and NDPB officials with salaries higher than the lowest permissible in Pay Band 1 of the Senior Civil Service pay scale to be published from September 2010.
    • Organograms for central government departments and agencies that include all staff positions to be published in a common format from October 2010.

    Can’t wait, personally. COINS will be an amazing resource – see what the Guardian’s Datablog has to say on that.

    An interesting side note of the sort of that needs to be done can be found among the comments on that Government Transparency page, from “Algernon Arry”:

    “Has the policy of forcing councils to publish details of their spends over £500 been thoroughly thought through. A Government committed to reducing waste and here it is proposing that local councils add to their bureaucratic processes. If this is forced onto local councils I forecast that they will need an army of bureaucrats to carry it out.”

    “Logistically, think of all the times when a payment greater than £500 will be made. A couple come immediately to mind. What about everytime a child or a vulnerable person is taken into care, we all know the cost of nursing/care homes do you expect their details to be given out? (Probably covered by Data Protection legislation, so that will have to be changed.) Or everytime an order is placed to repair an piece of defective tarmac on a road surface. And in both cases what added value does it provide, and honestly how many residents are actually ineterested [sic]. Most councils have scrutiny and audit committees where these issues can be debated.

    “I don’t want to pay a penny more in Council-tax, so the Government can make an instant saving by dropping this stupid idea.”

    The idea that it will take extra human effort to make this happen, and that it will have to be done by hand – rather than being a question of adding an output stream between the existing accounting system and the web – may also have some traction among some of the more hidebound councils, one suspects. Data wranglers beware; it may require some concrete examples.

  • And quote of the day goes to…

    Wednesday, March 24th, 2010

    A simply marvellous response from Martin Daly, chief technology officer of CADCorp, commenting on the AGI’s response to the Prime Minister (read it here, because you’ll want to).

    He remarked on Twitter:

    Loving this AGI letter re OS … with pp-d signature. It takes some cojones to pp a letter to THE FREAKING PRIME MINISTER.

    And you know what? He’s absolutely right. The second signature is pp’d. “Yeah, Prime Minister, I want you to read it, but you know what? I can’t actually be there to sign it. Can’t think how I could electronically put my signature on it either. So I got that person in the office who does stuff to sign it.”

    Roll on, digital age. But mind you bring my quills too.

    ESRI to government: aren’t you being a little hasty in making this OS data free?

    Wednesday, March 24th, 2010

    ESRI has sent an open letter to government – with a number of co-signatories who are in effect competitors – fretting about the proposal (commitment really) to make a number of OS datasets free.

    Below is the letter, which I’ve blockquoted to make it clear what’s the letter and what’s not. I’ve inserted some comments, based on my personal knowledge; and of course some of this is coloured by my advocacy of the Free Our Data campaign.

    Dear Sir,

    Following the Prime Minister’s announcement on November 17th 2009 which set out his vision for Making Public Data Public, a consultation has been taking place on three options for making certain Ordnance Survey datasets available for free with no restrictions on re-use. Submissions to the consultation must be received by March 17th.

    We the undersigned, represent companies who employ 630 staff in the UK GIS industry. We represent over 50% of this market and as such are effectively competitors. Whilst each of our companies has submitted an individual response to the consultation, we are writing this letter to express our shared, serious concerns about both the manner in which this consultation is being undertaken and the potential negative impacts that could result, not for our companies, but for the Ordnance Survey and for the UK economy.

    Before saying anything else, we wish to record our full support for the role of the Ordnance Survey as a world leader in the collection, maintenance and distribution of the highest quality geographic information and mapping.

    CA: That’s in common with the Free Our Data campaign, which has always praised the OS’s work in collecting data and said that it should remain in government ownership.

    The Government press release entitled “Re-mapping the future for Ordnance Survey – making public data public” stated that “…any change would be implemented from April 2010.” We insist that following the close of the consultation on 17th March, adequate time be allowed for a full analysis of the submissions, prior to any decision being taken. We are very concerned that the decision to release a selection of mid- and small- scale products for free appears to have already been made by the Prime Minister’s office.

    CA: Actually, it was made before the consultation – he announced it back in November. The consultation was in effect asking if more should be done, or if there might be some overwhelming reason why the old OS model should be retained, with all the fun of derived data restrictions and so on that have given us so much enjoyment over the years.

    The consultation document states that were this decision to be taken, it would “drive improved transparency and accountability of government and, by facilitating greater innovation, create new economic and social value”. Whilst this may drive transparency and create social value, we do not consider that any significant economic value would result.

    CA: This is an interesting assertion, but it’s not backed up by any evidence. It was trivial for us to show that it costs more to restrict the use of the CodePoint database by JobCentreProPlus than it actually benefits the economy (because the fees paid to lawyers are greater than the cost of the database licence, and of the benefits that would be paid to someone who can’t find a job). As for mid-scale data, it’s similarly easy to show that people who are put off by (a) licence prices and (b) derived data rules will embrace the economic possibilities of free data.

    Another question might be: what’s the specific difference between social value and economic value?

    We work on a daily basis with a very wide range of public and private sector customers who need large scale Ordnance Survey data in order to improve the effectiveness and efficiency of their organisations. Currently the cost of this data puts it beyond the reach of many organisations, particularly in the private sector. If we are to create real economic value for the UK, it is the availability of large scale datasets, at reasonable prices, that needs to be addressed, rather than releasing mid- and small- scale datasets for free.

    CA: the consultation does point out that the public sector has been getting its data cheap, and the private sector in effect overcharged. That’s likely to be realigned – but equally, some public sector organisations may opt out of some OS products.

    The consultation document says very little about the revenue impact on the Ordnance Survey of releasing datasets for free. The Ordnance Survey is a Trading Fund which is completely self-funded from the revenues it receives from licensing its datasets. If any of these datasets were to be made freely available, this lost revenue would need to be replaced. We would be extremely concerned if this were achieved by increasing the charges for large scale datasets to either the public or private sector or both.

    CA: basically, the concern is that prices of the large-scale [MasterMap level] data will be increased to pay for the free stuff.

    We cannot see how any decision to release Ordnance Survey datasets for free can be made without putting the necessary changes in place to compensate the Ordnance Survey for this lost revenue. In the present economic climate we cannot see the Government or Ordnance Survey public sector customers making up this short-fall and our experience tells us that the private sector will not pay more. Furthermore it is already evident that as a result of this proposed change, Ordnance Survey is feeling considerable pressure to generate additional revenue streams, through the rapid introduction of additional innovative and/or automated product lines. These pressures may encourage Ordnance Survey to risk its long term strategies of high quality geospatical data production for the sake of short term revenue gains.

    CA: well, this may be one of those extremely rare times where being a journalist, with the access it sometimes provides to decision-makers, is more useful than being in business. From speaking to ministers and the people who are working on making the data free, I’m confident that the Treasury has accepted the idea that it needs to fund the shortfall. (But there’s also an interesting question to be posed to ESRI: if OS data is so price-inelastic – that is, people will pay anything for them – and the GIS data so useful, then why can’t ESRI and the others contemplate a rise in OS pricing? After all, they’d all be using the same datasets, have the same prices, and so be able to pass them on to customers equally, because the usefulness of the datasets would have the same price inelasticity. I suspect ESRI’s sources on this are within Ordnance Survey. Mine are within Whitehall. Whitehall is where the money lives.

    If the Government is serious about making Government geospatial data more readily available, it also needs to look beyond the Ordnance Survey. Current restrictions on the availability of postal address file data from The Royal Mail and Communities and Local Government identifiers for properties and streets need to be removed. Making all such data more readily available would lead to greater innovation and play an important role in placing the UK at the forefront of the knowledge economy in the twenty-first century.

    CA: that sounds like a complaint about derived data, but it’s not quite clear what “restrictions… need to be removed” actually means.

    We appear to be moving towards a decision to release a selection of mid- and small- scale Ordnance Survey products for free, without due regard being given to the future funding of the Ordnance Survey or whether this will drive real economic growth for the UK. This gives us serious cause for concern.

    Dr Richard Waite, managing director, ESRI (UK) Ltd; Howard Papworth, executive director SG&I Western Europe, Intergraph; Dr Michael Sanderson, executive chairman, 1Spatial Group Ltd; Mike O’Neill, managing director, Cadcorp Ltd.

    CA: what’s interesting about this is the lack of numbers. I’d have thought there was enough detail in the consultation that these companies could have set out their worries in financial terms – especially with a budget coming. I find the lack of hard numbers intriguing.

    OS data: ‘you will like it’

    Tuesday, March 23rd, 2010

    Yesterday Gordon Brown fired a sort-of starting gun for the election, at least in part, by trying to get the geek vote: loads more non-personal government data to be made available for free commercial reuse, bus franchises to be obliged in future to make timetables available to others, tons of broadband. (Read the speech.)

    The interesting clause, of course, being this one:

    And following the strong support in our recent consultation, I can confirm that from 1st April, we will be making a substantial package of information held by ordnance survey freely available to the public, without restrictions on re-use. Further details on the package and government’s response to the consultation will be published by the end of March.

    (While I was typing this, I had a call from a PR representing a group of users of OS data who are upset that within days of the consultation ending Brown seems to be heading off down one of the routes that is not the “leave it as it was” one. More to come…)

    Afterwards I asked Nigel Shadbolt and Michael Wills – the latter being the minister at the Ministry of Justice who was, we think, the first inside government to give serious consideration to the ideas of the Free Our Data (read the interview with him from July 2007).

    So, I asked, what’s going to be in the free OS package? Raster data? Or vector data too? To what scale? From Explorer (1:25,000) upwards, or what?

    Furrowed brows. But plenty of assurances that “you will like it”. Await more in the coming days…

    Ed Parsons collects a list of OS consultation responses – with Goldilocks and elephants galore (updated)

    Saturday, March 20th, 2010

    Ed Parsons, once upon a time the chief technology officer of Ordnance Survey but now at Google, has blogged about the OS consultation, which he describes as a fairy tale:

    The consultation understandably and in my mind quite rightly has remained focused on the specifics of making OS data free, and in the great tradition of Civil Service options papers offers a Goldilocks Choice; one too cold, one too hot and one option just about acceptable.

    Option 1 appears to maintain the status-quo and I don’t see anyone outside the Romsey Road distortion field supporting it.

    Option 2 is perhaps something that may be achievable in the long term with continued technological change and changing market requirements, however at the moment this would put Ordnance Survey in a position where it’s current operational processes are financially unsustainable.

    So Option 3 represents the obvious compromise, some small scale data for free while allowing the cash cow of MasterMap to continue to fund a reduced but largely similar OS to the one we have today.

    He also points to the elephants in the room – the issue of derived data (which is mentioned only once in the whole consultation, but which impinges crucially on so much geospatial work), and the lack of a national address register:

    This is an issue bigger than Ordnance Survey, although OS has had its part to play in the current mess. This really does need strategic leadership from the centre.

    Helpfully, he’s also collected a list of so-far published responses to the consultation, which we’ll shamelessly republish here. One large search engine’s response seems to be missing though… must be an oversight.

    Let us (and him) know if there are more public responses to the consultation. We’re expecting to hear something relevant on Monday, by the way. More closer to the time.

    …and APPSI comes out swinging

    Thursday, March 11th, 2010

    The government’s Advisory Panel on Public Sector Information has come out with a strong response to the OS consultation.

    Headline points: look at the picture, not just at OS, resolve the “fundamental contradictions” in information policy and move towards a free data regime. “In particular, OS should not have any intellectual property rights in derived data.”

    Oh, and sort out the “national scandal” of the lack of a comprehensive free address register.

    There’s more here.

    It’s a good read.

    And while we’re talking about Land Registry.. here comes UKMap

    Friday, March 5th, 2010

    Verrry interesting post over at the UKMapping blog. We’ll include the full content because it’s short and very relevant.

    Great news was received today at UKMap HQ. After much review and testing Land Registry have confirmed [and are happy for us to tell people] that they are happy to accept registrations based on UKMap.

    Great news for all those consultants, government types and property professional using UKMap.

    Full text below

    “Following a review of UKMap, Land Registry is able to confirm that UKMap meets Land Registry’s requirements as a mapping base on which registration applications can be made and is comfortable accepting registration applications based on UKMap. Such registration applications must still follow relevant guidance as set out in Public Guide 40.”

    What does this mean? That LR doesn’t necessarily have to rely on Ordnance Survey maps for registrations. That, too, pours sand into any engines that might be getting started looking to fund OS through higher LR transaction fees. Savvy users would just go to UKMap.

    And an earlier post on that same blog includes these interesting new clients:

    • London Fire, serious users with a serious need to get better mapping. Now they have that with UKMap.
    • London Borough of Islington – managing their green environment, UKMap’s trees and Land Use makes all the difference for them.
    • Mott Macdonald – detailed city centre mapping, all in glorious 3D.
    • Promap – the UK’s leading mapping portal for the Land and Property market have announced they are taking UKMap

    London councils and emergency services? That’s what I think you call an inroad into OS’s market, isn’t it?

    A spoke in the wheels of Land Registry transaction fees to pay for Ordnance Survey?

    Thursday, March 4th, 2010

    An interesting debate about the prospect of redundancies at Land Registry on Wednesday brought Michael Wills, of the Ministry for Justice – who is also in charge of the (trading fund) Land Registry – to the chamber.

    It’s a long debate, but here’s the nitty-gritty:

    Transaction levels, which are the key factor for the Land Registry, will have fallen from 16.1 million in 2007-08 to a projected 10 million in 2009-10. The Land Registry receives no central funding because it is a trading fund; it depends on the fees that it receives for services rendered. It made a loss of £130 million in 2008-09 compared with a surplus of about £70 million in 2007-08.

    Very interesting – and this comes against a backdrop where it’s being suggested that government is being charged too little for OS data (compared to the private sector); and where there have been suggestions (hi, Robert Barr) that we should have an excess added to LR transactions in order to fund OS.

    I think this really needs to be costed. It would be interesting too to see how much Land Registry pays to OS…

    How GIS reveals discrimination in urban planning

    Saturday, January 2nd, 2010

    Does this sound at all familiar?

    Not all local governments appreciate the rise of GIS-driven advocacy, especially when their own data is used as a hammer against them, and they have begun to restrict public access. Some have pulled data off the Web in the alleged interest of national security; others charge exorbitant fees to produce it or deliver jumbled masses of data that are difficult to manage or decipher.

    Turns out though that it’s not from the UK, but the US, from a fascinating article about how GIS helps to demonstrate discrimination being practised by towns and cities – and how when that is revealed by mapping, the reaction tends not to be to get rid of the discrimination, but to get rid of the troublesome access to the data that reveals it. After all, it’s so much cheaper to do the one than the other:

    Mebane, the Cedar Grove Institute’s first case study of municipal discrimination, passed an Infrastructure Information Security Policy shortly after the study was published; the policy limited infrastructure data access to qualified engineering firms and town agencies. The city of Modesto, Calif., locked in a legal underbounding battle, pulled its infrastructure data off the Internet after the lawsuit was filed, citing national security grounds. “There’s no conceivable national security interest in where the traffic lights are in Modesto,” scoffs Ben Marsh, the institute’s chief mapmaker. A recent appellate ruling in California rejected a similar national-security rationale, as well as a copyright argument by Santa Clara County, but whether that opinion stands as precedent remains to be seen.


    Though restrictions on access to government data could prove troublesome, advocacy groups that use GIS have already been finding data sources outside of government. In particular, data collected by community residents have become an effective supplement to the “official story,” as University of Washington professor Sarah Elwood calls government data.

    Elwood has used GIS not only to map problems but to build the capacity of underserved and disadvantaged communities to advocate on their own behalf. Simple walking surveys that catalogue infrastructural deficiencies — potholes in sidewalks, missing stop signs, burned-out streetlights — fill gaps in the public record that mask actual conditions on the ground. With locally produced data, Elwood says, “You can tell a very detailed and very current, compelling story about neighborhood needs.”

    If that reminds you at all of fixmystreet, it ought to – that’s precisely the sort of idea it sprang from.

    UEA CRU climate data is a free data issue too

    Tuesday, December 22nd, 2009

    I’ve been researching the apparent hack of the University of East Anglia’s Climate Research Unit (CRU), where a huge amount of email going back more than a decade, plus huge numbers of documents, have been released onto the internet – they’re indexed on various sites in searchable form and through Wikileaks, for example.

    What I find interesting is some of the discussion around it. There have been multiple freedom of information (FOI) requests to the CRU from people who want to examine the underlying data used to make the analysis about human-driven global warming.

    You’d think it would be straightforward. Science operates by data leading to theory leading to prediction leading to test against data, with a parallel process of independent test against the same data. So you’d think that access to the data would be a key thing.

    Lots of Freedom of Information requests have thus come into the CRU demanding (that’s the word) the original data used for the papers. But the CRU has turned them down. Why? Because, UEA says, it came from weather organisations which charge for their datasets – and restrict those datasets’ redistribution.

    Read for yourself at the CRU Data Availability page:

    Since the early 1980s, some NMSs [national meteorological services], other organizations and individual scientists have given or sold us (see Hulme, 1994, for a summary of European data collection efforts) additional data for inclusion in the gridded datasets, often on the understanding that the data are only used for academic purposes with the full permission of the NMSs, organizations and scientists and the original station data are not passed onto third parties.


    In some of the examples given, it can be clearly seen that our requests for data from NMSs have always stated that we would not make the data available to third parties. We included such statements as standard from the 1980s, as that is what many NMSs requested.

    The inability of some agencies to release climate data held is not uncommon in climate science. The Dutch Met Service (KNMI) run the European Climate Assessment and Dataset (ECA&D, project. They are able to use much data in their numerous analyses, but they cannot make all the original daily station temperature and precipitation series available because of restrictions imposed by some of the data providers

    CRU insists it wants to make the data available:

    We receive numerous requests for these station data (not just monthly temperature averages, but precipitation totals and pressure averages as well). Requests come from a variety of sources, often for an individual station or all the stations in a region or a country. Sometimes these come because the data cannot be obtained locally or the requester does not have the resources to pay for what some NMSs charge for the data. These data are not ours to provide without the full permission of the relevant NMSs, organizations and scientists. We point enquirers to the GHCN web site. We hope in the future that we may be able to provide these data, jointly with the UK Met Office Hadley Centre, subject to obtaining consent for making them available from the rights holders. In developing gridded temperature datasets it is important to use as much station data as possible to fully characterise global- and regional-scale changes. Hence, restricting the grids to only including station data that can be freely exchanged would be detrimental to the gridded products in some parts of the world.

    The problem arises because the centre has been running in this way since the 1980s – before the internet reached even most universities, and when the culture of “pay for data” (because it was so hard to acquire, and so jealously guarded) was much more ingrained.

    But it is a problem that needs to be overcome. The CRU has all sorts of PR difficulties because it hasn’t grasped this nettle – which needs to be grasped so that it can finally get past any questions about its research. There are people who aren’t satisfied at being told that the data needed to investigate a scientific paper can’t be passed on because of long-lost contracts. (We wouldn’t be very impressed by that if we were told it either.)

    Paying for public data: it’s never a good idea. Especially when it creates problems like this.