Free Our Data: the blog

A Guardian Technology campaign for free public access to data about the UK and its citizens

Archive for June, 2010

Is the campaign won? What do you think?

Friday, June 4th, 2010

Choices by anemoneprojectors.

Choices. Photo by anemoneprojectors on Flickr. Some rights reserved
Simon Rogers writes at the Guardian about the release of a snapshot of the Treasury’s COINS (Combined Online Information System):

“known universally as Coins [it] is the most detailed record of public spending imaginable. Some 24m individual spending items in a CSV file of 120GB presents a unique picture of how the government does its business.”

Although that’s because it was presented in UTF-16, which would be great if it were encoding Mandarin (in-joke?), but which actually meant that every second byte was blank – so the Guardian’s developers simply did a quick conversion to UTF-8, halving the size at a keystroke.

He notes:

“This is a different kind of database. It shows how the government actually works; the millions of tiny items that make up the billions of public expenditure every year. It could well be the government’s largest database: if you know of anything of equivalent size and complexity let us know – we can’t come up with anything.”

And then he comments:

“It was only 2006 that the Guardian launched the Free Our Data campaign to push for the government to release public data that we’ve paid for but was previously hidden behind paywalls or official secrecy. Now, that battle is won.”

That’s an interesting one. For me, Coins isn’t actually the marker for the point where we can hang up our campaigning clothes. The Ordnance Survey data release was a huge step. But actually, for me the point at which I’ll feel we’ve really reached the parts that we want to reach is when the Environment Agency makes its flood map data available for free commercial re-use. (I haven’t asked Mike Cross. Perhaps he’ll pitch in.)

But what do you think? It’s your campaign too. Is everything that needs to be done, done? Or is there more to be done, and if so, what?

Local council? Want to publish your data? Here’s how

Wednesday, June 2nd, 2010

In a very timely fashion, has come up with a blogpost explaining for any local authorities who want to know (and are listening/reading) how to publish itemised local authority expenditure.

Publishing raw data quickly is an immediate priority, but in the medium term local authorities should work towards structured, regularly updated data published on the Web using open standards. Subject to other issues below, our immediate advice to local authorities is:

  • Users will be interested in the core information held in the accounts system – such as expenditure code, amount paid, transaction date, beneficiary, and payment reference number. The expenditure code has to be explained and steps taken to help users identify the
  • As a first stage, publish the raw data and any lookup table needed to interpret it in a spreadsheet as a CSV or XML file as soon as possible. This should be put on the council’s website as a document for anyone to download. Or even published in a service such as Google Docs
  • There is not yet a national approach for publishing local authority expenditure data. This should not stop publication of data in its raw, machine-readable form. Observing such raw data being used is the only route to a national approach, should one be required
  • Publishing raw data will allow the panel and others to assess how that data could/should be presented to users. Sight of the data is worth a hundred meetings. Members of the panel will study the data, take part in the discussion and revise this advice.
  • As a second stage, informed by the discussion, the panel and users can then give feedback about publishing data (RDF, CSV, etc) in a way that can be consistent across all local authorities involving structured, regularly updated data published on the Web using openstandards.

There’s more too, about experiences of some of the councils which have published data. Very well worth reading.

The open data flood begins here: Cameron writes to central and local government

Wednesday, June 2nd, 2010

OK – stand by: the data flood is about to get started.

David Cameron has declared his intention to make a lot more data open. First there was the podcast (with, helpfully, transcript), and then a letter to government departments.

Remember that this is following on from the Government Transparency page of the Coalition programme. (We added our own comment there, as you’ll see.)

The podcast first – which was recorded on the Saturday as David Laws was caught in the web of expenses:

“If there’s one thing I’ve noticed since doing this job, it’s how all the information about government – the money it spends, where it spends it, the results it achieves – how so much of it is locked away in a vault marked sort of private for the eyes of ministers and officials only.

“I think this is ridiculous. It’s your money, your government, you should know what’s going on.

“So we’re going to rip off that cloak of secrecy and extend transparency as far and as wide as possible. By bringing information out into the open, you’ll be able to hold government and public services to account. You’ll be able to see how your taxes are being spent. Judge standards in your local schools and hospitals. Find out just how effective the police are at fighting crime in your community.

“Now I think that’s going to do great things. It’s certainly going to save us money.”

That’s just an extract – it’s quite forceful, though of course there’s a big gap between speaking forcefully and actually getting a result. (It’s called politics.)

Then there’s his letter that he wrote to government departments:

Central government spending transparency

  • Historic COINS spending data to be published online in June 2010.
  • All new central government ICT contracts to be published online from July 2010.
  • All new central government lender documents for contracts over £10,000 to be published on a single website from September 2010, with this information to be made available to the public free of charge.
  • New items of central government spending over £25,000 to be published online from November 2010.
  • All new central government contracts to be published in full from January 2011.
  • Full information on all DFID international development projects over £500 to be published online from January 2011, including financial information and project documentation.

    Local government spending transparency

    • New items of local government spending over £500 to be published on a council-by-council basis from January 2011.
    • New local government contracts and tender documents for expenditure over £500 to be published in full from January 2011.

    Other key government datasets

    • Crime data to be published at a level that allows the public to see what is happening on their streets from January 2011.
    • Names, grades, job titles and annual pay rates for most Senior Civil Servants with salaries above £150,000 to be published in June 2010.
    • Names, grades, job titles and annual pay rates for most Senior Civil Servants and NDPB officials with salaries higher than the lowest permissible in Pay Band 1 of the Senior Civil Service pay scale to be published from September 2010.
    • Organograms for central government departments and agencies that include all staff positions to be published in a common format from October 2010.

    Can’t wait, personally. COINS will be an amazing resource – see what the Guardian’s Datablog has to say on that.

    An interesting side note of the sort of that needs to be done can be found among the comments on that Government Transparency page, from “Algernon Arry”:

    “Has the policy of forcing councils to publish details of their spends over £500 been thoroughly thought through. A Government committed to reducing waste and here it is proposing that local councils add to their bureaucratic processes. If this is forced onto local councils I forecast that they will need an army of bureaucrats to carry it out.”

    “Logistically, think of all the times when a payment greater than £500 will be made. A couple come immediately to mind. What about everytime a child or a vulnerable person is taken into care, we all know the cost of nursing/care homes do you expect their details to be given out? (Probably covered by Data Protection legislation, so that will have to be changed.) Or everytime an order is placed to repair an piece of defective tarmac on a road surface. And in both cases what added value does it provide, and honestly how many residents are actually ineterested [sic]. Most councils have scrutiny and audit committees where these issues can be debated.

    “I don’t want to pay a penny more in Council-tax, so the Government can make an instant saving by dropping this stupid idea.”

    The idea that it will take extra human effort to make this happen, and that it will have to be done by hand – rather than being a question of adding an output stream between the existing accounting system and the web – may also have some traction among some of the more hidebound councils, one suspects. Data wranglers beware; it may require some concrete examples.