Free Our Data: the blog

A Guardian Technology campaign for free public access to data about the UK and its citizens


Archive for January, 2007

Ed Parsons, ex-Ordnance Survey: ‘data belongs to citizens’

Friday, January 26th, 2007

Ed Parsons, who left OS at the end of December, has an interesting post following his speech to the e-Government awards about “the potential impact of web 2.0 approaches and the development of mash-up applications to future e-Government service”. (We’d love to see the notes from the speech, Ed, or even a post about it..)

In his post, he points out that

a perfect example of what I was suggesting as a future approach was announced yesterday by the US Environmental Protection Agency who are taking their first steps by publishing the locations of some contaminated land sites in XML of their website, with the specific intention of allowing citizens to analyse the data themselves. Of course raw data has always been more available in the US and I not getting into that debate… the difference here is that by publishing data in XML the EPA are opening up the data for people to manipulate using their own lightweight applications.

I’ll just give you some of the text from that News.com story, because it’s worth quoting directly:

The pilot piece of that effort, posted early Wednesday morning, is a single XML file containing information on about 1,600 locales relegated to the Superfund National Priorities List. As required by Congress since 1980, the EPA uses that list to locate, investigate and clean up the worst-offending landfills, chemical plants, radiation sites and other areas known or suspected of releasing contaminants, pollutants and other hazardous substances.

By the end of the year, the EPA hopes to expand its offerings to include data on at least 100,000 sites from across its many different regulatory programs, including hazardous waste storage and treatment sites, air pollution trends and toxic chemical releases.

And that single XML file is only 146KB, at present, and you can even feed back on data points you think are wrong.

Now, with some talk about people in London wondering about radioactivity beneath a potential Olympic site, wouldn’t we all be glad of a similar XML file here? (This is the point I was making to Steven Feldman in the comments of the previous post: while it’s good to have websites with data, what makes the whole system mesh is XML feeds, so computer can talk to computer, which can be controlled by person who filters or decides what to cross-correlate; not just making data available on a website to someone who has to laboriously cross-match sites together. The latter is web 1.0; the former, web 2.0.)

Anyway… Ed concludes

we should not forget that they are simple and cheap approaches to providing greater levels of information to the citizen by allowing the citizen to carry out the analysis themselves.

Another key point I made was that the next generation of citizens, “Generation Y? if you like, are in many ways more open to sharing data, having grown up defining they characters on-line on mySpace and Bebo than today’s. However this willingness to share data with others, even government? comes from the fact that as authors their “own? their own data and are free to modify, correct and update it.

For anyone delivering the citizen services of the future here is an important lesson – it is NOT your data, it is the citizens’ and they must feel true ownership of it.

You know what? We agree. And since we own those data… we should have unfettered access to them. [Update: Ed clarifies his definition in the comments for this post: he means citizen-*provided* data such as names and addresses.]

Local planning applications: free data, but hard to collate.. until now

Thursday, January 25th, 2007

We haven’t focussed much on local government so far in this campaign, because it has to be said that many local councils do pretty good work in terms of making data available – possibly because they’ve often, in the past, been required to do so.

Planning applications, for example, are generally available on the web. The problem is finding out whether one in your area will affect you, because if it’s a few streets away you might care about it, but won’t receive a letter telling you about it.

Which drove Richard Pope to write Planningalerts.com (note you need the “www”, as in the link; the http://planningalerts.com site is just a placeholder). It took him five days over Christmmas, but he reckons the template – which uses a screen scraper – can be applied very widely. So although he’s only got 41 or so councils hooked up, the other 300 or so should be quite easy to add, because there are a limited number of software packages used by councils to put planning data online.

Pope’s idea: you put in your postcode and email, and the site will contact you when something is applied for in your area.

Simple? Yes (by computing standards). Clever? We think so. Could be done by government? Well, it sort of is – except at much more expense, and put into the hands of a commercial company (Emap) which says it retains the copyright on the data it offers through the National Planning Application Register.

In Don’t panic: we’ll email if someone plans to demolish your house, today’s Guardian Technology explains what Pope would like (an API/XML feed from councils that would obviate the need for screen-scraping).

The irony is that government already offers a similar service to search for planning applications through its national planning portal at planningportal.gov.uk. But rather than five days, it has taken a year to build; it doesn’t send out proactive alerts; and a formidable copyright notice says that the National Planning Application Register is copyright of a commercial company, Emap Glenigan (whose website is used for the searches).

By contrast, Pope hasn’t worried much about copyright: “This information should be available to all.”

We’ve got nothing against Emap Glenigan using the same data that’s widely available – but it’s everyone’s, not Emap’s.

Is the new Statistics Board the right model for free data?

Thursday, January 18th, 2007

In this week’s Guardian Technology, Michael Cross examines what the proposed changes to the structure and oversight for the Office for National Statistics means for data access.

In Statistics are free – now let’s work on the rest of the data, he notes

national statistics are an important example of public sector information being posted free on the web. We would like to see all impersonal data collected by government to be made available this way, for the benefit both of democracy and the knowledge economy. Second, the governance regime now before Parliament could be applied to other types of data, from maps to weather forecasts.

What do you think?

How OS costs hold back local authorities housing associations: the view from the ground

Monday, January 15th, 2007

An interesting post that we came across by letting Blogdigger trawl the web for mentions of “Ordnance Survey”. Extract:

In the field I work in housing (asset management) the use of maps would be particularly beneficial. On top of this, I also help manage some patch related data for our housing management team. Being able to link all this into some automapping system would make many tasks I perform so much easier. In addition, visual tools are staggeringly useful when persuading people or explaining things to a new audience.

Being able to show a map which outlined that 80% of our Decent Homes failures are in one half of our stock would be a lot more powerful when persuading our board to release extra funds, for instance. The human mind responds strongly to visual stimuli, and maps in particular plug straight into a particular part of the brain.

Of course, to do all this, we need a certain amount of geographical data. There’s the maps themselves and then there’s geocoding our properties. Which leads to enquiries like the above. No doubt we could afford that, but in good conscience can we really spend £16k on what is effectively a glorified A to Z? £16k after all could supply brand new kitchens to four families, or ensure 8 homes have brand new central heating systems which will cut fuel bills and keep people warm should the weather change. We have to consider the opportunity costs.

As the post title says, this isn’t the ‘tragedy of the commons’ (where making something free means it gets abused, as in email carriage and spam). It’s the opposite: by enclosing the land, you keep out precisely the people who could make the best use of it. It’s like having farmland and then charging anyone who’s a farmer or wants to learn thousands of pounds to work it.

The writer concludes:

And so what will we do? As is often the case, my answer is DIY : We’ll collect the data ourselves. As ridiculous as it sounds, because of the restrictions on the data we’ll be much better off simply collecting the information ourselves and using any one of a number of open source applications simply generate the maps ourselves. Or so is my intention.

GPS equipment is now within the reach of the average citizen and a procedure for geocoding our properties could easily be included within our stock condition procedure or even included in with caretaker duties or a void routine.

But…isn’t this a bit ridiculous? Aren’t we going against a sensible division of labour? Instead of information being collected by experts en masse, we’re going to be taking a piecemeal amateur approach.

Yes, that’s how creating an ‘internal market’ generates inefficiencies. The council’s data might not be as good as the OS’s, and you’ll have duplicated work. Stupid.

Advisory Panel on PSI notes Free Our Data campaign again

Monday, January 15th, 2007

As Oscar Wilde said, the only thing worse than being talked about is not being talked about. So it’s encouraging to note that in its latest minutes from the November 2006 meeting (PDF, 185KB) the Advisory Panel on Public Sector Information (APPSI), which previously noted the campaign, has once more noted the contribution we’re making:

A member sought clarification from the Chair on whether APPSI intends writing to the Guardian in response to its “Free Our Data” campaign. The Chair stated that there were no plans to do so. He indicated said that the Guardian articles were providing a useful debate on issues relating to public sector information. The articles have been written by different contributors setting out their own viewpoints. APPSI did not consider it appropriate to get involved in commenting on specific issues or when an article contained some inaccuracies.

(Of course, The Guardian has a policy of correcting errors as soon as possible, and welcomes emails on that subject sent to reader*at*guardian.co.uk.)

The APPSI minutes contain some interesting points, including that of whether PhDs are subject to copyright, and if so whose.

Other interesting points:

Mike Clark stated that his paper “Fee or Free”, which is to be published in Business Information Review, was written in a personal capacity and it would contain a disclaimer stating that it did not represent the views of Government or APPSI. This led on to a discussion about the accuracy of the figures used to compare the EU and US PSI markets that were included in the PIRA Report on PSI for the EC in 2000. If the figure for the size of the US PSI market is incorrect it could have a significant bearing on the arguments that have been used to comment on the UK government’s approach to PSI and on other studies that use the figures cited by PIRA. A member suggested that APPSI should consider publishing a bibliography created by Mike Clark on its website with useful PSI references. Another member suggested that the bibliography could be linked to the actual articles.

Certainly if the numbers from the PIRA study aren’t right, it could have a big effect on the arguments. But it might do those in either direction.

Why making statistics free can save lives

Thursday, January 11th, 2007

In today’s Guardian, in Uncovering global inequalities through innovative statistics, we look at Hans Rosling’s call for governments to stop hiding away their potentially useful data “

Despite the encouragement that the internet provides, and the hunger of the public for better ways to analyse that data, governments are reluctant to open their databases to the world and make them searchable. “People put prices on them and stupid passwords,” says Rosling. “And this won’t work.”

Rosling has a very interesting interactive system at gapminder.org where you can plot all sorts of UN data for various countries against each other – such as carbon dioxide emissions vs gross national income, or child mortality against internet connectivity (is there a link? The data should show it).

There’s also his enormously impressive TED talk – watch this, and then you might start to see the point of free data, if you haven’t already.

(Our thanks to David O’Brien of Glasgow for pointing it out to us.)

Update: you can also see the (rather longer, at an hour) video of a Google campus talk by the Gapminder team on the same subject, which covers the same ground as Rosling at TED but in more depth. (You can also download it for Windows, Macs, video iPods and PSPs if you want some offline viewing.)

Public money paid for it – but the public can’t view because of crown copyright

Thursday, January 4th, 2007

The longer this campaign goes on, the more we seem to generate headlines like that on this post. The latest example is the project by University College London’s Centre for Advanced Spatial Analysis,

(image from the CASA blog; see below for link)
In Copyright fight sinks virtual planning, Michael Cross points out how the Virtual London project can only put clips on YouTube and small examples on its blog, because it is barred from putting the whole project online – which would let any of us zoom through a virtual London, and see how the Olympics projects might look, or model flooding, or planning or any of a host of truly useful activities – by, yes, the licensing restrictions imposed by Ordnance Survey.

That’s because the model lays the OS’s Mastermap (with details of all buildings and heights in the UK) over a Google Maps system. For London, it’s very impressive – see the Casa blog.

Is this, strictly, OS’s fault? Not really – it’s the fault of a government attitude which insists that every bit of data must be sweated as an asset; OS must cut its cloth to fit that insistence.

The real obstacle is crown copyright. For data gathered with taxpayers’ help, and by organisations answerable to the government, crown copyright makes less and less sense in a world where the free movement of data enables more activity.

After all, isn’t this the same administration which abolished museum charges? What was the rationale for that? Interestingly, less than a year after doing so, museum visits were up by 62%. We suspect that if you scrapped data charging you’d see a lot more than a 62% rise in the use of data such as the OS’s. (If anyone can find the cost of the free museums initivative, we’d like to hear.)