Free Our Data: the blog

A Guardian Technology campaign for free public access to data about the UK and its citizens

Tim Berners-Lee and Nigel Shadbolt on the benefits of open data

TBL and Nigel Shadbolt, who are together pushing along the open data idea in government, have an article in the Times (London) of November 18 about the benefits of free data, following on from the announcement yesterday:

Data has a particular value in that you can combine it with other data to discover new things. When in 1854 John Snow took the deaths from a cholera outbreak in London and plotted them on a map, he was able to illustrate the connection between the quality of the source of water and cholera — the world changed. In March the Department for Transport released three years’ worth of data about the location of accidents involving cyclists. Within 24 hours someone had converted this data to create cycle-accident route planners that avoid the black spots.

Government data is a valuable resource that we have already paid for. We are not talking about personal data but data that tells us, for example, about the amount and type of traffic on our roads, where the accidents are, how much is spent on areas where these accidents occur. This is data that has already been collected and paid for by the taxpayer, and the internet allows it to be distributed much more cheaply than before. Governments can unlock its value by simply letting people use it. This is beginning to happen in a number of countries, notably in the US under the Obama Administration, and in June Gordon Brown asked us to advise the Government on how to make rapid progress here.

(The fun thing here being that OS would argue that its data has not been collected and paid for by the taxpayer because it’s a trading fund. Unfortunately this doesn’t hold up in front of the point that (a) almost all of its data was collected while it was not a trading fund (b) half its revenues do come from the taxpayer, in the form of licences from public organisations.)

As all of this data becomes available, we have to look for the joins between it. A new set of standards for the web is emerging that allows us to link data from different sources. Everyone knows that web pages have addresses that identify them, allowing you to navigate around and find what you want. To make the web of linked open data work we also need to give identifying addresses to the objects and properties that make up the basic information in pages, spreadsheets or databases.

Think about the practical applications. If Companies House referred to companies using these new open, uniform identifiers, then other people who needed to talk about companies could use these whenever they referred to a company. If all websites that make data available about companies point to the same identifier for a company, then it’s possible to pull that data together very easily — whether its data about stock price, a product or a director. This is one of the core principles at the heart of the web of linked data.

None of this works unless the data is there in the first place. But when it is, innovation flourishes. Maybe someone uses the web to show schools close to you and their Ofsted reports, or the planning applications that might affect you, or the allotments available to use, or the crime rates in your area. Data is beginning to drive the Government’s websites. But without a consistent policy to make it available to others, without the use of open standards and unrestrictive licences for reuse, information stays compartmentalised and its full value is lost.

So there you have it: the free data concept is right there at the heart of government, with extra semantic web power from the person who invented it. That’s good. That’s very good.

    The following posts may be related...(the database guesses):