Free Our Data: the blog

A Guardian Technology campaign for free public access to data about the UK and its citizens


Archive for the 'General' Category

…and APPSI comes out swinging

Thursday, March 11th, 2010

The government’s Advisory Panel on Public Sector Information has come out with a strong response to the OS consultation.

Headline points: look at the picture, not just at OS, resolve the “fundamental contradictions” in information policy and move towards a free data regime. “In particular, OS should not have any intellectual property rights in derived data.”

Oh, and sort out the “national scandal” of the lack of a comprehensive free address register.

There’s more here. http://bit.ly/91MY2q

It’s a good read.

And while we’re talking about Land Registry.. here comes UKMap

Friday, March 5th, 2010

Verrry interesting post over at the UKMapping blog. We’ll include the full content because it’s short and very relevant.

Great news was received today at UKMap HQ. After much review and testing Land Registry have confirmed [and are happy for us to tell people] that they are happy to accept registrations based on UKMap.

Great news for all those consultants, government types and property professional using UKMap.

Full text below

“Following a review of UKMap, Land Registry is able to confirm that UKMap meets Land Registry’s requirements as a mapping base on which registration applications can be made and is comfortable accepting registration applications based on UKMap. Such registration applications must still follow relevant guidance as set out in Public Guide 40.”

What does this mean? That LR doesn’t necessarily have to rely on Ordnance Survey maps for registrations. That, too, pours sand into any engines that might be getting started looking to fund OS through higher LR transaction fees. Savvy users would just go to UKMap.

And an earlier post on that same blog includes these interesting new clients:

  • London Fire, serious users with a serious need to get better mapping. Now they have that with UKMap.
  • London Borough of Islington – managing their green environment, UKMap’s trees and Land Use makes all the difference for them.
  • Mott Macdonald – detailed city centre mapping, all in glorious 3D.
  • Promap – the UK’s leading mapping portal for the Land and Property market have announced they are taking UKMap

London councils and emergency services? That’s what I think you call an inroad into OS’s market, isn’t it?

A spoke in the wheels of Land Registry transaction fees to pay for Ordnance Survey?

Thursday, March 4th, 2010

An interesting debate about the prospect of redundancies at Land Registry on Wednesday brought Michael Wills, of the Ministry for Justice – who is also in charge of the (trading fund) Land Registry – to the chamber.

It’s a long debate, but here’s the nitty-gritty:

Transaction levels, which are the key factor for the Land Registry, will have fallen from 16.1 million in 2007-08 to a projected 10 million in 2009-10. The Land Registry receives no central funding because it is a trading fund; it depends on the fees that it receives for services rendered. It made a loss of £130 million in 2008-09 compared with a surplus of about £70 million in 2007-08.

Very interesting – and this comes against a backdrop where it’s being suggested that government is being charged too little for OS data (compared to the private sector); and where there have been suggestions (hi, Robert Barr) that we should have an excess added to LR transactions in order to fund OS.

I think this really needs to be costed. It would be interesting too to see how much Land Registry pays to OS…

How GIS reveals discrimination in urban planning

Saturday, January 2nd, 2010

Does this sound at all familiar?

Not all local governments appreciate the rise of GIS-driven advocacy, especially when their own data is used as a hammer against them, and they have begun to restrict public access. Some have pulled data off the Web in the alleged interest of national security; others charge exorbitant fees to produce it or deliver jumbled masses of data that are difficult to manage or decipher.

Turns out though that it’s not from the UK, but the US, from a fascinating article about how GIS helps to demonstrate discrimination being practised by towns and cities – and how when that is revealed by mapping, the reaction tends not to be to get rid of the discrimination, but to get rid of the troublesome access to the data that reveals it. After all, it’s so much cheaper to do the one than the other:

Mebane, the Cedar Grove Institute’s first case study of municipal discrimination, passed an Infrastructure Information Security Policy shortly after the study was published; the policy limited infrastructure data access to qualified engineering firms and town agencies. The city of Modesto, Calif., locked in a legal underbounding battle, pulled its infrastructure data off the Internet after the lawsuit was filed, citing national security grounds. “There’s no conceivable national security interest in where the traffic lights are in Modesto,” scoffs Ben Marsh, the institute’s chief mapmaker. A recent appellate ruling in California rejected a similar national-security rationale, as well as a copyright argument by Santa Clara County, but whether that opinion stands as precedent remains to be seen.

However…

Though restrictions on access to government data could prove troublesome, advocacy groups that use GIS have already been finding data sources outside of government. In particular, data collected by community residents have become an effective supplement to the “official story,” as University of Washington professor Sarah Elwood calls government data.

Elwood has used GIS not only to map problems but to build the capacity of underserved and disadvantaged communities to advocate on their own behalf. Simple walking surveys that catalogue infrastructural deficiencies — potholes in sidewalks, missing stop signs, burned-out streetlights — fill gaps in the public record that mask actual conditions on the ground. With locally produced data, Elwood says, “You can tell a very detailed and very current, compelling story about neighborhood needs.”

If that reminds you at all of fixmystreet, it ought to – that’s precisely the sort of idea it sprang from.

UEA CRU climate data is a free data issue too

Tuesday, December 22nd, 2009

I’ve been researching the apparent hack of the University of East Anglia’s Climate Research Unit (CRU), where a huge amount of email going back more than a decade, plus huge numbers of documents, have been released onto the internet – they’re indexed on various sites in searchable form and through Wikileaks, for example.

What I find interesting is some of the discussion around it. There have been multiple freedom of information (FOI) requests to the CRU from people who want to examine the underlying data used to make the analysis about human-driven global warming.

You’d think it would be straightforward. Science operates by data leading to theory leading to prediction leading to test against data, with a parallel process of independent test against the same data. So you’d think that access to the data would be a key thing.

Lots of Freedom of Information requests have thus come into the CRU demanding (that’s the word) the original data used for the papers. But the CRU has turned them down. Why? Because, UEA says, it came from weather organisations which charge for their datasets – and restrict those datasets’ redistribution.

Read for yourself at the CRU Data Availability page:

Since the early 1980s, some NMSs [national meteorological services], other organizations and individual scientists have given or sold us (see Hulme, 1994, for a summary of European data collection efforts) additional data for inclusion in the gridded datasets, often on the understanding that the data are only used for academic purposes with the full permission of the NMSs, organizations and scientists and the original station data are not passed onto third parties.

And:

In some of the examples given, it can be clearly seen that our requests for data from NMSs have always stated that we would not make the data available to third parties. We included such statements as standard from the 1980s, as that is what many NMSs requested.

The inability of some agencies to release climate data held is not uncommon in climate science. The Dutch Met Service (KNMI) run the European Climate Assessment and Dataset (ECA&D, http://eca.knmi.nl/) project. They are able to use much data in their numerous analyses, but they cannot make all the original daily station temperature and precipitation series available because of restrictions imposed by some of the data providers

CRU insists it wants to make the data available:

We receive numerous requests for these station data (not just monthly temperature averages, but precipitation totals and pressure averages as well). Requests come from a variety of sources, often for an individual station or all the stations in a region or a country. Sometimes these come because the data cannot be obtained locally or the requester does not have the resources to pay for what some NMSs charge for the data. These data are not ours to provide without the full permission of the relevant NMSs, organizations and scientists. We point enquirers to the GHCN web site. We hope in the future that we may be able to provide these data, jointly with the UK Met Office Hadley Centre, subject to obtaining consent for making them available from the rights holders. In developing gridded temperature datasets it is important to use as much station data as possible to fully characterise global- and regional-scale changes. Hence, restricting the grids to only including station data that can be freely exchanged would be detrimental to the gridded products in some parts of the world.

The problem arises because the centre has been running in this way since the 1980s – before the internet reached even most universities, and when the culture of “pay for data” (because it was so hard to acquire, and so jealously guarded) was much more ingrained.

But it is a problem that needs to be overcome. The CRU has all sorts of PR difficulties because it hasn’t grasped this nettle – which needs to be grasped so that it can finally get past any questions about its research. There are people who aren’t satisfied at being told that the data needed to investigate a scientific paper can’t be passed on because of long-lost contracts. (We wouldn’t be very impressed by that if we were told it either.)

Paying for public data: it’s never a good idea. Especially when it creates problems like this.

PDFs are bad for open government, says Sunlight Foundation in US

Saturday, November 7th, 2009

This is always worth remembering:

Government releasing data in PDF tends to be catastrophic for Open Government advocates, journalists and our readers because of the amount of overhead it takes to get data out of it. When a government agency publishes its data and documents as PDFs, it makes us Open Government advocates and developers cringe, tear our hair out, and swear a little (just a little). Most earmark requests by members of congress are published as PDF files of scanned letters, leading the Sunlight Foundation and others to write custom parsers for each letter.

I know that a lot of the efforts going on in the data.gov.uk channels are about finding effective ways of parsing data. The hope has to be though that very little of that involves finding ways of reversing data that has been output to PDF. The point being of course that turning PDF into useful data is, in the famous quote, “about as easy as turning hamburger into cow”.

Back to the Sunlight Foundation again:

Here at Sunlight we want the government to STOP publishing bills, and data in PDFs and Flash and start publish them in open, machine readable formats like XML and XSLT. What’s most frustrating is, Government seems to transform documents that are in XML into PDF to release them to the public, thinking that that’s a good thing for citizens. Government: We can turn XML into PDFs. We can’t turn PDFs into XML.

And another word for Flash. Ah, Flash:

Flash isn’t off the hook either. Government has spent lots of time and money developing flash tools to allow citizens to view charts and graphs online, and while we’re happy the government is interested in allowing citizens to do this, Government’s primary method of disclosure should not be these visualizations, but rather publishing the APIs and datasets that allow citizens to make their own

The comments are worth it too, such as Adrian Holovaty: “If I had a dollar for each hour I’ve spent trying to finagle raw data out of PDFs, I could afford Adobe Photoshop.”

And the rather scary one from Michael Friis: “Here in Denmark Parliament publishes many ancillary documents as PNGs.” Which is quite scary, though in line with Ordnance Survey’s tendency to release FOI requests as TIFFs.

Digital engagement, widening and public data getting analysed… in private

Friday, October 30th, 2009

Stephen Timms reports that there’s been good progress in Making Public Data Public.

As the Digital Engagement blog notes:

So far our request for developers to “get excited and make things” has so far exceeded our initial expectations. Not only is the number of people signing up to the developer forum higher (currently more than 1,300), but also the discussion board is very active with a healthy list of ideas for the site and, perhaps most excitingly, a few applications are beginning to see the light of day.

And also:

Working in partnership with Guardian Professional, we held 3 developer days hosted at The Guardian’s Kings Place offices in central London on the 14th-16th September. As an organisation they were best placed to help us undertake this task, having built a community of talented developers and opened up their API. You can have a look here at the excellent postcode paper concept and the rather wonderful traffic data visualisations here, which were just two of the many ideas for applications that emerged over the course of the camp. Ideas about their priorities for further data releases (to add to the 1,100 datasets currently on the site) were shared and important foundations for further iterations of the HMG Data site were laid.

There’s a certain irony in the fact that the sessions at the Guardian were held under such secrecy that I didn’t find out about them until the week after. More posts on that later…

Tim Berners-Lee to help UK government build single data access point

Thursday, October 29th, 2009

Computer Weekly reports that Tim Berners-Lee has been asked by the government to develop a single point of access for public data – as Stephen Timms, who has taken over where Tom Watson left off in the Cabinet Office, reports progress in “making public data public” (a concept that, when you think about it, seems a bit strange – as in “shouldn’t that have been done from the outset?”).

According to Computer Weekly, Timms told an RSA/Intellect event that

information is the “essential raw material” of a new digital society. “Government must play its part by setting a framework for new approaches to using data and ‘mashing’ data from different sources to provide new services which enhance our lives. In particular, we want government information to be accessible and useful for the widest possible spectrum of people.”

Well, minister, if that’s truly what you want, then you’ll make it free of charge, and free of copyright restrictions. It’s as simple as that. Could we suggest something like Creative Commons? The US government seems to find it amenable. .

Timms said, “We are supporting Sir Tim in a major new project, aiming for a single online point of contact for government data, and to extend access to data from the wider public sector. We want this project for ‘Making Public Data Public’ to put UK businesses and other organisations at the forefront of the new semantic web, and to be a platform for developing new technologies and new services.”

Fine words. We’d like some actions to go with them. We’re hearing plenty of sticks being wielded over how people use the net – Lord Mandelson’s threats to file-sharers, for example – but the carrots for companies to build on something that really would benefit Britain, by using British data, seems to be stuck on a really slow train.

Part of the problem, of course, is that it’s almost impossible to put a figure on how opportunity cost is lost through the lack of access to this data – whereas the music industry can much more easily point to figures it’s produced (though you may argue about their provenance) to suggest precisely how much harm it’s suffering through untrammelled downloading.

Interesting to contrast, though, that when we asked the Royal Mail to specify precisely how much harm it was suffering through the use by ernestmarples.com of the postcode to lat/long conversion, it robustly declined to say.

Of course there is the Cambridge trading funds report, with its analysis of the opportunity cost of the trading funds regime. But this goes much wider – the Cambridge analysis didn’t look at the Royal Mail and postcodes, for example, which have become embedded into many systems’ location processing.

Computer Weekly again:

So far, 1,300 people have signed up to the developer forum and contributed to the discussion board on what the data could be used for. The Cabinet Office also held a developers’ camp where ideas were shared.

We’ll have more about the devcamp in a future post.

Kent County Council wants you to recycle its data

Tuesday, October 20th, 2009


Have a look at http://picandmix.org.uk/:

Pic and Mix aims to increase public access to Kent-related datasets including those generated by Kent County Council (KCC). For the purposes of the pilot, we have brought together a sample of the most useful information. Where possible, it’s been provided in a format that allows it to be ‘mashed’ and customised. Please help us shape this initiative by suggesting additional data and ways in which we can improve this site. And if you do anything clever with the data, we’d like you to share that with us too!

The About page has more:

Last year, Kent County Council won Innovate08! Our idea had three elements:

  • To make publicly available information – things like crime statistics, employment information, business information – more
    accessible.
  • We also wanted to provide tools that would enable people to ‘pic and mix’ data to create customised information.
  • And last but not least, we wanted to provide a platform where people could share this information and discuss ways in which it could be used.

Winning Innovate08 meant we were given funding for a pilot project to see how people in Kent would respond to a resource of this kind. Our pilot project was intially launched with 25 small Kent-based businesses. With this new site we  hope to get the wider community involved.

So, how could Pic and Mix benefit you? Well, there’s a lot of information out there in a lot of different places. Rather than spend ages tracking down the information you need, we want you to come to a single place – picandmix.org.uk. For example, you may be looking for a care home for an elderly relative. You might want to mix this information with GP locations and bus routes. By plotting this information on a map you will be able to see which care homes are close to a GP surgery, and the bus routes. Another example might be a security company deciding where to focus its marketing efforts. They may want to mix office premises with crime statistics and use the information to plan a campaign.

Fascinating. We await developments – and news of same

Time for local government to think harder about opening its data

Wednesday, September 30th, 2009

Chris Taggart gave a presentation earlier this month to APPSI – the Advisory Panel on Public Sector Information – about opening up local government data.

Even without the actual talk (is it online anywhere in some form?), the slides make compelling reading. Local government, of course, can sometimes be just as bad as central government (or indeed trading funds) about hanging grimly on to its data, enforcing dubious or unnecessary copyright, and basically making peoples’ lives hard when it should be making it easier.

You can also read my thoughts on how local government could open itself up in an article for Society Guardian here, which has attracted some useful comments – and links to interesting sites.

But now, here’s the lecture. Flash required, of course.

Do you know where your postboxes are?

Tuesday, September 15th, 2009

As an example of how getting data out there can just be plain useful, let’s return to one of the winners of the Show Us A Better Way competition (remember that?).

Prizewinner: postbox locations.

Obstacle: Royal Mail wouldn’t release the data of the location of its 116,000 postboxes.

Solution: Freedom of Information request.

Obstacle: incomplete geographic information in the response (a postcode, not long/lat, plus a mystical Royal Mail reference per box); no collection times.

Solution: FOI request for the collection times and a bit of data marriage.

Obstacle: still don’t know where the postboxes actually are.

Solution: crowdsource it! Get people to pinpoint the locations of what they think are the postboxes onto an OpenStreetMap map. So far about 26,000 have been done – have you done the ones near you?

Obstacle: Royal Mail says it still holds all the rights to the locations of the postboxes.

Solution: actually, you don’t really need a solution. Toothpaste is notoriously hard to put back into the tube.

And as Matthew Somerville pointed out to us, knowing the locations of the postboxes means that one might be able to do “travelling salesman” analyses on the routes – which could have huge potential savings for the Royal Mail. How much does it spend on fuel and time doing collections every day? How much might it save with a proper analysis? Who knows? We won’t until we see all the postboxes put in their place.

And that’s why it’s better to rely on making government data available – free, in both senses of the word – than to try to create artificial “value” from it by charging.

Price does two things: it implies that what you are pricing has value; and it puts a barrier between the thing being “sold” and its potential users. If the users don’t want it enough, they won’t ever go across the barrier. If you take down the barrier, then you get every user you could ever get. And some of them will do really useful things with your product – that’s possible if it’s data.

Sounds like a good idea: Sir Tim Berners-Lee goes to Downing Street to talk open data

Tuesday, September 15th, 2009

Well, Sir Tim Berners-Lee (he invented the web, you know) seems to be getting stuck in. He has gone to Downing Street along with Nigel Shadbolt (whose name always reminds of a Harry Potter character – apologies: he’s actually professor of artificial intelligence at the University of Southampton) to talk to Gordon Brown.

About what?

Mr Berners-Lee and Mr Shadbolt presented an update to Cabinet on their work advising the Government on how to make data more accessible to the public.

Gordon Brown has already spoken publicly about his aim of making the UK a world leader in opening up government information on the internet, an important element of Building Britain’s Future.

He could have asked us. We’d have told him back in 2006. Or 2007. Or 2008.

Sir Tim Berners-Lee told Cabinet about the goal of delivering a single online access point to Government information, similar to the one introduced by the Obama administration in the US.

Don’t we sort of have that already through the work of OPSI and its data portal? Sometimes it seems like the work of Carol Tullo and John Sheridan et al has just been swept down a plughole – or perhaps memory hole, a la 1984.

He also spoke about proposals to extend the “open data” approach, ensuring greater transparency in government and improving the efficiency of public services.

It would be interesting if the “efficiency of public services” meant “to stop different bits of government squabbling over the data they collect like children in a playground and instead start to share it freely, rather as we adults advise children to do so they can discover the benefits of sharing”.

But there’s a suspicion it’s really code for “cut public services while saying what’s being cut will be replaced by something else at some time in the future”.

The Government hopes the data project will benefit the UK by creating jobs, driving new economic growth and allowing the re-use of government data to encourage the development of new, innovative information-based businesses and services.

Hold on just a moment there. The government hopes all these things, does it? Is that because it’s taking the Cambridge study seriously, and looking at its potential benefits to the economy? So we’re not going to see terrible approximations like the OS’s “hybrid” strategy, then?

It is also expected to help increase the transparency of government and empower citizens to get more out of public service by tailoring it to their needs.

What I don’t like here is the description of it as a “data project” as though it were something that sat apart from what should actually be a process – and a core process at that. It shouldn’t be “what part of this data shall we release” but “is there any of this that shouldn’t be released?”

After the update from Sir Tim and Professor Shadbolt, The Prime Minister confirmed his full support for the next phase of their work.

It would be nice to know what that next phase included. Anyone seen a copy of the timetable?

You cannot charge for property searches, councils told, and you might have to pay some back

Thursday, August 6th, 2009

Interesting decision by the Information Commissioner: property searches are environmental data, and as such should be made available to councils under Freedom of Information regulations.

This is pretty big – particularly for estate agents.

Thanks to EPSIPlus forum for the pointer:

As the head of the IPSA noted:

The ICO has published two section 50 rulings today against Local Authorities in England.

East Riding of Yorkshire – The ICO has ruled Building Control and Traffic data is EIR and the Local Authority must make the data available in 35 days.

Stoke City Council – The ICO has ruled Building Control and Traffic data is EIR and the Local Authority must make the data available in 35 days.

Failure to comply by either Local Authority may result in the ICO making written certification of this fact to the High Court (or the Court of Session in Scotland) pursuant to section 54 of the Act and may be dealt with as a contempt of court. Data must be made available under the pricing terms of EIR. The ICO is not satisfied by the ‘made available under another means’ (CON29R requests) and the payment of a full Local Authority fee. This is because the Charging Regulations (CPSR) acts as a barrier to the data.

The Property Search Industry will now seek reimbursement of fees paid under duress / under protest. (emphasis added).

Now, that could get rather interesting. And for cash-strapped councils, not being able to charge for property searches (or even parts of them, but particularly the environmental data side of them) is going to make a difference. If anyone knows how much councils make from those charges, we’d be very interested to know more.

Free our data, says Lords info committee

Thursday, August 6th, 2009

Simon Dickson has picked up what we were remiss in missing: the Lords Information Committee. He describes it as Free our data, says Lords info committee.

He notes that its final report

couldn’t really have been more in favour of the free our bills [as pushed by They Work For You, which would show you details of bills in progress in committee] agenda.

A key recommendation, among those listed in its listed in the press release:

(I’ve copied and pasted these from puffbox.com. All credit to Simon for what’s below, apart from any mistakes in the stuff in [italics], which are my additions

  • information and documentation related to the core work of the House of Lords should be produced and made available online in an open standardised electronic format (not pdf) that enables people outside Parliament to analyse and re-use the data
  • the integration of information on Parliament’s website, eg biographical info on Members to be linked to their voting record, their register of interests, questions tabled, etc [basically, like They Work For You]
  • Bills should be presented on Parliament’s website in a way that makes the legislative process more transparent and easier to understand [=Free Our Bills]
  • an online system enabling people to sign up to receive electronic alerts and updates about particular Bills [rather like planningalerts, but for legislation]
  • a requirement on the Government to start producing Bills in an electronic format which both complies with “open standards” and is readily reusable [a bit like the Conservatives' suggestions]
  • an online database to increase awareness of Members’ areas of expertise
  • an online debate to run in parallel with a debate in the Lords Chamber
  • greater access to Parliament for factual filming
  • a trial period during which voting in the Lords is filmed from within the voting lobbies
  • all public meetings of Lords committees to be webcast with video and audio
  • a review of the parliamentary language used in the House of Lords to make it easier for people outside the House to understand

Let’s see how it pans out. Is there time for this to be implemented before the election? Or would either of the main parties put it onto their agenda – or even manifesto?

Naughty, very naughty: Ernest Marples frees the postcodes

Saturday, July 11th, 2009

An interesting new site – ernestmarples.com/ – is trying to make postcodes free.

The people behind it (the whois details tell you that it’s registered to a location in SW1A 1AA, which happens to be Buckingham Palace) are Harry Metcalfe and Richard Pope.

They insist, when asked the question of “where does the data come from?” that

We’re not saying. But, just to be clear: we don’t hold a copy of the postcode database ourselves, neither in complete form nor as part of a cache.

But their aim is clear enough:

Post codes are really useful, but the powers that be keep them closed unless you have loads of money to pay for them. Which makes it hard to build useful websites (and that makes Ernest sad).

So we are setting them free and using them to run PlanningAlerts.com and Jobcentre Pro Plus. We’re doing the same as everyone’s being doing for years, but just being open about it.

Hopefully the Government and Royal Mail will realise the value of this service and license us to offer it officially and for free. If not, and this website gets shut down, we’ll close the websites we’ve made that make use of this site’s lookup service. Permanently.

There’s a long list of people who have supported it. We’ll add our voice. The Free Our Data campaign thinks it’s a good idea to make postcodes freely available.