Free Our Data: the blog

A Guardian Technology campaign for free public access to data about the UK and its citizens

Archive for the 'Cambridge study' Category

Goodbye Gordon Brown: but thanks for the data … and the campaign goes on

Wednesday, May 12th, 2010

Gordon Brown’s stint as prime minister is over. But we can thank him for one thing he left behind: the commitment by the Treasury to fund free data from the Ordnance Survey (by my understanding, for at least five years – which I think is at least what it will take for really useful commercial applications to emerge from the availability of the data).

That’s a huge step. When Mike Cross and I started the Free Our Data campaign in March 2006, Tony Blair was prime minister. We knew that there was a strong reason for it, but it took time to get traction. Our first meeting with a minister was with Baroness (Cathy) Ashton at the Ministry of Justice; she didn’t seem too interested.

Once Gordon Brown came into office and there was a change at ministerial level, things changed dramatically. We got audiences and found ministers who were largely sympathetic. Brown too understood the idea – which simply took off when he found himself sitting beside Tim Berners-Lee at a dinner and started making conversation.

Brown asked: “What’s the most important technology right now? How should the UK make the best use of the internet?”

To which the invigorated Berners-Lee replied: “Just put all the government’s data on it.”

To his surprise, Brown simply said “OK, let’s do it.”

Now the Conservatives and Liberal Democrats are in power. The Conservative manifesto contains a pledge to access to government data:

Drawing inspiration from administrations around the world which have shown that being transparent can transform the effectiveness of government, we will create a powerful new right to government data, enabling the public to request – and receive – government datasets in an open and standardised format. independent estimates suggest this could provide a £6 billion boost to the UK economy. We will open up Whitehall recruitment by publishing central government job vacancies online, saving costs and increasing transparency.

That £6bn number comes, I believe, from the Cambridge study – though that included making OS Mastermap data free. I don’t think that that will be done, given the commitment to extra public spending it would involve in the short term and the long-term payback it would need.

I also think that while it’s nice to publish central government jobs online, there will be problems in how you slice it so that people can find the jobs they want. You might find that it’ll become something that other sites – and of course businesses – will exploit as a raw data feed and sell access to, or improve. (Yes, I know it’s also a scheme which is about chopping the funding of the Guardian’s public-sector jobs supplement off at the knees; there have been elements within the Tory party which have wanted to do this for years.)

In short: the campaign continues, but the Con-Lib coalition has indicated that it has a lot of the right instincts. Once we know which ministers we need to lobby – and once they know what their viewpoints are – we’ll be pushing the campaign again. There’s still so much data in there which needs to be freed.

Free Our Data response to the DCLG consultation on OS Free

Friday, February 26th, 2010

We’ve written up our response to the DCLG consultation, and emailed it over this afternoon. Make sure you respond too! Deadline is 17 March – a Wednesday, for no obvious reason.

Feel free to crib from this or build on it. Your comments welcome (though of course they’ll only be useful after the fact…)

Response to Ordnance Survey consultation from the Free Our Data campaign, 25 February 2010

The Free Our Data campaign was co-founded in March 2006 by Charles Arthur and Michael Cross with the aim of persuading government that non-personal datasets created by government-owned agencies and companies and organisations should be made available for free reuse without licence restrictions.

The rationale for this approach is that citizens have already funded the existence and collection of these agencies through taxes paid over past years. (This includes historical data; Ordnance Survey, for example, has been a trading fund for some time but on its incarnation as a trading fund immediately used data previously collected at public expense.) Furthermore, the private and non-profit sector can imagine better ways of using data than government can because they have a direct interest in using it – but price and licensing are significant barriers to the development of those applications.

The campaign is apolitical. It is not aligned, associated with or funded by any political party or outside group; its (very small) costs are paid by the co-founders out of pocket.

We are delighted that government has chosen to accept the rationale behind the campaign’s logic with its plans to create the OS Free products. Our only caution is that it must ensure that the model used to fund it can be widely applied to other non-personal datasets within government. OS Free should not be a one-off, but instead should be the basis for a wider sharing of data.

One final, general point: the Free Our Data campaign believes that Ordnance Survey provides an excellent map-generating service and must remain a government-owned asset whose public task includes the continual mapping of the UK’s geography and built environment. Any moves to privatise any part of its operation would be retrograde and threaten both OS’s future usefulness and the UK’s economy. We would oppose such moves.

Question 1:What are your views or comments on the policy drivers for this consultation?
The need to reduce overt government spending, allied to the growth in personal computing power owned and controlled by the public at large, creates an entirely new opportunity to let citizens analyse, understand and benefit from the data that the government collects on their behalf. This is a two-way process.

Clearly the UK government is rapidly recognising the benefits of transparency – that actions are not just seen to be done, but that the reasoning for the actions can also be interrogated and understood. This is one key policy driver. (Hereafter PD1.)

There is a second policy driver (hereafter PD2): the need to reduce the public sector deficit in coming years. This is best done through a reduction in public spending and an increase in tax revenue from the private sector.

There is also an untapped private-sector entrepreneurial market whose entire existence depends on the successful implementation of this consultation and future ones like it. When government-collected data is treated as a limited asset which must be priced to create an artificial shortage, government constrains the private sector which generated the taxes used by the government to create the data. Clearly, that then constrains the tax base, because not all companies (extant or proposed) can afford to buy the data. Therefore total taxes are lowered by pricing data. This is inefficient, and constrains entrepreneurship based around the effective use of data.

Therefore making government-owned data like this free for reuse (including commercial reuse) will bring in larger tax revenues as long as HMRC is vigilant in collection of owed taxes from individuals and companies.

How the consultation will reduce public sector spending in the context of the Ordnance Survey’s financial model (as a trading fund) is less obvious, but still exists. The consultation iterates costs of making these mapping data free. However, it does not iterate the potential benefits through reduced costs to local councils, police forces and local health authorities, for example, of being able to provide map-linked data on public websites, without paying, at the Landranger and Explorer scale; this has been a consistent bugbear to local councils, to police forces and public health observatories which want to share their work with the public.

Making such data free also obviates the legal examination of any instance in which those bodies wish to share their work – a cost which is also unpriced in the consultation document. While these costs may not match the millions of pounds directly attributable in lost revenue from sales of Explorer and Landranger-scale data, they are significant in the cultural sense too – because they enable those bodies to operate in a more transparent manner as well, satisfying PD1 above.

Question 2: What are your views on how the market for geographic information has evolved recently and is likely to develop over the next 5-10 years?
The geographic information market has been completely transformed in the past 10 years by
-the opening of GPS (Global Positioning System) data to the world by the US military (an excellent example of treating “information as infrastructure”, in which the US government bears the cost of supplying, in effect, location data to non-US-taxpaying people in the UK and elsewhere); and
-the ability to create “crowdsourced” maps, such as OpenStreetMap (OSM), which are accessible via the internet without copyright restriction for consultation, addition or editing.

In the future the GI market will be further transformed by
-the growing number of smartphones with built-in GPS
-the falling cost of mapping large areas with great precision due to improving satellite photography systems
-mapping information, including road and route data, becoming a commodity, where only value-added forms can be effectively charged for.

Sales of satnav devices provide a clear indication that “knowing where you are” is a key piece of information for people: estimates suggest that between 4 million and 7.5m such devices have been sold in the past 10 years in the UK alone.

The commoditisation of route and road information, which has previously been supplied only by Ordnance Survey, will continue. If satnav makers decided that the OSM mapping was good enough, and that the pricing model it offers (of zero cost), they might choose to use its database (or even to improve its database while using it) and neglect the OS version. Under the present OS funding model, the only way for OS to recover its costs would be to raise costs to its existing clients, including the public sector – which would not fit PD2.

Therefore it is essential that OS does provide the OS Free data to encourage the growth of the UK geographical information sector, and develops its own high-quality mapping as part of the public task for which it exists.

Question 3: What are your views on the appropriate pricing model for Ordnance Survey products and services?
Given the name of the campaign, our obvious answer would be “all should be free”. But we recognise that there are pragmatic and political problems with this.

The question assumes a great deal about OS products and services, and its charging regimen. However as noted in the consultation OS has consistently declined to separate out the costs and revenues and profits of its “raw” and “value-added” products and services, which makes it difficult to take anything but a Gordian Knot approach to finding appropriate models.

The question would be better framed as “which products and services should OS produce, and what should it charge for, and how should the charging regime be set?”

The more logical approach is to ask what OS’s public task should be, what products and services flow from that, how far those should be self-financing (using, say, a trading fund method) and what other products and services are seen as a public good which should be funded out of general taxation.

OS’s public task is clearly to map the geography of the UK; arguably this also includes the built environment. MasterMap provides an appropriate starting point for the public task, comprising a detailed scale of the UK.

For the moment we find the proposed model – with MasterMap and non-OS Free products’ prices aligned for the public and private sectors – to be equable. However as costs of updating maps and built environment detail falls (due to pervasive GPS feedback systems such as smartphones and cheaper satellite imagery allied to automated updating of map databases) this may need review to see whether more detailed scale products closer to MasterMap level can also be offered free.

Question 4: What are your views and comments on public sector information regulation and policy, and the concepts of public task and good governance as they apply to Ordnance Survey?
PSI regulation and policy suffers from the problem that where public organisations decline to comply with it, neither method of enforcement is satisfactory.
-If OPSI or other organisations demand compliance using non-legal recourse (e.g. asking for “good practice”), the non-complying organisation can ignore it; or
-if OPSI or other organisations seek legal recourse for compliance, the exercise is extremely costly for all concerned and is concluded so slowly due to legal process that private organisations in particular are at risk of going out of business first. (The instance of Getmapping’s complaint against OS in the early 2000s is illustrative.)

It is absurd that OS has written its own definition of its public task – with or without the consent of its minister in DCLG. With the release of OS Free, it is time for the job of defining OS’s public task, which impinges on huge parts of British life and the economy, to be put in the hands of a body entirely outside OS.

Question 5: What are your views on and comments on the products under consideration for release for free re-use and the rationale for their inclusion?
It is essential that there should be both raster graphics and vector graphics. The former allow easy use on websites to create Google Maps-style interfaces (where the map can be “dragged” to a location). The latter allow dynamic scaling. Though no rationale has been offered for their inclusion, they seem to fit the “mid-scale” requirement.

The inclusion of Code-Point and Boundary-Line datasets, with licences that allow free reuse (including commercial reuse) is essential to the creation of useful, effective and profit-generation applications.

Question 6: How much do you think government should commit to funding the free product set? How might this be achieved?
This is a key question – and how the government chooses to implement this will demonstrate whether it is truly committed to the idea that “information is infrastructure” by creating a model of funding that will be applicable to other data-collecting trading funds and parts of citizen-funded government, or if it is simply choosing a short-term fix for the problem of the desperate need for free access to OS data.

It is easiest to start by indicating what the government should not do.

– It should not raise prices within government for the non-free OS datasets above those charged to commercial organisations outside government. This would create tensions under which government organisations would naturally seek third-party solutions to reduce their costs (because of PD2). That would undermine income for OS and jeopardise the quality of all its data. In extreme cases, price rises might deter local authorities and other public bodies from using high quality geographic information to deploy their resources more efficiently and end up costing the public purse more in the long term.

– It should not raise prices for commercial organisations above those charged to government for the same datasets. This too will tend to exacerbate any drift to third-party solutions for high-value datasets. (Although it should be expected that these will occur naturally due to new entrants in the market.)

Therefore government should commit exactly the “funding gap” that making the datasets mentioned free will cause – apart from the paper maps. OS will presumably continue to sell paper maps, and will be able to rely on its brand to benefit from their sales and consequent profits. Therefore Treasury should fund the “gap” in revenues out of general tax funding, rather than by levying greater charges for OS data from other public sector sources.

The government’s own argument that “information is infrastructure” should be applied here. Roads, for example, are physical infrastructure. Government sees their provision as a public good and commits to fund their building from general taxation. It does not charge higher road tax prices to government-owned vehicles to offset the fact that government has built the roads and provides free access to them. (Nor is road tax hypothecated towards road-building.)

In the same way, other public sector organisations that use OS data should not be charged over the amount that private sector groups would be, and their payments should not be hypothecated towards any “funding gap”. The amounts being discussed – ¬£19-¬£24m pa – are comparatively small when set against overall public spending.

The benefits, admittedly, are difficult to enumerate. It is possible that, as with GPS, the benefits will not be immediately visible, and may not appear in the same place as the investment. It would therefore be sensible for government to commission regular studies to evaluate the growth of business predicated on use of the OS Free products.

By adopting a “non-hypothecation” approach to funding OS Free, government will be greatly simplifying the process required for the subsequent release of other datasets from other government-owned bodies. The pressure to release OS Free arose because the trading fund model is too restrictive: it cannot prime the market.

To draw an analogy, the search engine Google could not be profitable if it were to use Microsoft’s Windows to power its multiple thousands of servers that store its index of the internet. It would have to pay a Windows licence on each of those servers, and for each additional one. The cost would outweigh its profits. Instead, Google uses the free Linux operating system for those servers. We are suggesting that using the OS trading fund model for products is akin to licensing Windows: it limits the size of the market for their use, and the speed with which companies can grow while using those products.

Question 7: What are your views on how free data from Ordnance Survey should be delivered?
The key to the datasets being useful will be (a) availability (b) reliability ( c) accessibility.
Availability: where are the datasets stored? If OS hosts the files, it will need to create an entirely new system to support hundreds or thousands of concurrent accesses. That is inefficient, and outside OS’s remit. It would be more sensible for the datasets to be uploaded to a cloud facility such as Amazon’s S3 storage or Google’s cloud facility where copies could be downloaded. This is a comparatively low-cost solution where OS would only have to pay for downloads, rather than setting up its own hosting service.

Furthermore, it is clear that the datasets will be subject to change over time. It would be inefficient to upload a complete set every day, for example. A more effective method would be to upload a “diff” file of differences from the previous full upload every so often (daily, weekly, monthly). This would reduce the total amount that would be needed to for an up-to-date download and simultaneously create new opportunities for applications showing what has changed on a map or dataset over time. A full dataset incorporating the diffs from the last full upload could be provided every, say, six months.

Reliability: so users can be confident that the files come from OS, they should be cryptographically signed.

Accessibility: the files should be made available in formats that are readable using open-source software: that will ensure that they will be usable by the widest possible range of users and applications.

Question 8: What are your views on the impact Ordnance Survey Free will have on the market?
Resellers of OS data will not be pleased. But this will force them to focus on value-added services rather than promulgating a system which perpetuates the extension of copyright limitations that are not sustainable in the age of the internet.

Some map providers have already cut their prices in response to the expectation of OS Free. As in Canada (in the example cited in the consultation) we should expect that mapmakers will take the opportunity to create specialised maps for different niche groups (climbers, walkers, and other outdoors pursuits are likely to be the first to take advantage of this).

The provision of CodePoint will galvanise a market that has been held back by the problems of creating fast, cheap and legal lookups for geocodes. Although organisations such as Yahoo offer them, using those leaves providers dependent on outside groups, when they would prefer to do their own lookup. CodePoint is an essential part of the package.

The provision of Boundary-Line will be highly important in the forthcoming election. It will also be important for online organisations which depend on mapping electoral constituencies.

Question 9: What are your comments on the proposal for a single National Address Register and suggestions for mechanisms to deliver it?
The absence of a working National Address Register (due mainly to intellectual property claims by publicly owned bodies) has created the absurd situation of the Office for National Statistics being forced to spend millions of pounds creating a one-off register for use in the 2011 census and then discarding it afterwards.

Government should retain the ONS census for future reuse and treat it as a resource with huge ability to create value for the economy.

Question 10: What are your views on the options outlined in this consultation?
Option 1 – allowing OS to continue with its planned “hybrid” strategy – is deeply unsatisfying. The strategy proposal has received no proper oversight; it has not been debated in Parliament; its financial assumptions are at best weak and at worst flawed; and the creation of an “attached” private company that would sell OS-branded goods is anticompetitive because it offers no transparency on pricing, while having sole advantage of the OS brand.

Option 2 – releasing large-scale data for free reuse – would cross a Rubicon. Although the Free Our Data campaign would support this, we are concerned that government and Treasury has not shown sufficient commitment to the idea of vote-funded data collection and parsing by OS, and that this strategy could endanger the long-term future of OS. Furthermore, it could undermine would-be commercial competitors, and would create substantial upheaval in the geographic information market. Change is good, but too much change can be unpalatable.

Option 3 – releasing “mid-scale” data as suggested, and considering a transition to further release – seems to offer a path towards the long-term future of OS while providing the opportunity to prove the benefit that would accrue to the private sector, and thus the Treasury through tax receipts, of freeing data. We find this the most pragmatic approach – but reiterate that the government’s aim should be to pursue a path where it releases data for the use of citizens without cost impairment.

Question 11: For local authorities: What will be the balance of impact of these proposals on your costs and revenues?

Question 12: Will these proposals have any impact on race, gender or disability equalities?
We see no impact on those inequalities.

Charles Arthur & Michael Cross, 26 February 2010.

Impact assessment of making OS ‘mid-scale’ data free puts cost at 47m-58m pounds

Wednesday, December 23rd, 2009

What interesting reading the impact assessment of the DCLG consultation on making OS data free is. Clearly some arms have been twisted in the Treasury to make it happen – Liam Byrne, chief secretary to the Treasury, almost surely in the driving seat there.

On the option being chosen (which is explicitly not the one that was examined in the “Cambridge study”, which looked at the benefits of releasing large-scale data, not the “mid-scale” data being proposed) the cost seems to be that government costs rise somewhat, while costs to the commercial sector fall.

From the document (on the impact assessment page):


Lost OS revenue from OS Free data being made free: £19-24m (govt would fund this on a cost plus basis, amounting to £6-9m).

Increased government charges for large-scale data: £28-34m (price rebalancing based on number of datasets used by public and private sector).

One-off (Transition) Yrs: £ tbc

Average Annual Cost (excluding one-off): £47-58m

Total Cost (PV) £391-482m

Other key non-monetised costs by ‘main affected groups’ Transition costs to Ordnance Survey, government departments and businesses of moving to new model. There would be impacts on third party providers (see Competition Assessment, Annex 1).


Description and scale of key monetised benefits by ‘main affected groups’: gain to business and consumers from OS large-scale data being made cheaper: £28-34m if assume price rebalancing is revenue neutral.

Gain from OS Free data being made available: £19-24m.

Average Annual Benefit (excluding one-off) £47-58m

Total Benefit (PV) £391-482m

Other key non-monetised benefits by ‘main affected groups’: The lower charges to businesses and consumers for large-scale data, and the free data should increase demand and hence welfare. Entry and innovation should occur in the market for geographical information. These welfare benefits have not been quantified (Pollock report focuses on releasing large-scale data).

And finally:

Key Assumptions/Sensitivities/Risk: Modelling assumptions: some substitution from paid-for to free data; lost revenue by OS due to competition from new derived products. Not yet determined how the revenue shortfall will be covered from government (i.e. who will pay and how). So for now assume no change in demand, but will estimate this for the final IA.

Price Base Year 2009

Time Period Years 10

Net Benefit Range (NPV) –

NET BENEFIT (NPV Best estimate) £0

Hurrah! Ordnance Survey consultation is live!

Wednesday, December 23rd, 2009

Thanks to a little bird at an interested organisation, we now know that the DCLG has opened its consultation on OS data.

It’s at, where we learn that the closing date is 17 March 2010 (and the opening date is today, 23 December 2009).

Consultation paper on the Government’s proposal to open up Ordnance Survey’s data relating to electoral and local authority boundaries, postcode areas and mid scale mapping information.

The consultation document itself weighs in at 2.2MB of PDF and 91 pages.

As ever, let us know your thoughts.

Update: and don’t miss the Impact Assessment paper – here’s the PDF of the Impact Assessment – which for some strange reason isn’t linked from the main page.

Tim Berners-Lee to help UK government build single data access point

Thursday, October 29th, 2009

Computer Weekly reports that Tim Berners-Lee has been asked by the government to develop a single point of access for public data – as Stephen Timms, who has taken over where Tom Watson left off in the Cabinet Office, reports progress in “making public data public” (a concept that, when you think about it, seems a bit strange – as in “shouldn’t that have been done from the outset?”).

According to Computer Weekly, Timms told an RSA/Intellect event that

information is the “essential raw material” of a new digital society. “Government must play its part by setting a framework for new approaches to using data and ‘mashing’ data from different sources to provide new services which enhance our lives. In particular, we want government information to be accessible and useful for the widest possible spectrum of people.”

Well, minister, if that’s truly what you want, then you’ll make it free of charge, and free of copyright restrictions. It’s as simple as that. Could we suggest something like Creative Commons? The US government seems to find it amenable. .

Timms said, “We are supporting Sir Tim in a major new project, aiming for a single online point of contact for government data, and to extend access to data from the wider public sector. We want this project for ‘Making Public Data Public’ to put UK businesses and other organisations at the forefront of the new semantic web, and to be a platform for developing new technologies and new services.”

Fine words. We’d like some actions to go with them. We’re hearing plenty of sticks being wielded over how people use the net – Lord Mandelson’s threats to file-sharers, for example – but the carrots for companies to build on something that really would benefit Britain, by using British data, seems to be stuck on a really slow train.

Part of the problem, of course, is that it’s almost impossible to put a figure on how opportunity cost is lost through the lack of access to this data – whereas the music industry can much more easily point to figures it’s produced (though you may argue about their provenance) to suggest precisely how much harm it’s suffering through untrammelled downloading.

Interesting to contrast, though, that when we asked the Royal Mail to specify precisely how much harm it was suffering through the use by of the postcode to lat/long conversion, it robustly declined to say.

Of course there is the Cambridge trading funds report, with its analysis of the opportunity cost of the trading funds regime. But this goes much wider – the Cambridge analysis didn’t look at the Royal Mail and postcodes, for example, which have become embedded into many systems’ location processing.

Computer Weekly again:

So far, 1,300 people have signed up to the developer forum and contributed to the discussion board on what the data could be used for. The Cabinet Office also held a developers’ camp where ideas were shared.

We’ll have more about the devcamp in a future post.

Sounds like a good idea: Sir Tim Berners-Lee goes to Downing Street to talk open data

Tuesday, September 15th, 2009

Well, Sir Tim Berners-Lee (he invented the web, you know) seems to be getting stuck in. He has gone to Downing Street along with Nigel Shadbolt (whose name always reminds of a Harry Potter character – apologies: he’s actually professor of artificial intelligence at the University of Southampton) to talk to Gordon Brown.

About what?

Mr Berners-Lee and Mr Shadbolt presented an update to Cabinet on their work advising the Government on how to make data more accessible to the public.

Gordon Brown has already spoken publicly about his aim of making the UK a world leader in opening up government information on the internet, an important element of Building Britain’s Future.

He could have asked us. We’d have told him back in 2006. Or 2007. Or 2008.

Sir Tim Berners-Lee told Cabinet about the goal of delivering a single online access point to Government information, similar to the one introduced by the Obama administration in the US.

Don’t we sort of have that already through the work of OPSI and its data portal? Sometimes it seems like the work of Carol Tullo and John Sheridan et al has just been swept down a plughole – or perhaps memory hole, a la 1984.

He also spoke about proposals to extend the “open data” approach, ensuring greater transparency in government and improving the efficiency of public services.

It would be interesting if the “efficiency of public services” meant “to stop different bits of government squabbling over the data they collect like children in a playground and instead start to share it freely, rather as we adults advise children to do so they can discover the benefits of sharing”.

But there’s a suspicion it’s really code for “cut public services while saying what’s being cut will be replaced by something else at some time in the future”.

The Government hopes the data project will benefit the UK by creating jobs, driving new economic growth and allowing the re-use of government data to encourage the development of new, innovative information-based businesses and services.

Hold on just a moment there. The government hopes all these things, does it? Is that because it’s taking the Cambridge study seriously, and looking at its potential benefits to the economy? So we’re not going to see terrible approximations like the OS’s “hybrid” strategy, then?

It is also expected to help increase the transparency of government and empower citizens to get more out of public service by tailoring it to their needs.

What I don’t like here is the description of it as a “data project” as though it were something that sat apart from what should actually be a process – and a core process at that. It shouldn’t be “what part of this data shall we release” but “is there any of this that shouldn’t be released?”

After the update from Sir Tim and Professor Shadbolt, The Prime Minister confirmed his full support for the next phase of their work.

It would be nice to know what that next phase included. Anyone seen a copy of the timetable?

OS chairman’s speech: internal study shows “free” OS would cost government 500m-1bn pounds – but won’t publish

Thursday, May 14th, 2009

The following is the text – as captured in shorthand contemporaneously – of a speech by Sir Rob Margetts, chairman of Ordnance Survey on Tuesday May 12. It is not complete but does capture the major themes and quotations.

The context is that Sir Rob was explaining to an invited audience, including many existing customers of OS, how the new “hybrid” strategy had been determined as the best one for its future development. He took some pains to emphasise that the “free data” model had not been rejected out of hand; but that instead a special study had been commissioned to investigate it.

This is my shorthand notes of what was said. My own comments are at the end.

There were major issues affecting the sustainability of OS as it goes through its proposed strategy.

We examined the complete range of options very impartially and objectively. That includes the free data, utility model where you would make data available to anybody [for free]. We examined the fully commercial model.. and alternatives within that range.

Our study of the utility [free data] model was done because some hold that that is a good strategy, and some of us weren’t indifferent to it. Some [of the study team] going in thought it could be interesting.

The study was fully costed for the government, calculating the costs of change to the residual value.

We came to conclusion that the cost to government in the first five years would be between £500m and £1 billion. That wasn’t the only reason that we discarded it. We did, with outside help, a review of equivalent organisations around the world.

We wanted sustainability and high [data] quality and came to the conclusion that at nearly every organisation that had gone to free data model, the quality had declined and that users and customers were increasingly dissatisfied with the product.

And the attractiveness to staff and recruitment and retention had also reduced. We found no evidence that this model actually worked elsewhere.

Those that work had a user-pays model. We tried to understand and explain why. Think that comes to the responsiveness to needs of the organisation. [ie: the responsiveness of the organisation to needs.]

If customers are required to pay then they specify needs very clearly and give feedback on whether they have got value [for money].

Customer stimulation is a vital part of any organisation because it’s sustainable.

And of course [there’s] recruiting and retaining quality staff.. they want to work for a qulity organisation and respond to real customer needs.

That’s why we didn’t pursue [the free data model] but can affirm that we looked at it in detail.

We also looked at a fully commercial model but weren’t satisfied it would fulfil the fundamental strategy [for OS].

We believe use [of geographical data] has expanded dramatically and changed.. but that potential is still considerably underexploited.

Our No. 1 aim is to improve capacity of OS to assist the exploitation of geographic information and be one of fundamental enablers of that [exploitation] in the UK for social and individual benefit.

With the proviso that by doing that we have to keep a sustainable organisation that not only covers its costs but also has enough left over… about £20m per annum.. to invest in the products that the market needs for customers, whether private individuals or business enterprises.

Commentary: Well, we’re fascinated to learn that OS found that there’s absolutely nobody out there who is making a free data model work. We have already emailed the South African mapping organisation, about which we wrote in 2007, to find out whether they were contacted by OS, and if so what they told them.

We will also pursue Freedom Of Information enquiries to find out which organisations OS spoke to and what their responses were. Since these are all free data models, there can’t be any commercial confidentiality for the foreign organisations, can there?

The “£500m – £1bn” range is extremely wide, and we’d like to see the detailed working. I asked the minister with responsibility for OS, Iain Wright, who was there, if he would order OS to release its full study. He said that if there weren’t any commercial-in-confidence implications… I wonder if we’ll see it? Again, we’ll ready some FOI requests.

There were questions at the end, and one interesting one came from Bob Barr, who pointed out that there is always the possibility of “pay to change” – that when you have a database of 460m features with (to give the statistics that Vanessa Lawrence, OS’s chief executive, read) 5,000 changes daily, why not charge those who are changing it? (We’ve looked at that model before, though I would like to see some more recent Land Registry figures.)

Here’s the question as I recorded it.

Robert Barr: “this hybrid financing.. it seems to be today that payment will be at the point of use. Usually [in other online systems] there’s a model where you pay to change the database. Doesn’t it make sense for data to be paid for where you change it?”

Peter ter Harr of OS: “This is a model we have been looking at. There are advantages and disadvantages. It’s not always the user who pays [in the current model]. There are many OS products which are free at the point of use. It’s the information provider who puts it online who pays. We have been looking at the model in various other countries. It works well in cases where it’s part of the statutory process.”

And that’s it? We really, really need to see that OS internal study, as it contradicts pretty much every study that’s been published. It’s going to be fascinating tracking it down.

One other thing: the cost to the government isn’t quite the same as the benefit to the economy, nor the eventual benefit to the government through taxation. It was the latter (actually, both) that the Cambridge study looked at. We are perfectly happy to generate tweaked versions of the “free data” model that could keep OS charging for some products (such as MasterMap) while freeing other data sets. Now that would be a truly hybrid model.

If anyone has had sight of that OS study, or any part of it, do please drop me an email at Or upload it to Wikileaks and let us know. We think it’s so important it ought to be out there, not locked away in an OS cupboard.

A quick roundup to start the new year

Thursday, January 1st, 2009

Hope you’ve all come through the new year without suffering too many leap-last-year problems. I thought it would be interesting to round up a few things that I’ve seen but not really had enough brainpower to turn into anything more than notes.

First, Public Data Sets stored on Amazon Web Services. (Via Richard Allan.) An interesting idea: got public datasets? Well, why not get them stored somewhere really cheap where people can access them but you only pay per download. It’s the ultimate outsourcing, and you also get to see how many people are downloading it without the capital costs of the servers.

Public Data Sets on AWS provides a centralized repository of public data sets that can be seamlessly integrated into AWS cloud-based applications. AWS is hosting the public data sets at no charge for the community, and like all AWS services, users pay only for the compute and storage they use for their own applications. An initial list of data sets is already available, and more will be added soon.

It’s already got the Human Genome and the US Census data. The idea of hosting UK public datasets on AWS was floated in the Cambridge Economics report released with the Budget back in March. Any takers?

Second, Municipalities open their GIS systems to citizens (thanks, Gerry Gavigan, who points out that “As well as innovation and the other usual unexpected benefits, it points to the existence, alas without quantification, of financial benefits.”) The article explains:

For instance, the online burning permit sales service of the Minnesota Department of Natural Resources (DNR) allows citizens to declare precisely where they would like to burn woody debris. High precision is essential in deciding whether a permit is obtainable, as well as when and under what conditions: if there is a high fire risk in the area and day for which a user asks for a permit, the software must refuse it. The Web site, however, makes it easy to enter the location with the greatest possible resolution: users first type an address into a form to get an approximate location on the map, then zoom at will and finally click on the exact spot for which they are applying for a permit.

And, more pertinently:

The success of initiatives like OpenStreetMap or the availability of Yahoo! and Google Maps APIs may make you think that people may create services like these and many more all by themselves, without getting any bureaucrat involved. However, in order to benefit the most from digital maps and other spatial data, citizens need such data to be officially inserted in, and completely integrated with, the maps and databases public administrations use to plan roads, zoning, and everything else.

Citizens may use Web sites like those mentioned here to request services as different as bus stops, trekking permits, or new post offices. Other uses may include signalling construction abuses, damages to public property, or illegal dumpsters. We may draw our preferred public bus routes on a map in our City Council official Web site.

Of course, to make all this work in practice, public administrations should also clarify the data ownership situation. Who owns data directly and freely provided from citizens? What license should apply to those data or any derived ones? This, however, is a separate issue, not really related to open source software.

And finally, some interesting questions being asked in Parliament by John Howell (of the Tories) about Ordnance Survey income from local authorities, and on use of OS vector data for commercial use (and, previously, about discussions between OS and Google over mapping licences; Ed Parsons, formerly OS and now Google, says the minister’s answer is wrong); while Mike Gapes, a Labour MP, about London mapping payments to OS and payments to OS for use of its data by various government authorities.

We’d be interested in any comments on what Gapes and Howell are trying to unearth here… and of course your comments on anything else. And happy new year!

APPSI to examine free data model, it says

Wednesday, June 11th, 2008

The Advisory Panel on Public Sector Information has put out its latest annual report. APPSI, you’ll recall, is now headed by David Rhind, the previous head of the Ordnance Survey (who also testified to the Treasury on the next census). But this report was signed off by Richard Susskind.

Kable has a short story on it:

Outgoing chair of APPSI, Professor Richard Susskind, said: “2007 was a pivotal year for the UK in relation to the re-use of PSI. Above all, we saw a marked increase across central government in the level of debate over the re-use of PSI. In particular, APPSI warmly welcomed the growing interest amongst ministers.”

The annual report itself is more interesting: re the Cambridge report, it says

APPSI can already confirm, however, that we welcome the tone and rigour of the Cambridge study – it is the kind of detailed and systematic economic analysis of trading funds and PSI re-use that we have been recommending since 2003; and we hope this represents the beginning of a new era of open and sophisticated thinking about the economics of PSI.

That’s encouraging. More transparency helps. And as for pressure upwards:

We intend, more frequently than we have in the past, to provide practical briefings to our Minister at the Ministry of Justice. These will cover key issues such as evidence, statistics and data relating to the impact of PSI; the governance of PSI and principles underpinning its re-use; the enforcement of the PSI Regulations; models and case studies clarifying the economics of PSI; the findings of ongoing horizon scanning by APPSI; and the adequacy and scope of information management activities across the public sector.

Basically, much more focus on on the economics of PSI. It also says it will “follow up” on progress from the recommendations of the reports into PSI such as The Power Of Information and the Cambridge study. And to that end…

To stimulate and widen debate about the future exploitation of PSI, we will conduct an initial inquiry into the implications of introducing a regime under which public bodies would be subject to some kind of obligation to make their PSI available for re-use. There is no such obligation today under the PSI Regulations.

Which is telling, isn’t it?

Government answers (cagily) on free data questions

Tuesday, April 22nd, 2008

The Green Party has been doing some sterling work trying to get the government to answer questions arising from the Cambridge economics study into the potential benefits of making data sets available for free.

We draw your attention to the exchange in the Lords (where the Greens have a representative, Lord Beaumont of Whitley) over this. Though you may find the answers uninspiring, at first:

Lord Beaumont of Whitley asked Her Majesty’s Government: Whether they intend to make the Ordnance Survey’s MasterMap available free of financial or legal restrictions. [HL2714]

The Parliamentary Under-Secretary of State, Department for Communities and Local Government (Baroness Andrews): As announced in the Budget, the Government will look closely at public sector information held by trading funds including Ordnance Survey, to distinguish more clearly what is required by government for public tasks and ensure that this information is made available as widely as possible for use in downstream markets. In the lead up to the next spending review, the Government will ensure that information collected for public purposes is priced so that the need for access is balanced with ensuring that customers pay a fair contribution to the cost of collecting this information in the long term. In the mean time Ordnance Survey will continue to generate the revenue it requires to cover its costs, to fund investments and to provide a return to government, from sales of paper mapping and from licensing use of the Crown copyright and Crown database rights in its data, including OS MasterMap.

This is the same “customers pay a fair contribution” line we’ve been hearing since the report came out (first used by DBERR, as we recall).

Of course the key to really good questions in Parliament is to ask something that the government can’t disagree with, but which isn’t part of its policy at present – because that puts it into the logical bind in which it should take up that policy. The case of economic advice given to the government is the classic one. (But you hear it all the time at Prime Minister’s Questions – the ne plus ultra of this game – in which opposing MPs try to expose gaps like this.)

Undeterred, Lord Beaumont pressed on soon afterwards (April 3):

Lord Beaumont of Whitley asked Her Majesty’s Government:

  • Whether they intend to make unrefined information held by trading funds available free of legal and financial restrictions, as recommended in the recently commissioned study, Models of Public Sector Information Provision via Trading Funds; and [HL2713]
  • Why certain financial information in the Ordnance Survey and United Kingdom Hydrographic Office sections were redacted as confidential in the recent Models of Public Sector Information Provision via Trading Funds report; and [HL2715]
  • Whether they intend to make the Met Office’s unrefined information available free of legal and financial restrictions, following the finding of the recently commissioned study Models of Public Sector Information Provision via Trading Funds (p76) that this would provide a net benefit to society of £1.03 million; and [HL2733]
  • Whether in light of the finding of the recently commissioned study Models of Public Sector Information Provision via Trading Funds that there is a net benefit to society if trading funds release unrefined information at marginal cost, they will review the financial and legal restrictions on all unrefined information held by public bodies. [HL2734]
  • Whether they will define the “public tasks” for trading funds in order to identify information that should be released free of financial and legal restrictions, as highlighted in the recently commissioned study Models of Public Sector Information Provision via Trading Funds (chapter 3, paragraph 3.49). [HL2735]

Lord Davies of Oldham:

As announced in the Budget, the Government will look closely at public sector information held by trading funds to distinguish more clearly what is required by Government for public tasks and ensure that this information is made available as widely as possible for use in downstream markets. In the lead-up to the next spending review the Government will ensure that information collected for public purposes is priced so that the need for access is balanced with ensuring that customers pay a fair contribution to the cost of collecting this information in the long term.

For central government bodies other than trading funds, the clear policy is that raw information should, subject to any statutory provision, be freely available or provided at the marginal cost of dissemination.

In drafting the report Models of Public Sector Information Provision via Trading Funds, Cambridge University relied on the co-operation of and provision of data from trading funds. Some of the financial data provided was commercially confidential and was therefore not published in the report. This did not alter the overall conclusions of the report.

Tom Chance, a coordinator for the Green Party (who helped draft the questions, notes on his blog that the commitment to make information freely available “subject to any statutory provision” is encouraging:

That’s good to know, and backs up the Green Party’s case for making it accessible as well, e.g. Parliamentary procedure in an open, machine-readable format rather than plain HTML, or key data on domestic energy use in one place as a canonical source rather than being scattered across different sections of government departments (Defra, BERR, CLG, etc.)

One other thing he remarks on (about the Guardian’s mapping):

It would be nice if we could supply those guys with a decent set of OpenStreetMap graphics for use in articles rather than using non-free sources too!

It’s actually an avenue we’re exploring – I’ve swapped emails with Steve Coast, who says that a change in the licensing for OSM may make this possible in the near future. It would certainly cut some of our mapping costs.

If you get free data, what will you do with it?

Saturday, April 19th, 2008

Our challenge to you: if you get free data, what will you do with it?

The question has some urgency because if you can think of what you’d like to do with data from the Land Registry, Companies House or the Met Office, then you could be in line to be the first to benefit from it – and show the benefits of making more data free.

The Cambridge report noted three sets of data that could be made free with minimal revenue impact: Land Registry, Companies House, and Met Office.

Let’s revisit them, so you can think what to do with them.

For Land Registry, the analysis only looks at “Property Data Services” – which are the ‘Property Price Data’ and ‘Polygons’, whose respective revenues were £893k, the majority of which was from a bulk form of the product, and £405k. (That compares to Land Registry’s total revenues of XXX, 86% of which comes from compulsory registrations.)

Those, it should be noted, are tiny compared to its revenues. Land Registry’s fee income in 2006/7 was £474million (in 2005/6, £395m). Its costs are very high, but it still had an operating surplus of £96m – nearly as large as Ordnance Survey’s entire revenues.

The reason: you’re obliged to tell LR when you buy or sell land or put a charge (such as a mortgage) on registered land.

The property data could surely be used for some imaginative analysis – though note that Land Registry bans the use of its data for unsolicited mailshots. (An interesting question is how, if one moved to a free data model, one would spot uses which broke rules like that. Would you drop the rule, or include intentionally fake data which would tip you off if you received a mailshot addresses to it?)

Companies House is next: £72m revenue, again almost all from obligatory registrations. (Although search is the most profitable area – as you’d expect: it’s easier to search data than to accept and check it.) Unfortunately most of the data there is marked confidential in the analysis – but again, one can imagine that it might be useful to find people who are persistent directors of companies that aren’t acting lawfully…

Finally there’s the Met Office. Can anyone think what you’d do with a lot of weather data?

More analysis and suggestions of how to use these three organisations’ output data – if it were free – are welcome.

Cambridge economics report (briefly) debated in Parliament

Wednesday, April 2nd, 2008

We note (via theyworkforyou’s excellent email alerts system) that the issue of trading funds – and specifically, the Ordnance Survey model – was briefly debated in the Commons yesterday.

The main protagonists: Iain Wright, who says he is

the Minister for Ordnance Survey responsible for the shareholder relationship between the Department and the agency, dealing with strategic and day-to-day issues arising in connection with its activities, particularly in terms of financial and Government matters. My ministerial colleague the noble Baroness Andrews leads for the Department on issues relating to the purchase of Ordnance Survey products and services.

Robert Key (Con, Salisbury) asks:

there is continuing confusion between [OS’s] public duty and the private competition that it has to have as a trading fund. The pan-government agreement, which regulates how different Government Departments and agencies use Ordnance Survey, came to an end yesterday. We have no news of what is going to be put in its place, so will he tell us? When will the regulatory framework be updated and amended to bring an end to all this confusion, which is getting in the way of Ordnance Survey’s excellent work?

The reply:

In respect of his important point about the pan-government agreement, that was established, as he is aware, to ensure that the Government have access to mapping data in order to develop and implement policy at a reasonable price. We are looking into that, and I will update the House accordingly.

Then more interestingly from David Taylor (Lab, NW Leicestershire):

There is an argument that [OS] information should be made more freely available, free of charge. Has he read the book which was published alongside the Budget, “Models of Public Sector Information via Trading Funds”—quite a racy read—and which rebuts the claim that a move to free data would damage the work of Ordnance Survey? It should be made freely available to citizens of this country, and that can be done in a way that produces funds rather than absorbs them.

The minister’s reply:

As a fellow accountant, I can imagine that I would find it racy as well. [Ah, Parliamentary humour – CA] My hon. Friend raises an important point about the provision of data. He said that Ordnance Survey breaks even as a trading fund. In fact, it provides about £6.2 million in surplus that is then passed back to the public purse via dividends. That is to be encouraged. The business model, with changing market conditions and technology, is being considered and, as Minister with responsibility for Ordnance Survey, I will continue to do so. [Emphasis added – CA]

Interesting: that the Cambridge report is now getting debate time, that the free data model is being considered, and that the minister responsible is considering the business model “with changing market conditions”.

Could a surcharge on planning applications fund free data from Ordnance Survey?

Tuesday, April 1st, 2008

Since we’re playing with hypothecated taxes – which we should point out is not a central tenet of the Free Our Data campaign; we think governmnent should fund this centrally, but recognise that in tight economic times it’s hard to persuade government to spend more rather than doing revenue-neutral “experiments” – let’s examine the suggestion (made elsewhere) that OS’s free data should be funded by a surcharge on planning applications.

After all, the logic goes, planning applications are voluntary; and they require OS detail. (One has to submit a number of copies of OS maps of the area the application applies to.)

[A disclosure up front: I have a planning application in process. However, the following simply emerges directly from the available statistics.]

Let’s recall the amount that needs to be raised, according to the Cambridge study, to compensate OS for the loss of revenues from selling its “raw” or “public task” data: £12m – £30m.

Now, how many planning applications were there for the UK? For England and Wales, the total applications received looks like this:

This is only for England and Wales, of course, but shows that planning applications, even at their busiest, are an order of magnitude less than Land Registry transactions. True, this excludes Scotland, but even if that has as many planning applications as England and Wales (which seems unlikely), that would only be a total of about 1.3m applications in the busiest time.

That in turn would imply a surcharge per application received of roughly £10 to £30. Whether that’s small or large compared to the cost of the overall transaction (which might be putting up an extension on a house or a change in use, requiring no actual building) isn’t clear.

(Locally, making an application costs between £135 – £265. It may vary in other locations. The surcharge would also have to be collected from multiple councils, which would impose an administrative overhead; by comparison, there’s only one Land Registry.)

But as an actual amount per transaction, it’s comparatively large, meaning it would be very sensitive to variations in the number of applications. The figures there vary from 501,000 to 689,000 – a 28% or 37% variation, depending which you take as your numerator.

Why might this be? Perhaps Land Registry transactions include commercial purchases, which occur more frequently. And also that people buy and sell houses more frequently than they do things to them. Either way, it’s doesn’t look like the optimum way forward. It’s certainly a lot more expensive – per transaction – than a Land Reggistry surcharge.

Year or Quarter

received (,000)





















* provisional. Source: DCLG

Land Registry surcharge could fund free OS data surprisingly cheaply

Thursday, March 27th, 2008

Sold signs outside houseOne suggestion that has been made by Robert Barr (of Manchester Geomatics) and echoed recently by Ed Parsons on his blog (though I think Ed came up with it independently) is that Ordnance Survey’s non-refined data (that is, the stuff it does as part of its public task, which the Cambridge economics study of trading funds interpreted to be its MasterMap and Large Scale Topo) could be made available for free by making up any funding shortfall from a surcharge on Land Registry transactions.

The reasoning: most LR transactions involve OS mapping.

According to the study, that would cost between £12m and £30m in foregone revenue.

So how much would you have to add to Land Registry transactions to make up that amount? It sounds like an awful lot of money to generate.

Here are the figures I’ve culled from the Land Registry’s performance data for the past three years on the number of transactions.

Number of registrations 2004/5 2005/6 2006/7 Mean 04-06 As % of total
first registrations 297,405 309,609 304,391 303,802 4.3
discharges 2,486,875 2,502,318 2,605,620 2,531,604 35.8
mortgages 2,680,128 2,627,999 2,723,530 2,677,219 37.9
transfers for value 1,378,200 1,270,867 1,480,819 1,376,629 19.5
leases 167,234 173,610 197,546 179,463 2.5
Total 7,009,842 6,884,403 7,311,906 7,068,717 100
Total w/o discharges 4,522,967 4,382,085 4,706,286 4,537,113 64.2

With millions of transactions, it looks like raising £12m – £30m wouldn’t actually be too hard. “Discharges” are the ending of a claim to a legal title – generally, though not always, the end of a mortgage. They attract no fee at present. Other LR charges range from £2 (for a search) to £700 (for first non-voluntary registration of a pricey parcel of land). Most of the charges, though, are £20 – £40 and upwards.

So to find the £12m that the trading funds report suggests OS would lose solely from non-discharge transactions would mean adding £2.65 to the cost of each LR transaction.

If we take the loss in revenue to OS as £30m, then it means adding £6.61 to each transaction. It’s not more than the cost of any transaction (except searches – which aren’t the same as the “searches” one does when buying a house; those go through your local authority), and compared to the cost of the typical transaction – say, the average £180,000 house purchase – it’s peanuts.

Right – that’s the analysis done. Now we just need to find a minister who is in charge of Land Registry and Ordnance Survey and can tweak the legislation (it doesn’t need primary legislation, surely?) to make these changes. And we’re done.

This analysis also appears (without the fun table) in today’s Guardian: Land Registry holds key to free OS.

Trading Funds report first glance: economists, start here

Wednesday, March 12th, 2008

As the report was written by two economists (Professor David Newbery and Rufus Pollock) and a professor of law (Lionel Bently), all of Cambridge University, it’s not surprising that it contains a lot of economic calculations – the sort that require at least A-level maths to feel comfortable with. (Do we all still feel comfortable? Good.)

We like the start:

The contents of this document may be reproduced free of charge in any format or medium provided that it is reproduced accurately and not used in a misleading context. The material must be acknowledged as Crown Copyright and the title of the document given.

So, just to be clear, the 154-page report is called “Models of Public Sector Information Provision via Trading Funds”. (Please note: page numbers given here refer to those in the PDF, which often differ from the printed form.)

We began at the end, with the appendix “A General Argument for Selling Public Sector Products at Marginal Cost”. This is a pretty important part of the argument: why should the government give away stuff rather than selling it for a profit?

A crux point from the appendix which looks at pricing at marginal cost (p139):

taxing public production (by the difference between price and marginal cost) is inefficient if the production is an input into production, and unlikely to be part of an optimal commodity tax system when sold as a final good.

That is (to simplify again from the economics) if the public data get used to generate something else that is then used in the private sector, charging for them isn’t the most efficient way of growing tax revenue.

They add (p139):

Certainly it is hard to believe that taxing any PSI products would increase consumers willingness to undertake taxed labour activities, or that reducing their price would lead to an increase in leisure at the expense of paid employment.

As shown by the government’s 2003 Green Book, the authors say, (p140)

The UK Government attaches importance to the distributional consequences of its actions, many of which are justified by the beneficial impact they have on distributional outcomes.

A key question then becomes the “marginal cost of public funds” – how much it costs the private economy to spend £1 in the public sector. A 1992 study noted that (p146)

‘The MC(P)F ultimately depends not just on the tax, but also on the nature of the government expenditure under consideration.’ This is a particularly salient point in the case of government revenue subsidising trading funds in order to offer below average cost pricing. As an example, the lower cost of trading fund data may lead to greater innovation.

Which would mean? (p146-7)

On the one hand this could result in higher corporate incomes, which would contribute to subsequent higher

government revenues and hence a lower MCPF.

(This is the Free Our Data argument.)

On the other hand the lower costs of trading fund data may be passed onto lower final goods prices. This case would leave the public with more income to spend on other goods and services, and could weaken incentives to supply labour. This time the lower government revenue would raise the MCPF.

(I have to admit I don’t follow the logic of the second sentence, unless it is that extra income to spend on other goods and services does not lead to extra government income because the same amount of money is being spent – all you’ve done is shift some spending from trading funds goods to other goods, without expanding the economy.)