Free Our Data: the blog

A Guardian Technology campaign for free public access to data about the UK and its citizens


A spoke in the wheels of Land Registry transaction fees to pay for Ordnance Survey?

March 4th, 2010

An interesting debate about the prospect of redundancies at Land Registry on Wednesday brought Michael Wills, of the Ministry for Justice – who is also in charge of the (trading fund) Land Registry – to the chamber.

It’s a long debate, but here’s the nitty-gritty:

Transaction levels, which are the key factor for the Land Registry, will have fallen from 16.1 million in 2007-08 to a projected 10 million in 2009-10. The Land Registry receives no central funding because it is a trading fund; it depends on the fees that it receives for services rendered. It made a loss of £130 million in 2008-09 compared with a surplus of about £70 million in 2007-08.

Very interesting – and this comes against a backdrop where it’s being suggested that government is being charged too little for OS data (compared to the private sector); and where there have been suggestions (hi, Robert Barr) that we should have an excess added to LR transactions in order to fund OS.

I think this really needs to be costed. It would be interesting too to see how much Land Registry pays to OS…

What should be done with Royal Mail’s PAF? (From 1999)

March 2nd, 2010

This letter first appeared in the Financial Times on Friday July 30 1999. (You can see it at http://twitpic.com/z7y0m – which we can’t embed it for technical reasons.) We’re using it here because it’s rather interesting historically – especially for the signatories, and particularly one of them. You’ll notice, of course, that it didn’t work out like they asked….


Sir
The financial and commercial freedom that plc status will confer on the Post Office may be a welcome contribution to national competitiveness. However, one Post Office asset, namely the Postcode Address File (PAF), a computerised and maintained list of all postal delivery addresses and their postcodes, is too important a part of the national information infrastructure to be handed over without safeguards.

At present the Post Office receives new address information from local authorities, attaches additional information to optimise the address for postal delivery and allocates a postcode. This information is compiled into the PAF, which is copyrighted and published by the Post Office. IT is also made available through a number of Value Added Resellers (VAR). These data are used by thousands of commercial enterprises, large and small, for the maintenance of customer records and for a wide range of marketing and logistical purposes.

As a public corporation the Post Office has handled its monopoly position, as the national compiler of postal addresses, responsible. However, some have questioned the price of the information and the control the Post Office has exercised over its reuse and resale. Once the Post Office is a plc, directors tasked with maximising shareholder value could be tempted to extract further advantage from the PAF by restricting competitors’ access to the data, placing constraints on the operations of VARs, or charging royalty payments for the use of addresses in other contexts.

To ensure that the Post Office cannot succumb to such temptations as a plc, we would propose that the production, maintenance and placement of the PAF in the public domain should become a regulatory requirement for the Post Office in exchange for the privilege of retaining monopoly rights for the delivery of letters. This would ensure that the national address file becomes a public good to be used for the benefit of all, rather than an unregulated private asset.

Robert Barr, sr lecturer, school of geography, University of Manchester
Keith Dugmore, managing director, Demographic Decisions
Philip Good, managing director, Hopewiser
Robert James, independent consultant
Vanessa Lawrence, chair, Association for Geographic Information
Christopher Roper, director, Landmark Information Group
Richard Webber, director, Experian

Free Our Data response to the DCLG consultation on OS Free

February 26th, 2010

We’ve written up our response to the DCLG consultation, and emailed it over this afternoon. Make sure you respond too! Deadline is 17 March – a Wednesday, for no obvious reason.

Feel free to crib from this or build on it. Your comments welcome (though of course they’ll only be useful after the fact…)


Response to Ordnance Survey consultation from the Free Our Data campaign, 25 February 2010

Introduction
The Free Our Data campaign was co-founded in March 2006 by Charles Arthur and Michael Cross with the aim of persuading government that non-personal datasets created by government-owned agencies and companies and organisations should be made available for free reuse without licence restrictions.

The rationale for this approach is that citizens have already funded the existence and collection of these agencies through taxes paid over past years. (This includes historical data; Ordnance Survey, for example, has been a trading fund for some time but on its incarnation as a trading fund immediately used data previously collected at public expense.) Furthermore, the private and non-profit sector can imagine better ways of using data than government can because they have a direct interest in using it – but price and licensing are significant barriers to the development of those applications.

The campaign is apolitical. It is not aligned, associated with or funded by any political party or outside group; its (very small) costs are paid by the co-founders out of pocket.

We are delighted that government has chosen to accept the rationale behind the campaign’s logic with its plans to create the OS Free products. Our only caution is that it must ensure that the model used to fund it can be widely applied to other non-personal datasets within government. OS Free should not be a one-off, but instead should be the basis for a wider sharing of data.

One final, general point: the Free Our Data campaign believes that Ordnance Survey provides an excellent map-generating service and must remain a government-owned asset whose public task includes the continual mapping of the UK’s geography and built environment. Any moves to privatise any part of its operation would be retrograde and threaten both OS’s future usefulness and the UK’s economy. We would oppose such moves.

Question 1:What are your views or comments on the policy drivers for this consultation?
The need to reduce overt government spending, allied to the growth in personal computing power owned and controlled by the public at large, creates an entirely new opportunity to let citizens analyse, understand and benefit from the data that the government collects on their behalf. This is a two-way process.

Clearly the UK government is rapidly recognising the benefits of transparency – that actions are not just seen to be done, but that the reasoning for the actions can also be interrogated and understood. This is one key policy driver. (Hereafter PD1.)

There is a second policy driver (hereafter PD2): the need to reduce the public sector deficit in coming years. This is best done through a reduction in public spending and an increase in tax revenue from the private sector.

There is also an untapped private-sector entrepreneurial market whose entire existence depends on the successful implementation of this consultation and future ones like it. When government-collected data is treated as a limited asset which must be priced to create an artificial shortage, government constrains the private sector which generated the taxes used by the government to create the data. Clearly, that then constrains the tax base, because not all companies (extant or proposed) can afford to buy the data. Therefore total taxes are lowered by pricing data. This is inefficient, and constrains entrepreneurship based around the effective use of data.

Therefore making government-owned data like this free for reuse (including commercial reuse) will bring in larger tax revenues as long as HMRC is vigilant in collection of owed taxes from individuals and companies.

How the consultation will reduce public sector spending in the context of the Ordnance Survey’s financial model (as a trading fund) is less obvious, but still exists. The consultation iterates costs of making these mapping data free. However, it does not iterate the potential benefits through reduced costs to local councils, police forces and local health authorities, for example, of being able to provide map-linked data on public websites, without paying, at the Landranger and Explorer scale; this has been a consistent bugbear to local councils, to police forces and public health observatories which want to share their work with the public.

Making such data free also obviates the legal examination of any instance in which those bodies wish to share their work – a cost which is also unpriced in the consultation document. While these costs may not match the millions of pounds directly attributable in lost revenue from sales of Explorer and Landranger-scale data, they are significant in the cultural sense too – because they enable those bodies to operate in a more transparent manner as well, satisfying PD1 above.

Question 2: What are your views on how the market for geographic information has evolved recently and is likely to develop over the next 5-10 years?
The geographic information market has been completely transformed in the past 10 years by
-the opening of GPS (Global Positioning System) data to the world by the US military (an excellent example of treating “information as infrastructure”, in which the US government bears the cost of supplying, in effect, location data to non-US-taxpaying people in the UK and elsewhere); and
-the ability to create “crowdsourced” maps, such as OpenStreetMap (OSM), which are accessible via the internet without copyright restriction for consultation, addition or editing.

In the future the GI market will be further transformed by
-the growing number of smartphones with built-in GPS
-the falling cost of mapping large areas with great precision due to improving satellite photography systems
-mapping information, including road and route data, becoming a commodity, where only value-added forms can be effectively charged for.

Sales of satnav devices provide a clear indication that “knowing where you are” is a key piece of information for people: estimates suggest that between 4 million and 7.5m such devices have been sold in the past 10 years in the UK alone.

The commoditisation of route and road information, which has previously been supplied only by Ordnance Survey, will continue. If satnav makers decided that the OSM mapping was good enough, and that the pricing model it offers (of zero cost), they might choose to use its database (or even to improve its database while using it) and neglect the OS version. Under the present OS funding model, the only way for OS to recover its costs would be to raise costs to its existing clients, including the public sector – which would not fit PD2.

Therefore it is essential that OS does provide the OS Free data to encourage the growth of the UK geographical information sector, and develops its own high-quality mapping as part of the public task for which it exists.

Question 3: What are your views on the appropriate pricing model for Ordnance Survey products and services?
Given the name of the campaign, our obvious answer would be “all should be free”. But we recognise that there are pragmatic and political problems with this.

The question assumes a great deal about OS products and services, and its charging regimen. However as noted in the consultation OS has consistently declined to separate out the costs and revenues and profits of its “raw” and “value-added” products and services, which makes it difficult to take anything but a Gordian Knot approach to finding appropriate models.

The question would be better framed as “which products and services should OS produce, and what should it charge for, and how should the charging regime be set?”

The more logical approach is to ask what OS’s public task should be, what products and services flow from that, how far those should be self-financing (using, say, a trading fund method) and what other products and services are seen as a public good which should be funded out of general taxation.

OS’s public task is clearly to map the geography of the UK; arguably this also includes the built environment. MasterMap provides an appropriate starting point for the public task, comprising a detailed scale of the UK.

For the moment we find the proposed model – with MasterMap and non-OS Free products’ prices aligned for the public and private sectors – to be equable. However as costs of updating maps and built environment detail falls (due to pervasive GPS feedback systems such as smartphones and cheaper satellite imagery allied to automated updating of map databases) this may need review to see whether more detailed scale products closer to MasterMap level can also be offered free.

Question 4: What are your views and comments on public sector information regulation and policy, and the concepts of public task and good governance as they apply to Ordnance Survey?
PSI regulation and policy suffers from the problem that where public organisations decline to comply with it, neither method of enforcement is satisfactory.
-If OPSI or other organisations demand compliance using non-legal recourse (e.g. asking for “good practice”), the non-complying organisation can ignore it; or
-if OPSI or other organisations seek legal recourse for compliance, the exercise is extremely costly for all concerned and is concluded so slowly due to legal process that private organisations in particular are at risk of going out of business first. (The instance of Getmapping’s complaint against OS in the early 2000s is illustrative.)

It is absurd that OS has written its own definition of its public task – with or without the consent of its minister in DCLG. With the release of OS Free, it is time for the job of defining OS’s public task, which impinges on huge parts of British life and the economy, to be put in the hands of a body entirely outside OS.

Question 5: What are your views on and comments on the products under consideration for release for free re-use and the rationale for their inclusion?
It is essential that there should be both raster graphics and vector graphics. The former allow easy use on websites to create Google Maps-style interfaces (where the map can be “dragged” to a location). The latter allow dynamic scaling. Though no rationale has been offered for their inclusion, they seem to fit the “mid-scale” requirement.

The inclusion of Code-Point and Boundary-Line datasets, with licences that allow free reuse (including commercial reuse) is essential to the creation of useful, effective and profit-generation applications.

Question 6: How much do you think government should commit to funding the free product set? How might this be achieved?
This is a key question – and how the government chooses to implement this will demonstrate whether it is truly committed to the idea that “information is infrastructure” by creating a model of funding that will be applicable to other data-collecting trading funds and parts of citizen-funded government, or if it is simply choosing a short-term fix for the problem of the desperate need for free access to OS data.

It is easiest to start by indicating what the government should not do.

- It should not raise prices within government for the non-free OS datasets above those charged to commercial organisations outside government. This would create tensions under which government organisations would naturally seek third-party solutions to reduce their costs (because of PD2). That would undermine income for OS and jeopardise the quality of all its data. In extreme cases, price rises might deter local authorities and other public bodies from using high quality geographic information to deploy their resources more efficiently and end up costing the public purse more in the long term.

- It should not raise prices for commercial organisations above those charged to government for the same datasets. This too will tend to exacerbate any drift to third-party solutions for high-value datasets. (Although it should be expected that these will occur naturally due to new entrants in the market.)

Therefore government should commit exactly the “funding gap” that making the datasets mentioned free will cause – apart from the paper maps. OS will presumably continue to sell paper maps, and will be able to rely on its brand to benefit from their sales and consequent profits. Therefore Treasury should fund the “gap” in revenues out of general tax funding, rather than by levying greater charges for OS data from other public sector sources.

The government’s own argument that “information is infrastructure” should be applied here. Roads, for example, are physical infrastructure. Government sees their provision as a public good and commits to fund their building from general taxation. It does not charge higher road tax prices to government-owned vehicles to offset the fact that government has built the roads and provides free access to them. (Nor is road tax hypothecated towards road-building.)

In the same way, other public sector organisations that use OS data should not be charged over the amount that private sector groups would be, and their payments should not be hypothecated towards any “funding gap”. The amounts being discussed – ¬£19-¬£24m pa – are comparatively small when set against overall public spending.

The benefits, admittedly, are difficult to enumerate. It is possible that, as with GPS, the benefits will not be immediately visible, and may not appear in the same place as the investment. It would therefore be sensible for government to commission regular studies to evaluate the growth of business predicated on use of the OS Free products.

By adopting a “non-hypothecation” approach to funding OS Free, government will be greatly simplifying the process required for the subsequent release of other datasets from other government-owned bodies. The pressure to release OS Free arose because the trading fund model is too restrictive: it cannot prime the market.

To draw an analogy, the search engine Google could not be profitable if it were to use Microsoft’s Windows to power its multiple thousands of servers that store its index of the internet. It would have to pay a Windows licence on each of those servers, and for each additional one. The cost would outweigh its profits. Instead, Google uses the free Linux operating system for those servers. We are suggesting that using the OS trading fund model for products is akin to licensing Windows: it limits the size of the market for their use, and the speed with which companies can grow while using those products.

Question 7: What are your views on how free data from Ordnance Survey should be delivered?
The key to the datasets being useful will be (a) availability (b) reliability ( c) accessibility.
Availability: where are the datasets stored? If OS hosts the files, it will need to create an entirely new system to support hundreds or thousands of concurrent accesses. That is inefficient, and outside OS’s remit. It would be more sensible for the datasets to be uploaded to a cloud facility such as Amazon’s S3 storage or Google’s cloud facility where copies could be downloaded. This is a comparatively low-cost solution where OS would only have to pay for downloads, rather than setting up its own hosting service.

Furthermore, it is clear that the datasets will be subject to change over time. It would be inefficient to upload a complete set every day, for example. A more effective method would be to upload a “diff” file of differences from the previous full upload every so often (daily, weekly, monthly). This would reduce the total amount that would be needed to for an up-to-date download and simultaneously create new opportunities for applications showing what has changed on a map or dataset over time. A full dataset incorporating the diffs from the last full upload could be provided every, say, six months.

Reliability: so users can be confident that the files come from OS, they should be cryptographically signed.

Accessibility: the files should be made available in formats that are readable using open-source software: that will ensure that they will be usable by the widest possible range of users and applications.

Question 8: What are your views on the impact Ordnance Survey Free will have on the market?
Resellers of OS data will not be pleased. But this will force them to focus on value-added services rather than promulgating a system which perpetuates the extension of copyright limitations that are not sustainable in the age of the internet.

Some map providers have already cut their prices in response to the expectation of OS Free. As in Canada (in the example cited in the consultation) we should expect that mapmakers will take the opportunity to create specialised maps for different niche groups (climbers, walkers, and other outdoors pursuits are likely to be the first to take advantage of this).

The provision of CodePoint will galvanise a market that has been held back by the problems of creating fast, cheap and legal lookups for geocodes. Although organisations such as Yahoo offer them, using those leaves providers dependent on outside groups, when they would prefer to do their own lookup. CodePoint is an essential part of the package.

The provision of Boundary-Line will be highly important in the forthcoming election. It will also be important for online organisations which depend on mapping electoral constituencies.

Question 9: What are your comments on the proposal for a single National Address Register and suggestions for mechanisms to deliver it?
The absence of a working National Address Register (due mainly to intellectual property claims by publicly owned bodies) has created the absurd situation of the Office for National Statistics being forced to spend millions of pounds creating a one-off register for use in the 2011 census and then discarding it afterwards.

Government should retain the ONS census for future reuse and treat it as a resource with huge ability to create value for the economy.

Question 10: What are your views on the options outlined in this consultation?
Option 1 – allowing OS to continue with its planned “hybrid” strategy – is deeply unsatisfying. The strategy proposal has received no proper oversight; it has not been debated in Parliament; its financial assumptions are at best weak and at worst flawed; and the creation of an “attached” private company that would sell OS-branded goods is anticompetitive because it offers no transparency on pricing, while having sole advantage of the OS brand.

Option 2 – releasing large-scale data for free reuse – would cross a Rubicon. Although the Free Our Data campaign would support this, we are concerned that government and Treasury has not shown sufficient commitment to the idea of vote-funded data collection and parsing by OS, and that this strategy could endanger the long-term future of OS. Furthermore, it could undermine would-be commercial competitors, and would create substantial upheaval in the geographic information market. Change is good, but too much change can be unpalatable.

Option 3 – releasing “mid-scale” data as suggested, and considering a transition to further release – seems to offer a path towards the long-term future of OS while providing the opportunity to prove the benefit that would accrue to the private sector, and thus the Treasury through tax receipts, of freeing data. We find this the most pragmatic approach – but reiterate that the government’s aim should be to pursue a path where it releases data for the use of citizens without cost impairment.

Question 11: For local authorities: What will be the balance of impact of these proposals on your costs and revenues?
N/A.

Question 12: Will these proposals have any impact on race, gender or disability equalities?
We see no impact on those inequalities.

Charles Arthur & Michael Cross, 26 February 2010.

A new No.10 petition: free PostZon

January 28th, 2010

Mark Goodge added this as a comment to the data.gov.uk post, but it seems worth making more visible. So here it is:

“While the launch of data.gov.uk is a big step in the right direction, the government’s response to the petition inspired by the forced closure of ernestmarples.com has been pathetic. As a consequence, I’ve created a new petition which seeks to focus more tightly on the Postzon data (the data use by ernestmarples in their API). This can be found at http://petitions.number10.gov.uk/geopostcode/.”

That’s one that’s definitely worth getting behind. Head over there and …do whatever the verb is for petitioning someone. Is it petition?

Data.gov.uk: now that’s what we call a result

January 25th, 2010

The official launch yesterday of data.gov.uk, with an index of 2,500 datasets provided by government departments, is fantastic news – and a significant milestone for the Free Our Data campaign.

It’s worth remembering how far we’ve come since 9 March 2006, when we kicked off the campaign in Guardian Technology with Give us back our crown jewels:

Imagine you had bought this newspaper for a friend. Imagine you asked them to tell you what’s in the TV listings – and they demanded cash before they would tell you. Outrageous? Certainly. Yet that is what a number of government agencies are doing with the data that we, as taxpayers, pay to have collected on our behalf. You have to pay to get a useful version of that data. Think of Ordnance Survey’s (OS) mapping data: useful to any business that wanted to provide a service in the UK, yet out of reach of startup companies without deep pockets.

This situation prevails across a number of government agencies. Its effects are all bad. It stifles innovation, enterprise and the creativity that should be the lifeblood of new business. And that is why Guardian Technology today launches a campaign – Free Our Data. The aim is simple: to persuade the government to abandon copyright on essential national data, making it freely available to anyone, while keeping the crucial task of collecting that data in the hands of taxpayer-funded agencies.

And further on:

[The consultancy] Pira [carrying out a study for the EU] pointed out that the US’s approach brings enormous economic benefits. The US and EU are comparable in size and population; but while the EU spent €9.5bn (£6.51bn) on gathering public sector data, and collected €68bn selling and licensing it, the US spent €19bn – twice as much – and realised €750bn – over 10 times more. [Peter] Weiss [who wrote a study comparing the US and UK] pointed out: “Governments realise two kinds of financial gain when they drop charges: higher indirect tax revenue from higher sales of the products that incorporate the … information; and higher income tax revenue and lower social welfare payments from net gains in employment.”

Happily, that argument has been driven through Whitehall by the efforts of Tim Berners-Lee and Professor Nigel Shadbolt. I interviewed Berners-Lee for the Guardian: see the video or read my account of how they did it.

So is that it? Is the campaign over? No, not at all. There are plenty of holdouts: UK Hydrographic Office is complicated (because it buys in third-party data which it then resells), yet even so one would think there should be information that it collects about British coastal waters which could be released as having public benefit.

Similarly postcodes, where there is some notable opposition to making any of the datasets free. The easiest one would be PostZon, which simply holds geolocations for each postcode plus data about which health and administrative boundary it lies inside; that’s nothing like as extensive (or valuable) as the full Postcode Address File (PAF).

But there’s really strong resistance against making anything from the Royal Mail available for free, and one detects Lord Mandelson’s hand in this.

If you haven’t yet had your say on the OS consultation, Harry Metcalfe has created a terrific tool for doing precisely that at osconsult.ernestmarples.com. Go along and make your views heard.

How GIS reveals discrimination in urban planning

January 2nd, 2010

Does this sound at all familiar?

Not all local governments appreciate the rise of GIS-driven advocacy, especially when their own data is used as a hammer against them, and they have begun to restrict public access. Some have pulled data off the Web in the alleged interest of national security; others charge exorbitant fees to produce it or deliver jumbled masses of data that are difficult to manage or decipher.

Turns out though that it’s not from the UK, but the US, from a fascinating article about how GIS helps to demonstrate discrimination being practised by towns and cities – and how when that is revealed by mapping, the reaction tends not to be to get rid of the discrimination, but to get rid of the troublesome access to the data that reveals it. After all, it’s so much cheaper to do the one than the other:

Mebane, the Cedar Grove Institute’s first case study of municipal discrimination, passed an Infrastructure Information Security Policy shortly after the study was published; the policy limited infrastructure data access to qualified engineering firms and town agencies. The city of Modesto, Calif., locked in a legal underbounding battle, pulled its infrastructure data off the Internet after the lawsuit was filed, citing national security grounds. “There’s no conceivable national security interest in where the traffic lights are in Modesto,” scoffs Ben Marsh, the institute’s chief mapmaker. A recent appellate ruling in California rejected a similar national-security rationale, as well as a copyright argument by Santa Clara County, but whether that opinion stands as precedent remains to be seen.

However…

Though restrictions on access to government data could prove troublesome, advocacy groups that use GIS have already been finding data sources outside of government. In particular, data collected by community residents have become an effective supplement to the “official story,” as University of Washington professor Sarah Elwood calls government data.

Elwood has used GIS not only to map problems but to build the capacity of underserved and disadvantaged communities to advocate on their own behalf. Simple walking surveys that catalogue infrastructural deficiencies — potholes in sidewalks, missing stop signs, burned-out streetlights — fill gaps in the public record that mask actual conditions on the ground. With locally produced data, Elwood says, “You can tell a very detailed and very current, compelling story about neighborhood needs.”

If that reminds you at all of fixmystreet, it ought to – that’s precisely the sort of idea it sprang from.

Fun facts from the DCLG / OS consultation

December 29th, 2009

A few things that strike us as we read through the consultation and impact assessment (links in previous posts).

Impact assessment:

Ordnance Survey generates most of its revenue from business and the public sector; in 2008/9 they each accounted for 46 per cent of the organisation’s total revenue. Consumers, through the sale of paper maps in retailing channels, accounted for the remaining 8 per cent of sales.

Impact assessment:

Ordnance Survey generates revenues from its products through licensing arrangements either directly with customers, or indirectly through licensed partners and through retail distributors. The direct customer channel accounts for two-thirds of Ordnance Survey’s trading revenue and includes various collective purchase agreements and major private sector users such as the utility companies. Approximately 25 per cent of Ordnance Survey’s trading revenue is generated though the indirect partner channel.

Impact assessment:

Separately, there are imbalances in Ordnance Survey’s current pricing model which may be causing inefficient allocation of resources. Firstly, Ordnance Survey currently charges private sector customers of its large-scale products significantly more than comparable government customers. The higher prices being paid by the private sector may potentially have restricted consumption to the less price sensitive users, impacting the economic benefit to the economy. Secondly, the payment allocation mechanism employed by government generates a weak price signal to Ordnance Survey from individual government users within the collective agreements.

Now that’s a really interesting one. Private sector pays more than government? I hadn’t heard that before. Payment mechanism generates a weak price signal?

Impact assessment:

[OS] already has a cost reduction programme underway as part of its existing business strategy, but any long-term strategic option would seek to introduce a framework that enhances cost transparency and provides incentives to pursue further efficiency gains.

More as we come across them…

Impact assessment of making OS ‘mid-scale’ data free puts cost at 47m-58m pounds

December 23rd, 2009

What interesting reading the impact assessment of the DCLG consultation on making OS data free is. Clearly some arms have been twisted in the Treasury to make it happen – Liam Byrne, chief secretary to the Treasury, almost surely in the driving seat there.

On the option being chosen (which is explicitly not the one that was examined in the “Cambridge study”, which looked at the benefits of releasing large-scale data, not the “mid-scale” data being proposed) the cost seems to be that government costs rise somewhat, while costs to the commercial sector fall.

From the document (on the impact assessment page):

ANNUAL COSTS

Lost OS revenue from OS Free data being made free: £19-24m (govt would fund this on a cost plus basis, amounting to £6-9m).

Increased government charges for large-scale data: £28-34m (price rebalancing based on number of datasets used by public and private sector).

One-off (Transition) Yrs: £ tbc

Average Annual Cost (excluding one-off): £47-58m

Total Cost (PV) £391-482m

Other key non-monetised costs by ‘main affected groups’ Transition costs to Ordnance Survey, government departments and businesses of moving to new model. There would be impacts on third party providers (see Competition Assessment, Annex 1).

ANNUAL BENEFITS

Description and scale of key monetised benefits by ‘main affected groups’: gain to business and consumers from OS large-scale data being made cheaper: £28-34m if assume price rebalancing is revenue neutral.

Gain from OS Free data being made available: £19-24m.

Average Annual Benefit (excluding one-off) £47-58m

Total Benefit (PV) £391-482m

Other key non-monetised benefits by ‘main affected groups’: The lower charges to businesses and consumers for large-scale data, and the free data should increase demand and hence welfare. Entry and innovation should occur in the market for geographical information. These welfare benefits have not been quantified (Pollock report focuses on releasing large-scale data).

And finally:

Key Assumptions/Sensitivities/Risk: Modelling assumptions: some substitution from paid-for to free data; lost revenue by OS due to competition from new derived products. Not yet determined how the revenue shortfall will be covered from government (i.e. who will pay and how). So for now assume no change in demand, but will estimate this for the final IA.

Price Base Year 2009

Time Period Years 10

Net Benefit Range (NPV) -

NET BENEFIT (NPV Best estimate) £0

Hurrah! Ordnance Survey consultation is live!

December 23rd, 2009

Thanks to a little bird at an interested organisation, we now know that the DCLG has opened its consultation on OS data.

It’s at http://www.communities.gov.uk/publications/corporate/ordnancesurveyconsultation, where we learn that the closing date is 17 March 2010 (and the opening date is today, 23 December 2009).

Consultation paper on the Government’s proposal to open up Ordnance Survey’s data relating to electoral and local authority boundaries, postcode areas and mid scale mapping information.

The consultation document itself weighs in at 2.2MB of PDF and 91 pages.

As ever, let us know your thoughts.

Update: and don’t miss the Impact Assessment paper – here’s the PDF of the Impact Assessment – which for some strange reason isn’t linked from the main page.

UEA CRU climate data is a free data issue too

December 22nd, 2009

I’ve been researching the apparent hack of the University of East Anglia’s Climate Research Unit (CRU), where a huge amount of email going back more than a decade, plus huge numbers of documents, have been released onto the internet – they’re indexed on various sites in searchable form and through Wikileaks, for example.

What I find interesting is some of the discussion around it. There have been multiple freedom of information (FOI) requests to the CRU from people who want to examine the underlying data used to make the analysis about human-driven global warming.

You’d think it would be straightforward. Science operates by data leading to theory leading to prediction leading to test against data, with a parallel process of independent test against the same data. So you’d think that access to the data would be a key thing.

Lots of Freedom of Information requests have thus come into the CRU demanding (that’s the word) the original data used for the papers. But the CRU has turned them down. Why? Because, UEA says, it came from weather organisations which charge for their datasets – and restrict those datasets’ redistribution.

Read for yourself at the CRU Data Availability page:

Since the early 1980s, some NMSs [national meteorological services], other organizations and individual scientists have given or sold us (see Hulme, 1994, for a summary of European data collection efforts) additional data for inclusion in the gridded datasets, often on the understanding that the data are only used for academic purposes with the full permission of the NMSs, organizations and scientists and the original station data are not passed onto third parties.

And:

In some of the examples given, it can be clearly seen that our requests for data from NMSs have always stated that we would not make the data available to third parties. We included such statements as standard from the 1980s, as that is what many NMSs requested.

The inability of some agencies to release climate data held is not uncommon in climate science. The Dutch Met Service (KNMI) run the European Climate Assessment and Dataset (ECA&D, http://eca.knmi.nl/) project. They are able to use much data in their numerous analyses, but they cannot make all the original daily station temperature and precipitation series available because of restrictions imposed by some of the data providers

CRU insists it wants to make the data available:

We receive numerous requests for these station data (not just monthly temperature averages, but precipitation totals and pressure averages as well). Requests come from a variety of sources, often for an individual station or all the stations in a region or a country. Sometimes these come because the data cannot be obtained locally or the requester does not have the resources to pay for what some NMSs charge for the data. These data are not ours to provide without the full permission of the relevant NMSs, organizations and scientists. We point enquirers to the GHCN web site. We hope in the future that we may be able to provide these data, jointly with the UK Met Office Hadley Centre, subject to obtaining consent for making them available from the rights holders. In developing gridded temperature datasets it is important to use as much station data as possible to fully characterise global- and regional-scale changes. Hence, restricting the grids to only including station data that can be freely exchanged would be detrimental to the gridded products in some parts of the world.

The problem arises because the centre has been running in this way since the 1980s – before the internet reached even most universities, and when the culture of “pay for data” (because it was so hard to acquire, and so jealously guarded) was much more ingrained.

But it is a problem that needs to be overcome. The CRU has all sorts of PR difficulties because it hasn’t grasped this nettle – which needs to be grasped so that it can finally get past any questions about its research. There are people who aren’t satisfied at being told that the data needed to investigate a scientific paper can’t be passed on because of long-lost contracts. (We wouldn’t be very impressed by that if we were told it either.)

Paying for public data: it’s never a good idea. Especially when it creates problems like this.

Consultation update: still invisible, but asked in Parliament

December 18th, 2009

Ordnance Survey says it’s for the Department of Communities and Local Government that’s in charge of the consultation over making its data free….

According to this Parliamentary answer, DCLG thinks so too:

The question:

Mark Field (Cities of London & Westminster, Conservative)

To ask the Secretary of State for Communities and Local Government with reference to the announcement of 17 November 2009 on the Making Public Data Public initiative, when he expects to begin the consultation regarding access to Ordnance Survey data.

The answer from the DCLG minister responsible:

Ian Austin (Minister of State (the West Midlands), Regional Affairs; Dudley North, Labour)

We expect the consultation to be launched during the week beginning 14 December 2009.

That’s this week. This week is almost over. What, it takes a week to launch a consultation? There are international experts who can do it quicker. Meanwhile I tried phoning the DCLG press office (no reply on multiple lines) and emailing it (no response).

Helluva way to organise a consultation.

Anyone seen a consultation?

December 18th, 2009

The Department of Communities and Local Government is, apparently, in charge of the consultation over making OS data free.

The plan was of course that the consultation would begin “in December”.

December’s here and we haven’t seen much sign. Anyone else? We’ve put a call in to find out…

(If you need reminding about the case for making data free, see our other pages on the site, such as the articles page or the “Case for free“. Perhaps we do need to update them in the light of the studies of the past couple of years…)

Postcodes to be free? But which ones?

December 9th, 2009

The BBC has a piece saying that “postcodes” will be free from 2010:

Currently organisations that want access to datasets that tie postcodes to physical locations cannot do so without incurring a charge.

Following a brief consultation, the postcode information is set to be freed in April 2010.

….

The dataset that is likely to be freed is that which ties postcodes to geographic locations. Many more commercial organisations use the Postcode Address File (PAF) that ties post codes to addresses. Currently access to either data set incurs a charge.

In October 2009 the Royal Mail took legal action that cut off the access many websites had to PAF data.

(You might remember that one.)(

Sites that used the postcode feed included Job Centre Pro Plus, HealthWare (locates nearby pharmacies and hospitals), Planning alerts.com (monitors planning applications), Straight Choice (finds out who sent political leaflets).

That’s quickly contradicted however by the email that came around from the Royal Mail, noted by Steve Feldman, coming from Giles Finnemore, Head of Marketing at the Address Management Unit of the Royal Mail:

You may be aware of a story on the BBC website today that Government is planning to give anyone free access to postcode data.

Access to postcodes is already, and will continue to be, free to every citizen via www.royalmail.com/postcodes4free.

(Which is a nonsense. It’s true, but it’s also nonsensical, because the postcodes4free page requires registration and will only give you a limited number of postcode lookups in a 24-hour period. Which, if you think about it, is absurd: why does the Royal Mail want to make it difficult to address letters? You need to have an address list if you want to generate postcodes; if you didn’t have the postcodes, where did you get the addresses?)

For the avoidance of doubt PAF(r), the Postcode Address File, remains the intellectual property of Royal Mail and is supplied and used under licence. The new and recently published licences come into effect from April 2010. There are no plans for that to change.

Maintaining a world class postal address file requires significant ongoing investment and it is right that organisations who obtain value from using the file pay to do so.

We are aware of no plans for Government to pay Royal Mail for businesses and organisations to use our address file.

And it’s also contradicted by the Royal Mail’s press release page, which at present (December 9) has nothing about postcodes.

However, it may well be that the PostZon file – or more precisely, the long/lat lookups for every postcode – will be available for free next April.

Daily Telegraph: making stuff free can create revenues

December 9th, 2009

Hey, look, even the Daily Telegraph – hardly a home of the idea of the free lunch – is reprinting Breaking Views pieces which point out that making data free brings bigger benefits.

UK map giveaway throws bread upon the waters pretty much sings from our songbook:

The Met Office and the Ordnance Survey are unlikely candidates to stimulate another revolution. The weather forecasts may be accurate (sometimes) and the maps beautiful, but as businesses, neither is going anywhere. This is no surprise, since neither is really suited to becoming a proper commercial enterprise.

Yet the data they own is, literally, invaluable. Made freely available, all sorts of would-be entrepreneurs could exploit it to build businesses beyond the dreams of the public sector. The slightly geeky approach needed to be a successful internet entrepreneur is commonplace among mapaholics and weather nuts. Given the raw material, they could make a thousand businesses bloom.

The proposal unveiled this week is vague – a consultation document is promised later this month. The ability of the civil servants to emasculate any good idea should never be underestimated. But this is one whose time has come.

Given their tiny profits, selling off the Ordnance Survey and Met Office would raise minimal amounts. Giving away the data will undermine profits, but the benefits in terms of corporate taxes should be much larger.

Thanks. We knew you couldn’t keep a good idea down.

Outrageous. Incredible. International expert was spoken word only even within OS

November 27th, 2009

Let’s just remind ourselves what it was that Sir Rob Margetts, chair of Ordnance Survey, said at the launch of OS’s proposed new strategy (which is now in little pieces all over the floor since Gordon Brown and Tim Berners-Lee announced the end of derived data and the freeing up of mid-scale mapping, but anyway) back in April:

“We came to conclusion that the cost to government in the first five years [of a free data model] would be between £500m and £1 billion. That wasn’t the only reason that we discarded it. We did, with outside help, a review of equivalent organisations around the world.“

Who, I then asked, was the “outside help”? OS responded:

With regard to the International Comparison of Geographical Information Trading Models Study, outside help was provided by senior officials of those Institutions contacted.

In the case of the United States of America, as senior officials of the United States Geological Survey (USGS) were unavailable, Mr. David Cowen, Distinguished Professor Emeritus at the University of South Carolina, kindly provided us with an in-depth overview of the state of public sector GI data in the United States, including USGS. Mr Cowan is a former chair of the Mapping Science Committee of the United States National Research Council and is chair of the National Research Council’s Committee for the study of Land Parcel Databases.

The document was also reviewed by an internationally recognised expert in Geographical Information and National Mapping who agreed with the analysis and conclusions.

This latter bit intrigued us. An “internationally recognised expert”, eh? Except it turned out that he or she did not want to be identified, although he or she works or has worked full-time for a foreign mapping agency, and read the study for free. And that OS transacted everything with the expert by spoken word:

A copy of the report was provided to the person concerned and engagement on this matter was conducted orally with no permanent record made of these conversations.

And now in response to my latest Freedom of Information request for

copies of all emails and/or documents internally relating to the decision to choose this person – for example, discussion of who would be suitable candidates or who would not be suitable candidates to carry out the review of the report

OS replies:

There was no decision process in place to find suitable candidates. An opportunity presented itself to request the opinion of a global expert in this field which was undertaken orally. The resultant opinion was expressed orally and there was no permanent record made of these conversations.

So here’s what happens. You have a report. You happen to bump into an old mate. “Hey, want to read my report?” you say. “Sure,” they say. They read it. “Seems OK,” they say. You go back to your office and tell people “I met X who says it’s fine.” Even though the report is a thrown-together farrago of disconnected information about various national mapping agencies and their charging methods, combined with an unrelated chunk of poorly displayed data about national GDP versus national R&D expenditure, which cannot by any reasonable measure be claimed to justify anything about any charging model.

This then becomes “The document was also reviewed by an internationally recognised expert in Geographical Information and National Mapping who agreed with the analysis and conclusions.”

If there is anyone at Ordnance Survey who is prepared to defend this course of events, could they please get in touch? Or even the international expert, who is very welcome to comment anonymously to explain whether they think OS’s representation of their opinion is justified. Comments are open.