Guest Blog: The ODI’s first incubator business on pioneering open data

We live in a world which is increasingly defined by data. My company, Mastodon C, are specialists in Hadoop (open source data technology) and data science (fancy name for finding insights in the data, sometimes using fancy maths). It’s a great time to be around for data lovers like us: all the problems just got very interesting. We finally have the software and hardware tools, the raw data, and the wider world realising the value of it all, coming together in a very exciting way. Also, the Harvard Business Review just said I have the sexiest job of the 21st century. So what’s not to like?

But why bother to make data open?

There are a bunch of political and philosophical reasons, but there is also one very big and very concrete reason, speaking as someone who tries to extract insight from data for a living: it’s hard to know, when you first get a dataset, what the value of it might be. There’s always a period of experimentation, in which you try looking at it from different angles, matching it up with some different things, and analysing it using different tools. Sometimes the most exciting, glamorous dataset turns out to be relatively boring because there’s nothing you can find beneath the surface. On the other hand, dull and straightforward data can become something special when looked at through a novel analytical lens, understood by a domain expert, or combined with some other dataset which adds new layers.

If data is closed, that experimentation period can be sadly truncated. Only a couple of individuals inside an organisation’s walls get to do any of that experimentation, and the data only gets to be combined with other data and domain knowledge that are also already inside those walls. In a closed world, most data never encounters the analyst and the context that will really make it live and breathe.

We got a nice example of this recently, tinkering around together with a large set of open government data. Our initial intent was pretty idle: I wanted to stress-test our technology platform by throwing a big set of data at it and doing some big calculations at random. But by combining our data crunching skills with interesting questions and insights from some domain experts about how this particular part of government works, we’re now looking at mapping and identifying government savings opportunities on a huge scale. The exact details are under wraps for the time being, while we cross-check everything, but it looks like we’ve come across something genuinely very valuable.

Without data being open, we as a company would never have been able to even start experimenting and learning about it, and we’d never have had the chance to build up its value and then give that back to users. We’re now looking at turning this into some real and very interesting products, all because the openness of data made the friction low enough to try some new approaches out.

I think this shows the very real value of open data, and also shows why we’re very pleased and proud to be the Open Data Institute’s first incubation company.