Guest Post: Finding stories in Stamford Hill's open data

null

In planning practice, the ability to acquire a true and subtle understanding of place is critical. ‘Open’ and ‘big’ data offer the tantalising prospect of gaining valuable insights into the places where we work. However, large data sets sometimes seem impenetrable. We were therefore drawn to Finding Stories in Open Data, a course run by the Open Data Institute and led by Ulrich Atz, to equip ourselves to address this problem.

Considering the role of data, professionals will be reassured by Atz’s affirmation of the continuing need for ‘theory’ based on empirical research and practical experience in developing policy responses to deal with real world problems. He rejected claims made by Google, for example, that big data ‘removes the need’ for theories to explain phenomena such as public health scares. (The company once asserted they were able to foresee the timing and location of flu outbreaks as a result of activity on its search engines, thus 'removing the need’ for theories to explain how and when such traumas take place. While apparently compelling, the relationships turned out to be illusory). Similarly, city planners, working with a ‘dashboard’ of indicators populated with access to big data, should keep the academic basis to their professional practice in place and up to date in conceiving policy.

The recently published book, ‘The Responsive City,’ by Stephen Goldsmith, picks up on some of these themes. In it, he cites an example of how New York started gathering data online from citizens about damage to public property and maintenance issues. These appeared as ‘yellow dots’ on a digital map. A substantial dataset quickly came into existence that not only alerted city managers to individual cases, but also clusters of problems. Based on this data, more efficient maintenance regimes, security and other preventative measures were devised. It would also be tempting to use it to draw conclusions about the relationship between patterns of vandalism and, for example, the position of fast food outlets and liquor stores. But findings such as this should seek to qualify, rather than replace, research-based policy approaches that seek to identify cause rather than just correlation.

In the realm of data, the dataset developed by New York is relatively small. City managers have speculated about the effects of truly big data, for example data on consumer behaviour gathered from supermarket loyalty card schemes, patterns of activity from smart phone use and information stored in increasingly sophisticated computing technology in cars. Atz presented a Venn diagram of the three ‘types’ of data – big, open and personal. In the dashboard envisaged earlier, the overlap between big and open data holds out the opportunity for rich datasets to populate a range of indicators that could help policymakers develop ideas across the policy spectrum.

null

Following the event, we looked to experiment with some of the methodologies imparted by Atz. Connecting the guidelines from the lecture to a project we’re currently working on in Stamford Hill, North London, we took a second look at a dataset published by the Office of National Statistics titled ‘Birth and Death Rates, Ward’ to get an understanding of how the population may have changed or be changing in the area and how this then compares to different geographic scales. The dataset covers the General Fertility Rates (GFRs) in all the wards across London’s 28 boroughs, as well the average GFRs within the boroughs and in London as a whole. Years from 2002 to 2013 are accounted for in the dataset, though as a simple exercise we looked at the years 2003 and 2013 to see if we could tell a story across the span of 10 years.

null

Our goal was to find out whether the combined wards that roughly makeup Stamford Hill (Cazenove, Springfield, Lordship and New River) follow a similar trajectory as the rest of Hackney and London in terms of the number of live births per 1,000 women of reproductive age. After using a pivot table to filter the data to the relevant wards and summarise the value fields of 2003 and 2013 by average, we learned that the average GFR for the four wards in 2003 was 106.8 and in 2013 was 109. By comparison, Hackney’s average GFR was 75 in 2003 and 71.1 in 2013, and London’s was even lower at 60.6 in 2003 and 64 in 2013.

null

While Hackney’s fertility rates have dropped, the GFR in the wards that form Stamford Hill has increased, indicating an area with large family sizes and an ever growing population.

Overall, the day with the ODI provided a framework that will enable us to explore big data and, in respect of our planning work, allow us to produce more subtle and compelling evidence to support policy development.

Ivan Tennant is the founder and principal of Plan Projects.

Our Finding Stories in Open Data course runs through the year, more about this and our other courses can be found here.