A year in open data: How Visualising Rail Disruption helped commuters avoid delays

With commuters across Britain complaining of poor morning rail service, the ODI and startup Fasteroute decided to investigate. They found that by making small changes to their travel patterns, people could avoid up to a day of delays per year

By Stefan Janusz

null This is a train station. Even if these people get on a train, it will be expensive and probably late. Credit: CC BY-SA 3.0 Colin/Wikimedia

On 4 December 2015, the train operating companies that run railway services in England, Scotland and Wales announced their annual fare increases. The extra cost for season ticket holders was more modest than in previous years, at around 1% on average, but this above-inflation rise predictably sparked a row over what many felt was substandard service delivery, with delays, cancellations and poor communication to customers cited by commuters across the UK. The trouble is that, like all infrastructure, rail travel is boring and unmemorable when it works well, and only tends to be noticed when it doesn’t. It wasn’t possible to tell how quite how much of that grumbling was ‘anecdata’ – an impression of the service level of a year’s worth of travel skewed by subjective experience – and how much it reflected reality.

Travel apps are sometimes thought of as something of a cliché in open data: there’s a lot of open data on travel out there, and commuting by public transport, especially by rail, is how some 13 million people get to a place of work in the morning. Cue a proliferation of apps like Citymapper, which helps travellers in familiar or unfamiliar cities get from one side of town to the other in the most time-efficient way. So when startup Fasteroute applied to the ODI’s summer showcase to build a tool that would help commuters travel smarter, they had to set themselves apart from being ‘just another travel app’. Their idea was to aggregate real-time, open train arrival data to provide an indication of a given service’s punctuality based on its historical performance. The project was carried out under the name Visualising Rail Disruption, and the team behind the project was given communications support and guidance from teams at the ODI.

Apps like Citymapper provide users with travel routes calculated from timetable data, open map data and other factors, such as an average walking speed. What they don’t do is tell you how reliable a service is. The DelayExplorer that Fasteroute developed gives an indication of reliability based on past performance over a rolling period of the previous eight weeks. This is useful both in the case of a regular commute, where the impact in terms of time wasted on delayed trains accrued over a year can be significant, and for a journey that is unfamiliar to the traveller, especially if the suggested route involves multiple stages with short transfer windows between public transport services.

How does DelayExplorer work?

To explore whether historical train arrival data supports commuters’ anecdotal grumbles, the ODI worked with Fasteroute to analyse the punctuality performance of trains arriving into major regional commuter hubs in England, Scotland and Wales. Fasteroute’s visual DelayExplorer gives an indication of punctuality performance calculated by first categorising the degree of lateness of a late train using a system of ‘time-bins’: ‘on time’ or less than a minute late; 30 minutes late, which for the purposes of this analysis was considered the same as being cancelled. This data is plotted as a simple bar chart, with which a traveller can see how delayed their chosen service has been over the past eight weeks.

However, to find out the annual accrued delays an individual traveller could incur by travelling to commuter hubs at different times in the morning, we needed to find out not only how many trains fell into each bucket, but also how big a contribution those services made to overall lateness. This is important because a train that is 4 minutes 59 seconds late may have a bigger impact, in terms of knock-on effects of multi-stage commutes, than a train that is just 1 minute late.

We first split the morning commuter period of 7–10am up into 30-minute windows: 7–7:30am; 7:30–8am, and so on. By knowing the total contribution of lateness in seconds and the total number of trains counted in each lateness time-bin, it was first of all possible to know the mean average lateness of trains in that category, but also the average accrued hours of lateness a traveller travelling on services arriving during that half-hour window would incur. It was possible to see, for each city included in the analysis, how the pattern of lateness shifted throughout the morning, with degrees of lateness shifting between the time-bins. We found that, as the morning progressed, services were more likely to be late and prone to being cancelled – very likely because earlier delays have a knock-on effect throughout the morning. By travelling half an earlier, commuters could save half a working day’s worth of delays per year; up to a full working day if they were prepared to get into work before 8am, instead of for 9am.

The pattern was different for different cities; so much so that averaging the lateness trend throughout commuting hours across them all tended to obscure any meaningful result. We decided to focus on what travelling into commuter hubs meant for people across the country.

The analysis was carried out using a Google sheet, which was shared with journalists in the national and regional press. This allowed the journalists to not only verify the headline figures and trends highlighted in the press release prepared for the purpose, but to also investigate the data for themselves.

Who did the story reach?

The story was reported in 11 regional newspapers, the national Daily Mirror, websites including ITV News, and BBC Radio Oxford.

The analysis did support many commuters’ experience that delayed trains across a year of travel were having a significant impact; this despite the claims from many train operating companies that their trains were ‘on time’ more than 92.5% of the time. The reason train companies’ punctuality figures were so high is that services are normally counted as being ‘on time’ if they are no more than 5 minutes late (or 10 minutes late for long distance services), whereas our own analysis considered any delay of over 1 minute to be a delay – 5 or 10 minutes could mean a missed connection by rail or bus, and could mean late arrival to a meeting.

On 9 January, Network Rail released their own ‘right-time performance’ analysis, which was reported in the Guardian. While their analysis focused on train operating companies rather than cities, their findings supported our own: service punctuality was much lower than previous official figures had indicated, masked by the assessment that a service anywhere up to ten minutes late had been counted as being ‘on time’.

On 14 January, BBC television’s local news for London reported that Hertfordshire council intended to take Govia Thameslink, who run services between St Albans and London, to task for a lack of improvement since they took over the franchise a year ago. With renewed calls to bring the network back under public ownership, one thing commuters can be sure of is that open travel data can help them travel smarter, choosing services that historical performance tells us are less likely to be delayed, and find more time to do the things they want to do, instead of being stuck on a train.

If you have ideas or experience in open data that you'd like to share, pitch us a blog or tweet us at @ODIHQ.