COVID-19: Borough Challenges with PHE and NHS datasets


Data plays a huge role in boroughs’ work to support their residents during the pandemic.

London borough data teams are supporting public health and frontline colleagues to understand the impact of COVID-19 in their area and assisting the national contact tracing system through local contact tracing efforts.

There are three different pools of data that play a role in this work.

  1. Councils access the Public Health England (PHE) line list data for the latest infection and outbreak details in their area, helping to understand hotspots and cluster and deliver interventions for specific communities to reduce transmission.
  2. NHS Test and Trace data is provided to councils to contact people who may have been exposed to the virus that the national team has not been able to contact within 24 hours of reporting.
  3. Meanwhile, the Shielding Patients List is now published by the NHS. Boroughs access this data to understand which residents might need additional support.

Snapshot from the public Coronavirus Dashboard – https://coronavirus.data.gov.uk/

Data Issues

Unfortunately, our networks of LOTI data analysts and managers have reported a number of serious challenges they face when trying to access and interpret these data sets, which are inhibiting the speed at which they can create insights and support residents in need.

LOTI has collated all these concerns into a document which we have shared with PHE and NHS Digital. While that document lists specific challenges with each data set, there are a number of challenges that are common across all of them, highlighted below.

Lack of Automation

Access to each data set currently requires a manual process to login and pull new data when it is published. This increases the time required get data ready for analysis in what is an extremely time-critical activity. The manual nature of the process increases the risk of error as data must be downloaded into Excel or CSV format.

Lack of Metadata

Published data is not always accompanied by good documentation and metadata to support analysts in interpreting the data sets and combining them with other data sources. Missing or inadequate metadata reduces trust and can lead to errors.

Data Quality

Issues with data quality are apparent in all the published data sets. While we recognise that there are numerous upstream challenges affecting quality for the data publishers, incorrect and inconsistent records combined with inadequate metadata make it difficult to interpret the data and could lead to individual cases or contacts being missed.

Where to start

With infection rates continuing to rise in London, addressing these challenges quickly would significantly aid boroughs in their work to tackle the virus in their areas.

We recommend that this starts with enabling the automated pulling of data to save time and reduce the risk of error. This should then be supported by improved metadata.

If you’ve experienced any additional challenges when accessing any of the COVID-19 related data sets, please add these as comments to the google doc.

And if any colleagues working at Public Health England or NHS Digital read this blog, we’ve love to discuss these issues with you.

Covid-19 Data
by Jay Saggar
16 October 2020 ·