What does the Data for London Library teach us about data as infrastructure?
In July this year, the GLA launched the Data for London Library, the new front door to the city’s growing and diverse data ecosystem. It allows users to search city data, all in one place, making it easier to share and use London’s data to improve the city and benefit Londoners. It is designed to make finding and sharing data about the capital quicker, easier and more collaborative. We have released it as a beta so we can continue to improve and expand it as you use it.
The Data for London Library is a significant step forward in the Data for London programme, which is aiming to fix the plumbing for data sharing. At launch, four boroughs, two central government departments, and two local delivery partners added their data catalogues to the London Datastore’s. This has more than tripled the number of datasets available overnight and raised visibility of data across London.
The Data for London Library allows us to make better use of data to solve city problems, ranging from education and employment services and social care through to energy planning and housing delivery. It is reducing money wasted re-collecting data that already exists, and improves our ability to consider all Londoners when we make decisions.
The process of researching, building, and releasing the Library has taught us a lot about the importance of data as infrastructure, and what value it brings to a modern city.
Data gains value when shared
Data as “the new oil” is a bad metaphor. Half a barrel of oil is worth half the full barrel: data can gain value when shared. The Data for London Library isn’t about technology for its own sake. It’s about practical value. London’s strength lies in collaboration. Across 32 boroughs and the City of London, world-class research institutions, and a vibrant voluntary and community sector, we see innovation every day. There are many examples where LOTI’s members have shown that Londoners’ lives can be improved by working together, often enabled by sharing data between us.
- In one past project, boroughs, central government, and civil society have shared data to fuel the Strategic Insights Tool for Rough Sleeping. With this tool, we can help people at risk of sleeping rough at a much earlier stage and look for prevention strategies.
- The London Innovation and Improvement Alliance shares data on children’s services and special educational needs to make sure children get the best possible education and care. This allows local government to commission and design the right services for their area.
- Sub-regional partnerships all over London are gathering data for effective Local Area Energy Planning. Having the right energy infrastructure is critical for planning the future of housing and commercial developments locally, and makes sure we make the right investments at the right time.
London is a vibrant ecosystem of partners, many of which already collaborate effectively. The new Library connects existing efforts and helps people work together more easily. Using the Data for London Library, users are finding new partnerships that benefit Londoners day-to-day.
Data discovery matters
London clearly has an open, collaborative spirit that proves how data gains value when shared. However, London is also a complex data ecosystem, with a lot of data owners and existing agreements. This city can be daunting and opaque for new entrants and innovators. The Library allows anyone, including those new to the city’s data ecosystem, to search and find existing datasets. This is useful in a few ways.
Firstly, it avoids duplication: we know of many occasions when data is re-collected by different agencies because they did not know it already existed. That is a waste of money and effort which can be avoided. The more we share out metadata, the more likely it is that others can find what we have. Using the Data for London Library, analysts can establish whether a dataset exists before spending money to collect it again. This means we spend money more efficiently and can ensure more benefits make it to Londoners.
Secondly, it enables new partnerships to be built. Each problem is unique, and requires its own combination of data to solve. In London, there are some very large data owners – many will think of boroughs, transport, police, health. However, local problems often require local partnerships and collaboration with delivery partners. Those partners may be small organisations, civil society, or local businesses. This benefits Londoners as more voices are represented in projects, and groups who don’t show up in traditional datasets – because they are digitally excluded, or don’t engage with public services, or for any other reason – are more effectively considered.
Open data is more than spreadsheets
London has a proud history of open data. The Mayor set a high standard for clear, transparent and intelligent uses of public data by encouraging publication of data from across the public sector, and many of our delivery partners have followed suit. There are now many open data portals across the boroughs, health and care, transport, and policing. We have seen that openness has enabled innovators to build solutions for London’s problems, set up businesses, and deliver value to the city.
However, there is a lot of data that, for good reasons, cannot be open. Examples might include sensitive data, or commercial information, or personal data. However, only rarely is it sensitive ‘that the data exists’. For example, we should not publish names and addresses of everybody living in social housing, as that would be a breach of privacy. However, it is not a secret that social housing providers have such data: it is integral to their operations.
In other words, even when the data itself is sensitive, the metadata (a description of the data, its update frequency, geography, etc.) likely is not. The Data for London Library, therefore, can contain metadata on datasets that cannot be published as open data. If a user finds a dataset they believe they have a good reason to use, the Library can point them in the direction of the process or team by which they can safely, legally, gain access if appropriate.
This openness is a continuation of London’s culture of transparency and partnership working. It reduces the barrier to entry for those looking to solve real city problems, while safeguarding sensitive data from those who do not have a legal basis to access it. It ensures that data access requests are grounded in the reality of what exists, and helps users to judge the feasibility of answering their questions.
The future of London’s data infrastructure
The Data for London team will continue its work to make it easier to share and use data to improve the city and benefit Londoners. This is the foundation we need if we want new technologies to work for everyone. AI systems are only as good as the data they are trained on. If we want to ensure these systems are fair, transparent and effective, then we must focus on the plumbing — making sure that high-quality, representative data is available and discoverable in a way that protects public trust.
We hope you will join us in that journey, and start by visiting the Data for London Library. To use it, you can enter your search terms and press ‘enter’. The Library will promptly search across all eight data catalogues and return anything relevant, no matter where it is currently located – from the London Datastore and boroughs like Barnet, through to Transport for London and the Office for Health Improvement and Disparities. You can click individual datasets to find out more, and to be taken straight to the data itself.
Martine Wauben