Getting better at data ethics with ODI
As part of LOTI’s work to support London boroughs develop their data ethics capabilities, Sam and Jay recently completed the Open Data Institute’s (ODI) Data Ethics training for Professionals and Facilitators.
The course is aimed at developing the skills of professionals working with data to support them to think critically about data ethics in a project context and to work with their organisations to develop long term ethics capabilities and processes.
Having recently published a set of Data Ethics recommendations of Local Authorities the course offered an opportunity for us to reflect on the scope, detail and relevance of our guidance.
The course helped us to confirm the overall focus on building institutional capability and capacity to deal with data ethics consistently is correct. Where we recognised a gap in our work was in the recommendations for those working closer to the data. Data Scientists and Analysts building models will be the front line in flagging where bias may arrive and hold an important responsibility to flag the possibility of bias or inaccurate results. Our guidance currently does not offer anything at this level, something we will address in future output.
To address this we are considering how we can include guidance for data practitioners on how best to identify and flag ethical risks in their work, this could include using elements of the UKSA’s Ethics framework. We will also compile a list of courses or training on bias in data and data models.
The course introduced us to the ODI Data Ethics Maturity model, a new framework that organisations can use to assess their progress towards building a consistent, embedded and effective data ethics capability. LOTI will explore if this maturity model can be used alongside our recommendations.
Meeting a range of practitioners working on data ethics from across industry and government in the UK and globally provided a valuable insight into the multiple different frames for approaching data ethics. Large technology companies, museums, universities and media organisations are all working towards a robust data ethics capability and this indicates the growing awareness that with an ethics you can’t do data.
Data Ethics Case Studies
As part of the assessment of the course, Jay and Sam were required to apply their learning to a chosen case study. We have summarised their case studies below, if you would like more information, please contact email@example.com and firstname.lastname@example.org.
Jay’s case study
Jay chose to explore the ethical implications of a project LOTI are working on with London Councils and Bloomberg Philanthropy which aims to prevent rough sleeping in London. The project intends to bring together data from a number of sources to better understand how people enter rough sleeping with the intention of using this insight to develop preventive interventions.
Pseudonymisation is one of the technical mitigation factors that is suggested as a means to mitigate the impact on individuals privacy, but even with strong legal and technical measures in place the impact of poor quality or missing or omitted data needs to be assessed for its potential impact on outcomes and trust. Data minimisation can also help in building trust in the system with, for example, data on immigration status not shared.
Using ethical frameworks in the initial project design helps with identifying risks early on and mitigations to be developed. Thinking about the full project lifecycle is essential as technology and ethical perceptions of data use shift over time so designing in review points can ensure that a project remains ethical in the future.
Sam’s case studies
Sam chose to write about the World Health Organisation’s social listening platform, the Early AI-supported Response with Social Listening, or EARS. EARS was developed to help monitor the spread of disinformation surrounding Covid-19. It uses social listening technology, which applies natural language processing to public posts on social media platforms in different countries from around the world. The hope is that the algorithm can help policy-makers better respond to particular strands of misinformation which might be gaining in popularity, by tracking the use of certain key words pertaining to different misinformation-related topics. Sam’s analysis focused on the limits of the algorithm in analysing posts from women or those who don’t speak the country’s primary language well like migrants, the data chosen which massively diverges in quantity between countries, and whether policy-makers might treat the outputs of the model with sufficient scepticism. The main takeaway from the feedback to Sam’s work was to not be afraid to really get his hands dirty analysing the biases of the model, even if the algorithms aren’t available publicly.
Choosing a different case study for the assessment of the Facilitator training, Sam focused on the International Red Cross Committee (ICRC)’s use of biometric data in humanitarian aid, having written about it as a case study in his previous job. It was a great chance to put the facilitation and data ethics skills that Sam had picked up over the length of the course to the test, and was a good chance to see what ethics facilitation will look like in practice.