class: title-slide, left, bottom
# CDU Data Science Team — Data Hazards ---- ## **Structuring ethical questions for research/analysis** ### Zoë Turner | September 2022 --- class: inverse, middle, center # Ethics ### What's it got to do with data analysis? --- class: inverse-white, middle # Why ethics? -- * Ethics is just for research/Artificial Intelligence -- * It's never been a part of my work -- * Ethics committees are for medical cases -- * There are procedures that cover this, surely -- * Are there even any consequences if something is unethical? --- class:center # I'm a good person -- I know what I'm doing is ok -- ![A picture of grey jigsaw pieces with a light blue coloured one in the middle with a gold heart](data:image/png;base64,#img/pexels-tara-winstead-8386126.jpg) --- class: inverse # Scenario 1 ## Freedom of Information request .pull-left[ ![Hands of a person, one with a pen in, with cut out question marks around the central page](data:image/png;base64,#img/pexels-olya-kobruseva-5428833.jpg) ] .pull-right[ A question was asked last year about waiting times for a service and a figure was provided. A similar question was asked this year and a different analyst looked at the code. It was slightly wrong so the number produced is very different. Do you: a) use the old code b) use the new code ] --- class: hide-logo # Right answer? .pull-left[ ![Picture of three boxes with a red tick at the uppermost](data:image/png;base64,#img/pexels-tara-winstead-8850709.jpg) ] -- .pull-right[ </br> What do we do when there are no clear right or wrong answers? Do we only find out the answer was wrong after it's caused damage? When do we even find out if we've caused damage? ] --- class: hide-logo # COSHH Pictograms .pull-left[ ![Listed COSHH hazard labels from the Very Good Science Data Hazards GitHub repository](data:image/png;base64,#img/What-are-the-COSHH-Symbols.jpg) ] .pull-right[ </br> Pictograms alert us to the presence of hazardous chemicals. These are not saying "Don't use" Giving a warning to take precautions ] --- class: inverse, center, middle # Wouldn't that be handy for data! ![Data Hazards banner](data:image/png;base64,#img/data-hazards-banner.png) --- # Introducing Data Hazards -- * A starting point -- * Open source - open contribution -- * Invite other views on a project ![Photo of a desk from above with computers and the tops of some people's heads](data:image/png;base64,#img/pexels-fauxels-3183183.jpg) --- # When to use them -- * Thinking about worst case scenarios -- * Might use all or most of the hazard labels -- * Not just for the project but ways in which others might use it --- class: inverse, middle, center # Scenario 1 - revisited Links: [General hazard](#general) [Reinforces bias](#bias) [Ranks or classifies people](#rank) [High environment cost](#environment) [Lacks community involvement](#community) [Danger of misuse](#misuse) [Difficult to understand](#understand) [May cause direct harm](#harm) [Risk to privacy](#privacy) [Automates decision making](#decisions) [Lacks informed consent](#consent) [More resources](#resources) --- name: general class: hide-logo # [General hazard](https://datahazards.com/contents/hazards/general-hazard.html) .pull-left[ ![General Hazard logo](data:image/png;base64,#img/hazards/general-hazard.png) ] -- .pull-right[ Issues: Misunderstanding/misinterpretation Errors, changes in methodology We don't know what happens to the request after its made. -- Precautions: Mitigate with a caveat of data changing? Issue corrections publicly? Share methodology and update it if changes? ] --- name: bias class: hide-logo # [Reinforce Existing bias](https://datahazards.com/contents/hazards/reinforces-biases.html) .pull-left[ ![Bias logo](data:image/png;base64,#img/hazards/reinforce-bias.png) ] -- .pull-right[ Issues: Hard to say as the service wasn't mentioned in the scenario but could an incorrect number have been accepted because of bias? Could reusing it reinforce that bias? -- Precautions: Although the Freedom of Information related to one service, could this be extended to other services to see if there is bias between services and/or certain groups. ] --- name: rank class: hide-logo # [Ranks or classifies people](https://datahazards.com/contents/hazards/ranks-classifies.html) .pull-left[ ![Ranks people logo](data:image/png;base64,#img/hazards/classifies-people.png) ] -- .pull-right[ Issues: Could the data also be requested of multiple Trusts to be compared now or could be in the future? -- Precautions: Publish the methodology and/or code for other Trusts to replicate or audit if required. ] --- name: environment class: hide-logo # [High Environmental Cost](https://datahazards.com/contents/hazards/high-environmental-cost.html) .pull-left[ ![Environmental cost logo](data:image/png;base64,#img/hazards/environment.png) ] -- .pull-right[ Issues: Was it quick code to run or resource hungry? Is the data part of a larger data set that is replicated across Trusts (HES data for example) and needs big storage? -- Precautions: Code reviews to ensure code is not unnecessarily resource hungry. ] --- name: community class: hide-logo # [Lacks community involvement](https://datahazards.com/contents/hazards/lacks-community-involvement.html) .pull-left[ ![Lacks Community logo](data:image/png;base64,#img/hazards/lacks-community.png) ] -- .pull-right[ Issues: Generally these data requests are not checked with the community. But is that necessary? Freedom of Information requests are also published by Trusts or public sites like [What do they Know](https://www.whatdotheyknow.com/) -- Precautions: Could some Freedom of Information requests become analyses in their own right? For example, a question that highlights a data gap like smoking cessation used for patients who are under 16 in mental health services highlighted data on smoking status is poorly recorded. Could we escalate the poor data? ] --- name: misuse class: hide-logo # [Danger of misuse](https://datahazards.com/contents/hazards/danger-of-misuse.html) .pull-left[ ![Danger of misuse logo](data:image/png;base64,#img/hazards/misuse.png) ] -- .pull-right[ Issues: Perhaps a hazard for all Freedom of Information as we don't know what happens to the request. -- Precautions: Could we mitigate by sharing and publishing methodology/code? ] --- name: understand class: hide-logo # [Difficult to understand](https://datahazards.com/contents/hazards/difficult-to-understand.html) .pull-left[ ![Difficult to understand logo](data:image/png;base64,#img/hazards/difficult-to-understand.png) ] -- .pull-right[ Issues: Was the question itself difficult to understand? The code was wrong so does that introduce difficulty? -- Precautions: Providing data with a caveat that it is possible it could change. Sharing code/methodology. ] --- name: harm class: hide-logo # [May cause direct harm](https://datahazards.com/contents/hazards/direct-harm.html) .pull-left[ ![Direct harm logo](data:image/png;base64,#img/hazards/direct-harm.png) ] -- .pull-right[ Issues: Difficult to say what this could be? ] --- name: privacy class: hide-logo # [Risk to privacy](https://datahazards.com/contents/hazards/risk-to-privacy.html) .pull-left[ ![Privacy logo](data:image/png;base64,#img/hazards/privacy.png) ] -- .pull-right[ Issues: Possibly not as data is scrutinised by Information Governance often for identifiable data. ] --- name: decisions class: hide-logo # [Automates decision making](https://datahazards.com/contents/hazards/automates-decision-making.html) .pull-left[ ![Automates decision making logo](data:image/png;base64,#img/hazards/automates-decision-making.png) ] -- .pull-right[ Issues: Possibly not as the data is re-run as requested, and an error was identified so doesn't suggest any automation. ] --- name:consent class: hide-logo # [Lacks informed Consent](https://datahazards.com/contents/hazards/lacks-informed-consent.html) .pull-left[ ![Lacks informed consent logo](data:image/png;base64,#img/hazards/lacks-informed-consent.png) ] -- .pull-right[ Issues: Data is often about the service, not the patients so are service audits rather than research. ] --- name:resources # More Ethics Resources -- * Bristol University have a virtual [Data Ethics Club](https://dataethicsclub.com/) -- * Civil Service have an [awareness in data ethics online course](https://analysisfunction.civilservice.gov.uk/training/awareness-in-data-ethics/) -- * UK Statistics Authority have an [Ethics Self-Assessment Tool](https://uksa.statisticsauthority.gov.uk/the-authority-board/committees/national-statisticians-advisory-committees-and-panels/national-statisticians-data-ethics-advisory-committee/ethics-self-assessment-tool/) -- * [Government Data Science Slack](govdatascience.slack.com) have an Ethics channel and -- [Data ethics and society reading group](https://github.com/alphagov/data-ethics-and-society-reading-group) for cross government sessions on books and articles relating to ethics in data science --- class: inverse # Scenario 2 ## Did not attend analysis .pull-left[ Use [templates](#templates) to go through the Hazards ![Photo of a calendar with dates](data:image/png;base64,#img/pexels-pixabay-273153.jpg) ] .pull-right[ Did not attends cost the NHS in money, lost opportunity and staff time. It's analysis that is asked of analysts/data scientist, particularly in view of demographics and population health management. The analysis question posed is: Are some groups more like to not attend than others? ] --- class: inverse # Scenario 3 ## Dashboard analysis .pull-left[ Use [templates](#templates) to go through the Hazards ![Photo of a laptop with graphs on the screen](data:image/png;base64,#img/pexels-lukas-577210.jpg) ] .pull-right[ A dashboard is requested which needs to include demographics, service information and some time related information. It needs to be high level but possible to drill down to lower but not to actual patient data. ] --- name: templates class: inverse, middle, center The following slides are the individual Data Hazard labels to prompt discussion or the [Word template](https://very-good-science.github.io/data-hazards/_static/data_hazards_template.docx) --- class: hide-logo # [General hazard](https://datahazards.com/contents/hazards/general-hazard.html) .pull-left[ ![General Hazard logo](data:image/png;base64,#img/hazards/general-hazard.png) ] .pull-right[ Issues: Precautions: ] --- class: hide-logo # [Reinforce Existing bias](https://datahazards.com/contents/hazards/reinforces-biases.html) .pull-left[ ![Bias logo](data:image/png;base64,#img/hazards/reinforce-bias.png) ] .pull-right[ Issues: Precautions: ] --- class: hide-logo # [Ranks or classifies people](https://datahazards.com/contents/hazards/ranks-classifies.html) .pull-left[ ![Ranks people logo](data:image/png;base64,#img/hazards/classifies-people.png) ] .pull-right[ Issues: Precautions: ] --- class: hide-logo # [High Environmental Cost](https://datahazards.com/contents/hazards/high-environmental-cost.html) .pull-left[ ![Environmental cost logo](data:image/png;base64,#img/hazards/environment.png) ] .pull-right[ Issues: Precautions: ] --- class: hide-logo # [Lacks community involvement](https://datahazards.com/contents/hazards/lacks-community-involvement.html) .pull-left[ ![Lacks Community logo](data:image/png;base64,#img/hazards/lacks-community.png) ] .pull-right[ Issues: Precautions: ] --- class: hide-logo # [Danger of misuse](https://datahazards.com/contents/hazards/danger-of-misuse.html) .pull-left[ ![Danger of misuse logo](data:image/png;base64,#img/hazards/misuse.png) ] .pull-right[ Issues: Precautions: ] --- class: hide-logo # [Difficult to understand](https://datahazards.com/contents/hazards/difficult-to-understand.html) .pull-left[ ![Difficult to understand logo](data:image/png;base64,#img/hazards/difficult-to-understand.png) ] .pull-right[ Issues: Precautions: ] --- class: hide-logo # [May cause direct harm](https://datahazards.com/contents/hazards/direct-harm.html) .pull-left[ ![Direct harm logo](data:image/png;base64,#img/hazards/direct-harm.png) ] .pull-right[ Issues: Precautions: ] --- class: hide-logo # [Risk to privacy](https://datahazards.com/contents/hazards/risk-to-privacy.html) .pull-left[ ![Privacy logo](data:image/png;base64,#img/hazards/privacy.png) ] .pull-right[ Issues: Precautions: ] --- class: hide-logo # [Automates decision making](https://datahazards.com/contents/hazards/automates-decision-making.html) .pull-left[ ![Automates decision making logo](data:image/png;base64,#img/hazards/automates-decision-making.png) ] .pull-right[ Issues: Precautions: ] --- class: hide-logo # [Lacks informed Consent](https://datahazards.com/contents/hazards/lacks-informed-consent.html) .pull-left[ ![Lacks informed consent logo](data:image/png;base64,#img/hazards/lacks-informed-consent.png) ] .pull-right[ Issues: Precautions: ] --- class: inverse name: acknowledgement # Acknowledgments Special thanks to: Zelenka, N., & Di Cara, N. H. Data Hazards (Version 0.1) [Computer software]. https://github.com/very-good-science/data-hazards Acknowledgements: the professional look of this presentation, using NHS and Nottinghamshire Healthcare NHS Foundation Trust colour branding, exists because of the amazing work of Silvia Canelón, details of the workshops she ran at the [NHS-R Community conference](https://spcanelon.github.io/xaringan-basics-and-beyond/index.html), Milan Wiedemann who created the CDU Data Science logo with the help of the team and Zoë Turner for putting together the slides. [<svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M459.37 151.716c.325 4.548.325 9.097.325 13.645 0 138.72-105.583 298.558-298.558 298.558-59.452 0-114.68-17.219-161.137-47.106 8.447.974 16.568 1.299 25.34 1.299 49.055 0 94.213-16.568 130.274-44.832-46.132-.975-84.792-31.188-98.112-72.772 6.498.974 12.995 1.624 19.818 1.624 9.421 0 18.843-1.3 27.614-3.573-48.081-9.747-84.143-51.98-84.143-102.985v-1.299c13.969 7.797 30.214 12.67 47.431 13.319-28.264-18.843-46.781-51.005-46.781-87.391 0-19.492 5.197-37.36 14.294-52.954 51.655 63.675 129.3 105.258 216.365 109.807-1.624-7.797-2.599-15.918-2.599-24.04 0-57.828 46.782-104.934 104.934-104.934 30.213 0 57.502 12.67 76.67 33.137 23.715-4.548 46.456-13.32 66.599-25.34-7.798 24.366-24.366 44.833-46.132 57.827 21.117-2.273 41.584-8.122 60.426-16.243-14.292 20.791-32.161 39.308-52.628 54.253z"></path></svg> @DataScienceNott](https://twitter.com/DataScienceNott) [<svg viewBox="0 0 496 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M165.9 397.4c0 2-2.3 3.6-5.2 3.6-3.3.3-5.6-1.3-5.6-3.6 0-2 2.3-3.6 5.2-3.6 3-.3 5.6 1.3 5.6 3.6zm-31.1-4.5c-.7 2 1.3 4.3 4.3 4.9 2.6 1 5.6 0 6.2-2s-1.3-4.3-4.3-5.2c-2.6-.7-5.5.3-6.2 2.3zm44.2-1.7c-2.9.7-4.9 2.6-4.6 4.9.3 2 2.9 3.3 5.9 2.6 2.9-.7 4.9-2.6 4.6-4.6-.3-1.9-3-3.2-5.9-2.9zM244.8 8C106.1 8 0 113.3 0 252c0 110.9 69.8 205.8 169.5 239.2 12.8 2.3 17.3-5.6 17.3-12.1 0-6.2-.3-40.4-.3-61.4 0 0-70 15-84.7-29.8 0 0-11.4-29.1-27.8-36.6 0 0-22.9-15.7 1.6-15.4 0 0 24.9 2 38.6 25.8 21.9 38.6 58.6 27.5 72.9 20.9 2.3-16 8.8-27.1 16-33.7-55.9-6.2-112.3-14.3-112.3-110.5 0-27.5 7.6-41.3 23.6-58.9-2.6-6.5-11.1-33.3 2.6-67.9 20.9-6.5 69 27 69 27 20-5.6 41.5-8.5 62.8-8.5s42.8 2.9 62.8 8.5c0 0 48.1-33.6 69-27 13.7 34.7 5.2 61.4 2.6 67.9 16 17.7 25.8 31.5 25.8 58.9 0 96.5-58.9 104.2-114.8 110.5 9.2 7.9 17 22.9 17 46.4 0 33.7-.3 75.4-.3 83.6 0 6.5 4.6 14.4 17.3 12.1C428.2 457.8 496 362.9 496 252 496 113.3 383.5 8 244.8 8zM97.2 352.9c-1.3 1-1 3.3.7 5.2 1.6 1.6 3.9 2.3 5.2 1 1.3-1 1-3.3-.7-5.2-1.6-1.6-3.9-2.3-5.2-1zm-10.8-8.1c-.7 1.3.3 2.9 2.3 3.9 1.6 1 3.6.7 4.3-.7.7-1.3-.3-2.9-2.3-3.9-2-.6-3.6-.3-4.3.7zm32.4 35.6c-1.6 1.3-1 4.3 1.3 6.2 2.3 2.3 5.2 2.6 6.5 1 1.3-1.3.7-4.3-1.3-6.2-2.2-2.3-5.2-2.6-6.5-1zm-11.4-14.7c-1.6 1-1.6 3.6 0 5.9 1.6 2.3 4.3 3.3 5.6 2.3 1.6-1.3 1.6-3.9 0-6.2-1.4-2.3-4-3.3-5.6-2z"></path></svg> Clinical Development Unit Data Science Team](https://github.com/CDU-data-science-team) [<svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M440 6.5L24 246.4c-34.4 19.9-31.1 70.8 5.7 85.9L144 379.6V464c0 46.4 59.2 65.5 86.6 28.6l43.8-59.1 111.9 46.2c5.9 2.4 12.1 3.6 18.3 3.6 8.2 0 16.3-2.1 23.6-6.2 12.8-7.2 21.6-20 23.9-34.5l59.4-387.2c6.1-40.1-36.9-68.8-71.5-48.9zM192 464v-64.6l36.6 15.1L192 464zm212.6-28.7l-153.8-63.5L391 169.5c10.7-15.5-9.5-33.5-23.7-21.2L155.8 332.6 48 288 464 48l-59.4 387.3z"></path></svg> cdudatascience@nottshc.nhs.uk](mailto:cdudatascience@nottshc.nhs.uk) Photos (in order of appearance): Photo by Tara Winstead: https://www.pexels.com/photo/blue-puzzle-piece-on-white-jigsaw-puzzle-8386126/ Photo by Olya Kobruseva: https://www.pexels.com/photo/question-marks-on-paper-crafts-5428833/ Photo by Tara Winstead: https://www.pexels.com/photo/red-check-mark-on-box-in-close-up-view-8850709/ Photo by fauxels: https://www.pexels.com/photo/people-having-business-meeting-together-3183183/ Photo by Pixabay: https://www.pexels.com/photo/calendar-dates-paper-schedule-273153/ Photo by Lukas: https://www.pexels.com/photo/close-up-photo-of-gray-laptop-577210/