Summaries of AI Bias Papers - June 25, 2023

Machine Learning for MDs Weekly Digest

The mission of ML for MDs is to connect physicians interested in machine learning. This newsletter provides learnings at the intersection of medicine and machine learning.

Fun Fact

The first message sent over ARPANET happened on Oct. 29, 1969. Charley Kline, who was a student at UCLA, tried to log in to the mainframe at the Stanford Research Institute. He successfully typed in the characters L and O, but the computer crashed when he typed the G of the command LOGIN

News and Stories of the Week

From Eric Topol’s Substack: “In this week’s JAMA, Kanjee, Crowe, and Rodman published a comparison of 70 NEJM CPCs [clinical patient conferences/grand rounds] for the medical expert diagnosis compared with GPT-4.”
A review of using top-down vs. bottom-up approach when using EHR data
Nice summary by the team at AI Checkup of the current sprawling health AI landscape with really great graphics.
A research group created Clinical GPT using “medical records, domain-specific knowledge, and multi-round dialogue consultations in the training process” and says it significantly outperforms other models in “medical knowledge question-answering, medical exams, patient consultations, and diagnostic analysis of medical records”
This perspective paper on foundation models for medical images gives a nice framework of “the “spectrum” of medical foundation models, ranging from general vision models, modality-specific models, to organ/task-specific models”

Weekly Summary

Bias in AI has long been a concern, and the effects were first seen in non-medical settings: algorithms that gave Black inmates longer sentences or presented job postings for highly paid positions primarily to men.

Bias studies in healthcare are still developing, but below are some representative studies. They show the lack of publicly available representative data, as well as the bias introduced by using proxies in algorithms.

Obermyer and colleagues published one of the first studies of AI bias in healthcare in Science:

Setting	Large academic center
Timeframe	2013-2015
Patients	Primary care patients enrolled in risk-based insurance contracts. About 6000 Black patients, 43,000 White patients
What the investigators did	Analyzed racial differences in a real algorithm used by many insurers and healthcare systems, designed to identify high-risk, complex patients who would benefit from additional healthcare resources.
What the investigators found	This widely-used algorithm used healthcare costs as a proxy to identify patients with high healthcare needs. However, at the same risk score, Black patients were sicker because they don’t utilize as much healthcare as White patients at the same level of illness. This means they were less likely to qualify for a program with additional support.
Key takeaway	Proxies used for machine learning algorithms can introduce hidden bias

Also in 2019 published in the Lancet, Tomasev et al trained a model to predict kidney injury and found significant gender bias:

Setting	US
Timeframe	2011-2015
Patients	703,782 inpatients and outpatients from VA hospitals, 94% men
What the investigators did	Developed a deep learning algorithm (recurrent neural network) to predict future kidney injury
What the investigators found	Model could predict 90% of future dialysis needs, but performed significantly worse for women
Key takeaway	Datasets that aren’t representative of specific populations will likely perform worse on them

In 2021, Celi et al described the bias present in data availability in PLoS Digital Health:

Setting	Clinical papers published in PubMed
Timeframe	2019
Patients	n/a
What the investigators did	Used a machine learning algorithm to determine how where machine learning databases were from
What the investigators found	40% of the databases were from the US and 13% from China, and 40% of authors were American or Chinese
Key takeaway	Countries with better datasets are likely to benefit more from AI, and current datasets and authors are overwhelmingly from the US and China

In 2022, Wen et al published a paper in Lancet Digital Health showing that publicly available datasets are not representative of many parts of the world:

Setting	Datasets published in MEDLINE, Google, and Google Dataset
Timeframe	2020-2021
Patients	100,000 images
What the investigators did	Searched all publicly available dermatology image datasets of skin cancer
What the investigators found	79%-88% of image sets were from Europe, Oceania, and North America. Only one dataset originated from Asia, two from South America, and none from Africa.
Key takeaway	The images demonstrated a “massive under-representation of skin lesion images from darker skinned populations”

In 2021, the WHO published “Ethics and governance of artificial intelligence for health”, a 165 page document detailing:

Laws and policies related to AI in healthcare
Key ethical principles for use of AI in healthcare
- Protect autonomy
- Promote human well-being, safety,and the public good
- Ensure transparency, explainability and intelligibility
- Foster responsibility and accountability
- Ensure inclusiveness and equity
- Promote AI that is responsive and sustainable
Ethical challenges to use of artificial intelligence for health care
- Bias, cybersecurity, data collection, accountability, etc
Building an ethical approach to use of AI for health
- Transparent design, impact assessment, public engagement
Liability regimes for AI in health
- Liability, compensation for errors, regulatory agencies
Elements of a framework for governance of AI for health
- Data governance, regulatory considerations, model legislation

Community News

If you haven’t introduced yourself, please do so under the #intros channel.

Thanks for being a part of this community! As always, please let me know if you have questions/ideas/feedback.

Sarah

Sarah Gebauer, MD

MlforMDs.com

Summaries of AI Bias Papers – June 25, 2023

Fun Fact

News and Stories of the Week

Weekly Summary

Community News

Leave a Comment Cancel Reply