Machine Learning for MDs Weekly Digest

What’s New in ML for MDs

Welcome to the ML for MDs Newsletter. The mission of ML for MDs is to connect physicians interested in machine learning. This newsletter provides the most relevant news, journal articles, and jobs at the intersection of medicine and machine learning.

Fun Facts

The first known use of the word “robot” was in a 1921 Czech science fiction play about “artificial people” by Karel Capek titled “Rossum’s Universal Robots”
Arthur Samuel developed a program for a computer to play checkers, the first time a computer played independently, and in 1959 used the term “machine learning” when describing how to teach machines how to play checkers better than humans

This Week’s Top Stories

Google can now predict flooding worldwide, which has the potential to save thousands of lives per year
Flux AI shows how AI can help design hardware

Weekly summary

A summary of A Surgeon’s Guide to Machine Learning which nicely covers some fundamentals (co-authored by our own Dr. Eckert)

Step 1: Define the Use Case – do we need ML to solve this problem?

“ML algorithms can be applied to most data in medicine…may result in an overly complicated model for the clinical situation creating a scenario that would be better suited by more traditional methods”
“ML models find associations between covariates and outcomes. Clinicians need to understand that, unless specifically defined, ML models are not determining causal relationships”

Step 2: Deal with the data

Authors suggest meticulous specificity in defining where the data came from, the potentially thousands of covariates the ML model used, inclusion and exclusion criteria of the data, and outcomes very clearly defined
Because the outcome of interest in medicine is often rare, the ML model may not have sufficient data to learn from unless appropriate techniques such as oversampling or cost-sensitive learning.
The amount of missing data and its randomness should be described, and projects should describe how this missing data is handled
- Complete case analysis (remove all rows with missing data)
- Imputation (using similar data to make an educated guess)
  - K-nearest neighbors – looks at similar data rows to make a guess
  - Multiple Imputation by Chained Equations (MICE) – generates, then pools together multiple complete data sets to replace the incomplete row
  - Informed missingness using binary indicators – good for data that is missing for a reason, like the test was not ordered because it wasn’t clinically appropriate

Step 3: Model selection

“Most ML methods fall into one of four groups: supervised learning, unsupervised learning, semisupervised learning, and reinforcement learning.”
- Supervised
  - Outcome is determined before the model is run
  - The outcome variable is available as a labeled field in the dataset
    - Death, infection, tumor, etc
  - With traditional ML, performance will plateau after a certain amount of data. With deep learning, more data is usually better
- Unsupervised
  - “Determines associations in the data or groups instances in the data into clusters based on like characteristics.”
  - “Identify otherwise unknown or unsuspected patterns”
- Semisupervised
  - “In most cases of semisupervised learning, the labeled data constitutes a minuscule percentage of all data instances”
  - “a combination of supervised and unsupervised methods are used to uncover patterns and develop prediction models”
- Reinforcement learning
  - “algorithm gets feedback based on its actions—in the form of rewards or penalties”
  - “useful in contexts where significant heterogeneity in patient characteristics is observed”
Deep learning uses neural networks and can use any of the approaches above
- Requires less pre-processing of data but harder to explain

Step 4: Validate the model

Two main approaches: hold-out testing and cross validation
- Hold out testing breaks the data up into a training set (usually 80%) and test set (usually 20%)
  - “hold-out testing is traditionally utilized on larger datasets due to the decreased processing power required and is more computationally efficient”
- Cross validation breaks the data up into a pre-defined number of groups, and one of those groups is kept out of the training data for testing, then repeated until every group is tested
  - “ better indication of model performance due to its ability to analyze numerous train-test groupings”

Step 5: Determine performance metrics

Authors provide the following metrics for model performance:
- “How well does the model discern the true positives from the false positives (precision)?
- How well does it discern the true positives from the false negatives (recall)?
- How well does it risk stratify the scored cohort (calibration)?
- Is the model explainable?
- Is it fair and free of bias?”
Performance on a validation data set (often unrelated to the original data set)
- Poor performance on validation dataset suggests lack of generalizability

Community News

If you haven’t introduced yourself, please do so under the #intros channel.

Thanks for being a part of this community! As always, please let me know if you have questions/ideas/feedback.

Sarah

Sarah Gebauer, MD

MlforMDs.com

June 12, 2023