Researchers have tested the performance of machine learning and artificial intelligence (AI) algorithms used in medical image recognition and found they were highly unstable and might have led to false negatives and false positives.
You may also like: ICH Diagnosis: Could AI Step In Where Radiologists Fail?
The team, led by the University of Cambridge and Simon Fraser University, designed a series of tests to find the flaws in AI-based medical imaging systems, including MRI, CT and NMR.
They analysed several forms of instabilities resulting in unwanted
alterations and other major errors in the final images:
1) Tiny perturbations (movements), both in the image and sampling domain, which may lead to severe artefacts in the reconstruction.
2) A small structural change (eg, a tumor), which may not be captured in the reconstructed image.
3) Number of samples: more samples may yield poorer performance.
The above proved to be widespread across different types of artificial
neural networks, but were typically not present in non-AI-based imaging
techniques. The researchers therefore warn against fully relying on AI-based
image reconstruction techniques to make diagnoses and determine treatment.
Any AI algorithm need to provide results that are accurate and stable. But
with medical image reconstruction details, such as a tumour, may either be removed,
added, distorted or obscured, and unwanted artefacts may occur in the image. Also,
the quality of image reconstruction would deteriorate with repeated subsampling,
hence networks must be retrained on any subsampling pattern. These errors were found
to be widespread across the different types of neural networks regardless of
the underlying mathematical model, which means that these algorithms lack the
stability they need.
The authors conclude that instabilities are not necessarily rare events, and
that the instability phenomenon is not easy to remedy due to its presence in
various types of algorithms.
All of the code is available from GitHub.
Source: University of Cambridge