News
NCI Researchers Test Generalizability of Artificial Intelligence (AI) Model
Can you trust your AI model to perform well, even across diverse data sets? That’s the question that NCI researchers, Dr. Baris Turkbey (from the Molecular Imaging Branch in NCI’s Center for Cancer Research) and his colleagues examined in a recent retrospective study.
The researchers looked at how well their model for detecting prostate cancer performed using both in-house and external scans. The study was possible because each of the 210 patients they examined came to NIH with scans already performed elsewhere (i.e., community hospitals, private practices, and academic centers). By pairing the images, the researchers found a 16–20% difference (between in-house vs. external scans) in how well the model detected lesions and identified cancer.
The results were particularly significant because the researchers applied the model to a wide range of images of varying quality, from more than 20 different scanner types and multiple venders.
The study validates the overall generalizability of their model and sets the stage for additional prospective testing across more institutions.
As noted by senior author, Dr. Turkbey, “Our study shows the importance of using external testing to evaluate the generalizability and reproducibility of AI models. Unfortunately, many researchers in academia only test their models using in-house data, and thus may miss images from institutions, specifically from the community practice where patients often initially seek care.”
He added, “By ensuring widespread generalizability, a model such as ours could be an important support tool for radiologists, reducing the need for additional imaging in specialized centers and making equitable and high-quality care available for all patients with prostate cancer.”