Artificial Intelligence for Classification of Lung Nodules: A Review of Clinical Utility, Diagnostic Accuracy, Cost-Effectiveness, and Guidelines


( Last Updated : February 18, 2020)
Project Line:
Health Technology Review
Project Sub Line:
Summary with Critical Appraisal
Project Number:
RC1228-000

Details


Question


  1.   What is the clinical utility of artificial intelligence for nodule classification in screening, incidental identification, or known or suspected malignancies for lung cancer?

  2.   What is the diagnostic accuracy of artificial intelligence for nodule classification in screening, incidental identification, or known or suspected malignancies for lung cancer?

  3.   What is the cost-effectiveness of artificial intelligence for nodule classification in screening, incidental identification, or known or suspected malignancies for lung cancer?

  4.   What are the evidence-based guidelines regarding artificial intelligence for nodule classification in screening, incidental identification, or known or suspected malignancies for lung cancer?


Key Message

Seven diagnostic case-control studies were identified regarding the diagnostic accuracy of artificial intelligence for nodule classification in screening, incidental identification, or known or suspected malignancies for lung cancer. No evidence regarding the cost-effectiveness, clinical utility or evidence-based guidelines regarding artificial intelligence for nodule classification in screening, incidental identification, or known or suspected malignancies for lung cancer were identified.



Results from the case-control studies were mixed. Two studies reported that artificial intelligence models are significantly more accurate at nodule classification when compared to radiologists classifying lung nodules using the American College of Radiologists Lung CT [computed tomography] Screening Reporting and Data System. Two studies descriptively reported that artificial intelligence models are more accurate at nodule classification compared to human observation. However, three studies reported that artificial intelligence models were comparable or had a reduced accuracy when versus human observers (statistical testing performed for one study, descriptive results provided for two studies). Three studies descriptively reported on the sensitivity and specificity outcomes and found that artificial intelligence models had higher values for sensitivity and specificity outcomes than their respective comparators for the diagnosis of lung malignancy.



It may be premature to draw conclusions about artificial intelligence for lung nodule classification given the paucity of clinical utility, cost-effectiveness evidence and guidelines, and mixed results and inherent methodological flaws noted within the included diagnostic accuracy studies.