Article
Author: Bladou, Franck ; Debeer, Sabine ; Walz, Jochen ; Cornud, François ; Fiard, Gaelle ; Rabilloud, Muriel ; Mottet, Nicolas ; Badet, Lionel ; Malavaud, Bernard ; Lefèvre, Frédéric ; Aziza, Richard ; Crouzet, Sébastien ; Mège-Lechevallier, Florence ; Timsit, Marc-Olivier ; Colombel, Marc ; Grange, Rémi ; Couchoux, Thibaut ; Correas, Jean-Michel ; Barry Delongchamps, Nicolas ; Portalez, Daniel ; Bratan, Flavie ; Villers, Arnauld ; Arfi, Nicolas ; Roy, Catherine ; Mozer, Pierre ; Branchu, Arthur ; Baseilhac, Pierre ; Jaouen, Tristan ; Renard-Penna, Raphaele ; Melodelima-Gonindard, Christelle ; Grenier, Nicolas ; Moldovan, Paul C ; Roumiguié, Matthieu ; Ruffion, Alain ; Rouvière, Olivier ; Souchon, Rémi ; Colin, Pierre ; Brunelle, Serge ; Descotes, Jean-Luc ; Lang, Hervé ; Puech, Philippe ; Tricard, Thibault ; Potiron, Eric ; Eschwege, Pascal ; Mansuy, Adeline ; Guillaume, Bénédicte ; Decaussin-Petrucci, Myriam ; Marcelin, Clément
BACKGROUND AND OBJECTIVEProstate multiparametric magnetic resonance imaging (MRI) shows high sensitivity for International Society of Urological Pathology grade group (GG) ≥2 cancers. Many artificial intelligence algorithms have shown promising results in diagnosing clinically significant prostate cancer on MRI. To assess a region-of-interest-based machine-learning algorithm aimed at characterising GG ≥2 prostate cancer on multiparametric MRI.METHODSThe lesions targeted at biopsy in the MRI-FIRST dataset were retrospectively delineated and assessed using a previously developed algorithm. The Prostate Imaging-Reporting and Data System version 2 (PI-RADSv2) score assigned prospectively before biopsy and the algorithm score calculated retrospectively in the regions of interest were compared for diagnosing GG ≥2 cancer, using the areas under the curve (AUCs), and sensitivities and specificities calculated with predefined thresholds (PIRADSv2 scores ≥3 and ≥4; algorithm scores yielding 90% sensitivity in the training database). Ten predefined biopsy strategies were assessed retrospectively.KEY FINDINGS AND LIMITATIONSAfter excluding 19 patients, we analysed 232 patients imaged on 16 different scanners; 85 had GG ≥2 cancer at biopsy. At patient level, AUCs of the algorithm and PI-RADSv2 were 77% (95% confidence interval [CI]: 70-82) and 80% (CI: 74-85; p = 0.36), respectively. The algorithm's sensitivity and specificity were 86% (CI: 76-93) and 65% (CI: 54-73), respectively. PI-RADSv2 sensitivities and specificities were 95% (CI: 89-100) and 38% (CI: 26-47), and 89% (CI: 79-96) and 47% (CI: 35-57) for thresholds of ≥3 and ≥4, respectively. Using the PI-RADSv2 score to trigger a biopsy would have avoided 26-34% of biopsies while missing 5-11% of GG ≥2 cancers. Combining prostate-specific antigen density, the PI-RADSv2 and algorithm's scores would have avoided 44-47% of biopsies while missing 6-9% of GG ≥2 cancers. Limitations include the retrospective nature of the study and a lack of PI-RADS version 2.1 assessment.CONCLUSIONS AND CLINICAL IMPLICATIONSThe algorithm provided robust results in the multicentre multiscanner MRI-FIRST database and could help select patients for biopsy.PATIENT SUMMARYAn artificial intelligence-based algorithm aimed at diagnosing aggressive cancers on prostate magnetic resonance imaging showed results similar to expert human assessment in a prospectively acquired multicentre test database.