Swiss study compares two bone age algorithms

The accuracy of bone age determination by BoneXpert and Panda in 188 images was reported at ECR.

Federica Zanca from Leuven, together with four co-authors from Switzerland, presented a study comparing two bone age algorithms, BoneXpert from Visiana and Panda (based on deep learning) from ImageBiopsyLab. The study included 188 images taken under real-world conditions across 11 centres in Switzerland. The ground truth was provided by an exceptionally reliable manual rater. BoneXpert’s intended use includes autonomous use, while Panda’s is intended to only assist the radiologist. However, in this study, both algorithms were used without human interference.

The mean absolute deviation between the algorithm and the ground truth was 0.36 y for BoneXpert and 0.42 y for Panda, and this difference was significant with p=0.01.

In the Bland Altman plots below one can clearly see that the agreement is better with BoneXpert.

Figure 1: BoneXpert versus radiologist


Figure 2: Panda versus radiologist

The plots reveal that there are markedly fewer large deviations with BoneXpert.

So what is the clinical signficance of the different performance? The authors adressed this by defining the clinically acceptable limit of agreement to be ±1 year, and they found twice as many such significant deviations for Panda.

The table below summarises all the findings.

The poster is available though myESR (requiring log in)

Table: The deviation between the bone age algorithm and the radiologist

BoneXpert Panda
Mean Absolute Deviation 0.36 y 0.42 y
Root Mean Square Deviation 0.47 y 0.55 y
Number of deviations > 1 y 7 14
Largest deviation 1.2 y 1.9 y