|Mean Absolute Deviation (MAD) (months)||Root Mean Square Error (RMSE) (years)|
You may wonder how to best define the true bone age rating of a radiograph. The only meaningful way that we know of, is to define the true rating as the average of very many human ratings of the image.
The reference ratings in this bone age rating test are formed as an average over the ratings by forty-six radiologists, and this is such a large number, that it is pretty close to the true rating. In fact, the mean absolute difference between the reference ratings and the true ratings is only 0.9 months (5.8 divided by square-root of 46).
The above estimate of your accuracy is based on only 10 observations, so please don’t take it as a scientific evaluation of your actual rating accuracy. But we hope that this test has given you a feeling for how difficult it is for a human to rate bone age accurately.
One way to reduce the rater variability is to ask four people to rate each image and then take the average. This brings down the error by a factor of about 2. Another solution is to use BoneXpert, which has approximately the same accuracy as the average of four ratings.
Table of the ten images with reference ratings (average of 46 Dutch radiologist), BoneXpert version 3 ratings, and your ratings.