Copy the page URI to the clipboard
Langenbucher, Achim; Szentmáry, Nóra; Cayless, Alan; Wendelstein, Jascha and Hoffmann, Peter
(2023).
DOI: https://doi.org/10.1080/02713683.2022.2108457
Abstract
Purpose
Bootstrapping is a modern technique widely used in statistics to evaluate the performance of model parameters. The purpose of this study was to develop a strategy to identify and eliminate outliers in a dataset used for optimizing formula constants for lens power calculation.
Methods
In a dataset with N = 888 clinical cases treated with a monofocal aspherical intraocular lens (XC1/XY1, Hoya) constants for the SRKT, Haigis and Castrop formula were optimized and the prediction error PE calculated. The PE was bootstrapped NB = 1000 times, and the mean and trimmed mean of the bootstrapped PE were derived to generate the Bootlier plot showing the probability density function of the mean minus trimmed mean. With outliers this Bootlier plot shows some multimodality, and a Bootlier Index was extracted as a measure for multimodality. Outliers were removed from the tails of the PE distribution in a stepwise fashion until the Bootlier Index fell below a threshold of 0.001.
Results
With the entire dataset the mean/SD/median/mean absolute/root mean squared PE using the optimized formula constants were -0.0045/0.44415/0.0134/0.3406/0.4412 dpt with SRKT, 0.0065/0.3711/-0.0056/0.2830/0.3710 dpt with Haigis, and 0.0034/0.3452/0.0023/0.2683/0.3451 dpt with the Castrop formula. After identifying and removing outliers the respective metrics for the PE were -0.0036/0.4028/0.0134/0.3205/0.4026 dpt for the SRKT (13 cases removed), 0.0050/0.3375/-0.0056/0.2656/0.3373 dpt with Haigis (11 cases removed), and 0.0035/0.3168/0.0023/0.2531/0.3166 dpt with Castrop (11 cases removed). The multimodality in the Bootlier plots was reduced from 0/0.1567/0.0587/0.0258/0.0007/0 with SRKT, 0/0.0981/0.0261/0.0202/0.0003/0 with Haigis, and 0.0006/0.0006/0.0161/0.0191/0.0005/0 with Castrop for the entire dataset to values below 1e-3 for trimming both tails of the PE distribution by ⅛, ¼, ½, 1, 2.5, and 5% respectively.
Conclusion
We were able to prove that bootstrapping with outlier identification based on Bootlier plots and the Bootlier Index is a powerful tool to clean a dataset of outliers for formula constant optimization.