Automated identification of miscoded and misclassified cases of diabetes from computer records
Sadek AR., Van Vlymen J., Khunti K., de Lusignan S.
Aims To develop a computer processable algorithm, capable of running automated searches of routine data that flag miscoded and misclassified cases of diabetes for subsequent clinical review. Method Anonymized computer data from the Quality Improvement in Chronic Kidney Disease (QICKD) trial (n=942031) were analysed using a binary method to assess the accuracy of data on diabetes diagnosis. Diagnostic codes were processed and stratified into: definite, probable and possible diagnosis of Type 1 or Type 2 diabetes. Diagnostic accuracy was improved by using prescription compatibility and temporally sequenced anthropomorphic and biochemical data. Bayesian false detection rate analysis was used to compare findings with those of an entirely independent and more complex manual sort of the first round QICKD study data (n=760588). Results The prevalence of definite diagnosis of Type 1 diabetes and Type 2 diabetes were 0.32% and 3.27% respectively when using the binary search method. Up to 35% of Type 1 diabetes and 0.1% of Type 2 diabetes were miscoded or misclassified on the basis of age/BMI and coding. False detection rate analysis demonstrated a close correlation between the new method and the published hand-crafted sort. Both methods had the highest false detection rate values when coding, therapeutic, anthropomorphic and biochemical filters were used (up to 90% for the new and 75% for the hand-crafted search method). Conclusions A simple computerized algorithm achieves very similar results to more complex search strategies to identify miscoded and misclassified cases of both Type 1 diabetes and Type 2 diabetes. It has the potential to be used as an automated audit instrument to improve quality of diabetes diagnosis. © 2011 The Authors. Diabetic Medicine © 2011 Diabetes UK.