Can Researchers Assess the Suitability of Datasets to Answer Their Research Questions, with Access to Metadata Only?
Tilston G., Williams R., Griffiths E., Al-Adely S., Lawson-Tovey S., Hulme W., Short A., Davies J., Welch J., Peek N.
Health research increasingly requires effective ways to identify existing datasets and assess their suitability for research. We sought to test whether researchers could use an existing metadata catalogue to assess the suitability of datasets for addressing specified research questions. Five datasets were described in the National Institute for Health Research Health Informatics Collaborative metadata catalogue, and for each dataset five associated research questions were formulated, some of which were answerable with the dataset while others were not. Thirteen researchers each assessed whether the ten questions associated with two randomly selected datasets were answerable with the described datasets. After removing instances where participants misunderstood the question or lacked subject matter knowledge to make the assessment, we found that 87 out of 109 assessments (80%) were correct. Participants particularly struggled with one dataset which consisted of EHR data. The most common reason for incorrect assessments was the inability to find the relevant information in the metadata catalogue.