Cohen et al. [Cohen 2010] have adopted the testsuite methodology from software engineering and proposed a stratified approach to data sampling based on several criteria. Each criterion focuses on a set of concepts that share a particular property, such as, length in tokens, presence of punctuation, coordination, etc. This leads to a framework able to characterize the strengths of the linguistic patterns used within each concept recognition system and, moreover, to a platform that can be applied and shared to perform standardized error analysis.
This framework has been applied to HPO and has led to 32 manually crafted criteria (or types of test suites) comprising 2,164 entries - each entry corresponds to the label of an HPO concept. In addition to being structured by type, test suites have also been structured according to the 21 top-level abnormalities present in HPO. The complete list of criteria is listed below. The archive comprising all test suites can be downloaded from:
Test suite list: