- FCS1.0 and 2.0 list mode files as well as a Dbase3 database can be On-Line classified in the following examples to get an impression of the comparatively fast action of the CLASSIF1 algorithm on standard personal computers. The examples concern the prognostic risk assessment for individual patients from immunephenotype and clinical measurements:
- The CLASSIF1 classification algorithm shortly works as follows. Pairs of percentiles e.g. 5/95%, 10/90%, 15/95% etc. are calculated from the reference samples of each database column of the learning set. The values of all database columns are then transformed into a triple matrix by assigning: 0 to values between the percentiles, + to values above and - to values below the respective upper and lower percentiles. A confusion matrix is subsequently established between the known e.g. clinical classification of the reference and abnormal samples on the ordinate and the same classification for triple matrix classifier on the abscissa. The classification result is ideal when a 100% recognition value is obtained in each diagonal box of the confusion matrix which is usually not the case when all database columns are used for classification.
- The optimization process is directed towards an increase of the sum of the diagonal box values. Single and combinations of database columns are temporarily excluded from the calculations to see whether their absence improves or deteriorates the classification result. Once no improvement is achieved any more, all database columns which neither alone nor in combination with other database columns improved the result are definitively excluded from the final classification.
- Triple matrix classifiers are inherently standardized onto the group of reference samples during the classification process. Classifiers from different flow cytometers or laboratories can be compared in an instrument and laboratory independent way provided no differences between the respective reference groups are detected by the program.
- The advantage of this classification strategy is that triple matrix classifiers can be numerically compared during interlaboratory consensus trials e.g. on leukemia, HIV and thrombocyte classifications by immunephenotyping.
- In view of immunephenotyping or other ring trials, the performance of triple matrix classifiers depends preferentially on individual laboratory precision rather than on interlaboratory accuracy. Instrument accuracy cancels out by percentile normalization on the mean values of the reference samples. Reference samples can be obtained in many instances from age and sex matched blood donors who represent a comparatively homogeneous group of persons.
- The relative independence of the classification results of overall accuracy represents an advantageous feature of triple matrix classifiers. It is a frequent experience in ring trials that good accuracy is more difficult to achieve than good precision especially when different instrument types are used by the participants. The reason for this consists in the high technical complexity of flow cytometers and cell sorters in view of e.g. optical signal perception, amplification, thresholding, filter characteristics and data processing. The recent development of various standard bead preparations improves but does not yet generally solve this problem.
- The practical consequence of this is e.g. that rare diseases can be classified at places where no sufficient learning sets can be generated in reasonable times or where costly investigations are necessary to establish appropriate learning sets.
- Furthermore the biochemical properties of many body cell systems during disease can be directly compared when the classifiers are standardized on peripheral blood cells. This standardization in addition permits the use for classification of disease induced changes in tissue or effusion localized blood cells.