This work seeks to create a better visual understanding performance of machine learning models that operate over a corpus of proteins. These classifiers typically try to predict where certain ligands bind to help in protein characterization (determining the function of a protein). The visualization seeks to combine summary judgments of performance over the corpus with a detailed view that shows performance on the three-dimensional surface of the protein.

Understanding how the performance of a trained classifier varies over a labeled test data set (corpus) can be difficult. The typical strategy of using summary statistics (accuracy, recall, F1 score, etc.) to determine the performance of a classifier leaves out critical data needed for comprehensive analysis of the classifier output. Which proteins does the classifier not accurately classify? What trends of performance can be seen across the corpus? How does this relate to metadata of the proteins? (e.g. size) How does the classification performance manifest itself on the three-dimensional structure of the proteins in the corpus? Are there any spatial trends that impact classification performance?

We have developed a visualization platform to view the results of protein structural classifiers to be able to see the performance over both the entire test corpus (200+ proteins) and the protein structure.