Skip to content

Tags: santosardr/non-CSPs

Tags

v2

I used only fourteen positive validation proteins and 92 negative pro…

…teins to construct all the results until now. These fourteen were the remaining ones from the PengaRoo dataset training (157), after excluding the 141 for training plus two due to similarity issues with these 141 (>90% similar). Based on this validation set with only fourteen positive proteins, I will stop training for better results. From now on, I will elaborate on the independent dataset used by the PengaRoo and ASPIRER to better compare my results with them. However, I will include in the manuscript that PengaRoo scored pretty poorly concerning my independent personal data set: It misclassifies 58 out of 92 of my negative proteins as non-CSPs.