MAESTRO Data sets

DATA SETS

Data sets used in the study and detailed validation results. The respective PDB files are accessible as zip archives in column “STRUCTURES”.

Dataset Size Structures Prediction Validation Source/Ref.
SP1 2648 single point mutations 131 ΔΔG 5-fold cross validation Dehouck et al., 2009
SP2 350 single point mutations 67 ΔΔG performance test Dehouck et al., 2009
SP3 1925 single point mutations 55 ΔΔG 20-fold cross validation Masso et al., 2008; Capriotti et al., 2005
SP4 1765 single point mutations 98 ΔΔG 10-fold cross validation ProTherm DB
MP 479 multi point mutations 57 ΔΔG 10-fold cross validation ProTherm DB
SS1 75 disulfide bonds 75 S-S bond performance test Salam et al., 2014
minimized structures 75 S-S bond performance test (Salam et al., 2014)*
SS2 15 engineered disulfide bonds 13 S-S bond performance test Salam et al., 2014

The meaning of the sign of a ∆∆G varies from data set to data set. We defined negative ∆∆G as an increase in the stability of a protein, while positive values indicates a destabilization and adopted all data sets to this definition.

*) The models (minimized structure) used in Salam et al. 2014 were not available. Therefore, we applied a comparable procedure and provide the resulting structures here.