INPS/INPS-3D methods are described in the following articles:
Both INPS and INPS-3D adopt a Support Vector Regression (SVR) approach. In INPS, the SVR is trained on features extracted from the protein primary sequence, including:
INPS-3D extends the feature set by also including information extracted from the protein 3D structure:
INPS and INPS-3D have been trained/tested using a dataset of 2648 variants occurring in 132 proteins (see Datasets). The dataset (S2648) was originally extracted from the ProTherm database and curated by the authors of the PoPMuSiC algorithm (Dehouck et al., 2009).
Five-fold cross-validation has been performed on the S2648 dataset. Cross-validation split has been computed at the protein level: all variations occurring in the same protein were collected into the same testing set avoiding any possible bias between training/testing data.
For sake of comparison with other approaches, an additional dataset has been used, comprising 42 variations within the DNA binding domain of the tumor suppressor protein p53, whose thermodynamic effects have previously been experimentally characterized (P53 dataset).
References