ISPRED-SEQ
A Deep-learning based method for the prediction of Interaction Sites starting from protein sequence.
Part of the Bioinformatics Sweeties collection.
Datasets
Both the training set and the blind test sets are available in a format similar to a fasta file, where for each protein you have three lines showing respectively: i) the '>' character followed by the protein ID ii) the residue sequence iii) the class of each residue (1 for Interaction Site, 0 otherwise).
Inside the ISPRED_Blind_Test_Sets folder you will find all the blind sets used for benchmark (Dset335, Dset448, Hetero_TE, Homo_TE).
Inside the cross_validation folder you will find 10 text files (split0-split9), each containing the protein IDs belonging to the corresponding subset used in cross-validation.