protein classification algorithm
Go to external page http://scai.fraunhofer.de/HuPSON#SCAIVPH_00000567
Term information
Many approaches have been presented for the protein classification problem, including methods based on pairwise similarity of sequences 1;2;3, prfiles for protein families 4, consensus patterns using motifs 5;6 and hidden Markov models 7;8;9. Most of these methods are generative approaches: the methodology involves building a model for a single protein family and then evaluating each candidate sequence to see how well it fits the model. If the "fit" is above some threshold, then the protein is classified as belonging to the family. Discriminative approaches 10;11;12 take a different point of view: protein sequences are seen as a set of labeled examples { positive if they are in the family and negative otherwise } and a learning algorithm attempts to learn the distinction between the different classes. Both positive and negative examples are used in training for a discriminative approach, while generative approaches can only make use of positive training examples. source: Leslie C, Eskin E, Noble WS.: The spectrum kernel: a string kernel for SVM protein classification.. Pac Symp Biocomput. 2002:564-75. PMID: 11928508 http://psb.stanford.edu/psb-online/proceedings/psb02/leslie.pdf