Chapter title |
Homology-Based Annotation of Large Protein Datasets.
|
---|---|
Chapter number | 8 |
Book title |
Data Mining Techniques for the Life Sciences
|
Published in |
Methods in molecular biology, January 2016
|
DOI | 10.1007/978-1-4939-3572-7_8 |
Pubmed ID | |
Book ISBNs |
978-1-4939-3570-3, 978-1-4939-3572-7
|
Authors |
Marco Punta, Jaina Mistry |
Editors |
Oliviero Carugo, Frank Eisenhaber |
Abstract |
Advances in DNA sequencing technologies have led to an increasing amount of protein sequence data being generated. Only a small fraction of this protein sequence data will have experimental annotation associated with them. Here, we describe a protocol for in silico homology-based annotation of large protein datasets that makes extensive use of manually curated collections of protein families. We focus on annotations provided by the Pfam database and suggest ways to identify family outliers and family variations. This protocol may be useful to people who are new to protein data analysis, or who are unfamiliar with the current computational tools that are available. |
Mendeley readers
Geographical breakdown
Country | Count | As % |
---|---|---|
Unknown | 7 | 100% |
Demographic breakdown
Readers by professional status | Count | As % |
---|---|---|
Student > Bachelor | 2 | 29% |
Researcher | 2 | 29% |
Student > Doctoral Student | 1 | 14% |
Student > Ph. D. Student | 1 | 14% |
Lecturer | 1 | 14% |
Other | 0 | 0% |
Readers by discipline | Count | As % |
---|---|---|
Biochemistry, Genetics and Molecular Biology | 3 | 43% |
Computer Science | 2 | 29% |
Agricultural and Biological Sciences | 2 | 29% |