Systematic Exploration of an Efficient Amino Acid Substitution Matrix: MIQS.
Data Mining Techniques for the Life Sciences
Methods in molecular biology, January 2016
Kentaro Tomii, Kazunori Yamada
Oliviero Carugo, Frank Eisenhaber
Amino acid sequence comparisons to find similarities between proteins are fundamental sequence information analyses for inferring protein structure and function. In this study, we improve amino acid substitution matrices to identify distantly related proteins. We systematically sampled and benchmarked substitution matrices generated from the principal component analysis (PCA) subspace based on a set of typical existing matrices. Based on the benchmark results, we identified a region of highly sensitive matrices in the PCA subspace using kernel density estimation (KDE). Using the PCA subspace, we were able to deduce a novel sensitive matrix, called MIQS, which shows better detection performance for detecting distantly related proteins than those of existing matrices. This approach to derive an efficient amino acid substitution matrix might influence many fields of protein sequence analysis. MIQS is available at http://csas.cbrc.jp/Ssearch/ .
|Readers by professional status||Count||As %|
|Student > Ph. D. Student||1||25%|
|Readers by discipline||Count||As %|
|Biochemistry, Genetics and Molecular Biology||1||25%|
|Agricultural and Biological Sciences||1||25%|