↓ Skip to main content

Data Mining for Systems Biology

Overview of attention for book
Cover of 'Data Mining for Systems Biology'

Table of Contents

  1. Altmetric Badge
    Book Overview
  2. Altmetric Badge
    Chapter 1 Identifying Bacterial Strains from Sequencing Data
  3. Altmetric Badge
    Chapter 2 MetaVW: Large-Scale Machine Learning for Metagenomics Sequence Classification
  4. Altmetric Badge
    Chapter 3 Online Interactive Microbial Classification and Geospatial Distributional Analysis Using BioAtlas
  5. Altmetric Badge
    Chapter 4 Generative Models for Quantification of DNA Modifications
  6. Altmetric Badge
    Chapter 5 DiMmer: Discovery of Differentially Methylated Regions in Epigenome-Wide Association Study (EWAS) Data
  7. Altmetric Badge
    Chapter 6 Implementing a Transcription Factor Interaction Prediction System Using the GenoMetric Query Language
  8. Altmetric Badge
    Chapter 7 Multiple Testing Tool to Detect Combinatorial Effects in Biology
  9. Altmetric Badge
    Chapter 8 SiBIC: A Tool for Generating a Network of Biclusters Captured by Maximal Frequent Itemset Mining
  10. Altmetric Badge
    Chapter 9 Computing and Visualizing Gene Function Similarity and Coherence with NaviGO
  11. Altmetric Badge
    Chapter 10 Analyzing Glycan-Binding Profiles Using Weighted Multiple Alignment of Trees
  12. Altmetric Badge
    Chapter 11 Analysis of Fluxomic Experiments with Principal Metabolic Flux Mode Analysis
  13. Altmetric Badge
    Chapter 12 Analyzing Tandem Mass Spectra Using the DRIP Toolkit: Training, Searching, and Post-Processing
  14. Altmetric Badge
    Chapter 13 Sparse Modeling to Analyze Drug–Target Interaction Networks
  15. Altmetric Badge
    Chapter 14 DrugE-Rank: Predicting Drug-Target Interactions by Learning to Rank
  16. Altmetric Badge
    Chapter 15 MeSHLabeler and DeepMeSH: Recent Progress in Large-Scale MeSH Indexing
  17. Altmetric Badge
    Chapter 16 Disease Gene Classification with Metagraph Representations
  18. Altmetric Badge
    Chapter 17 Inferring Antimicrobial Resistance from Pathogen Genomes in KEGG
Attention for Chapter 16: Disease Gene Classification with Metagraph Representations
Altmetric Badge

Citations

dimensions_citation
1 Dimensions

Readers on

mendeley
9 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Chapter title
Disease Gene Classification with Metagraph Representations
Chapter number 16
Book title
Data Mining for Systems Biology
Published in
Methods in molecular biology, January 2018
DOI 10.1007/978-1-4939-8561-6_16
Pubmed ID
Book ISBNs
978-1-4939-8560-9, 978-1-4939-8561-6
Authors

Sezin Kircali Ata, Yuan Fang, Min Wu, Xiao-Li Li, Xiaokui Xiao, Ata, Sezin Kircali, Fang, Yuan, Wu, Min, Li, Xiao-Li, Xiao, Xiaokui

Abstract

This chapter is based on exploiting the network-based representations of proteins, metagraphs, in protein-protein interaction network to identify candidate disease-causing proteins. Protein-protein interaction (PPI) networks are effective tools in studying the functional roles of proteins in the development of various diseases. However, they are insufficient without the support of additional biological knowledge for proteins such as their molecular functions and biological processes. To enhance PPI networks, we utilize biological properties of individual proteins as well. More specifically, we integrate keywords from UniProt database describing protein properties into the PPI network and construct a novel heterogeneous PPI-Keyword (PPIK) network consisting of both proteins and keywords. As proteins with similar functional duties or involving in the same metabolic pathway tend to have similar topological characteristics, we propose to represent them with metagraphs. Compared to the traditional network motif or subgraph, a metagraph can capture the topological arrangements through not only the protein-protein interactions but also protein-keyword associations. We feed those novel metagraph representations into classifiers for disease protein prediction and conduct our experiments on three different PPI databases. They show that the proposed method consistently increases disease protein prediction performance across various classifiers, by 15.3% in AUC on average. It outperforms the diffusion-based (e.g., RWR) and the module-based baselines by 13.8-32.9% in overall disease protein prediction. Breast cancer protein prediction outperforms RWR, PRINCE, and the module-based baselines by 6.6-14.2%. Finally, our predictions also exhibit better correlations with literature findings from PubMed database.

Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 9 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Unknown 9 100%

Demographic breakdown

Readers by professional status Count As %
Researcher 2 22%
Student > Ph. D. Student 1 11%
Professor 1 11%
Student > Master 1 11%
Student > Postgraduate 1 11%
Other 0 0%
Unknown 3 33%
Readers by discipline Count As %
Computer Science 2 22%
Mathematics 1 11%
Biochemistry, Genetics and Molecular Biology 1 11%
Agricultural and Biological Sciences 1 11%
Materials Science 1 11%
Other 0 0%
Unknown 3 33%