↓ Skip to main content

Research in Computational Molecular Biology

Overview of attention for book
Cover of 'Research in Computational Molecular Biology'

Table of Contents

  1. Altmetric Badge
    Book Overview
  2. Altmetric Badge
    Chapter 1 Boosting Alignment Accuracy by Adaptive Local Realignment
  3. Altmetric Badge
    Chapter 2 A Concurrent Subtractive Assembly Approach for Identification of Disease Associated Sub-metagenomes
  4. Altmetric Badge
    Chapter 3 A Flow Procedure for the Linearization of Genome Sequence Graphs
  5. Altmetric Badge
    Chapter 4 Dynamic Alignment-Free and Reference-Free Read Compression
  6. Altmetric Badge
    Chapter 5 A Fast Approximate Algorithm for Mapping Long Reads to Large Reference Databases
  7. Altmetric Badge
    Chapter 6 Determining the Consistency of Resolved Triplets and Fan Triplets
  8. Altmetric Badge
    Chapter 7 Progressive Calibration and Averaging for Tandem Mass Spectrometry Statistical Confidence Estimation: Why Settle for a Single Decoy?
  9. Altmetric Badge
    Chapter 8 Resolving Multicopy Duplications de novo Using Polyploid Phasing
  10. Altmetric Badge
    Chapter 9 A Bayesian Active Learning Experimental Design for Inferring Signaling Networks
  11. Altmetric Badge
    Chapter 10 $$BBK^*$$ (Branch and Bound over $$K^*$$ ): A Provable and Efficient Ensemble-Based Algorithm to Optimize Stability and Binding Affinity over Large Sequence Spaces
  12. Altmetric Badge
    Chapter 11 Superbubbles, Ultrabubbles and Cacti
  13. Altmetric Badge
    Chapter 12 EPR-Dictionaries: A Practical and Fast Data Structure for Constant Time Searches in Unidirectional and Bidirectional FM Indices
  14. Altmetric Badge
    Chapter 13 A Bayesian Framework for Estimating Cell Type Composition from DNA Methylation Without the Need for Methylation Reference
  15. Altmetric Badge
    Chapter 14 Towards Recovering Allele-Specific Cancer Genome Graphs
  16. Altmetric Badge
    Chapter 15 Using Stochastic Approximation Techniques to Efficiently Construct Confidence Intervals for Heritability
  17. Altmetric Badge
    Chapter 16 Improved Search of Large Transcriptomic Sequencing Databases Using Split Sequence Bloom Trees
  18. Altmetric Badge
    Chapter 17 AllSome Sequence Bloom Trees
  19. Altmetric Badge
    Chapter 18 Longitudinal Genotype-Phenotype Association Study via Temporal Structure Auto-learning Predictive Model
  20. Altmetric Badge
    Chapter 19 Improving Imputation Accuracy by Inferring Causal Variants in Genetic Studies
  21. Altmetric Badge
    Chapter 20 The Copy-Number Tree Mixture Deconvolution Problem and Applications to Multi-sample Bulk Sequencing Tumor Data
  22. Altmetric Badge
    Chapter 21 Quantifying the Impact of Non-coding Variants on Transcription Factor-DNA Binding
  23. Altmetric Badge
    Chapter 22 aBayesQR: A Bayesian Method for Reconstruction of Viral Populations Characterized by Low Diversity
Attention for Chapter 8: Resolving Multicopy Duplications de novo Using Polyploid Phasing
Altmetric Badge

Mentioned by

2 X users


1 Dimensions

Readers on

16 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Chapter title
Resolving Multicopy Duplications de novo Using Polyploid Phasing
Chapter number 8
Book title
Research in Computational Molecular Biology
Published in
Research in computational molecular biology : ... Annual International Conference, RECOMB ... : proceedings. RECOMB (Conference : 2005-), May 2017
DOI 10.1007/978-3-319-56970-3_8
Pubmed ID
Book ISBNs
978-3-31-956969-7, 978-3-31-956970-3

Mark J. Chaisson, Sudipto Mukherjee, Sreeram Kannan, Evan E. Eichler


While the rise of single-molecule sequencing systems has enabled an unprecedented rise in the ability to assemble complex regions of the genome, long segmental duplications in the genome still remain a challenging frontier in assembly. Segmental duplications are at the same time both gene rich and prone to large structural rearrangements, making the resolution of their sequences important in medical and evolutionary studies. Duplicated sequences that are collapsed in mammalian de novo assemblies are rarely identical; after a sequence is duplicated, it begins to acquire paralog specific variants. In this paper, we study the problem of resolving the variations in multicopy long-segmental duplications by developing and utilizing algorithms for polyploid phasing. We develop two algorithms: the first one is targeted at maximizing the likelihood of observing the reads given the underlying haplotypes using discrete matrix completion. The second algorithm is based on correlation clustering and exploits an assumption, which is often satisfied in these duplications, that each paralog has a sizable number of paralog-specific variants. We develop a detailed simulation methodology, and demonstrate the superior performance of the proposed algorithms on an array of simulated datasets. We measure the likelihood score as well as reconstruction accuracy, i.e., what fraction of the reads are clustered correctly. In both the performance metrics, we find that our algorithms dominate existing algorithms on more than 93% of the datasets. While the discrete matrix completion performs better on likelihood score, the correlation clustering algorithm performs better on reconstruction accuracy due to the stronger regularization inherent in the algorithm. We also show that our correlation-clustering algorithm can reconstruct on an average 7.0 haplotypes in 10-copy duplication data-sets whereas existing algorithms reconstruct less than 1 copy on average.

X Demographics

X Demographics

The data shown below were collected from the profiles of 2 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 16 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Netherlands 1 6%
Unknown 15 94%

Demographic breakdown

Readers by professional status Count As %
Researcher 5 31%
Student > Bachelor 4 25%
Student > Ph. D. Student 3 19%
Professor > Associate Professor 1 6%
Unknown 3 19%
Readers by discipline Count As %
Biochemistry, Genetics and Molecular Biology 5 31%
Agricultural and Biological Sciences 4 25%
Computer Science 2 13%
Veterinary Science and Veterinary Medicine 1 6%
Unknown 4 25%