Chapter title |
NGSPanPipe: A Pipeline for Pan-genome Identification in Microbial Strains from Experimental Reads
|
---|---|
Chapter number | 4 |
Book title |
Infectious Diseases and Nanomedicine III
|
Published in |
Advances in experimental medicine and biology, January 2018
|
DOI | 10.1007/978-981-10-7572-8_4 |
Pubmed ID | |
Book ISBNs |
978-9-81-107571-1, 978-9-81-107572-8
|
Authors |
Umay Kulsum, Arti Kapil, Harpreet Singh, Punit Kaur, Kulsum, Umay, Kapil, Arti, Singh, Harpreet, Kaur, Punit |
Abstract |
Recent advancements in sequencing technologies have decreased both time span and cost for sequencing the whole bacterial genome. High-throughput Next-Generation Sequencing (NGS) technology has led to the generation of enormous data concerning microbial populations publically available across various repositories. As a consequence, it has become possible to study and compare the genomes of different bacterial strains within a species or genus in terms of evolution, ecology and diversity. Studying the pan-genome provides insights into deciphering microevolution, global composition and diversity in virulence and pathogenesis of a species. It can also assist in identifying drug targets and proposing vaccine candidates. The effective analysis of these large genome datasets necessitates the development of robust tools. Current methods to develop pan-genome do not support direct input of raw reads from the sequencer machine but require preprocessing of reads as an assembled protein/gene sequence file or the binary matrix of orthologous genes/proteins. We have designed an easy-to-use integrated pipeline, NGSPanPipe, which can directly identify the pan-genome from short reads. The output from the pipeline is compatible with other pan-genome analysis tools. We evaluated our pipeline with other methods for developing pan-genome, i.e. reference-based assembly and de novo assembly using simulated reads of Mycobacterium tuberculosis. The single script pipeline (pipeline.pl) is applicable for all bacterial strains. It integrates multiple in-house Perl scripts and is freely accessible from https://github.com/Biomedinformatics/NGSPanPipe . |
X Demographics
Geographical breakdown
Country | Count | As % |
---|---|---|
United Kingdom | 2 | 33% |
Spain | 1 | 17% |
Czechia | 1 | 17% |
Unknown | 2 | 33% |
Demographic breakdown
Type | Count | As % |
---|---|---|
Members of the public | 4 | 67% |
Scientists | 2 | 33% |
Mendeley readers
Geographical breakdown
Country | Count | As % |
---|---|---|
Unknown | 34 | 100% |
Demographic breakdown
Readers by professional status | Count | As % |
---|---|---|
Student > Master | 8 | 24% |
Researcher | 6 | 18% |
Librarian | 2 | 6% |
Professor > Associate Professor | 2 | 6% |
Student > Ph. D. Student | 2 | 6% |
Other | 3 | 9% |
Unknown | 11 | 32% |
Readers by discipline | Count | As % |
---|---|---|
Biochemistry, Genetics and Molecular Biology | 7 | 21% |
Agricultural and Biological Sciences | 6 | 18% |
Medicine and Dentistry | 3 | 9% |
Computer Science | 2 | 6% |
Immunology and Microbiology | 1 | 3% |
Other | 1 | 3% |
Unknown | 14 | 41% |