Home   >   CSC-OpenAccess Library   >    Manuscript Information
A Novel High Accuracy Algorithm for Reference Assembly in Colour Space
Balazs Gor, Anett Balla, Edit Tukacs, Istvan Nagy, Zsolt Torok
Pages - 92 - 104     |    Revised - 15-07-2012     |    Published - 10-08-2012
Volume - 6   Issue - 4    |    Publication Date - August 2012  Table of Contents
MORE INFORMATION
KEYWORDS
Smith-Waterman, Reference Assembly, Colour Space Code, Algorithm, Next Generation Sequencing.
ABSTRACT
Although numerous algorithms exist for genome alignment using Next Generation Sequencing tags, assembly of colour coded reads remains a challenge. We present a novel pairwise sequence aligner algorithm derived from Smith-Waterman method. Original feature of the algorithm is that it translates the reference sequence into colour code and performs the alignment in colour space. While operating on this base it can prevent most read error-derived assembly errors. Based on dynamic programming it gives the optimal alignment in colour space. Further, validation on empirical dataset with capillary sequencing proved high mapping accuracy. The algorithm can be implemented into any reference assembly software thereby improving mapping accuracy while maintaining high speed mapping.
1 Google Scholar 
2 CiteSeerX 
3 refSeek 
4 Scribd 
5 SlideShare 
6 PdfSR 
A. Magi, M. Benelli, A. Gozzini, F. Girolami, F. Torricelli, M. L. Brandi."Bioinformatics for Next Generation Sequencing Data". Genes, vol. 1, pp. 294-307, Sep. 2010.
Applied Biosystems Incorporated. “Principles of Di-Base Sequencing and the Advantages of Color Space Analysis in the SOLiD System”. 2008.
B. Horvath, J. Hunyadkurti, A. Voros, Cs. Fekete, E. Urban, L. Kemeny, I. Nagy. “Genome sequence of Propionibacteriumacnes type II strain ATCC 11828”. Journal of Bacteriology, vol. 194, pp 202-203, 2012.
B. Langmead, C. Trapnell, M. Pop, S.L. Salzberg. (2009, March)."Ultrafast and memoryefficient alignment of short DNA sequences to the human genome". Genome Biology, vol.10, 10:R25, Available:http://genomebiology.com/2009/10/3/R25
B. Langmead, S. L. Salzberg. “Fast gapped-read alignment with Bowtie 2”. Nature Methods, vol. 9, pp. 357–359, Mar. 2012.
D. R. Bentley. “Whole-genome re-sequencing”. Current Opinion in Genetics & Development, vol. 16, pp. 545-552, Oct. 2006.
D. R. Smith, A. R. Quinlan, H. E. Peckham, K. Makowsky, W. Tao, B. Woolf et al. “Rapid whole-genome mutational profiling using next-generation sequencing technologies”. Genome Research, vol. 18, pp. 1638–1642, Oct. 2009.
H. Breu."A Theoretical Understanding of 2 Base Color Codes and Its Application to Annotation, Error Detection, and Error Correction", 2010.
H. Bruggeman, A. Henne, F. Hoster, H. Liesegang, A. Wiezer, A. Strittmatter, S. Hujer, P. Durre, G. Gottschalk. "The complete genome sequence of Propionibacterium acnes, a commensal of human skin". Science, vol. 305, pp. 671-673, 2004.
H. Li, R. Durbin."Fast and accurate short read alignment with Burrows-Wheeler transform". Bioinformatics, vol. 25, pp 1754-1760, 2009.
I. Nagy, A. Pivarcsi, K. Kis, A. Koreck, L. Bodai, A. McDowell, H. Seltmann, S. Patrick, C.C. Zouboulis, L. Kemeny. "Propionibacterium acnes and lipopolysaccharide induce the expression of antimicrobial peptides and proinflammatory cytokines/chemokines in human sebocytes". Microbes Infect., vol.8 ,pp 2195-2205, 2006.
J. Shendure, H. Ji. “Next-generation DNA sequencing”. Nature Biotechnology, vol. 26, pp. 1135-1145, Oct. 2008.
K. R. Rasmussen, J. Stoye, E. W. Myers. “Efficient q-gram filters for finding all epsilonmatches over a given length”. Journal of Computational Biology, vol. 13, pp. 296–308, Mar. 2006.
M. L. Metzker. “Sequencing technologies – the next generation”. Nature Reviews Genetics, vol. 11, pp. 31-46, Jan. 2010.
M. Margulies, M. Egholm, W. E. Altman, S. Attiya, J. S. Bader, L. A. Bemben et al. “Genome sequencing in microfabricated high-density picolitre reactors”. Nature, vol. 437, pp. 376- 380, Sep. 2005.
N. Homer, B. Merriman, S.F. Nelson. (2009, June). "Local alignment of two-base encoded DNA sequence". BMC Bioinformatics,10:175, Available:http://www.biomedcentral.com/1471-2105/10/175
N. Homer, B. Merriman, S.F. Nelson."BFAST: an alignment tool for large scale genome resequencing". PLoS One,4(11) e7767., 2009.
P. Flicek, E. Birney. “Sense from sequence reads: methods for alignment and assembly”. Nature Methods, vol. 6, pp. S6–S12, Nov. 2009.
R. A. Gibbs, J. W. Belmont, P. Hardenbol, T. D. Willis et al. “The International HapMap Project”. Nature, vol. 426, pp. 789–796, Dec. 2003.
R. A. Lippert. “Space-efficient whole genome comparisons with Burrows-Wheeler transforms”. Journal of Computational Biology, vol. 12, pp. 407-415, May 2005.
R. M. Durbin, D. L. Altshuler, R. M. Durbin, G. A. R. Abecasis, D. R. Bentley, A. Chakravarti, A. G. Clark, F. S. Collins et al. "A map of human genome variation from population-scale sequencing". Nature, vol. 467, pp.1061–1073, Oct. 2010.
S. Bao, R. Jiang, W. Kwan, B. Wang, X. Ma, Y. Q. Song. “Evaluation of next-generation sequencing software in mapping and assembly”. Journal of Human Genetics, vol. 56, pp. 406-414, Jun. 2011.
S. M. Rumble, P. Lacroute, A. V. Dalca, M. Fiume, A. Sidow, M. Brudno. (2009, May). "SHRiMP: accurate mapping of short color-space reads". PLoS Computational Biology,5(5), Available:http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1000386
T. F. Smith, M. S. Waterman. "Identification of common molecular subsequences". Journal of Molecular Biology, vol.147, pp 195-197, 1981.
Z. Su, B. Ning, H. Fang, H. Hong, R. Perkins , W. Tong, L. Shi. “Next-generation sequencing and its applications in molecular diagnostics”. Expert Review of Molecular Diagnostics, vol. 11, pp. 333-343, Apr. 2011.
Dr. Balazs Gor
- Hungary
balazs.gor@astridbio.com
Mr. Anett Balla
- Hungary
Mr. Edit Tukacs
- Hungary
Mr. Istvan Nagy
- Hungary
Mr. Zsolt Torok
- Hungary


CREATE AUTHOR ACCOUNT
 
LAUNCH YOUR SPECIAL ISSUE
View all special issues >>
 
PUBLICATION VIDEOS