Nucleotide sequence alignment and compression via shortest unique substring

Adaş, Boran; Bayraktar, Ersin; Faro, Simone; Moustafa, Ibraheem Elsayed; Külekçi, Muhammed Oǧuzhan

Göster/Aç

Tam Metin / Full Text (252.4Kb)

Erişim

info:eu-repo/semantics/openAccess

Tarih

2015

Yazar

Adaş, Boran
Bayraktar, Ersin
Faro, Simone
Moustafa, Ibraheem Elsayed
Külekçi, Muhammed Oǧuzhan

Üst veri

Tüm öğe kaydını göster

Künye

Adaş, B., Bayraktar, E., Faro, S., Moustafa, I. E. ve Külekçi, M. O. (2015). Nucleotide sequence alignment and compression via shortest unique substring. L3rd International Work-Conference on Bioinformatics and Biomedical Engineering içinde (363-374. ss.). Grenada, Spain, April 15-17, 2015.

Özet

Aligning short reads produced by high throughput sequencing equipments onto a reference genome is the fundamental step of sequence analysis. Since the sequencing machinery generates massive volumes of data, it is becoming more and more vital to keep those data compressed also. In this study we present the initial results of an on-going research project, which aims to combine the alignment and compression of short reads with a novel preprocessing technique based on shortest unique substring identifiers. We observe that clustering the short reads according to the set of unique identifiers they include provide us an opportunity to combine compression and alignment. Thus, we propose an alternative path in high-throughput sequence analysis pipeline, where instead of applying an immediate whole alignment, a preprocessing that clusters the reads according to the set of shortest unique substring identifiers extracted from the reference genome is to be performed first. We also present an analysis of the short unique substrings identifiers on the human reference genome and examine how labeling each short read with those identifiers helps in alignment and compression.

WoS Q Kategorisi

Kaynak

3rd International Work-Conference on Bioinformatics and Biomedical Engineering

Bağlantı

https://hdl.handle.net/20.500.12511/829

Koleksiyonlar

Bildiri Koleksiyonu [30]
Scopus İndeksli Yayınlar Koleksiyonu [6632]
WoS İndeksli Yayınlar Koleksiyonu [6705]