A simple yet time-optimal and linear-space algorithm for shortest unique substring queries

dc.contributor.authorİleri, Atalay Mert
dc.contributor.authorKülekçi, Muhammed O?uzhan
dc.contributor.authorXu, Bojian
dc.date.accessioned10.07.201910:49:13
dc.date.accessioned2019-07-10T20:03:15Z
dc.date.available10.07.201910:49:13
dc.date.available2019-07-10T20:03:15Z
dc.date.issued2015
dc.departmentİstanbul Medipol Üniversitesi, Mühendislik ve Doğa Bilimleri Fakültesi, Biyomedikal Mühendisliği Bölümü
dc.descriptionWOS: 000347602000043
dc.description.abstractWe revisit the problem of finding shortest unique substring (SUS) proposed recently by Pei et al. (2013) [12]. We propose an optimal O(n) time and space algorithm that can find an SUS for every location of a string of size n and thus significantly improve their O(n(2)) time complexity. Our method also supports finding all the SUSes covering every location, whereas theirs can find only one SUS for every location. Further, our solution is simpler and easier to implement and is more space efficient in practice, since we only use the inverse suffix array and the longest common prefix array of the string, while their algorithm uses the suffix tree of the string and other auxiliary data structures. Our theoretical results are validated by an empirical study with real-world data that shows our method is at least 8 times faster and uses at least 20 times less memory. The speedup gained by our method against Pei et al's can become even more significant when the string size increases due to their quadratic time complexity. We also have compared our method with the recent Tsuruta et al.'s (2014) [14] proposal, another independent 0(n) time and space algorithm for SUS finding. The empirical study shows that both methods have nearly the same processing speed. However, ours uses at least 4 times less memory for finding one SUS and at least 2 times less memory for finding all SUSes, both covering every string location.
dc.description.sponsorshipEWU's Faculty Grants for Research and Creative Worksen_US
dc.description.sponsorshipSupported in part by EWU's Faculty Grants for Research and Creative Works.en_US
dc.identifier.citationAtalay İleri, M., Külekçi, M. O. ve Xu, B. (2015). A simple yet time-optimal and linear-space algorithm for shortest unique substring queries. Theoretical Computer Science, 562, 621-633. https://dx.doi.org/10.1016/j.tcs.2014.11.004
dc.identifier.doi10.1016/j.tcs.2014.11.004
dc.identifier.endpage633
dc.identifier.issn0304-3975
dc.identifier.issn1879-2294
dc.identifier.scopusqualityQ1
dc.identifier.startpage621
dc.identifier.urihttps://dx.doi.org/10.1016/j.tcs.2014.11.004
dc.identifier.urihttps://hdl.handle.net/20.500.12511/3838
dc.identifier.volume562
dc.identifier.wosqualityQ4
dc.indekslendigikaynakWeb of Science
dc.indekslendigikaynakScopus
dc.language.isoen
dc.publisherElsevier
dc.relation.ispartofTheoretical Computer Scienceen_US
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rightsinfo:eu-repo/semantics/openAccess
dc.subjectUnique Substring
dc.subjectShortest Unique Substring
dc.subjectRepetitiveness
dc.subjectRegularity
dc.titleA simple yet time-optimal and linear-space algorithm for shortest unique substring queries
dc.typeArticle

Dosyalar

Orijinal paket
Listeleniyor 1 - 1 / 1
Yükleniyor...
Küçük Resim
İsim:
kulekci, oguzhan-2015.pdf
Boyut:
513.86 KB
Biçim:
Adobe Portable Document Format
Açıklama:
Tam Metin / Full Text