Trends in Sux Sorting: A Survey of Low Memory Algorithms

Dhaliwal, J., Puglisi, S.J. and Turpin, A.

    The suffix array is a sorted array of all the suffixes in a string. This remarkably simple data structure is fundamental for string processing and lies at the heart of efficient algorithms for pattern matching, pattern mining, and data compression. In many applications suffix array construction, or equivalently suffix sorting, is a computational bottleneck and so has been the focus of intense research in the last 20 years. This paper outlines several suffix array construction algorithms that have emerged since the survey due to Puglisi, Smyth and Turpin [ACM Computing Surveys 39, 2007]. These algorithms have tended to strive for small working space (RAM), often at the cost of runtime, and make use of compressed data structures or secondary memory (disk) to achieve this goal. We provide a high-level description of each algorithm, avoiding implementation details as much as possible, and outline directions that could benefit from further research.
Cite as: Dhaliwal, J., Puglisi, S.J. and Turpin, A. (2012). Trends in Sux Sorting: A Survey of Low Memory Algorithms. In Proc. Australasian Computer Science Conference (ACSC 2012) Melbourne, Australia. CRPIT, 122. Reynolds, M. and Thomas, B, Eds., ACS. 91-98
pdf (from crpit.com) pdf (local if available) BibTeX EndNote GS