We discuss ways to effectively parallelize the subset construction algorithm, which is used to convert non-deterministic finite automata (NFAs) to deterministic finite automata (DFAs). This conversion is at the heart of string pattern matching based on regular expressions and thus has many applications in text processing, compilers, scripting languages and web browsers, security and more recently also with DNA sequence analysis. We discuss sources of parallelism in the sequential algorithm and their profitability on shared-memory multicore architectures. Our NFA and DFA data-structures are designed to improve scalability and keep communication and synchronization overhead to a minimum. We present three different ways for synchronization; the performance of our non-blocking synchronization based on a compare-and-swap (CAS) primitive compares favorably to a lock-based approach. We consider structural NFA properties and their relationship to scalability on highly-parallel multicore architectures. We demonstrate the efficiency of our parallel subset construction algorithm through several benchmarks run on a 4-CPU (40 cores) node of the Intel Manycore Testing Lab. Achieved speedups are up to a factor of 32x with 40 cores.
|Cite as: Choi, H. and Burgstaller, B. (2013). Non-blocking Parallel Subset Construction on Shared-memory Multicore Architectures. In Proc. Parallel and Distributed Computing 2013 (AusPDC 2013) Adelaide, Australia. CRPIT, 140. Javadi, B. and Garg, S.K. Eds., ACS. 13-20 |
(local if available)