The SET COVER problem is tantalizingly simple to describe: given a collection F of sets, each containing a subset of a universe U of objects, find a smallest subcollection A of F such that every object in U is included in at least one of the sets in A. However, like many such combinatorial problems, SET COVER is NP-hard, meaning that it is unlikely that an efficient algorithm will be found, and that approximation algorithms must be preferred for non-trivial problem instances. One well-known approximation approach for SET COVER is to repeatedly add the set with the most uncovered items to the solution, continuing until every element in the universe is covered; this GREEDY approach has a provable logarithmic approximation ratio, essentially the best feasible ratio. Here we study the implementation of the GREEDY approach to SET COVER, evaluating eager and lazy versions and other implementation options. Experiments with several large datasets demonstrate that lazy “as required priority queue updates should be preferred, rather than eager - as soon as possible - ones; and that when implemented in this way, the GREEDY mechanism can solve some large instances of the SET COVER problem very quickly. This practical superiority contrasts with the lazy version's having a demonstrably higher worst-case operation cost.
Cite as: Lim, C. L., Moffat, A. and Wirth, A. (2014). Lazy and Eager Approaches for the Set Cover Problem. In Proc. Thirty-Seventh Australasian Computer Science Conference (ACSC 2014) Auckland, New Zealand. CRPIT, 147. Thomas, B. and Parry, D. Eds., ACS. 19-27
(from crpit.com)
(local if available)