Data stream computations in domains such as internet
applications are often performed in a highly
distributed fashion in order to save time. An example
is the class of applications that use the Google
Mapreduce framework of scalable distributed processing
as presented by Dean & Ghemawat.
A basic question here is: what kind of data stream
computations admit scalable and efficient distributed
algorithms? We show that the class of data stream
computations that approximate functions of the frequency
vector of the stream can be computed efficiently
in a distributed manner.
Cite as: Ganguly, S. (2009). Distributing Frequency-Dependent Data Stream Computations. In Proc. Fifteenth Computing: The Australasian Theory Symposium (CATS 2009), Wellington, New Zealand. CRPIT, 94. Downey, R. and Manyem, P., Eds. ACS. 161-167.
(from crpit.com)
(local if available)