Distributing Frequency-Dependent Data Stream Computations

Ganguly, S.

Data stream computations in domains such as internet applications are often performed in a highly distributed fashion in order to save time. An example is the class of applications that use the Google Mapreduce framework of scalable distributed processing as presented by Dean & Ghemawat. A basic question here is: what kind of data stream computations admit scalable and efficient distributed algorithms? We show that the class of data stream computations that approximate functions of the frequency vector of the stream can be computed efficiently in a distributed manner.

Cite as: Ganguly, S. (2009). Distributing Frequency-Dependent Data Stream Computations. In Proc. Fifteenth Computing: The Australasian Theory Symposium (CATS 2009), Wellington, New Zealand. CRPIT, 94. Downey, R. and Manyem, P., Eds. ACS. 161-167.

(from crpit.com) (local if available)