Minimizing Communication Cost in Distributed Multi-query Processing
Title | Minimizing Communication Cost in Distributed Multi-query Processing |
Publication Type | Conference Papers |
Year of Publication | 2009 |
Authors | Li J, Deshpande A, Khuller S |
Conference Name | IEEE 25th International Conference on Data Engineering, 2009. ICDE '09 |
Date Published | 2009/04/29/March |
Publisher | IEEE |
ISBN Number | 978-1-4244-3422-0 |
Keywords | Approximation algorithms, Communication networks, Computer science, Cost function, Data engineering, distributed communication network, distributed databases, distributed multi-query processing, grid computing, Large-scale systems, NP-hard, optimisation, Polynomials, Publish-subscribe, publish-subscribe systems, Query optimization, Query processing, sensor networks, Steiner tree problem, Tree graphs, trees (mathematics) |
Abstract | Increasing prevalence of large-scale distributed monitoring and computing environments such as sensor networks, scientific federations, Grids etc., has led to a renewed interest in the area of distributed query processing and optimization. In this paper we address a general, distributed multi-query processing problem motivated by the need to minimize the communication cost in these environments. Specifically we address the problem of optimally sharing data movement across the communication edges in a distributed communication network given a set of overlapping queries and query plans for them (specifying the operations to be executed). Most of the problem variations of our general problem can be shown to be NP-Hard by a reduction from the Steiner tree problem. However, we show that the problem can be solved optimally if the communication network is a tree, and present a novel algorithm for finding an optimal data movement plan. For general communication networks, we present efficient approximation algorithms for several variations of the problem. Finally, we present an experimental study over synthetic datasets showing both the need for exploiting the sharing of data movement and the effectiveness of our algorithms at finding such plans. |
DOI | 10.1109/ICDE.2009.85 |