正在加载图片...
Based on the above analysis we note that the optimized k-d Algorithm 1 Constructing k-d Tree based Index Structure with tree structure satisfies the following property:for an optimum Optimized Granularity k-d tree structure,its sub-tree structures are also optimum 1:Initially construct a k-d tree as a full binary tree with Therefore this optimization problem can be solved by dynamic maximum height of h,each leaf node denotes a unit cell. programming that works from the bottom to the top along the 2:for each leaf node associated with region Si do tree. 3: E(S)=fe·ProbE(S)Eevent(Si)+fg·Probo(Sa) C.Constructing Optimized Index Structure based on Dynamic Equery(Si) Programming 4:end for We construct the k-d tree from bottom to top,which is in 5:for all remaining nodes Sij with corresponding children post-order.Assume the maximum height of the k-d tree is nodes S;and Sj,in post-order do h,initially consider a full k-d tree with all leaf nodes at the 6. costl is associated with the replicate storage scheme hth level,each leaf node correspond to a unit cell,thus the for region Si.j} number of nodes over the k-d tree is n=2h+1-1.For each cost1=fe·ProbE(S,)·Eevent(Sii)+fg· two nodes i and j within one sub-tree,we denote Si and S;as ProbQ(S,)·Equery(S,i) cost2 is associated with the split storage scheme for their respective regions,and we denote Si.;as the super-region region Si.j} composed of the two regions,which is corresponding to their parent node.We will try to merge two split nodes along the cost2=E(S)+E(S)】 10: tree if the energy cost for replicate storage scheme is lower E(Si.j)=min(cost1,cost2) 11: if cost1<cost2 then than the sum of each split nodes. 12 Algorithm I depicts the pseudo code of this dynamic Apply the replicate storage scheme to region Sij, programming based algorithm.Assume that the n nodes in merge two split nodes Si and Sj into one node Si.j the k-d tree are labelled using the post-order.A table E[l...n] for the k-d tree 13: end if is used to hold the minimum energy cost of each node which 14:end for corresponds to a specific storage region Si,here for ease of presentation we denote it as E(Si).In the algorithm,line 3 computes the energy cost E(Si)for all leave nodes and lines 6-10 compute the minimum energy cost E(Si.j)for As described in Section IV.when the event message reaches the remaining internal nodes in post-order.As there're O(n) the zone,we utilize the Double Ruling scheme [9][10]to entries of energy cost to compute according to all the nodes distribute the event information to all relevant replicate storage over the initial full k-d tree,and at each step one exact entry is nodes. computed,the time complexity of Algorithm I is O(n),where B.Resolving and Routing Range Queries n is the number of nodes over the initial full k-d tree. For implementation of optimized index structure construc- As a range query corresponds to a region over the tion,first the sink calculates the event data distribution and multi-dimensional data space,thus a conventional range range query distribution according to a training set by lever- query may cover several zones over the k-d tree based aging the methodology proposed in Section III,and further index structure.For example,consider a range query calculates optimized parameters for the optimized k-d tree (0.3,0.7,[0.5,0.6,0.4,0.8,0.7,0.9j)over the example k- d tree depicted in Fig.2(a),the storage regions of zone code based on dynamic programming,then it broadcasts the index structure information to the sensor network,after receiving 110.111,010.0111 will be involved.Therefore how to route this information,those selected storage nodes of each cell these sub-range queries to those specified storage nodes in will confirm their storage roles which includes corresponding an energy efficient approach is the problem we are trying to storage schemes and the multi-dimensional ranges of data they address in this section.As the corresponding query replies can will store,and each sensor node will know where to store/find be sent back to query node along the query routing paths,thus here we only consider the query diffusion cost.for the overall the relevant ranges of data according to the locally stored index cost is actually twice the query diffusion cost. structure. Basically there are three strategies to route these sub- V.EVENT STORAGE AND QUERY PROCESSING queries.The first strategy is to route sub-queries separately MECHANISMS to corresponding zones by selecting nearest storage nodes for A.Inserting Events to Cells the query node.As Fig.4(a)shows,for each relevant storage region Si associated with sub-query gi,the query node selects As an event data is a point value over the multi-dimensional a nearest storage node from those replicate storages,and data space,it corresponds to a unique zone over the de- respectively sends a sub-query message to the corresponding ployment area.When a sensor node generates an event data it will check the locally stored index structure to find the storage node.In this scheme,no routing paths are shared. The second strategy leverages Double Ruling based routing corresponding zone,then it leverages the GPSR[1]routing approach.As Fig.4(b)shows,the query node first imposes scheme to forward the event message to that specified region. a routing path horizontally,and at specific positions of the 170 Authorized licensed use limited to:Nanjing University.Downloaded on July 11,2010 at 07:42:17 UTC from IEEE Xplore.Restrictions apply.Based on the above analysis we note that the optimized k-d tree structure satisfies the following property: for an optimum k-d tree structure, its sub-tree structures are also optimum. Therefore this optimization problem can be solved by dynamic programming that works from the bottom to the top along the tree. C. Constructing Optimized Index Structure based on Dynamic Programming We construct the k-d tree from bottom to top, which is in post-order. Assume the maximum height of the k-d tree is h, initially consider a full k-d tree with all leaf nodes at the hth level, each leaf node correspond to a unit cell, thus the number of nodes over the k-d tree is n = 2h+1 − 1. For each two nodes i and j within one sub-tree, we denote Si and Sj as their respective regions, and we denote Si,j as the super-region composed of the two regions, which is corresponding to their parent node. We will try to merge two split nodes along the tree if the energy cost for replicate storage scheme is lower than the sum of each split nodes. Algorithm 1 depicts the pseudo code of this dynamic programming based algorithm. Assume that the n nodes in the k-d tree are labelled using the post-order. A table E[1...n] is used to hold the minimum energy cost of each node which corresponds to a specific storage region Si , here for ease of presentation we denote it as E(Si). In the algorithm, line 3 computes the energy cost E(Si) for all leave nodes and lines 6-10 compute the minimum energy cost E(Si,j ) for the remaining internal nodes in post-order. As there’re O(n) entries of energy cost to compute according to all the nodes over the initial full k-d tree, and at each step one exact entry is computed, the time complexity of Algorithm 1 is O(n), where n is the number of nodes over the initial full k-d tree. For implementation of optimized index structure construc￾tion, first the sink calculates the event data distribution and range query distribution according to a training set by lever￾aging the methodology proposed in Section III, and further calculates optimized parameters for the optimized k-d tree based on dynamic programming, then it broadcasts the index structure information to the sensor network, after receiving this information, those selected storage nodes of each cell will confirm their storage roles which includes corresponding storage schemes and the multi-dimensional ranges of data they will store, and each sensor node will know where to store/find the relevant ranges of data according to the locally stored index structure. V. EVENT STORAGE AND QUERY PROCESSING MECHANISMS A. Inserting Events to Cells As an event data is a point value over the multi-dimensional data space, it corresponds to a unique zone over the de￾ployment area. When a sensor node generates an event data, it will check the locally stored index structure to find the corresponding zone, then it leverages the GPSR[1] routing scheme to forward the event message to that specified region. Algorithm 1 Constructing k-d Tree based Index Structure with Optimized Granularity 1: Initially construct a k-d tree as a full binary tree with maximum height of h, each leaf node denotes a unit cell. 2: for each leaf node associated with region Si do 3: E(Si) = fe ·P robE(Si)·Eevent(Si)+fq ·P robQ(Si)· Equery(Si) 4: end for 5: for all remaining nodes Si,j with corresponding children nodes Si and Sj , in post-order do 6: {cost1 is associated with the replicate storage scheme for region Si,j} 7: cost1 = fe · P robE(Si,j ) · Eevent(Si,j ) + fq · P robQ(Si,j ) · Equery(Si,j ) 8: {cost2 is associated with the split storage scheme for region Si,j} 9: cost2 = E(Si) + E(Sj ) 10: E(Si,j ) = min(cost1, cost2) 11: if cost1 < cost2 then 12: Apply the replicate storage scheme to region Si,j , merge two split nodes Si and Sj into one node Si,j for the k-d tree. 13: end if 14: end for As described in Section IV, when the event message reaches the zone, we utilize the Double Ruling scheme [9][10] to distribute the event information to all relevant replicate storage nodes. B. Resolving and Routing Range Queries As a range query corresponds to a region over the multi-dimensional data space, thus a conventional range query may cover several zones over the k-d tree based index structure. For example, consider a range query h[0.3, 0.7], [0.5, 0.6], [0.4, 0.8], [0.7, 0.9]i over the example k￾d tree depicted in Fig.2(a), the storage regions of zone code 110,111, 010, 0111 will be involved. Therefore how to route these sub-range queries to those specified storage nodes in an energy efficient approach is the problem we are trying to address in this section. As the corresponding query replies can be sent back to query node along the query routing paths, thus here we only consider the query diffusion cost, for the overall cost is actually twice the query diffusion cost. Basically there are three strategies to route these sub￾queries. The first strategy is to route sub-queries separately to corresponding zones by selecting nearest storage nodes for the query node. As Fig.4(a) shows, for each relevant storage region Si associated with sub-query qi , the query node selects a nearest storage node from those replicate storages, and respectively sends a sub-query message to the corresponding storage node. In this scheme, no routing paths are shared. The second strategy leverages Double Ruling based routing approach. As Fig.4(b) shows, the query node first imposes a routing path horizontally, and at specific positions of the 170 Authorized licensed use limited to: Nanjing University. Downloaded on July 11,2010 at 07:42:17 UTC from IEEE Xplore. Restrictions apply
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有