by leveraging the GPSR routing scheme,thus to reach those corresponding storage regions,then the query results will be sent back along the original routing paths. VI.SIMULATION RESULTS In this section,we use simulation to compare the perfor- mance of our optimized storage scheme against conventional storage schemes for multi-dimensional range queries.We con- duct the simulation based on the grid based system,we divide the sensor network into n x n cells (we set 8 as the default The Optimized K-d Tree Index The corresponding storage regions over the cell based structure value for n),and each cell represents a unit storage area,we Befwork normalize the cell size to 1.For performance comparison we Fig.6.The optimized k-d tree and corresponding storage zones over the separately build DIM's zone tree,the fully replicate storage cell based network. scheme,and our optimized k-d tree over the network.Here we assign a unique zone for each of the cell units in DIM's zone tree,which infers to be a fully split approach over the fully replicate storage scheme always achieves the most energy cell based network,and we construct the fully replicate storage consumption as it requires the event message to update all scheme as another extreme situation,which stores all sensor n x n replicate storage cells.And DIM always achieves the data into each of the cell units,thus all the cells are replicate least energy consumption as it requires the event message storage units inside the unique zone to update one unique corresponding storage cell.For our And we utilize a training set of events and range queries to optimized k-d tree storage scheme,as the zone specified by derive the event and query distributions,thus to construct our the event may include one or more replicate storage cells,thus optimized k-d tree based index structure,as Fig.5 depicts,we its energy consumption is between the above two extreme use the Gaussian-like distribution to generate 5-dimensional situations,and unlike the fully replicate storage scheme,it data and queries for the training set and we normalize all the grows slowly as the network size increases. attribute values into the range [0,1],for range queries,we set Fig.7(b)shows the average query diffusion cost for the the query range for each dimension within range [0,1].And above three storage schemes.Note that DIM always achieves we also generate a similar set of events and range queries to the most energy consumption and the fully replicate storage evaluate the performance of various multi-dimensional storage scheme always achieves the least energy consumption,because schemes,we set the default number of event frequency fe and for one specific range query,DIM may split the query into query frequency fa to 1000. sub-queries and map them to quite many storage zones while the fully replicate storage scheme simply maps the query to one unique zone at the top level with replicate storages,thus its energy cost is nearly negligible.And similar to the former ass case,our optimized k-d tree storage scheme gains a tradeoff between the two extremes,and unlike DIM,it grows slowly as the network size increases. As Fig.7(c)shows,we further compare the overall energy cost for the above three schemes.For the situation fe 1000,fa=2000.DIM achieves the most energy cost as DIM's query diffusion cost is the largest among the three Fig.5.The Gaussian-like distribution of multi-dimensional data schemes,although its event insertion cost is rather small.As the fe/f ratio increases,the fully replicate storage scheme achieves the most energy cost due to its large event insertion A.Comparing performance for various multi-dimensional cost.For all the above three situations,our optimized k-d tree storage schemes storage scheme always achieves the least overall energy cost, In this section,we evaluate the energy efficiency for various as it gains an optimized tradeoff to sufficiently control both multi-dimensional storage schemes.Fig.6 depicts the opti- the event insertion cost and the query delivery cost.For the mized k-d tree and corresponding storage zones over the cell situation fe=1000,f=1000,our optimized k-d tree storage based network,this optimized index structure is constructed scheme reduces 39%energy cost than DIM and further reduces according to the event and query training set generated in terms 58%energy cost than the fully replicate storage scheme. of the Gaussian-like distributions. We now compare the average event insertion cost and B.Comparing performance for various query diffusion ap- average query delivery cost.Fig.7(a)shows the average event proaches insertion cost for DIM,the fully replicate storage scheme, In this section we compare the energy cost for various query and optimized K-d tree.As the network size increases,the diffusion approaches:the normal separate routing approach, 172 Authorized licensed use limited to:Nanjing University.Downloaded on July 11,2010 at 07:42:17 UTC from IEEE Xplore.Restrictions apply.by leveraging the GPSR routing scheme, thus to reach those corresponding storage regions, then the query results will be sent back along the original routing paths. VI. SIMULATION RESULTS In this section, we use simulation to compare the performance of our optimized storage scheme against conventional storage schemes for multi-dimensional range queries. We conduct the simulation based on the grid based system, we divide the sensor network into n × n cells (we set 8 as the default value for n), and each cell represents a unit storage area, we normalize the cell size to 1. For performance comparison we separately build DIM’s zone tree, the fully replicate storage scheme, and our optimized k-d tree over the network. Here we assign a unique zone for each of the cell units in DIM’s zone tree, which infers to be a fully split approach over the cell based network, and we construct the fully replicate storage scheme as another extreme situation, which stores all sensor data into each of the cell units, thus all the cells are replicate storage units inside the unique zone. And we utilize a training set of events and range queries to derive the event and query distributions, thus to construct our optimized k-d tree based index structure, as Fig.5 depicts, we use the Gaussian-like distribution to generate 5-dimensional data and queries for the training set and we normalize all the attribute values into the range [0,1], for range queries, we set the query range for each dimension within range [0,1]. And we also generate a similar set of events and range queries to evaluate the performance of various multi-dimensional storage schemes, we set the default number of event frequency fe and query frequency fq to 1000. -0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 0.00 0.05 0.10 0.15 0.20 0.25 0.30 Probability Normalized Value Fig. 5. The Gaussian-like distribution of multi-dimensional data A. Comparing performance for various multi-dimensional storage schemes In this section, we evaluate the energy efficiency for various multi-dimensional storage schemes. Fig.6 depicts the optimized k-d tree and corresponding storage zones over the cell based network, this optimized index structure is constructed according to the event and query training set generated in terms of the Gaussian-like distributions. We now compare the average event insertion cost and average query delivery cost. Fig.7(a) shows the average event insertion cost for DIM, the fully replicate storage scheme, and optimized K-d tree. As the network size increases, the Fig. 6. The optimized k-d tree and corresponding storage zones over the cell based network. fully replicate storage scheme always achieves the most energy consumption as it requires the event message to update all n × n replicate storage cells. And DIM always achieves the least energy consumption as it requires the event message to update one unique corresponding storage cell. For our optimized k-d tree storage scheme, as the zone specified by the event may include one or more replicate storage cells, thus its energy consumption is between the above two extreme situations, and unlike the fully replicate storage scheme, it grows slowly as the network size increases. Fig.7(b) shows the average query diffusion cost for the above three storage schemes. Note that DIM always achieves the most energy consumption and the fully replicate storage scheme always achieves the least energy consumption, because for one specific range query, DIM may split the query into sub-queries and map them to quite many storage zones while the fully replicate storage scheme simply maps the query to one unique zone at the top level with replicate storages, thus its energy cost is nearly negligible. And similar to the former case, our optimized k-d tree storage scheme gains a tradeoff between the two extremes, and unlike DIM, it grows slowly as the network size increases. As Fig.7(c) shows, we further compare the overall energy cost for the above three schemes. For the situation fe = 1000, fq = 2000, DIM achieves the most energy cost as DIM’s query diffusion cost is the largest among the three schemes, although its event insertion cost is rather small. As the fe/fq ratio increases, the fully replicate storage scheme achieves the most energy cost due to its large event insertion cost. For all the above three situations, our optimized k-d tree storage scheme always achieves the least overall energy cost, as it gains an optimized tradeoff to sufficiently control both the event insertion cost and the query delivery cost. For the situation fe = 1000, fq = 1000, our optimized k-d tree storage scheme reduces 39% energy cost than DIM and further reduces 58% energy cost than the fully replicate storage scheme. B. Comparing performance for various query diffusion approaches In this section we compare the energy cost for various query diffusion approaches: the normal separate routing approach, 172 Authorized licensed use limited to: Nanjing University. Downloaded on July 11,2010 at 07:42:17 UTC from IEEE Xplore. Restrictions apply