10 4.5 510 ES -E3-g 0.5 01 0.15 0.2 0.25 50 Tev 150 200 200 400 B00 100i Scenario (a (b) (c) (d) Fig.2.Simulation results in basic histogram collection:(a)The overall scanning time in various scenarios:(b)The overall scanning time with various e:(c)The overall scanning time with various Te/Ts;(d)The scanning time with various value of m. 10 210 5x10 0 -ES 6 60 40 ES 12 30 14 18 80 100 120 140 40 100 120 1.2 1.4 1.8 18 The value of e The value al a The lotal scanning tme x10 (a) (b) (c) (d) Fig.3.Simulation results in advanced histogram collection:(a)The scanning time with various threshold :(b)The scanning time with various variance o:(c)The scanning time with various value of k;(d)The variation of g with the scanning time. the number of categories is increased,the scanning time of the set u=100,o =20.Note that while the number of categories separate counting (SC)is apparently increased,due to the large increases,the scanning time of each solution grows in a linear inter-cycle overhead and the constant initial frame size in each approach.Still,the ES-g solution always achieves the minimum category.Still,the ES strategy has the longest scanning time.In scanning time. Scenario 3,we set m 100,u=500 and o 100.We observe that the basic identification(BD)has the longest scanning time,B.The Performance in Advanced Histogram Collection as the current overall tag size is rather large.The ES strategy We evaluate the performance of our iceberg query algorithm now requires a fairly short scanning time,as the variance o is We use ES to denote our optimized solution based on ensemble relatively small as compared to u.Note that in all cases,our sampling.In Fig.3(a)we compare the scanning time with optimized solution ES-g always achieves the best performance various values of threshold ratio 0.We set m 200,u in terms of scanning time.In Fig.2(b),we compare the scanning 200,=100,the exact threshold is set to t =0.u.We time with various values of e in the accuracy constraint.We set observe that,as the threshold increases,the scanning time of m =100,u=500,o =100.As the value of e is increasing, the SC strategy and the ES strategy is continuously decreasing, the scanning time of all solutions,except the BI strategy,is while the scanning time for the BI strategy is not affected.In decreasing.Among the four strategies,the ES-g solution always Fig.3(b)we compare the scanning time with various standard achieves the best performance in scanning time. deviation o.We set m =200.u=200 and the threshold ratio In Fig.2(c),we evaluate the impact of the inter-cycle over- 6-1.5.We observe that as the value of o increases,the head in the strategies.We set m =150,u=50,10.It scanning time of the SC strategy and the ES strategy grows is known that,by reducing the transmitted bits in singleton slowly.The reason is as follows:as the standard deviation o slots,the average slot duration can be further reduced,while increases,the number of qualified categories is increasing.thus the inter-cycle overhead is not easy to be greatly reduced due more slots are essential to verify the categories for accuracy; to the necessity to calm down the activated tags.So we test the besides,fewer categories have tag sizes close to the threshold, overall scanning time with various ratios of Te/Ta.We observe thus fewer slots are required to verify the population constraint that the BI strategy and the ES strategy have a fairly stable In all,the overall scanning time is increasing rather slowly. scanning time,as the number of query cycles is relatively small. We evaluate the performance of our PT-Topk:algorithm.In The separate counting (SC)has a relatively short scanning time Fig.3(c),we compare the scanning time with various values when Te/Ts is less than 50.As the value of Te/Ts increases, of k.We observe that as k increases from 20 to 120.the its scanning time linearly increases and surpasses the other scanning time of the ES strategy increases from 1.5 x 105 to strategies.The ES-g strategy always has the shortest scanning 2.5 x 105,and then decreases to 2 x 105.The reason is that, time.In Fig.2(d),we evaluate the scalability of the proposed as the value of k increases,the exact threshold is reduced,and algorithms while varying the overall number of categories.We more categories are identified as qualified,thus more slots are1 2 3 0 5 10 15 x 104 Scenario The overall scanning time (ms) BI SC ES ES−g (a) 0.1 0.15 0.2 0.25 0.3 0 5 10 15 x 104 The value of ε The overall scanning time (ms) BI SC ES ES−g (b) 0 50 100 150 200 0.5 1 1.5 2 2.5 3 3.5 4 4.5 x 104 The value of τ c /τ s The total scanning time BI SC ES ES−g (c) 200 400 600 800 1000 0 0.5 1 1.5 2 2.5 3 3.5 x 105 The value of m The overall scanning time (ms) BI SC ES ES−g (d) Fig. 2. Simulation results in basic histogram collection: (a)The overall scanning time in various scenarios;(b)The overall scanning time with various 𝜖;(c)The overall scanning time with various 𝜏𝑐/𝜏𝑠;(d)The scanning time with various value of 𝑚. 1 1.2 1.4 1.6 1.8 2 0 2 4 6 8 10 12 x 104 The value of θ The total scanning time BI SC ES (a) 40 60 80 100 120 140 160 0 2 4 6 8 10 12 x 104 The value of σ The total scanning time BI SC ES (b) 20 40 60 80 100 120 0 0.5 1 1.5 2 2.5 3 3.5 x 105 The value of K The total scanning time BI ES (c) 1 1.2 1.4 1.6 1.8 x 10 5 20 40 60 80 100 The total scanning time The value of g (d) Fig. 3. Simulation results in advanced histogram collection: (a)The scanning time with various threshold 𝜃;(b)The scanning time with various variance 𝜎;(c)The scanning time with various value of 𝑘;(d)The variation of 𝑔 with the scanning time. the number of categories is increased, the scanning time of the separate counting (SC) is apparently increased, due to the large inter-cycle overhead and the constant initial frame size in each category. Still, the ES strategy has the longest scanning time. In Scenario 3, we set 𝑚 = 100, 𝜇 = 500 and 𝜎 = 100. We observe that the basic identification (BI) has the longest scanning time, as the current overall tag size is rather large. The ES strategy now requires a fairly short scanning time, as the variance 𝜎 is relatively small as compared to 𝜇. Note that in all cases, our optimized solution ES-g always achieves the best performance in terms of scanning time. In Fig.2(b), we compare the scanning time with various values of 𝜖 in the accuracy constraint. We set 𝑚 = 100, 𝜇 = 500, 𝜎 = 100. As the value of 𝜖 is increasing, the scanning time of all solutions, except the BI strategy, is decreasing. Among the four strategies, the ES-g solution always achieves the best performance in scanning time. In Fig.2(c), we evaluate the impact of the inter-cycle overhead in the strategies. We set 𝑚 = 150, 𝜇 = 50, 𝜎 = 10. It is known that, by reducing the transmitted bits in singleton slots, the average slot duration can be further reduced, while the inter-cycle overhead is not easy to be greatly reduced due to the necessity to calm down the activated tags. So we test the overall scanning time with various ratios of 𝜏𝑐/𝜏𝑠. We observe that the BI strategy and the ES strategy have a fairly stable scanning time, as the number of query cycles is relatively small. The separate counting (SC) has a relatively short scanning time when 𝜏𝑐/𝜏𝑠 is less than 50. As the value of 𝜏𝑐/𝜏𝑠 increases, its scanning time linearly increases and surpasses the other strategies. The ES-g strategy always has the shortest scanning time. In Fig.2(d), we evaluate the scalability of the proposed algorithms while varying the overall number of categories. We set 𝜇 = 100, 𝜎 = 20. Note that while the number of categories increases, the scanning time of each solution grows in a linear approach. Still, the ES-g solution always achieves the minimum scanning time. B. The Performance in Advanced Histogram Collection We evaluate the performance of our iceberg query algorithm. We use ES to denote our optimized solution based on ensemble sampling. In Fig.3(a) we compare the scanning time with various values of threshold ratio 𝜃. We set 𝑚 = 200, 𝜇 = 200, 𝜎 = 100, the exact threshold is set to 𝑡 = 𝜃 ⋅ 𝜇. We observe that, as the threshold increases, the scanning time of the SC strategy and the ES strategy is continuously decreasing, while the scanning time for the BI strategy is not affected. In Fig.3(b) we compare the scanning time with various standard deviation 𝜎. We set 𝑚 = 200, 𝜇 = 200 and the threshold ratio 𝜃 = 1.5. We observe that as the value of 𝜎 increases, the scanning time of the SC strategy and the ES strategy grows slowly. The reason is as follows: as the standard deviation 𝜎 increases, the number of qualified categories is increasing, thus more slots are essential to verify the categories for accuracy; besides, fewer categories have tag sizes close to the threshold, thus fewer slots are required to verify the population constraint. In all, the overall scanning time is increasing rather slowly. We evaluate the performance of our PT-Top𝑘 algorithm. In Fig.3(c), we compare the scanning time with various values of 𝑘. We observe that as 𝑘 increases from 20 to 120, the scanning time of the ES strategy increases from 1.5 × 105 to 2.5 × 105, and then decreases to 2 × 105. The reason is that, as the value of 𝑘 increases, the exact threshold is reduced, and more categories are identified as qualified, thus more slots are