ๆญฃๅœจๅŠ ่ฝฝๅ›พ็‰‡...
ni=100,the estimated tag size n should be within the range categories m can be fairly large,e.g.,in the order of hundreds [90,110]with a probability greater than 95%. the Select command and the fixed initial frame size for each In the Iceberg Ouery Problem,only those categories with a category,as well as the inter-cycle overhead among a large tag size over a specified threshold t are essential to be illustrated number of query cycles,make the overall scanning time rather in the histogram,while the accuracy requirement is satisfied. large. As the exact tag size ni for category Ci is unknown,then, Therefore,we consider an ensemble sampling-based estima- given the estimated value of tag size ni,it is possible to have tion scheme as follows:select a certain number of categories false negative error and false positive error in verifying the and issue a query cycle,obtain the empty/singleton/collision population constraint.Therefore,it is essential to guarantee that slots,and then estimate the tag size for each of the categories the false negative/positive rate is below B,that is: according to the sampling in the singleton slots.In this way,the Pr[m<tn:โ‰ฅt<B, ensemble sampling is more preferred than the separate counting (2) in terms of reading performance.Since more tags are involved Prmโ‰ฅtn:<t<B. (3) in one query cycle,more slots amortize the cost of inter-cycle In the Top-k Query Problem,we use the definition of the overhead,the Select command,as well as the fixed initial frame probabilistic threshold top-k query (PT-Topk query),i.e.,in size.Thus,the overall scanning time can be greatly reduced. regard to the tag size,only the set of categories where each A.The Estimator SE takes a probability of at least 1-B to be in the top-k list are In the slotted ALOHA-based protocol,besides the empty illustrated in the histogram,while the accuracy requirement is slots and the collision slots,the singleton slots can be obtained. satisfied.Much like the iceberg query problem,as the exact tag size ni for category Ci is unknown,then,given the estimated In the ensemble sampling-based estimation,according to the observed statistics of the empty/singleton/collision slots,we can value of tag size ni,it is possible to have false negative error use estimators in [4]12]15]etc.to estimate the overall tag and false positive error in verifying the population constraint, the following constraint must be satisfied: size.Then,according to the response in each singleton slot, the category ID is obtained from the first s bits in the tag ID. Pr[Ciis regarded out of top-k list|Ci E top-k list]<B,(4) Based on the sampling from the singleton slots,the tag size for Pr[C;is regarded in top-k list Ci top-k list]<B.(5) each category can be estimated.The reason is as follows: Assume that there exists m categories C1,C2,...,Cm,the In this paper,we aim to propose a series of novel solutions overall tag size is n,and the tag size for each category is to tackle the above problems,while satisfying the following n,n2,...,nm.We define an indicator variable Xij to denote properties:(1)Time-efficient.(2)Simple for the tag side in the whether one tag of category Ci selects a slot j inside the frame protocol.(3)Complies with the EPCglobal C1G2 standards. with the size f.We set Xi.j=1 if only one tag of category V.USE ENSEMBLE SAMPLING TO COLLECT HISTOGRAMS Ci selects the slot j,and Xi.j=0 otherwise.Moreover,we use Pr[Xi.j=1]to denote the probability that only one tag When collecting the histograms over a large number of of category Ci selects the slot j,then, categories,the objective is to minimize the overall scanning time while the corresponding accuracy/population constraints Pr[X=1ๅˆ‚= are satisfied.Two straightforward solutions are summarized as follows:(1)Basic Tag Identification:The histogram is collected If we use ns.i to denote the number of singleton slots selected by uniquely identifying each tag from the massive tag set and by tags of category Ci,thus ns.i= โˆ‘=lXi,then,the putting it into the corresponding categories,thus the accuracy expected value is 100%,and(2)Separate Counting:The histogram is collected by estimating the number of tags in each category one by E(nsi)= Pr[Xi=1=(1-F)n-1.n one.Specifically,for each category,the reader broadcasts a i= Select command to activate the tags in the specified category. According to the replies from the specified tags,the estimators Furthermore,let ns denote the number of singleton slots,the like [4][12][15]can be used to estimate the tag size for each expected value E(ns)=(1-1)n-1.n.Then, =ๅ…ด E(nใ€‚) category.As the rough tag size for each category cannot be Thus we can approximate the tag size of category C as follows: predicted,a fixed initial frame size is used for each category. si.ไบข ๅ…ƒ= (6 Both of the above two solutions are not time-efficient.In Ta regard to the basic tag identification,uniquely identifying each Here,is the estimated value of the overall tag size.Let= tag in the massive set is not scalable,for as the tag size grows ,henๅบ”=ๅ…ƒ into a huge number,the scanning time can be an unacceptable value.In regard to the separated counting,the reader needs to B.Accuracy Analysis scan each category with at least one query cycle,even if the 1)Accuracy of the SE Estimator:In the ensemble sampling- category is a minor category,which is not necessarily addressed based estimation,since the estimators such as [4][12][15]can in the iceberg query or the top-k query.As the number of be utilized for estimating the overall number of tags,we use 6๐‘›๐‘– = 100, the estimated tag size ๐‘›ห†๐‘– should be within the range [90, 110] with a probability greater than 95%. In the Iceberg Query Problem, only those categories with a tag size over a specified threshold ๐‘ก are essential to be illustrated in the histogram, while the accuracy requirement is satisfied. As the exact tag size ๐‘›๐‘– for category ๐ถ๐‘– is unknown, then, given the estimated value of tag size ๐‘›ห†๐‘–, it is possible to have false negative error and false positive error in verifying the population constraint. Therefore, it is essential to guarantee that the false negative/positive rate is below ๐›ฝ, that is: ๐‘ƒ ๐‘Ÿ[๐‘›ห†๐‘– < ๐‘กโˆฃ๐‘›๐‘– โ‰ฅ ๐‘ก] < ๐›ฝ, (2) ๐‘ƒ ๐‘Ÿ[๐‘›ห†๐‘– โ‰ฅ ๐‘กโˆฃ๐‘›๐‘– < ๐‘ก] < ๐›ฝ. (3) In the Top-k Query Problem, we use the definition of the probabilistic threshold top-๐‘˜ query (PT-Topk query), i.e., in regard to the tag size, only the set of categories where each takes a probability of at least 1 โˆ’ ๐›ฝ to be in the top-๐‘˜ list are illustrated in the histogram, while the accuracy requirement is satisfied. Much like the iceberg query problem, as the exact tag size ๐‘›๐‘– for category ๐ถ๐‘– is unknown, then, given the estimated value of tag size ๐‘›ห†๐‘–, it is possible to have false negative error and false positive error in verifying the population constraint, the following constraint must be satisfied: ๐‘ƒ ๐‘Ÿ[๐ถ๐‘–is regarded out of top-๐‘˜ listโˆฃ๐ถ๐‘– โˆˆ top-๐‘˜ list] < ๐›ฝ, (4) ๐‘ƒ ๐‘Ÿ[๐ถ๐‘–is regarded in top-๐‘˜ listโˆฃ๐ถ๐‘– โˆˆ/ top-๐‘˜ list] < ๐›ฝ. (5) In this paper, we aim to propose a series of novel solutions to tackle the above problems, while satisfying the following properties: (1) Time-efficient. (2) Simple for the tag side in the protocol. (3) Complies with the EPCglobal C1G2 standards. V. USE ENSEMBLE SAMPLING TO COLLECT HISTOGRAMS When collecting the histograms over a large number of categories, the objective is to minimize the overall scanning time while the corresponding accuracy/population constraints are satisfied. Two straightforward solutions are summarized as follows: (1) Basic Tag Identification: The histogram is collected by uniquely identifying each tag from the massive tag set and putting it into the corresponding categories, thus the accuracy is 100%, and (2) Separate Counting: The histogram is collected by estimating the number of tags in each category one by one. Specifically, for each category, the reader broadcasts a Select command to activate the tags in the specified category. According to the replies from the specified tags, the estimators like [4][12][15] can be used to estimate the tag size for each category. As the rough tag size for each category cannot be predicted, a fixed initial frame size is used for each category. Both of the above two solutions are not time-efficient. In regard to the basic tag identification, uniquely identifying each tag in the massive set is not scalable, for as the tag size grows into a huge number, the scanning time can be an unacceptable value. In regard to the separated counting, the reader needs to scan each category with at least one query cycle, even if the category is a minor category, which is not necessarily addressed in the iceberg query or the top-๐‘˜ query. As the number of categories ๐‘š can be fairly large, e.g., in the order of hundreds, the Select command and the fixed initial frame size for each category, as well as the inter-cycle overhead among a large number of query cycles, make the overall scanning time rather large. Therefore, we consider an ensemble sampling-based estima๏ฟพtion scheme as follows: select a certain number of categories and issue a query cycle, obtain the empty/singleton/collision slots, and then estimate the tag size for each of the categories according to the sampling in the singleton slots. In this way, the ensemble sampling is more preferred than the separate counting in terms of reading performance. Since more tags are involved in one query cycle, more slots amortize the cost of inter-cycle overhead, the Select command, as well as the fixed initial frame size. Thus, the overall scanning time can be greatly reduced. A. The Estimator SE In the slotted ALOHA-based protocol, besides the empty slots and the collision slots, the singleton slots can be obtained. In the ensemble sampling-based estimation, according to the observed statistics of the empty/singleton/collision slots, we can use estimators in [4][12][15] etc. to estimate the overall tag size. Then, according to the response in each singleton slot, the category ID is obtained from the first ๐‘  bits in the tag ID. Based on the sampling from the singleton slots, the tag size for each category can be estimated. The reason is as follows: Assume that there exists ๐‘š categories ๐ถ1, ๐ถ2, ..., ๐ถ๐‘š, the overall tag size is ๐‘›, and the tag size for each category is ๐‘›1, ๐‘›2, ..., ๐‘›๐‘š. We define an indicator variable ๐‘‹๐‘–,๐‘— to denote whether one tag of category ๐ถ๐‘– selects a slot ๐‘— inside the frame with the size ๐‘“. We set ๐‘‹๐‘–,๐‘— = 1 if only one tag of category ๐ถ๐‘– selects the slot ๐‘—, and ๐‘‹๐‘–,๐‘— = 0 otherwise. Moreover, we use ๐‘ƒ ๐‘Ÿ[๐‘‹๐‘–,๐‘— = 1] to denote the probability that only one tag of category ๐ถ๐‘– selects the slot ๐‘—, then, ๐‘ƒ ๐‘Ÿ[๐‘‹๐‘–,๐‘— = 1] = 1 ๐‘“ โ‹… (1 โˆ’ 1 ๐‘“ ) ๐‘›โˆ’1 โ‹… ๐‘›๐‘–. If we use ๐‘›๐‘ ,๐‘– to denote the number of singleton slots selected by tags of category ๐ถ๐‘–, thus ๐‘›๐‘ ,๐‘– = โˆ‘๐‘“ ๐‘—=1 ๐‘‹๐‘–,๐‘— , then, the expected value ๐ธ(๐‘›๐‘ ,๐‘–) = โˆ‘ ๐‘“ ๐‘—=1 ๐‘ƒ ๐‘Ÿ[๐‘‹๐‘–,๐‘— = 1] = (1 โˆ’ 1 ๐‘“ ) ๐‘›โˆ’1 โ‹… ๐‘›๐‘–. Furthermore, let ๐‘›๐‘  denote the number of singleton slots, the expected value ๐ธ(๐‘›๐‘ ) = (1 โˆ’ 1 ๐‘“ )๐‘›โˆ’1 โ‹… ๐‘›. Then, ๐ธ(๐‘›๐‘ ,๐‘–) ๐ธ(๐‘›๐‘ ) = ๐‘›๐‘– ๐‘› . Thus we can approximate the tag size of category ๐ถ๐‘– as follows: ๐‘›ห†๐‘– = ๐‘›๐‘ ,๐‘– ๐‘›๐‘  โ‹… ๐‘›. ห† (6) Here, ๐‘›ห† is the estimated value of the overall tag size. Let ๐›ผห†๐‘– = ๐‘›๐‘ ,๐‘– ๐‘›๐‘  , then ๐‘›ห†๐‘– = ๐›ผห†๐‘– โ‹… ๐‘›ห†. B. Accuracy Analysis 1) Accuracy of the SE Estimator: In the ensemble sampling๏ฟพbased estimation, since the estimators such as [4][12][15] can be utilized for estimating the overall number of tags, we use ๐›ฟ
<<ๅ‘ไธŠ็ฟป้กตๅ‘ไธ‹็ฟป้กต>>
©2008-็Žฐๅœจ cucdc.com ้ซ˜็ญ‰ๆ•™่‚ฒ่ต„่ฎฏ็ฝ‘ ็‰ˆๆƒๆ‰€ๆœ‰