Probabilistic Optimal Tree Hopping for RFID Identification Muhammad Shahzad and Alex X.Liu Department of Computer Science and Engineering Michigan State University East Lansing,Michigan,USA [shahzadm,alexliu@cse.msu.edu ABSTRACT 1.INTRODUCTION Radio Frequency Identification (RFID)systems are widely used in various applications such as supply chain manage- 1.1 Background and Problem Statement ment,inventory control,and object tracking.Identifying As the cost of commercial RFID tags,which is as low as RFID tags in a given tag population is the most fundamental 5 cents per tag 21,has become negligible compared to the operation in RFID systems.While the Tree Walking(TW) prices of the products to which they are attached,RFID sys- protocol has become the industrial standard for identifying tems are being increasingly used in various applications such RFID tags,little is known about the mathematical nature as supply chain management [13],indoor localization [18], of this protocol and only some ad-hoc heuristics exist for 3D positioning [27],object tracking [17],inventory control, optimizing it.In this paper,first,we analytically model the electronic toll collection,and access control [5,16].For ex- TW protocol,and then using that model,propose the Tree ample,Walmart has started to use RFID tags to track jeans Hopping(TH)protocol that optimizes TW both theoreti- and underwear for better inventory control.An RFID sys- cally and practically.The key novelty of TH is to formulate tem consists of tags and readers.A tag is a microchip com- tag identification as an optimization problem and find the bined with an antenna in a compact package that has limited optimal solution that ensures the minimal average number computing power and communication range.There are two of queries.With this solid theoretical underpinning,for dif- types of tags:(1)passive tags,which do not have their own ferent tag population sizes ranging from 100 to 100K tags, power source,are powered up by harvesting the radio fre- TH significantly outperforms the best prior tag identifica- quency energy from readers,and have communication ranges tion protocols on the metrics of the total number of queries often less than 20 feet;(2)active tags.which come with their per tag,the total identification time per tag,and the average own power sources and have relatively longer communica- number of responses per tag by an average of 50%,10%,and tion ranges.A reader has a dedicated power source with 30%,respectively,when tag IDs are uniformly distributed in significant computing power.RFID systems mostly work in the ID space,and of 26%,37%,and 26%,respectively,when a query-response fashion where a reader transmits queries tag IDs are non-uniformly distributed. to a set of tags and the tags respond with their IDs over a shared wireless medium. Categories and Subject Descriptors This paper addresses the fundamental RFID tag identifi- cation problem,namely reading all IDs of a given set of tags. C.2.1 [Computer-Communication Networks):Network which is needed in almost all RFID systems.Because tags Architecture and Design -Wireless communication;C.2.8 respond over a shared wireless medium,tag identification Mobile Computing]:Algorithm Design and Analysis protocols are also called collision arbitration,tag singula- tion,or tag anti-collision protocols.Tag identification pro- tocols need to be scalable as the number of tags that need to General Terms be identified could be as large as tens of thousands with the Algorithms.Design,Performance,Experimentation increasing adoption of RFID tags.An RFID system with a large number of tags may require multiple readers with overlapping regions.In this paper,we first focus on the sin- Keywords gle reader version of the tag identification problem and then RFID:Tags:Identification extend our solution to the multiple reader problem. 1.2 Summary and Limitations of Prior Art The industrial standard.EPCGlobal Class 1 Generation 2 Permission to make digital or hard copies of all or part of this work for (C1G2)RFID 9,adopted two tag identification protocols personal or classroom use is granted without fee provided that copies are namely framed slotted Aloha and Tree Walking (TW).In not made or distributed for profit or commercial advantage and that copies framed slotted Aloha.a reader first broadcasts a value f to bear this notice and the full citation on the first page.To copy otherwise,to the tags in its vicinity where f represents the number of time republish,to post on servers or to redistribute to lists,requires prior specific permission and/or a fee. slots present in a forthcoming frame.Then each tag whose S/GMETR/CS'/3.June 17-21,2013.Pittsburgh,PA,USA inventory bit is 0 randomly picks a time slot in the frame Copyright2013ACM978-1-4503-1900-3/13/06..$15.00. and replies during that slot.Each C1G2 compliant tag has 293
Probabilistic Optimal Tree Hopping for RFID Identification Muhammad Shahzad and Alex X. Liu Department of Computer Science and Engineering Michigan State University East Lansing, Michigan, USA {shahzadm, alexliu}@cse.msu.edu ABSTRACT Radio Frequency Identification (RFID) systems are widely used in various applications such as supply chain management, inventory control, and object tracking. Identifying RFID tags in a given tag population is the most fundamental operation in RFID systems. While the Tree Walking (TW) protocol has become the industrial standard for identifying RFID tags, little is known about the mathematical nature of this protocol and only some ad-hoc heuristics exist for optimizing it. In this paper, first, we analytically model the TW protocol, and then using that model, propose the Tree Hopping (TH) protocol that optimizes TW both theoretically and practically. The key novelty of TH is to formulate tag identification as an optimization problem and find the optimal solution that ensures the minimal average number of queries. With this solid theoretical underpinning, for different tag population sizes ranging from 100 to 100K tags, TH significantly outperforms the best prior tag identification protocols on the metrics of the total number of queries per tag, the total identification time per tag, and the average number of responses per tag by an average of 50%, 10%, and 30%, respectively, when tag IDs are uniformly distributed in the ID space, and of 26%, 37%, and 26%, respectively, when tag IDs are non-uniformly distributed. Categories and Subject Descriptors C.2.1 [Computer-Communication Networks]: Network Architecture and Design – Wireless communication; C.2.8 [Mobile Computing]: Algorithm Design and Analysis General Terms Algorithms, Design, Performance, Experimentation Keywords RFID; Tags; Identification Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. SIGMETRICS’13, June 17-21, 2013, Pittsburgh, PA, USA. Copyright 2013 ACM 978-1-4503-1900-3/13/06 ...$15.00. 1. INTRODUCTION 1.1 Background and Problem Statement As the cost of commercial RFID tags, which is as low as 5 cents per tag [21], has become negligible compared to the prices of the products to which they are attached, RFID systems are being increasingly used in various applications such as supply chain management [13], indoor localization [18], 3D positioning [27], object tracking [17], inventory control, electronic toll collection, and access control [5, 16]. For example, Walmart has started to use RFID tags to track jeans and underwear for better inventory control. An RFID system consists of tags and readers. A tag is a microchip combined with an antenna in a compact package that has limited computing power and communication range. There are two types of tags: (1) passive tags, which do not have their own power source, are powered up by harvesting the radio frequency energy from readers, and have communication ranges often less than 20 feet; (2) active tags, which come with their own power sources and have relatively longer communication ranges. A reader has a dedicated power source with significant computing power. RFID systems mostly work in a query-response fashion where a reader transmits queries to a set of tags and the tags respond with their IDs over a shared wireless medium. This paper addresses the fundamental RFID tag identifi- cation problem, namely reading all IDs of a given set of tags, which is needed in almost all RFID systems. Because tags respond over a shared wireless medium, tag identification protocols are also called collision arbitration, tag singulation, or tag anti-collision protocols. Tag identification protocols need to be scalable as the number of tags that need to be identified could be as large as tens of thousands with the increasing adoption of RFID tags. An RFID system with a large number of tags may require multiple readers with overlapping regions. In this paper, we first focus on the single reader version of the tag identification problem and then extend our solution to the multiple reader problem. 1.2 Summary and Limitations of Prior Art The industrial standard, EPCGlobal Class 1 Generation 2 (C1G2) RFID [9], adopted two tag identification protocols, namely framed slotted Aloha and Tree Walking (TW). In framed slotted Aloha, a reader first broadcasts a value f to the tags in its vicinity where f represents the number of time slots present in a forthcoming frame. Then each tag whose inventory bit is 0 randomly picks a time slot in the frame and replies during that slot. Each C1G2 compliant tag has 293
(a)Nodes visited using TW protocol (b)Nodes visited using TH protocol Figure 1:Identifying a population of 9 tags over ID space of 2 using Tree Walking and Tree Hopping an inventory bit,which is initialized to be 0.In any slot, nodes in the binary tree.Although several heuristics have if exactly one tag responds,the reader successfully gets the been proposed to reduce the number of visits to collision ID of that tag and issues a command to the tag to change nodes 15,19,all these heuristics based methods are not its inventory bit to 1.The key limitation of framed slotted guaranteed to minimize such futile visits.Prior Aloha-TW Aloha is that it can not identify large tag populations due to hybrid protocols also have this limitation. the finite possible size of f,which is typically no more than 512 [23].Qian et al.have shown that framed slotted Aloha 1.3 System Model is most efficient when f is equal to the number of tags 20 As most commercially available tags and readers already Therefore.although theoretically any arbitrarily large tag comply with the C1G2 standard,we do not assume changes population can be identified by indefinitely increasing the to either tags or the physical protocol that they use to com- frame size,practically this is infeasible because during the municate with readers.We assume that readers can be repro- entire identification process,Aloha based protocols require grammed to adopt new tag identification software.For reli- all tags,including those that have been identified,to stay able tag identification,we are given the probability of suc- powered up and listen to all the messages from the reader in cessful query-response communication between the reader order to maintain the value of the inventory bit.This results and a tag. in high instability because any intermittent loss of power at a tag will set its inventory bit back to 0,leading the tag to 1.4 Proposed Approach contend in the subsequent frame.The instability of Aloha To address the fundamental limitations that lie in the based protocols has formally been proven by Rosenkrantz heuristic nature of prior TW based protocols,we propose and Towsley in [22]. a new approach to tag identification called Tree Hopping TW is a fundamental multiple access protocol,which was (TH).The key novel idea of TH is to formulate the tag iden- first invented by U.S.Army for testing soldiers for syphilis tification problem as an optimization problem and find the during World War II [4].TW was proposed as an RFID optimal solution that ensures the minimal expected number tag identification protocol by Law et al.in [12].In TW,a of queries(i.e.,nodes visited on the binary tree).In TH,we reader first queries 0 and all the tags whose IDs start with first quickly estimate the tag population size.Second,based 0 respond.If result of the query is a successful read (i.e., on the estimated tag population size,we calculate the opti- exactly one tag responds)or an empty read (i.e.,no tag mal level to start tree traversal so that the expected number responds),the reader queries 1 and all the tags whose IDs of queries is minimal,hop directly to the left most node start with 1 respond.If the result of the query is a collision, on that level,and then perform DFT on the subtree rooted the reader generates two new query strings by appending a at that node.Third,after that subtree is traversed,we re 0 and a 1 to the previous query string and queries the tags estimate the size of remaining unidentified tag population. with these new query strings.All the tags whose IDs start re-calculate the new optimal level,hop directly to the new with the new query string respond.This process continues optimal node,and perform DFT on the subtree rooted at until all the tags have been identified.This identification that node.Hopping to optimal nodes in this manner skips a process is essentially a partial Depth First Traversal(DFT) large number of collision nodes.This process continues un- on the complete binary tree over the tag ID space,and the til all the tags have been identified.Figure 1(b)shows the actual traversal forms a binary tree where the leaf nodes nodes traversed by TH for the same population of 9 tags as represent successful or empty reads and the internal nodes in Figure 1(a).Here a skipped node is one that TW visits represent collisions.Figure 1(a)shows the tree walking pro- but TH does not.We can see that TH traverses 11 nodes to cess for identifying 9 tags over a tag ID space of size 2.Here identify these 9 tags.In comparison,TW traverses 16 nodes a successful read node is one that an identification proto- as shown in Figure 1(a).This difference scales significantly col visits and there is exactly one tag in the subtree rooted as tag population size increases. at this node,an empty read node is one that an identifica- tion protocol visits and there is no tag in the subtree rooted 1.4.1 Population Size Estimation at this node,and a collision node is one that an identifi- TH first uses a framed slotted Aloha based method to cation protocol visits and there are more than one tags in quickly estimate the tag population size.For this,TH re- the subtree rooted at this node.The key limitation of TW quires each tag to respond to the reader with a probability based protocols is that they visit a large number of collision q.As C1G2 compliant tags do not support this probabilistic 294
(3,0) (3,3) (3,4) (3,5) (3,6) (3,7) 11 12 14 15 6 (3,2) 3 4 7 (3,1) 10 13 2 5 9 16 1 8 1 (2,0) (2,1) (2,2) (2,3) (4,0) (1,0) (1,1) (4,1) (4,2) (4,3) (4,4) (4,5) (4,6) (4,7) (4,8) (4,9) (4,10) (4,11) (4,12) (4,13) (4,14) (4,15) Tag Skipped Node (0,0) 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 (3,0) (3,3) (3,4) (3,5) (3,6) (3,7) 6 7 9 10 3 (3,2) 1 2 4 (3,1) 5 8 11 (2,0) (2,1) (2,2) (2,3) (4,0) (1,0) (1,1) (4,1) (4,2) (4,3) (4,4) (4,5) (4,6) (4,7) (4,8) (4,9) (4,10) (4,11) (4,12) (4,13) (4,14) (4,15) Empty Read Collision Successful Read (0,0) 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 0 1 0 1 0 1 (a) Nodes visited using TW protocol (b) Nodes visited using TH protocol 1 Figure 1: Identifying a population of 9 tags over ID space of 2 4 using Tree Walking and Tree Hopping an inventory bit, which is initialized to be 0. In any slot, if exactly one tag responds, the reader successfully gets the ID of that tag and issues a command to the tag to change its inventory bit to 1. The key limitation of framed slotted Aloha is that it can not identify large tag populations due to the finite possible size of f, which is typically no more than 512 [23]. Qian et al. have shown that framed slotted Aloha is most efficient when f is equal to the number of tags [20]. Therefore, although theoretically any arbitrarily large tag population can be identified by indefinitely increasing the frame size, practically this is infeasible because during the entire identification process, Aloha based protocols require all tags, including those that have been identified, to stay powered up and listen to all the messages from the reader in order to maintain the value of the inventory bit. This results in high instability because any intermittent loss of power at a tag will set its inventory bit back to 0, leading the tag to contend in the subsequent frame. The instability of Aloha based protocols has formally been proven by Rosenkrantz and Towsley in [22]. TW is a fundamental multiple access protocol, which was first invented by U.S. Army for testing soldiers for syphilis during World War II [4]. TW was proposed as an RFID tag identification protocol by Law et al. in [12]. In TW, a reader first queries 0 and all the tags whose IDs start with 0 respond. If result of the query is a successful read (i.e., exactly one tag responds) or an empty read (i.e., no tag responds), the reader queries 1 and all the tags whose IDs start with 1 respond. If the result of the query is a collision, the reader generates two new query strings by appending a 0 and a 1 to the previous query string and queries the tags with these new query strings. All the tags whose IDs start with the new query string respond. This process continues until all the tags have been identified. This identification process is essentially a partial Depth First Traversal (DFT) on the complete binary tree over the tag ID space, and the actual traversal forms a binary tree where the leaf nodes represent successful or empty reads and the internal nodes represent collisions. Figure 1(a) shows the tree walking process for identifying 9 tags over a tag ID space of size 24 . Here a successful read node is one that an identification protocol visits and there is exactly one tag in the subtree rooted at this node, an empty read node is one that an identification protocol visits and there is no tag in the subtree rooted at this node, and a collision node is one that an identifi- cation protocol visits and there are more than one tags in the subtree rooted at this node. The key limitation of TW based protocols is that they visit a large number of collision nodes in the binary tree. Although several heuristics have been proposed to reduce the number of visits to collision nodes [15, 19], all these heuristics based methods are not guaranteed to minimize such futile visits. Prior Aloha-TW hybrid protocols also have this limitation. 1.3 System Model As most commercially available tags and readers already comply with the C1G2 standard, we do not assume changes to either tags or the physical protocol that they use to communicate with readers. We assume that readers can be reprogrammed to adopt new tag identification software. For reliable tag identification, we are given the probability of successful query-response communication between the reader and a tag. 1.4 Proposed Approach To address the fundamental limitations that lie in the heuristic nature of prior TW based protocols, we propose a new approach to tag identification called Tree Hopping (TH). The key novel idea of TH is to formulate the tag identification problem as an optimization problem and find the optimal solution that ensures the minimal expected number of queries (i.e., nodes visited on the binary tree). In TH, we first quickly estimate the tag population size. Second, based on the estimated tag population size, we calculate the optimal level to start tree traversal so that the expected number of queries is minimal, hop directly to the left most node on that level, and then perform DFT on the subtree rooted at that node. Third, after that subtree is traversed, we reestimate the size of remaining unidentified tag population, re-calculate the new optimal level, hop directly to the new optimal node, and perform DFT on the subtree rooted at that node. Hopping to optimal nodes in this manner skips a large number of collision nodes. This process continues until all the tags have been identified. Figure 1(b) shows the nodes traversed by TH for the same population of 9 tags as in Figure 1(a). Here a skipped node is one that TW visits but TH does not. We can see that TH traverses 11 nodes to identify these 9 tags. In comparison, TW traverses 16 nodes as shown in Figure 1(a). This difference scales significantly as tag population size increases. 1.4.1 Population Size Estimation TH first uses a framed slotted Aloha based method to quickly estimate the tag population size. For this, TH requires each tag to respond to the reader with a probability q. As C1G2 compliant tags do not support this probabilistic 294
responding,we implement this by"virtually"extending the tree of (2,2),which is wasteful as all the tags in the subtree frame sizetimes.To estimate the tag population size,the of(2,2)have already been identified.Similarly,if there had reader announces a frame size of I but terminates it after the been a third leftmost node on the new optimal level and if first slot.The reader issues several single-slot frames while TH starts the DFT from that third left most node.it will not reducing q with a geometric distribution (i.e.,q=in visit the subtree of(2,3),resulting in tag (4,13)not being the i-th frame)until the reader gets an empty slot.Suppose identified.To avoid both scenarios,i.e.,some subtrees being the empty slot occurred in the i-th frame,TH estimates the traversed multiple times and some subtrees with tags not tag population size to be 1.2897 x 2-2 based on Flajolet being traversed,after the optimal level is recalculated,TH and Martin's algorithm used in databases 6. hops to the root of the largest subtree that can contain the next tag to be identified but does not contain any previously 1.4.2 Finding Optimal Level identified tag.The level at which this root is located can not To determine the optimal level Yop that TH directly hops be smaller than the new optimal level.For the example in to,we first calculate the expected number of nodes that Figure 1(b),after the subtree rooted at node (2,2)has been TH will visit if it starts DFTs from nodes on any given traversed,the recalculated optimal level is 1 and the next level y.Let b be the number of bits in each tag ID(which node that TH hops to is(2,3). is 64 for C1G2 compliant tags),then,we have 1 <y< Our experimental results in Figure 2 show that when the b.If y is small,more collision nodes will be visited while tags are not uniformly distributed in the ID space,our tech- if it is large,more empty read nodes will be visited.Our nique of dynamically adjusting Yp according to the left- objective is to calculate an optimal level Yop that will result over population size significantly reduces the total number in the smallest number of nodes visited.To find Yop,we first of queries and the average number of responses per tag. derive the expression for calculating the expected number The two curves "TH w re-estimation-Seg"and "TH w/o re of nodes visited by TH if TH directly hops to level y.Then estimation-Seg"show the total number of queries needed. we calculate the value of which minimizes this expression. respectively,with and without the dynamic adjustment of This value of y is the value of optimal level Yp.We present Yop for non-uniformly distributed tag IDs.For example,for the technical details of finding Yop in Section 3. 10K tags,this dynamic level adjustment reduces the total number of queries by 31.5%.Our experimental results in Fig- 1.4.3 Population Size Re-estimation ure 2 also show that when the tags are uniformly distributed If the tags that we want to identify are uniformly dis- in the ID space,there is no need to dynamically adjust Yop. tributed in the ID space [0,2-1],then performing DFTs The two curves“TH w re-estimation-Uni”and“THw/ore from each node on level Yop will result in minimum num- estimation-Uni"show the total number of queries needed. ber of nodes visited.However,in reality,the tags may not respectively,with and without the dynamic adjustment for be uniformly distributed.In such cases,each time when the uniformly distributed tag IDs.These two curves are quite DFT of a subtree is finished,TH needs to re-estimate the close because for uniformly distributed tag IDs,Yop does total tag population size to find the next optimal level and not usually change after each DFT and thus the benefit of the hoping destination node.TH performs the re-estimation dynamically adjusting Yop is relatively small. as follows.Let z be the first tag population size estimated using the Aloha based method,c be the number of tags that have been identified,and s be the size of the tag ID space TH w/o re-estimation-Seq OTH w re-estimation-Sec covered by the nodes visited.Naturally,z-r is an estimate ATH w/o re-estimation-Un of the remaining tag population size;however,we cannot use TH w re-estimation-Unl this estimate to calculate the next optimal level because the remaining leftover ID space may not form a complete binary tree.Instead,based on the node density in the remaining ID space,TH extrapolates the total tag population size to be and uses it to find the next hopping destination node.Note that if tags are uniformly distributed,we have 10 10 ×2=2 #18s 1.4.4 Finding Hopping Destination Figure 2:Impact of re-estimation Each time after a DFT is done and the new optimal level is recalculated,TH needs to find the next node to hop to which may not be the leftmost node on the optimal level 2. RELATED WORK Consider the example shown in Figure 1(b).Assuming a uni- We review existing tag identification protocols,which can form distribution,the optimal level to start the DFT is 3.In be classified as nondeterministic,deterministic.or hybrid this paper,we use(,p)to denote the ph node on level l.TH protocols. performs DFTs on the subtrees of nodes (3,0)to (3,5)and identifies 8 out of 9 tags.Based on the number of remaining 2.1 Nondeterministic Identification Protocols tags after the last DFT,which is 1,the optimal level for the Existing such protocols are either based on framed slot- next hop is changed from 3 to 1.However,if TH starts the ted Aloha [28]or Binary Splitting(BS)[2].As we discussed DFT from the leftmost node on level 1,which is (1,0),it above.Aloha based protocols only work for small tag popu- will result in identifying all tags in its subtree again which lations.In BS[2],the identification process starts with the is wasteful.Similarly,if TH starts the DFT from the second reader asking the tags to respond.If more than one tag eftmost node on level 1,which is (1,1),it will visit the sub- responds,BS divides and subdivides the population into 295
responding, we implement this by “virtually” extending the frame size 1 q times. To estimate the tag population size, the reader announces a frame size of 1 q but terminates it after the first slot. The reader issues several single-slot frames while reducing q with a geometric distribution (i.e., q = 1 2i−1 in the i-th frame) until the reader gets an empty slot. Suppose the empty slot occurred in the i-th frame, TH estimates the tag population size to be 1.2897 × 2 i−2 based on Flajolet and Martin’s algorithm used in databases [6]. 1.4.2 Finding Optimal Level To determine the optimal level γop that TH directly hops to, we first calculate the expected number of nodes that TH will visit if it starts DFTs from nodes on any given level γ. Let b be the number of bits in each tag ID (which is 64 for C1G2 compliant tags), then, we have 1 ≤ γ ≤ b. If γ is small, more collision nodes will be visited while if it is large, more empty read nodes will be visited. Our objective is to calculate an optimal level γop that will result in the smallest number of nodes visited. To find γop, we first derive the expression for calculating the expected number of nodes visited by TH if TH directly hops to level γ. Then we calculate the value of γ which minimizes this expression. This value of γ is the value of optimal level γop. We present the technical details of finding γop in Section 3. 1.4.3 Population Size Re-estimation If the tags that we want to identify are uniformly distributed in the ID space [0, 2 b − 1], then performing DFTs from each node on level γop will result in minimum number of nodes visited. However, in reality, the tags may not be uniformly distributed. In such cases, each time when the DFT of a subtree is finished, TH needs to re-estimate the total tag population size to find the next optimal level and the hoping destination node. TH performs the re-estimation as follows. Let z be the first tag population size estimated using the Aloha based method, x be the number of tags that have been identified, and s be the size of the tag ID space covered by the nodes visited. Naturally, z − x is an estimate of the remaining tag population size; however, we cannot use this estimate to calculate the next optimal level because the remaining leftover ID space may not form a complete binary tree. Instead, based on the node density in the remaining ID space, TH extrapolates the total tag population size to be z−x 2b−s × 2 b and uses it to find the next hopping destination node. Note that if tags are uniformly distributed, we have z−x 2b−s × 2 b = z. 1.4.4 Finding Hopping Destination Each time after a DFT is done and the new optimal level is recalculated, TH needs to find the next node to hop to, which may not be the leftmost node on the optimal level. Consider the example shown in Figure 1(b). Assuming a uniform distribution, the optimal level to start the DFT is 3. In this paper, we use (l, p) to denote the p th node on level l. TH performs DFTs on the subtrees of nodes (3, 0) to (3, 5) and identifies 8 out of 9 tags. Based on the number of remaining tags after the last DFT, which is 1, the optimal level for the next hop is changed from 3 to 1. However, if TH starts the DFT from the leftmost node on level 1, which is (1, 0), it will result in identifying all tags in its subtree again which is wasteful. Similarly, if TH starts the DFT from the second leftmost node on level 1, which is (1, 1), it will visit the subtree of (2, 2), which is wasteful as all the tags in the subtree of (2, 2) have already been identified. Similarly, if there had been a third leftmost node on the new optimal level and if TH starts the DFT from that third left most node, it will not visit the subtree of (2, 3), resulting in tag (4, 13) not being identified. To avoid both scenarios, i.e., some subtrees being traversed multiple times and some subtrees with tags not being traversed, after the optimal level is recalculated, TH hops to the root of the largest subtree that can contain the next tag to be identified but does not contain any previously identified tag. The level at which this root is located can not be smaller than the new optimal level. For the example in Figure 1(b), after the subtree rooted at node (2, 2) has been traversed, the recalculated optimal level is 1 and the next node that TH hops to is (2, 3). Our experimental results in Figure 2 show that when the tags are not uniformly distributed in the ID space, our technique of dynamically adjusting γop according to the leftover population size significantly reduces the total number of queries and the average number of responses per tag. The two curves “TH w re-estimation-Seq” and “TH w/o reestimation-Seq” show the total number of queries needed, respectively, with and without the dynamic adjustment of γop for non-uniformly distributed tag IDs. For example, for 10K tags, this dynamic level adjustment reduces the total number of queries by 31.5%. Our experimental results in Figure 2 also show that when the tags are uniformly distributed in the ID space, there is no need to dynamically adjust γop. The two curves “TH w re-estimation-Uni” and “TH w/o reestimation-Uni” show the total number of queries needed, respectively, with and without the dynamic adjustment for uniformly distributed tag IDs. These two curves are quite close because for uniformly distributed tag IDs, γop does not usually change after each DFT and thus the benefit of dynamically adjusting γop is relatively small. 102 103 104 1 1.5 2 2.5 3 # Tags # Queries / # Tags TH w/o re−estimation − Seq TH w re−estimation − Seq TH w/o re−estimation − Uni TH w re−estimation − Uni Figure 2: Impact of re-estimation 2. RELATED WORK We review existing tag identification protocols, which can be classified as nondeterministic, deterministic, or hybrid protocols. 2.1 Nondeterministic Identification Protocols Existing such protocols are either based on framed slotted Aloha [28] or Binary Splitting (BS) [2]. As we discussed above, Aloha based protocols only work for small tag populations. In BS [2], the identification process starts with the reader asking the tags to respond. If more than one tag responds, BS divides and subdivides the population into 295
smaller groups until each group has only one or no tag. This process of random subdivision incurs a lot of collisions. Table 1:Notations Furthermore.BS requires the tags to perform operations Symbol Description that are not supported by the C1G2 standard.ABS is a BS b of bits in tag ID,which is 64 for C1G2 tags based protocol that is designed for continuous identification n size of the whole ID space,which is 20 of tags 14. node whose top-to-down vertical level is (L,p) 1≤l≤b and left-.to-right horizontal position 2.2 Deterministic Identification Protocols is0≤p≤2b-1 estimated number of tags in the population There are 3 such protocols:(1)the basic TW protocol 12 level from which TH performs DFTs (2)the Adaptive Tree Walking (ATW)protocol 24,and (3) Yop optimal level to perform DFTs the TW-based Smart Trend Traversal (STT)protocol 19 0 tag response probability used in estimation ATW is an optimized version of TW that always starts DFTs I(L.p) indicator random variable,1 if (l,p)is visited from the level of log z,where z is the size of tag population. T random variable for total of nodes visited This is the traditional wisdom for optimizing TW.The key ET expected of nodes visited to identify all tags limitation of ATW is that it is optimal only when all tag in the population IDs are evenly spaced in the ID space;however,this is often Pi(t.p) probability of visiting (l,p)if it is left child Pri.p) probability of visiting (l.p)if it is right child not true in real-world applications.In contrast,during the identification process,our TH protocol adaptively chooses size of ID space covered by the parent of the current node being visited the optimal level to hop to based on distribution of IDs.STT improves TW using some ad-hoc heuristics to select prefixes 不 of tags covered by the parent of the current node being visited for next queries based upon the type of response to previous Ps:Pe probabilities of successful read,collision.or queries.It assumes that the number of tags identified in the P empty read at parent of the current node past k queries is the same as the number of tags that will B repetitions of query to handle unreliable channel be identified in the next k queries.This may not be true in 9 actual probability of reading a tag u reality. required probability of reading a tag maximum of nodes visited to identify all tags 2.3 Hybrid Identification Protocols 0o of subtrees covering no tags 61 of subtrees covering 1 tag Hybrid protocols combine features from nondeterministic and deterministic protocols.There are two major such pro tocols:Multi slotted scheme with Assigned Slots (MAS)[15 3.1 Average Number of Queries and Adaptively Splitting-based Arbitration Protocol (ASAP) 20.MAS is a TW-based protocol in which each tag that Let random variable T denote the total number of nodes matches the reader's query picks up one of the f time slots that TH visits to identify all tags.Note that each node visit to respond.For large populations,due to the finite prac- corresponds to one reader query.We next calculate E[T. tical size of f 512,for queries corresponding to higher Let I(l.p)be an indicator random variable whose value is levels in the binary tree,the response in each of the f slots 1 if and only if node (I,p)is visited.Thus,T is the sum of is most likely a collision.which increases the identification I(I.p)for all l and all p. time.ASAP divides and subdivides the tag population until 14 the size of each subset is below a certain threshold and then T I(L.p) (1) applies Aloha on each subset.For this,ASAP requires tags l=1p=0 to be able to pick slots using a geometric distribution,which makes it incompliant with the C1G2 standard.Furthermore, Let P[(l,p)}be the probability that TH visits node (l,p). subdividing the population before the actual identification Thus,ET can be expressed as follows: is in itself very time consuming. b2-1 3. OPTIMAL TREE HOPPING P{(Lp)} (2) l=1p=0 After quick population size estimation using Flajolet and Martin's algorithm [6],TH needs to find the optimal level Next,we focus on expressing P{(,p)}using variable 7, to hop to.First,we derive an expression to calculate the where y denotes the level that TH hops to.Recall that TH expected number of queries (i.e.,the number of nodes that skips all nodes on levels from 1 to y-1 and performs DET TH will visit)if it starts DFTs from the nodes on level y,as- on each of the2'nodes on level,where1≤y≤b. suming that tags are uniformly distributed in the ID space. Note that the root node of the whole binary tree is al- Second,as the derived expression is too complex to calculate ways meaningless to visit as it corresponds to a query of the optimal value of y that minimizes the expected number length 0.Here P{(l,p)}is calculated differently depending of queries by simply differentiating the expression with re- on whether node (1,p)is the left child of its parent or the spect to y,we present a numerical method to calculate the right.Let P{(,p)}and P{(l,p)}denote the probability optimal level Yop.If tags are not uniformly distributed,each of visiting (I,p)when (l,p)is the left and right child of its time when the DFT on a node is completed,as stated in Sec- parent,respectively.If the estimated total number of tags tion 1.4.TH re-estimates the total population size based on z is zero,then P{(l,p))=P(l,p)}=0 for all I and p. the initial estimate and the number of tags that have been Below we assume z>0.As TH skips all nodes from levels identified,re-calculates the new optimal level,and finds the 1 to y-1,we have hopping destination node.Table 1 summarizes the symbols P(,p)=P,p)=0if1<I<Y (3) used in this paper. 296
smaller groups until each group has only one or no tag. This process of random subdivision incurs a lot of collisions. Furthermore, BS requires the tags to perform operations that are not supported by the C1G2 standard. ABS is a BS based protocol that is designed for continuous identification of tags [14]. 2.2 Deterministic Identification Protocols There are 3 such protocols: (1) the basic TW protocol [12], (2) the Adaptive Tree Walking (ATW) protocol [24], and (3) the TW-based Smart Trend Traversal (STT) protocol [19]. ATW is an optimized version of TW that always starts DFTs from the level of log z, where z is the size of tag population. This is the traditional wisdom for optimizing TW. The key limitation of ATW is that it is optimal only when all tag IDs are evenly spaced in the ID space; however, this is often not true in real-world applications. In contrast, during the identification process, our TH protocol adaptively chooses the optimal level to hop to based on distribution of IDs. STT improves TW using some ad-hoc heuristics to select prefixes for next queries based upon the type of response to previous queries. It assumes that the number of tags identified in the past k queries is the same as the number of tags that will be identified in the next k queries. This may not be true in reality. 2.3 Hybrid Identification Protocols Hybrid protocols combine features from nondeterministic and deterministic protocols. There are two major such protocols: Multi slotted scheme with Assigned Slots (MAS) [15] and Adaptively Splitting-based Arbitration Protocol (ASAP) [20]. MAS is a TW-based protocol in which each tag that matches the reader’s query picks up one of the f time slots to respond. For large populations, due to the finite practical size of f ≤ 512, for queries corresponding to higher levels in the binary tree, the response in each of the f slots is most likely a collision, which increases the identification time. ASAP divides and subdivides the tag population until the size of each subset is below a certain threshold and then applies Aloha on each subset. For this, ASAP requires tags to be able to pick slots using a geometric distribution, which makes it incompliant with the C1G2 standard. Furthermore, subdividing the population before the actual identification is in itself very time consuming. 3. OPTIMAL TREE HOPPING After quick population size estimation using Flajolet and Martin’s algorithm [6], TH needs to find the optimal level to hop to. First, we derive an expression to calculate the expected number of queries (i.e., the number of nodes that TH will visit) if it starts DFTs from the nodes on level γ, assuming that tags are uniformly distributed in the ID space. Second, as the derived expression is too complex to calculate the optimal value of γ that minimizes the expected number of queries by simply differentiating the expression with respect to γ, we present a numerical method to calculate the optimal level γop. If tags are not uniformly distributed, each time when the DFT on a node is completed, as stated in Section 1.4, TH re-estimates the total population size based on the initial estimate and the number of tags that have been identified, re-calculates the new optimal level, and finds the hopping destination node. Table 1 summarizes the symbols used in this paper. Table 1: Notations Symbol Description b # of bits in tag ID, which is 64 for C1G2 tags n size of the whole ID space, which is 2b (l, p) node whose top-to-down vertical level is 1 ≤ l ≤ b and left-to-right horizontal position is 0 ≤ p ≤ 2 b − 1 z estimated number of tags in the population γ level from which TH performs DFTs γop optimal level to perform DFTs q tag response probability used in estimation I(l, p) indicator random variable, 1 if (l, p) is visited T random variable for total # of nodes visited E[T] expected # of nodes visited to identify all tags in the population Pl {(l, p)} probability of visiting (l, p) if it is left child Pr {(l, p)} probability of visiting (l, p) if it is right child m size of ID space covered by the parent of the current node being visited k # of tags covered by the parent of the current node being visited Ps, Pc, probabilities of successful read, collision, or Pe empty read at parent of the current node β repetitions of query to handle unreliable channel g actual probability of reading a tag u required probability of reading a tag V maximum # of nodes visited to identify all tags θ0 # of subtrees covering no tags θ1 # of subtrees covering 1 tag 3.1 Average Number of Queries Let random variable T denote the total number of nodes that TH visits to identify all tags. Note that each node visit corresponds to one reader query. We next calculate E[T ]. Let I(l, p) be an indicator random variable whose value is 1 if and only if node (l, p) is visited. Thus, T is the sum of I(l, p) for all l and all p. T = Xb l=1 2Xl−1 p=0 I(l, p) (1) Let P {(l, p)} be the probability that TH visits node (l, p). Thus, E[T ] can be expressed as follows: E[T ] = Xb l=1 2Xl−1 p=0 P {(l, p)} (2) Next, we focus on expressing P {(l, p)} using variable γ, where γ denotes the level that TH hops to. Recall that TH skips all nodes on levels from 1 to γ − 1 and performs DFT on each of the 2γ nodes on level γ, where 1 ≤ γ ≤ b. Note that the root node of the whole binary tree is always meaningless to visit as it corresponds to a query of length 0. Here P {(l, p)} is calculated differently depending on whether node (l, p) is the left child of its parent or the right. Let Pl {(l, p)} and Pr {(l, p)} denote the probability of visiting (l, p) when (l, p) is the left and right child of its parent, respectively. If the estimated total number of tags z is zero, then Pl {(l, p)} = Pr {(l, p)} = 0 for all l and p. Below we assume z > 0. As TH skips all nodes from levels 1 to γ − 1, we have Pl {(l, p)} = Pr {(l, p)} = 0 if 1 ≤ l < γ (3) 296
As TH performs DFT from each node on level y,it visits n-罗>z-1,then the probability that TH visits(亿,p)is each node on this level.Thus,we have equal to the probability that (l,p-1)is not an empty read, B{(L,p)}=P,{(l,p)}=1ifl=Y (4) which is 1-()/(2)based on Equation (6).Finally,we have For each remaining level yz-1 (10) right child of its parent,if the parent is a collision node ifn-受≤z-1 and (l,p-1)is an empty read node,then (p)will also be a collision node.Thus,instead of visiting (p),TH should directly hop to the left child of(,p).Therefore,P(l,p)}is Case 2:n-m=z-1. equal to the probability that the parent of(I,p)is a collision In this case,.z-k≤n-m=z-l,which means k≥1. node and (l,p-1)is not an empty read node. As the parent of(l,p)covers k 1 tags,the probability Let k denote the number of tags covered by the parent of of the parent of(l,p)being an empty read is 0 and the node (l,p)(i.e.,the number of tags that are in the subtree rooted at the parent of (,p).Let mdenote the probability of the parent of(l,p)being a successful read is m(-T)/(g)=m(-)/(g)=m/(g)based on Equation maximum number of tags that the parent of(亿,p)can cover (7).If (p)is the left child of its parent,then TH visits it and n=2 denote the number of tags in the whole ID space. if and only if the parent of(I,p)is a collision node.Thus, The probability that the parent of(l,p)covers k of z tags the probability of visiting(l,p)is equal to the probability of follows a hypergeometric distribution: the parent of(l,p)being a collision node,which is equal to )二 1-P-P.Thus,we have P(#tags=k}= () (5) A{,p}=1-P-B=1-百 n (11) Let P be the probability that the parent of(l,p)is an empty read.Thus, If (l,p)is the right child of its parent,then TH visits it if and only if both the parent of (I,p)is a collision node and (1,p-1) n一m P.=P{#tags =0}= (6) is not an empty read.The probability that the parent of(,p) is a collision node is 1-m/()as calculated above.Given that the parent of (l,p)is a collision node,the probability Let P be the probability that the parent of(l,p)is a suc- cessful read.Thus. that(亿,p-1)is an empty read is(-)-受)/(g)-m P,P{#tags=1)= m(c) (7) (12) g Let Pe be the probability that the parent of(l,p)is a collision node.Thus, Case 3.n-m>z-1. In this case,k 0.Similar to the above calculations, Pe=1-(P+P)=1- (m)m("m) (8) based on Equations (6)and (7),we have: (G) Next we calculate P{(l,p)}and P{(,p)}for y<I<b B{Lp)}=1-P-P=1- )+m( (13) for the following three cases:n-m<z-1,n-m =2-1. () and n-m z-1.Note that n-m is the size of the ID space that is not covered by the parent of (l,p),and z-k P{L,p)}= 1 )+m(二 is the remaining number of tags that are not covered by the parent of (p).Thus,z-k<n-m. x1- 二)-{m)+受(二] (14) Case 1:n-m<z-1. -{m)+m(-} In this case,.z-k≤n-m<z-l,which means k≥2. Finally,Equations (3)through(14)completely define the Thus,as the parent of(l,p)covers at least two tags,it must probabilities P{(,p)}and P {(l,p)}.Note that as tags are be a collision node,i.e.P=1.Thus,if (l,p)is the left child uniformly distributed,the probability of visiting node(l,p) of its parent,TH for sure visits it: is independent of the horizontal position p. B{(0,p)}=1 (9) The expected number of queries can now be calculated using Theorem 1. If(,p)is the right child of its parent,TH visits it if and only if node (l,p-1),which is the left sibling of (l,p),is THEOREM 1.For a population of z tags uniformly dis- not an empty read.If (,p-1)is an empty read,as its tributed in the ID space,where each tag has an ID of b bits, parent is a collision node,(t,p)must also be a collision node, if TH hops to level y to perform DET from each node on which means that TH will directly visit the left child of this level,the erpected number of queries for identifying all (1,p)instead of (l,p).The size of the ID space covered by z tags is: (L,p-1)is受.Ifn-受≤z-l,then node(l,p-1)covers at least one tag,which means that (l,p-1)is not an empty E四=2+∑2-[B{,p}+P{0,pH(15) read and TH for sure visits (l,p),i.e.,P.{(l,p)}=1.If l=y+1 297
As TH performs DFT from each node on level γ, it visits each node on this level. Thus, we have Pl {(l, p)} = Pr {(l, p)} = 1 if l = γ (4) For each remaining level γ z − 1. Note that n − m is the size of the ID space that is not covered by the parent of (l, p), and z − k is the remaining number of tags that are not covered by the parent of (l, p). Thus, z − k ≤ n − m. Case 1: n − m z − 1, then the probability that TH visits (l, p) is equal to the probability that (l, p − 1) is not an empty read, which is 1 − n− m 2 z / n z based on Equation (6). Finally, we have Pr {(l, p)} = 1 − ( n− m 2 z ) ( n z ) if n − m 2 > z − 1 1 if n − m 2 ≤ z − 1 (10) Case 2: n − m = z − 1. In this case, z − k ≤ n − m = z − 1, which means k ≥ 1. As the parent of (l, p) covers k ≥ 1 tags, the probability of the parent of (l, p) being an empty read is 0 and the probability of the parent of (l, p) being a successful read is m n−m z−1 / n z = m z−1 z−1 / n z = m/n z based on Equation (7). If (l, p) is the left child of its parent, then TH visits it if and only if the parent of (l, p) is a collision node. Thus, the probability of visiting (l, p) is equal to the probability of the parent of (l, p) being a collision node, which is equal to 1 − Pe − Ps. Thus, we have Pl {(l, p)} = 1 − Pe − Ps = 1 − mn z (11) If (l, p) is the right child of its parent, then TH visits it if and only if both the parent of (l, p) is a collision node and (l, p−1) is not an empty read. The probability that the parent of (l, p) is a collision node is 1 − m/n z as calculated above. Given that the parent of (l, p) is a collision node, the probability that (l, p−1) is an empty read is n− m 2 z − m 2 / n z −m . Pr {(l, p)} = 1 − mn z . 1 − n− m 2 z − m 2 n z − m (12) Case 3. n − m > z − 1. In this case, k ≥ 0. Similar to the above calculations, based on Equations (6) and (7), we have: Pl {(l, p)} = 1 − Pe − Ps = 1 − n−m z + m n−m z−1 n z (13) Pr {(l, p)} = 1 − n−m z + m n−m z−1 n z × 1 − n− m 2 z − n−m z + m 2 n−m z−1 n z − n−m z + m n−m z−1 (14) Finally, Equations (3) through (14) completely define the probabilities Pl {(l, p)} and Pr {(l, p)}. Note that as tags are uniformly distributed, the probability of visiting node (l, p) is independent of the horizontal position p. The expected number of queries can now be calculated using Theorem 1. Theorem 1. For a population of z tags uniformly distributed in the ID space, where each tag has an ID of b bits, if TH hops to level γ to perform DFT from each node on this level, the expected number of queries for identifying all z tags is: E[T ] = 2γ + Xb l=γ+1 2 l−1 [Pl {(l, p)} + Pr {(l, p)}] (15) 297
20 00 10 x10 Tags x10 Figure 3:Norm.E[T]vs.Figure 4:Expected Figure 5:Maximum Figure 6:Expected population size Vy queries:TH vs.TW queries:TH vs.TW queries of Reliable TH PROOF.First.on level y,all the 27 nodes are visited by TH.Second,on any level I where y+1 2 tags with b-bit interval [ci,ci+1)in which z lies and then using Yop =i. IDs using Y=Yop.We have The solid line in Figure 3 is plotted using the values of yop V≤z(b-op+1)-2op+20o-01(6-op-1)(16) obtained using the proposed strategy.As values of ci only depend on b,it is a one time cost to calculate them.Table where 2 tabulates the values of ci obtained using this strategy for 00=2p b=64. 0b-ToP We next conduct an analytical comparison between the expected number of queries for TH and that for TW.Figure 4 shows the expected number of queries for TH,which is cal- PROOF.Let Vrw denote the number of queries that TW culated using Equation(15)using y=Yop,and that for TW, may need to identify a>2 tags with b-bit IDs.The upper which is calculated using Equation(15)using y=1,for 64 bound of Vrw is given as follows(proven in [12]): 298
0 200 400 600 800 1000 1 1.5 2 2.5 # Tags Expected # queries / # Tags •←γ = 1 •←γ = 2 •←γ = 3 •←γ = 4 •←γ = 5 γ = 6 γ = 7 γ = 8 γ = 9 γ = 10 γ opt Figure 3: Norm. E[T ] vs. population size ∀γ 1 2 3 4 5 x 104 0 5 10 15x 104 Expected # queries # Tags TH TW Figure 4: Expected # queries: TH vs. TW 1 2 3 4 5 x 104 0 0.5 1 1.5 2 2.5 3 x 106 # Tags Max # queries TH TW Figure 5: Maximum # queries: TH vs. TW 1 2 3 4 5 x 104 0 0.5 1 1.5 2 2.5 3 x 105 # Tags Expected # queries Reliable TH with optimization Reliable TH without optimization Figure 6: Expected # queries of Reliable TH Proof. First, on level γ, all the 2γ nodes are visited by TH. Second, on any level l where γ+1 ≤ l ≤ b, the probabilities of left and right nodes being visited are Pl {(l, p)} and Pr {(l, p)} respectively. As there are 2l−1 pairs of left and right nodes on level l, the expected number of nodes visited by TH on level l is 2l−1 [Pl {(l, p)} + Pr {(l, p)}]. When γ = 1, Equation (15) is also the analytical model for calculating expected number of queries of TW protocol. 3.2 Calculating Optimal Hopping Level Equation (15) shows that E[T ] is a function of γ as n = 2b , m = 2b−l+1, and b is given. For any given z, we want to find the optimal level γ = γop so that E[T ] is minimal. The conventional approach to finding the optimal variable value that minimizes a given function is to differentiate the function with respect to that variable, equate the resulting expression to zero, and solve the equation to obtain the optimal variable value. However, it is very difficult, if not impossible, to use this approach to find the optimal level because Equation (15) for calculating E[T ] is too complex. Next, we present a numerical method to find the optimal level. First, we define normalized E[T ] as the ratio of E[T ] to tag population size. Figure 3 shows the plots of normalized E[T ] vs. the number of tags for different γ values ranging from 1 to b (here we used b = 10 for illustration). From this figure, we observe that for any tag population size, there is a unique optimal value of γ. For example, for a population of 600 tags, γop = 9. Second, we define crossover points as follows: for a given ID length b, the crossover points are the tag population sizes c0 = 0, c1, c2, · · · , cb+1 = 2b such that for any tag population size in [ci, ci+1) (0 ≤ i ≤ b), γop = i. These crossover points are essentially the x-coordinates of the intersection points of the normalized E[T ] curves of consecutive values of γ in Figure 3. Thus, the value of ci can be obtained by putting z = ci and numerically solving E[T, γ = i − 1] = E[T, γ = i] for ci using the bisection method. Once ci is calculated for each 1 ≤ i ≤ b, γop for a given z can be obtained by simply identifying the unique interval [ci, ci+1) in which z lies and then using γop = i. The solid line in Figure 3 is plotted using the values of γop obtained using the proposed strategy. As values of ci only depend on b, it is a one time cost to calculate them. Table 2 tabulates the values of ci obtained using this strategy for b = 64. We next conduct an analytical comparison between the expected number of queries for TH and that for TW. Figure 4 shows the expected number of queries for TH, which is calculated using Equation (15) using γ = γop, and that for TW, which is calculated using Equation (15) using γ = 1, for 64 Table 2: All crossover points for 64-bit tag IDs c0 0.00E+00 c22 3.84E+06 c44 1.61E+13 c1 2.00E+00 c23 7.68E+06 c45 3.22E+13 c2 4.00E+00 c24 1.54E+07 c46 6.45E+13 c3 7.00E+00 c25 3.07E+07 c47 1.29E+14 c4 1.50E+01 c26 6.12E+07 c48 2.58E+14 c5 2.90E+01 c27 1.22E+08 c49 5.16E+14 c6 5.90E+01 c28 2.42E+08 c50 1.03E+15 c7 1.17E+02 c29 4.76E+08 c51 2.06E+15 c8 2.35E+02 c30 9.22E+08 c52 4.13E+15 c9 4.69E+02 c31 1.73E+09 c53 8.25E+15 c10 9.38E+02 c32 3.04E+09 c54 1.65E+16 c11 1.88E+03 c33 8.59E+09 c55 3.30E+16 c12 3.75E+03 c34 1.72E+10 c56 6.59E+16 c13 7.51E+03 c35 3.01E+10 c57 1.32E+17 c14 1.50E+04 c36 6.44E+10 c58 2.63E+17 c15 3.00E+04 c37 1.25E+11 c59 5.24E+17 c16 6.00E+04 c38 2.53E+11 c60 1.04E+18 c17 1.20E+05 c39 5.03E+11 c61 2.04E+18 c18 2.40E+05 c40 1.01E+12 c62 3.96E+18 c19 4.80E+05 c41 2.01E+12 c63 7.42E+18 c20 9.61E+05 c42 4.03E+12 c64 1.30E+19 c21 1.92E+06 c43 8.06E+12 c65 1.84E+19 bit tag IDs. We observe that TH significantly outperforms TW for the expected number of queries. For example, for a population of 10K tags, the expected number of queries for TH is only 54% of that for TW. We will present detailed experimental comparison between TH and other identification protocols in Section 5. 3.3 Maximum Number of Queries Although the primary goal of our TH protocol is to minimize the average number of queries, next, we analyze the maximum number of queries of TH and analytically show that it is still smaller than that of TW. The maximum number of queries that TH may need to identify z tags with b-bit IDs is shown in Theorem 2. Theorem 2. Let V denote the number of queries that TH may need to identify a population of z ≥ 2 tags with b-bit IDs using γ = γop. We have V ≤ z(b − γop + 1) − 2 γop + 2θ0 − θ1(b − γop − 1) (16) where θ0 = 2γop − l z 2 b−γop m θ1 = l z 2 b−γop m − l z − 1 2 b−γop ml1 − γop b m Proof. Let VTW denote the number of queries that TW may need to identify z ≥ 2 tags with b-bit IDs. The upper bound of VTW is given as follows (proven in [12]): 298
Algorithm 1:IdentifyRFIDTags(b) rw≤z6+1-log)-1 (17) Input:Tag ID size b in bits Output:IDs of all the tags IDa Because z≥2,we have VTw≤z(b+1)-1. 1 zinit:=EstimateNumberofTags() When z tags are uniformly distributed in the ID space. 2i1D=0 3 s:=0/counts number of leaf nodes covered/ TH essentially performs TW on all subtrees rooted at nodes 4 :=0/*counts number of tags identified on level Yop.Let 0o and 01 denote the number of subtrees covering 0 and 1 tags,respectively.For these 00+01 subtrees, 61:=0 TH only visits the roots,which are at level Yop.Let a denote 7p=0 the number of remaining subtrees (i.e.,a =2%P-00- s while s≤2bdo YopCalculate Yop using the method of Section 3.2 01)and Ti denote a subtree covering zi >2 tags.For each 10 (I.p):=FindHopDest(Yop,b,(I.p)) subtree Ti,the maximum number of nodes that TH visits is 11 {位,IDs=TreeWalking(L,p) zi(b-Top+1)-1.Summing all 2Top subtrees,we have 12 IDs(iID iD+IDs -1]:IDs 13 i :=i+IDs -1 14 x=x+立 vs∑ (2(6-op+1)-1+00+01 15 8=8+2b-1 i三0 16 2=×2 =2(b-Yop+1)-2ap+200-0(6-1op-1)(18) 17 return IDs The right hand side(RHS)of Equation (18)is maximized when 0o is maximized and 0 is minimized,which happens when all z tag IDs are contiguous and they start from the Algorithm 2:EstimateNumberofTags() left most leaf of a subtree at level Yop.In this case,the Input:None number of subtrees with tags are and therefore Output:Initial estimate zinit of population size 00 =2TOP- 20-Yp Furthermore in this case,when Yop s 3p4=1 b-1,there is at most one subtree at level Yop that has exactly 4 repeat Provide the reader with f/p and a random seed Ri. one tag i.e.,1= 2-2 when Yop =b,01 Run Aloha on reader and get the response equals z.Combining the two cases of yop ((2」+1)2-p)-1them number of queries for TH is 93%of that for TW Algorithms 1 through 4 show the pseudocode of TH. =2op】 DISCUSSION G i=1 4.1 Reliable Tag Identification =p So far we have assumed that the communication channel else if Yop >I then between the reader and tags is reliable,which means that 10 each tag can receive the query from the reader and the reader L户=(p+1)(2-op)-1 can receive either the response if only one tag responds or 11 else if Top ==I then the collision if more than one tag respond.However,this 12 13 p=P assumption often does not hold in reality because wireless 14p=p+1 communication medium is inherently unreliable.There are 1s return (l,p) two existing schemes for making tag identification reliable. 299
VTW ≤ z(b + 1 − log z 2 ) − 1 (17) Because z ≥ 2, we have VTW ≤ z(b + 1) − 1. When z tags are uniformly distributed in the ID space, TH essentially performs TW on all subtrees rooted at nodes on level γop. Let θ0 and θ1 denote the number of subtrees covering 0 and 1 tags, respectively. For these θ0+θ1 subtrees, TH only visits the roots, which are at level γop. Let α denote the number of remaining subtrees (i.e., α = 2γop − θ0 − θ1) and Ti denote a subtree covering zi ≥ 2 tags. For each subtree Ti, the maximum number of nodes that TH visits is zi(b − γop + 1) − 1. Summing all 2γop subtrees, we have V ≤ αX−1 i=0 zi(b − γop + 1) − 1 + θ0 + θ1 = z(b − γop + 1) − 2 γop + 2θ0 − θ1(b − γop − 1) (18) The right hand side (RHS) of Equation (18) is maximized when θ0 is maximized and θ1 is minimized, which happens when all z tag IDs are contiguous and they start from the left most leaf of a subtree at level γop. In this case, the number of subtrees with tags are l z 2 b−γop m and therefore θ0 = 2γop − l z 2 b−γop m . Furthermore in this case, when γop ≤ b−1, there is at most one subtree at level γop that has exactly one tag i.e., θ1 = l z 2 b−γop m − l z−1 2 b−γop m ; when γop = b, θ1 equals z. Combining the two cases of γop ≤ b−1 and γop = b, we have θ1 = l z 2 b−γop m − l z−1 2 b−γop ml1 − γop b m . The proof above gives us the insight that TH requires fewer queries when the tag IDs are distributed more uniformly in the ID space. Intuitively, this makes sense because the more the tag IDs are distributed uniformly, the fewer the number of collisions encountered by TH. Experimentally, our results shown in Figures 7(a) and 7(b) in Section 5 also confirm this insight: for the same number of tags, the number of queries needed by TH when tags are uniformly distributed is less than that when tags are non-uniformly distributed. We now conduct an analytical comparison between the maximum number of queries for TH and that for TW. Figure 5 shows the maximum number of queries for TH, which is calculated using the RHS of Equation (16), and that for TW, which is calculated using the RHS of Equation (17), for 64 bit tag IDs. We observe that TH again outperforms TW for the maximum number of queries, although slightly. For example, for a population of 10K tags, the maximum number of queries for TH is 93% of that for TW. Algorithms 1 through 4 show the pseudocode of TH. 4. DISCUSSION 4.1 Reliable Tag Identification So far we have assumed that the communication channel between the reader and tags is reliable, which means that each tag can receive the query from the reader and the reader can receive either the response if only one tag responds or the collision if more than one tag respond. However, this assumption often does not hold in reality because wireless communication medium is inherently unreliable. There are two existing schemes for making tag identification reliable. Algorithm 1: IdentifyRFIDTags(b) Input: Tag ID size b in bits Output: IDs of all the tags IDs 1 zinit := EstimateNumberofTags() 2 iID := 0 3 s := 0 /*counts number of leaf nodes covered*/ 4 x := 0 /*counts number of tags identified*/ 5 z := zinit 6 l := 0 7 p := 0 8 while s ≤ 2 b do 9 γop := Calculate γop using the method of Section 3.2 10 (l, p) := FindHopDest(γop, b, (l, p)) 11 [˜x, IDs] := ˜ TreeWalking((l, p)) 12 IDs[iID : iID+ |IDs ˜ | − 1] := IDs ˜ 13 iID := iID+ |IDs ˜ | 14 x := x + ˜x 15 s := s + 2b−l 16 z := zinit−x 2b−s × 2 b 17 return IDs Algorithm 2: EstimateNumberofTags() Input: None Output: Initial estimate zinit of population size 1 f := 1 2 i := 1 3 pi := 1 4 repeat 5 Provide the reader with f/pi and a random seed Ri. 6 Run Aloha on reader and get the response. 7 if slot is not empty then 8 pi+1 := pi/2 9 i := i + 1 10 until slot is empty 11 zinit := 1.2897 × 2 i−2 12 return zinit Backes et al. proposed the scheme of letting each tag store the IDs of several other tags [1]. When the reader queries a tag, the tag transmits back its own ID as well as the IDs of other tags stored in it. When identification completes, the reader compares the set of IDs of tags that responded with Algorithm 3: FindHopDest(γop, b, (l, p)) Input: (1) Optimal level γop (2) Tag ID size b in bits (3) Current node location (l, p) Output: Hop destination node location (˜l, p˜) 1 if γop j p 2 l−γop k + 1 (2l−γop ) − 1 then 3 ˜l = γop 4 p˜ = j p 2 l−γop k 5 else 6 ˜l = l 7 p˜ = p 8 else if γop > l then 9 ˜l = γop 10 p˜ = (p + 1)(2l−γop ) − 1 11 else if γop == l then 12 ˜l = l 13 p˜ = p 14 p˜ = ˜p + 1 15 return (˜l, p˜) 299
Algorithm 4:TreeWalking((1,p)) The optimization technique of stop transmitting the query Input:Current node location(1,p) corresponding to a terminal on a successful read significantly Output:(1)Number of tags i identified in the subtree whose reduces the total number of queries.Figure 6 plots the ex- root node is (I,p) pected number of queries per tag for the reliable TH proto- (2)Identifiers IDs of the tags identified in the subtree col with and without this optimization.For example,for a whose root node is (l,p) 1i:=l population of 50000 tags,the number of queries per tag are 2p:=p reduced by 24%. D:=0 4.元=0 4.2 Continuous Scanning 5Ds=【] In some applications,the tag population may change over 6 while节<(p+1)×2-ldo time (i.e.,tags leave and join the population dynamically). 7 [status,tagI D]type of response to query (1,)/a query (亿,)means I bit representation of*/ We adapt the continuous scanning strategy proposed by collision then Myung et al.in 14.In the first scanning of the whole tag =2p population,TH records the queries that resulted in success- i:=i+1 ful or empty reads.If the tag population does not change else if status ==successful then by perfoming DFTs on the subtrees rooted at successful and 12 :=金+1 empty read nodes of the previous scan,TH experiences no 13 IDsD】:=tagID collision.If some new tags join the population,some of the 14 iD :=i+1 successful read nodes of the previous scan can now turn into 18 q=Σ°(L」-=) collision nodes and some empty read nodes can turn into 16 i:=1-q+1 successful or collision nodes.If some old tags leave the pop- 节=29T ulation,some successful read nodes will become empty read return [i,IDs] nodes.If any of the new empty read nodes happens to be a sibling of another empty read node,then TH discards these two nodes from the record and stores the location of their parent because the parent is also an empty read node.This the union of sets of IDs of other tags reported by each re- strategy works well when the tag population size either re- sponding tag.If the sets are not equal,the whole process is mains static or increases.However,when the tag population repeated again to ensure that the missed tags are identified. decreases,the best choice is to re-execute TH for the subse- This scheme has two weaknesses.First,this scheme does not quent scan. comply with the C1G2 standard.Second,it assumes that the tag population remains static for the lifetime of tags as 4.3 Multiple Readers each tag is hard coded with some other tags'IDs.The sec- An application with a large number of RFID tags requires ond scheme is to run an identification protocol on the same multiple readers with overlapping regions because a single population several times until probability of missing a tag reader can not cover all tags due to the short communication falls below a threshold 7,10.They estimate the probability range of tags (usually less than 20 feet).The use of multi- of missing a tag based upon the number of tags that were ple readers introduces several new types of collisions such as identified in some runs of the protocol but not in others. reader-reader collisions and reader-tag collisions.Such colli- While we can use the C1G2 compliant scheme proposed sions can be handled by reader scheduling protocols such as in [7,10]to make TH reliable,i.e.,repeatedly run TH until those proposed in [3,25,26,29].TH is compatible with all of the required reliability is achieved.We observe that in this these reader scheduling protocols. scheme,the leaf nodes in the binary tree are queried multiple times.This is wasteful of time for the nodes that the reader 5. PERFORMANCE COMPARISON successfully reads.To eliminate such waste,we propose to query each node multiple times,instead of querying the whole We implemented TH and all the 8 prior tag identification binary tree multiple times.We define the reliability of suc- protocols in Matlab,namely the 3 nondeterministic proto- cessfully reading a tag to be the probability that both the tag cols (Aloha [28.BS 2,and ABS 14),the 3 deterministic receives the query from the reader and the reader receives the protocols (TW 12,ATW 24,and STT [19),and the 2 hy- response from the tag.For this,we calculate the maximum brid protocols (MAS [15]and ASAP [20).As ATW starts number of times the reader should transmit a query,which DFTs from the level of log z which may not be a whole num- is denoted by B.Let g and u be the given and required reli- ber,we present results for ATW by both ceiling and flooring ability of successfully reading a tag,respectively.Thus,the the values of log z and representing them with ATW-c and probability of successfully identifying a tag is 1-(1-g) ATW-f respectively.In terms of implementation complexity, Equating it to u,we get: TH and all the 8 prior protocols are implemented in the sim- ilar number of lines of code.We performed extensive testing B=log(1-)(1-u) (19) both manually and automatically,to ensure the correctness of each protocol implementation. Our scheme of reliable tag identification works as follows: We performed the side-by-side comparison with TH,al- for each non-terminal node in the binary tree that TH needs though this comparison is not completely fair for TH for to visit,TH transmits a query corresponding to that node two reasons.First,3 of these 8 protocols (i.e.,BS.ABS B times:for each terminal node.TH keeps transmitting the and ASAP)require modification to tags and thus do not query corresponding to that node until either that query has work with standard C1G2 tags,whereas TH is fully com- been transmitted B times or the reader successfully receives pliant with C1G2.Second,for the framed slotted Aloha,to the tag ID. its best advantage,we choose the frame size to be the ideal 300
Algorithm 4: TreeWalking((l, p)) Input: Current node location (l, p) Output: (1) Number of tags ˜x identified in the subtree whose root node is (l, p) (2) Identifiers IDs of the tags identified in the subtree ˜ whose root node is (l, p) 1 ˜l := l 2 p˜ := p 3 iID := 0 4 x˜ := 0 5 IDs := [ ] ˜ 6 while p <˜ (p + 1) × 2 ˜l−l do 7 [status, tagID] = type of response to query (˜l, p˜) /*a query (˜l, p˜) means ˜l bit representation of ˜p*/ 8 if status == collision then 9 p˜ := 2˜p 10 ˜l := ˜l + 1 11 else if status == successful then 12 x˜ := ˜x + 1 13 IDs[ ˜ iID] := tagID 14 iID := iID + 1 15 q := Plog2 p˜ i=0 p˜ 2i − p˜−1 2i 16 ˜l := ˜l − q + 1 17 p˜ := p˜ 2q−1 18 return [˜x, IDs] ˜ the union of sets of IDs of other tags reported by each responding tag. If the sets are not equal, the whole process is repeated again to ensure that the missed tags are identified. This scheme has two weaknesses. First, this scheme does not comply with the C1G2 standard. Second, it assumes that the tag population remains static for the lifetime of tags as each tag is hard coded with some other tags’ IDs. The second scheme is to run an identification protocol on the same population several times until probability of missing a tag falls below a threshold [7,10]. They estimate the probability of missing a tag based upon the number of tags that were identified in some runs of the protocol but not in others. While we can use the C1G2 compliant scheme proposed in [7, 10] to make TH reliable, i.e., repeatedly run TH until the required reliability is achieved. We observe that in this scheme, the leaf nodes in the binary tree are queried multiple times. This is wasteful of time for the nodes that the reader successfully reads. To eliminate such waste, we propose to query each node multiple times, instead of querying the whole binary tree multiple times. We define the reliability of successfully reading a tag to be the probability that both the tag receives the query from the reader and the reader receives the response from the tag. For this, we calculate the maximum number of times the reader should transmit a query, which is denoted by β. Let g and u be the given and required reliability of successfully reading a tag, respectively. Thus, the probability of successfully identifying a tag is 1 − (1 − g) β . Equating it to u, we get: β = log(1−g)(1 − u) (19) Our scheme of reliable tag identification works as follows: for each non-terminal node in the binary tree that TH needs to visit, TH transmits a query corresponding to that node β times; for each terminal node, TH keeps transmitting the query corresponding to that node until either that query has been transmitted β times or the reader successfully receives the tag ID. The optimization technique of stop transmitting the query corresponding to a terminal on a successful read significantly reduces the total number of queries. Figure 6 plots the expected number of queries per tag for the reliable TH protocol with and without this optimization. For example, for a population of 50000 tags, the number of queries per tag are reduced by 24%. 4.2 Continuous Scanning In some applications, the tag population may change over time (i.e., tags leave and join the population dynamically). We adapt the continuous scanning strategy proposed by Myung et al. in [14]. In the first scanning of the whole tag population, TH records the queries that resulted in successful or empty reads. If the tag population does not change, by perfoming DFTs on the subtrees rooted at successful and empty read nodes of the previous scan, TH experiences no collision. If some new tags join the population, some of the successful read nodes of the previous scan can now turn into collision nodes and some empty read nodes can turn into successful or collision nodes. If some old tags leave the population, some successful read nodes will become empty read nodes. If any of the new empty read nodes happens to be a sibling of another empty read node, then TH discards these two nodes from the record and stores the location of their parent because the parent is also an empty read node. This strategy works well when the tag population size either remains static or increases. However, when the tag population decreases, the best choice is to re-execute TH for the subsequent scan. 4.3 Multiple Readers An application with a large number of RFID tags requires multiple readers with overlapping regions because a single reader can not cover all tags due to the short communication range of tags (usually less than 20 feet). The use of multiple readers introduces several new types of collisions such as reader-reader collisions and reader-tag collisions. Such collisions can be handled by reader scheduling protocols such as those proposed in [3, 25, 26, 29]. TH is compatible with all of these reader scheduling protocols. 5. PERFORMANCE COMPARISON We implemented TH and all the 8 prior tag identification protocols in Matlab, namely the 3 nondeterministic protocols (Aloha [28], BS [2], and ABS [14]), the 3 deterministic protocols (TW [12], ATW [24], and STT [19]), and the 2 hybrid protocols (MAS [15] and ASAP [20]). As ATW starts DFTs from the level of log z which may not be a whole number, we present results for ATW by both ceiling and flooring the values of log z and representing them with ATW-c and ATW-f respectively. In terms of implementation complexity, TH and all the 8 prior protocols are implemented in the similar number of lines of code. We performed extensive testing, both manually and automatically, to ensure the correctness of each protocol implementation. We performed the side-by-side comparison with TH, although this comparison is not completely fair for TH for two reasons. First, 3 of these 8 protocols (i.e., BS, ABS, and ASAP) require modification to tags and thus do not work with standard C1G2 tags, whereas TH is fully compliant with C1G2. Second, for the framed slotted Aloha, to its best advantage, we choose the frame size to be the ideal 300
Table 3:Comparison with Prior C1G2 Compliant Protocols (TH/Prior Art) Prior Nondeterministic Prior Deterministic Prior Hybrid Protocol Protocol (=Aloha) Protocols (=MAS) Max Min Mean Best Max Min Mean Max Min prior Mean #queries/tag 0.24 0.10 0.18 ATW-f 0.51 0.50 0.50 0.39 0.38 0.39 query time/tag 0.84 0.71 0.76 ATW-c 0.92 0.89 0.90 0.81 0.78 0.79 #responses/tag 0.35 0.59 0.69 AIW-c X 0.67 0.70 0.64 0.24 0.38 response fairness 1.15 1.10 1.13 TW 1.12 1.0 .1 1.12 1.07 1.10 #queries/tag 0.27 0.20 0.22 ATW-f 0.75 0.68 0.74 0.47 0.18 0.33 query time/tag 0.44 0.34 0.37 ATW-f 0.69 0.61 0.63 0.31 0.16 0.22 #responses/tag 0.85 0.42 0.68 ATW-c 0.86 0.58 0.74 0.59 0.30 0.48 response fairness 1.338 1.25 135 ATW-c .033 .00 102 1.05 0.95 1.02 size,which is equal to the tag population size,disregarding shown in Figures 7(a)to 8(b),for both uniform and non- the practical limitations on the frame sizes.We choose tag uniform distributions.Note that for non-uniform distribu- ID length to be the C1G2 standard 64 bits.We performed tions,we fix the tag population size to be 5000 and range the comparison for both the uniform case (where the tag the block size from 5 to 1000. population is uniformly distributed in the ID space)and the non-uniform case(where the tag population is not uniformly distributed in the ID space).For the uniform case,we range 5.1.1 Normalized Reader Queries tag population sizes from 100 to 100.000 to evaluate the TH reduces the normalized reader queries of the best prior scalability of these protocols.For the non-uniform case,we C1G2 compliant nondeterministic,deterministic,and hybrid distribute tag populations in blocks where each block is a tag identification protocols by an average of 82%,50%,and continuous sequence of tag IDs.We range block sizes from 5 61%,respectively,for uniformly distributed tag populations, to 1000.Our motivation for simulating non-uniform distri- and by an average of 78%,26%,and 67%,respectively,for bution in blocks is that in some applications,such as supply non-uniformly distributed tag populations.Figures 7(a)and chains,tag IDs often come in such blocks when they are 7(b)show the normalized reader queries of all protocols manufactured.For each tag population size,we run each for uniformly and non-uniformly distributed populations protocol 100 times and report the mean.We compare TH respectively.Based on these two figures,we make the fol- with prior protocols from both reader and tag perspectives lowing two observations from the perspective of normalized reader queries for both uniform and non-uniform distribu- 5.1 Reader Side Comparison tions.First,among all the 8 prior protocols,the traditional ATW protocol turns out to be the best.Second,the framed For the reader side,we compared TH with the 8 prior slotted Aloha in the C1G2 standard performs the worst even protocols based on the following two metrics:(1)normal- when we disregard the practical limitations on the frame ized reader queries and (2)identification speed.Normalized sizes.Although BS is the best among the 3 prior nondeter- reader queries is the ratio of the number of queries that the ministic tag identification protocols,it is not compliant with reader transmits to identify a tag population divided by the C1G2.Similarly,although ASAP is the best among the 2 number of tags in the population.Similarly,identification prior hybrid tag identification protocols,it is not compliant speed is the total time that the reader takes to identify a with C1G2. tag population divided by the number of tags in that popu- lation. In general,more queries implies more identification time 5.1.2 Identification Speed However,identification time is not strictly in proportion to TH improves the identification speed of the best prior C1G2 the number of queries because different queries may take compliant nondeterministic,deterministic,and hybrid tag different amounts of time.According to 8 and [9.the time identification protocols by an average of 24%,10%,and 21%. for a successful read t,an empty read te,and a collision te respectively,for uniformly distributed tag populations,and are 3ms,0.3ms,1.5ms,respectively. by an average of 63%,37%,and 78%,respectively,for non- For each metric,in Table 3,we show the value of TH di- uniformly distributed tag populations.Figures 8(a)and 8(b) vided by that for the best prior C1G2 compliant protocol show the identification speed of all protocols for uniformly for this metric in the corresponding category of nondeter- and non-uniformly distributed tag populations,respectively. ministic,deterministic,or hybrid.Note that the only prior Based on these two figures,we make the following two ob- C1G2 compliant nondeterministic tag identification proto- servations from the perspective of identification speed.First col is the framed slotted Aloha and the only prior C1G2 among all 8 prior protocols,the traditional ATW protocol compliant hybrid tag identification protocol is MAS.There turns out to be the best for both uniform and non-uniform are 3 prior C1G2 compliant deterministic tag identification distributions.Second,although framed slotted Aloha is the protocols:TW,ATW,and STT.We report min,max,and worst in terms of normalized reader queries,its identification mean for these ratios for tag populations ranging from 100 speed is not the worst.This is because in our experiments we to100.000. allow it to use unrealistically large frame sizes.which leads For the two metrics defined above,the absolute perfor- to many empty slots and empty read is much faster than mance of TH and all prior 8 tag identification protocols is successful read and collision. 301
Table 3: Comparison with Prior C1G2 Compliant Protocols (TH/Prior Art) Prior Nondeterministic Prior Deterministic Prior Hybrid Protocol Protocol (=Aloha) Protocols (=MAS) Max Min Mean Best Max Min Mean Max Min Mean prior Uniform #queries/tag 0.24 0.10 0.18 ATW-f 0.51 0.50 0.50 0.39 0.38 0.39 query time/tag 0.84 0.71 0.76 ATW-c 0.92 0.89 0.90 0.81 0.78 0.79 #responses/tag 0.85 0.59 0.69 ATW-c 0.85 0.67 0.70 0.64 0.24 0.38 response fairness 1.15 1.10 1.13 TW 1.12 1.07 1.11 1.12 1.07 1.10 Non-Uni #queries/tag 0.27 0.20 0.22 ATW-f 0.75 0.68 0.74 0.47 0.18 0.33 query time/tag 0.44 0.34 0.37 ATW-f 0.69 0.61 0.63 0.31 0.16 0.22 #responses/tag 0.85 0.42 0.68 ATW-c 0.86 0.58 0.74 0.59 0.30 0.48 response fairness 1.38 1.25 1.35 ATW-c 1.03 1.00 1.02 1.05 0.95 1.02 size, which is equal to the tag population size, disregarding the practical limitations on the frame sizes. We choose tag ID length to be the C1G2 standard 64 bits. We performed the comparison for both the uniform case (where the tag population is uniformly distributed in the ID space) and the non-uniform case (where the tag population is not uniformly distributed in the ID space). For the uniform case, we range tag population sizes from 100 to 100, 000 to evaluate the scalability of these protocols. For the non-uniform case, we distribute tag populations in blocks where each block is a continuous sequence of tag IDs. We range block sizes from 5 to 1000. Our motivation for simulating non-uniform distribution in blocks is that in some applications, such as supply chains, tag IDs often come in such blocks when they are manufactured. For each tag population size, we run each protocol 100 times and report the mean. We compare TH with prior protocols from both reader and tag perspectives. 5.1 Reader Side Comparison For the reader side, we compared TH with the 8 prior protocols based on the following two metrics: (1) normalized reader queries and (2) identification speed. Normalized reader queries is the ratio of the number of queries that the reader transmits to identify a tag population divided by the number of tags in the population. Similarly, identification speed is the total time that the reader takes to identify a tag population divided by the number of tags in that population. In general, more queries implies more identification time. However, identification time is not strictly in proportion to the number of queries because different queries may take different amounts of time. According to [8] and [9], the time for a successful read ts, an empty read te, and a collision tc are 3ms, 0.3ms, 1.5ms, respectively. For each metric, in Table 3, we show the value of TH divided by that for the best prior C1G2 compliant protocol for this metric in the corresponding category of nondeterministic, deterministic, or hybrid. Note that the only prior C1G2 compliant nondeterministic tag identification protocol is the framed slotted Aloha and the only prior C1G2 compliant hybrid tag identification protocol is MAS. There are 3 prior C1G2 compliant deterministic tag identification protocols: TW, ATW, and STT. We report min, max, and mean for these ratios for tag populations ranging from 100 to 100, 000. For the two metrics defined above, the absolute performance of TH and all prior 8 tag identification protocols is shown in Figures 7(a) to 8(b), for both uniform and nonuniform distributions. Note that for non-uniform distributions, we fix the tag population size to be 5000 and range the block size from 5 to 1000. 5.1.1 Normalized Reader Queries TH reduces the normalized reader queries of the best prior C1G2 compliant nondeterministic, deterministic, and hybrid tag identification protocols by an average of 82%, 50%, and 61%, respectively, for uniformly distributed tag populations, and by an average of 78%, 26%, and 67%, respectively, for non-uniformly distributed tag populations. Figures 7(a) and 7(b) show the normalized reader queries of all protocols for uniformly and non-uniformly distributed populations, respectively. Based on these two figures, we make the following two observations from the perspective of normalized reader queries for both uniform and non-uniform distributions. First, among all the 8 prior protocols, the traditional ATW protocol turns out to be the best. Second, the framed slotted Aloha in the C1G2 standard performs the worst even when we disregard the practical limitations on the frame sizes. Although BS is the best among the 3 prior nondeterministic tag identification protocols, it is not compliant with C1G2. Similarly, although ASAP is the best among the 2 prior hybrid tag identification protocols, it is not compliant with C1G2. 5.1.2 Identification Speed TH improves the identification speed of the best prior C1G2 compliant nondeterministic, deterministic, and hybrid tag identification protocols by an average of 24%, 10%, and 21%, respectively, for uniformly distributed tag populations, and by an average of 63%, 37%, and 78%, respectively, for nonuniformly distributed tag populations. Figures 8(a) and 8(b) show the identification speed of all protocols for uniformly and non-uniformly distributed tag populations, respectively. Based on these two figures, we make the following two observations from the perspective of identification speed. First, among all 8 prior protocols, the traditional ATW protocol turns out to be the best for both uniform and non-uniform distributions. Second, although framed slotted Aloha is the worst in terms of normalized reader queries, its identification speed is not the worst. This is because in our experiments we allow it to use unrealistically large frame sizes, which leads to many empty slots and empty read is much faster than successful read and collision. 301
十Aloha Aloha 55 ---TW STT ◆ABS -MAS X-MAS 50 -T TT -ABS ATW. AA日 4.5 -STT ATW ABS -A8AF -STT BS ,叠nW ●-ASAF ATW H -TH -TH 10 102 10 10 10 10 10 10 10 0 Block size (non #Tags (uniform distribution) Block size(non-uniform d (a)Uniform distribution (b)Non-unif.distribution (a)Uniform distribution (b)Non-unif.distribution Figure 7:Normalized Queries Figure 8:Identification Speed TH ASAP 12 0. MAS ABS MAS 17 BS h 0 0.65 Block s 10 10 (a)Uniform distribution (b)Non-unif.distribution (a)Uniform distribution (b)Non-unif.distribution Figure 9:Normalized Responses Figure 10:Response Fairness 5.2 Tag Side Comparison For normalized tag responses and response fairness,in Ta- On the tag side,we compare TH with the 8 prior protocols ble 3,we show the value of TH divided by that for the best based on the following four metrics:(1)normalized tag re- prior C1G2 compliant protocol in the corresponding cate- sponses,(2)response fairness,(3)normalized collisions,and gory of nondeterministic,deterministic,or hybrid.The ab- (4)normalized empty reads.Normalized tag responses is the solute performance of TH and all prior 8 tag identification ratio of sum of responses of all tags during the identification protocols is shown in Figures 9(a)to 11(b),for both uniform process to the number of tags in the population.Response and non-uniform distributions. fairness is the Jain's fairness index given by (1x)2 1 where 5.2.1 Normalized Tag Responses ri is the total number of responses by tag i 11.Normalized TH reduces the normalized tag responses of the best prior collisions is the ratio of total number of collisions during the C1G2 compliant nondeterministic,deterministic,and hybrid identification process to the number of tags in the popula- tag identification protocols by an average of 31%.30%.and tion.Normalized empty reads is the ratio of total number of 62%,respectively,for uniformly distributed tag populations, empty reads during the identification process to the number and by an average of 32%,26%,and 52%,respectively,for of tags in the population. non-uniformly distributed tag populations.Figures 9(a)and The first two metrics are important for active tags because 9(b)show the normalized tag responses of all protocols for active tags are powered by batteries.Lesser number of nor- uniformly and non-uniformly distributed tag populations malized tag responses mean lesser power consumption for respectively.We make following 3 observations from these active tags.Response fairness measures the variance in the two figures.First,the normalized tag responses of BS,ABS number of responses per tag.Less fairness results in the de- TW,MAS,and ASAP increase with increasing tag popula- pletion of the batteries of some tags more quickly compared tion size.Second,for non-uniformly distributed tag popula- to others.In large scale tag deployments,it is often nontriv- tions,the normalized tag responses of nondeterministic pro- ial to identify tags with depleted batteries and replace them. tocols is not affected by the block size because their perfor- Using an absolutely fair tag identification protocol,the bat- mance is independent of tag ID distribution.In contrast,the teries of all tags deplete at the same time and therefore all normalized tag responses of deterministic protocols slightly can be replaced at the same time.We use the Jain's fairness increase with increasing block size.Third,among all 8 prior metric defined in [11.For z tags,the fairness value is in the protocols,Aloha has the smallest number of normalized tag range [1].The higher this fairness value is,the more fair responses.This is because of the the unlimitedly large frame the protocol is.The second two metrics are important for sizes that we used for Aloha.With large frame sizes,tags ex- understanding these identification protocols. perience lesser collisions and thus reply fewer times 302
102 103 104 105 1 2 3 4 5 6 7 # Tags (uniform distribution) # Queries / # Tags Aloha STT MAS ASAP ABS BS TW ATW−c ATW−f TH (a) Uniform distribution 100 101 102 103 2 4 6 8 10 Block size (non−uniform distribution) # Queries / # Tags Aloha MAS TW STT ATW−c ASAP ABS BS ATW−f TH (b) Non-unif. distribution Figure 7: Normalized Queries 102 103 104 105 3.5 4 4.5 5 5.5 # Tags (uniform distribution) Time (ms) / # Tags ABS BS TW Aloha MAS STT ASAP ATW−f ATW−c TH (a) Uniform distribution 100 101 102 103 4 5 6 7 8 9 10 Block size (non−uniform distribution) Time (ms) / # Tags TW MAS Aloha ABS BS ATW−c STT ASAP ATW−f TH (b) Non-unif. distribution Figure 8: Identification Speed 102 103 104 105 2 4 6 8 10 12 14 16 # Tags (uniform distribution) # Tag responses / # Tags BS ABS TW ASAP MAS STT ATW−f ATW−c Aloha TH (a) Uniform distribution 100 101 102 103 0 5 10 15 20 Block size (non−uniform distribution) # Tag responses / # Tags BS TW ABS MAS STT ASAP ATW−f Aloha ATW−c TH (b) Non-unif. distribution Figure 9: Normalized Responses 102 103 104 105 0.65 0.7 0.75 0.8 0.85 # Tags (uniform distribution) Fairness TH ASAP ABS MAS TW BS ATW−f Aloha ATW−c STT (a) Uniform distribution 100 101 102 103 0.65 0.7 0.75 0.8 0.85 Block size (non−uniform distribution) Fairness TH MAS TW ATW−f ATW−c ABS BS ASAP Aloha STT (b) Non-unif. distribution Figure 10: Response Fairness 5.2 Tag Side Comparison On the tag side, we compare TH with the 8 prior protocols based on the following four metrics: (1) normalized tag responses, (2) response fairness, (3) normalized collisions, and (4) normalized empty reads. Normalized tag responses is the ratio of sum of responses of all tags during the identification process to the number of tags in the population. Response fairness is the Jain’s fairness index given by ( Pz i=1 xi) 2 z· Pz i=1 x2 i where xi is the total number of responses by tag i [11]. Normalized collisions is the ratio of total number of collisions during the identification process to the number of tags in the population. Normalized empty reads is the ratio of total number of empty reads during the identification process to the number of tags in the population. The first two metrics are important for active tags because active tags are powered by batteries. Lesser number of normalized tag responses mean lesser power consumption for active tags. Response fairness measures the variance in the number of responses per tag. Less fairness results in the depletion of the batteries of some tags more quickly compared to others. In large scale tag deployments, it is often nontrivial to identify tags with depleted batteries and replace them. Using an absolutely fair tag identification protocol, the batteries of all tags deplete at the same time and therefore all can be replaced at the same time. We use the Jain’s fairness metric defined in [11]. For z tags, the fairness value is in the range [ 1 z , 1]. The higher this fairness value is, the more fair the protocol is. The second two metrics are important for understanding these identification protocols. For normalized tag responses and response fairness, in Table 3, we show the value of TH divided by that for the best prior C1G2 compliant protocol in the corresponding category of nondeterministic, deterministic, or hybrid. The absolute performance of TH and all prior 8 tag identification protocols is shown in Figures 9(a) to 11(b), for both uniform and non-uniform distributions. 5.2.1 Normalized Tag Responses TH reduces the normalized tag responses of the best prior C1G2 compliant nondeterministic, deterministic, and hybrid tag identification protocols by an average of 31%, 30%, and 62%, respectively, for uniformly distributed tag populations, and by an average of 32%, 26%, and 52%, respectively, for non-uniformly distributed tag populations. Figures 9(a) and 9(b) show the normalized tag responses of all protocols for uniformly and non-uniformly distributed tag populations, respectively. We make following 3 observations from these two figures. First, the normalized tag responses of BS, ABS, TW, MAS, and ASAP increase with increasing tag population size. Second, for non-uniformly distributed tag populations, the normalized tag responses of nondeterministic protocols is not affected by the block size because their performance is independent of tag ID distribution. In contrast, the normalized tag responses of deterministic protocols slightly increase with increasing block size. Third, among all 8 prior protocols, Aloha has the smallest number of normalized tag responses. This is because of the the unlimitedly large frame sizes that we used for Aloha. With large frame sizes, tags experience lesser collisions and thus reply fewer times. 302