Counting rFld Tags efficiently and Anonymously Hao Hants,Bo Shengt,Chiu C.Tant,Qun Lit,Weizhen Maof,Sanglu Lus fCollege of William and Mary,Williamsburg,VA,USA Northeastern University,Boston,MA,USA SState Key Laboratory of Novel Software Technology,Nanjing University,China Email:fhhan,cct,liqun,wm}@cs.wm.edu,shengbo@ccs.neu.edu,sanglu@nju.edu.cn Abstract-Radio Frequency IDentification (RFID)technology Prior work in [12]and [13]considers this problem by using has attracted much attention due to its variety of applications,e.g. probabilistic estimation based on the framed-slotted ALOHA inventory control and object tracking.One important problem in model.Unfortunately,the scanning time can be considerably RFID systems is how to quickly estimate the number of distinct tags without reading each tag individually.This problem plays a long due to the large frame size required.The performance crucial role in many real-time monitoring and privacy-preserving becomes worse when the mobile tags appear dynamically so applications.In this paper,we present an efficient and anonymous that counting them at a fixed time instant is not possible.That scheme for tag population estimation.This scheme leverages the is because the tags have to be scanned independently with each position of the first reply from a group of tags in a frame.Results counting consuming a long time from mathematical analysis and extensive simulation demonstrate that our scheme outperforms other protocols proposed in the In this paper,we propose a novel scheme for the reader to previous work. quickly estimate the number of distinct tags within a required accuracy.Our scheme is based on a new distinct element I.INTRODUCTION counting method [14],without reading either the actual or Radio Frequency IDentification(RFID)technology is widely pseudo IDs.The main idea of our algorithm is to utilize used in monitoring applications such as inventory control and the position of the first reply from a group of tags in a object tracking [1]-[7].Small RFID tags,each with a unique frame to infer the number of tags.Theoretical analysis and ID,are attached to items under monitoring.An RFID reader extensive simulation show that our scheme outperforms earlier can remotely collect these IDs later for verification.Due to RFID tag estimation schemes.Moreover,our scheme tries to the large number of deployed RFID tags,collecting all tag optimize incremental counting in a mobile environment.Note IDs for verification is inefficient.Some real-time applications. that our approach has a general purpose of counting RFID tags. such as counting the number of tags in a shipping portal,need Combined with other commands,it can be flexibly adopted in more efficient techniques to manage tag data.In this paper,we various applications. consider the problem of efficiently and anonymously estimating Our contributions are summarized as follows the cardinality of a large set of RFID tags with a desired We propose a novel anonymous estimating scheme which accuracy. does not collect the ID from each RFID tag,but is still Efficient techniques for estimating the number of RFID able to estimate the number of tags accurately. tags are important for applications when the time window for We present estimators for both static and dynamic sets of collecting tag data is small.These applications include real- tags.The static set specifies a snapshot of a set of tags, time monitoring or managing a large quantity of products and the dynamic set considers that tags can join or leave For example,a warehouse operator may need to perform a the set with time.Both our estimators are more efficient quick estimation of the number of products left in stock.Such than the existing protocols,even when the cardinality of applications demand efficient estimating schemes instead of the the tag set varies across many orders of magnitude. slow and unnecessary process of reading every tag ID. We propose a novel send-and-reply protocol among the Anonymity is another important issue when dealing with reader and tags to improve performance RFID tags attached to uniquely identifiable items such as The rest of our paper is as follows.Section II contains passports [8]or driver's licenses [9].Either broadcasting tag the related work.Section III presents our problem definition IDs in the open,or revealing IDs to the RFID reader may leak and system model.Section IV outlines the main idea of our personal information.For instance,an adversary could capture schemes.Section V details the algorithms.Our schemes are the communication between the reader and tags or compromise evaluated in Section VI,and Section VII concludes. the reader to track users'activities.Identifying each tag ID increases individual security and privacy risks.An alternative II.RELATED WORK way of providing anonymity is to use cryptographic protocols For a reader to successfully identify every tag in proxim- to mask the actual ID [10],[11].However,the cryptographic ity,collision arbitration protocols must be considered so that techniques require additional modification to the tag hardware, replies from multiple tags will not be garbled due to collision. as well as increase the computational complexity on both tags Collision arbitration protocols are divided into two approaches: and readers. ALOHA-based [15]-[17]and tree-based [18]-[20].In the first
Counting RFID Tags Efficiently and Anonymously Hao Han†§, Bo Sheng‡ , Chiu C. Tan† , Qun Li† , Weizhen Mao† , Sanglu Lu§ †College of William and Mary, Williamsburg, VA, USA ‡Northeastern University, Boston, MA, USA §State Key Laboratory of Novel Software Technology, Nanjing University, China Email: †{hhan, cct, liqun, wm}@cs.wm.edu, ‡ shengbo@ccs.neu.edu, § sanglu@nju.edu.cn Abstract—Radio Frequency IDentification (RFID) technology has attracted much attention due to its variety of applications, e.g., inventory control and object tracking. One important problem in RFID systems is how to quickly estimate the number of distinct tags without reading each tag individually. This problem plays a crucial role in many real-time monitoring and privacy-preserving applications. In this paper, we present an efficient and anonymous scheme for tag population estimation. This scheme leverages the position of the first reply from a group of tags in a frame. Results from mathematical analysis and extensive simulation demonstrate that our scheme outperforms other protocols proposed in the previous work. I. INTRODUCTION Radio Frequency IDentification (RFID) technology is widely used in monitoring applications such as inventory control and object tracking [1]–[7]. Small RFID tags, each with a unique ID, are attached to items under monitoring. An RFID reader can remotely collect these IDs later for verification. Due to the large number of deployed RFID tags, collecting all tag IDs for verification is inefficient. Some real-time applications, such as counting the number of tags in a shipping portal, need more efficient techniques to manage tag data. In this paper, we consider the problem of efficiently and anonymously estimating the cardinality of a large set of RFID tags with a desired accuracy. Efficient techniques for estimating the number of RFID tags are important for applications when the time window for collecting tag data is small. These applications include realtime monitoring or managing a large quantity of products. For example, a warehouse operator may need to perform a quick estimation of the number of products left in stock. Such applications demand efficient estimating schemes instead of the slow and unnecessary process of reading every tag ID. Anonymity is another important issue when dealing with RFID tags attached to uniquely identifiable items such as passports [8] or driver’s licenses [9]. Either broadcasting tag IDs in the open, or revealing IDs to the RFID reader may leak personal information. For instance, an adversary could capture the communication between the reader and tags or compromise the reader to track users’ activities. Identifying each tag ID increases individual security and privacy risks. An alternative way of providing anonymity is to use cryptographic protocols to mask the actual ID [10], [11]. However, the cryptographic techniques require additional modification to the tag hardware, as well as increase the computational complexity on both tags and readers. Prior work in [12] and [13] considers this problem by using probabilistic estimation based on the framed-slotted ALOHA model. Unfortunately, the scanning time can be considerably long due to the large frame size required. The performance becomes worse when the mobile tags appear dynamically so that counting them at a fixed time instant is not possible. That is because the tags have to be scanned independently with each counting consuming a long time. In this paper, we propose a novel scheme for the reader to quickly estimate the number of distinct tags within a required accuracy. Our scheme is based on a new distinct element counting method [14], without reading either the actual or pseudo IDs. The main idea of our algorithm is to utilize the position of the first reply from a group of tags in a frame to infer the number of tags. Theoretical analysis and extensive simulation show that our scheme outperforms earlier RFID tag estimation schemes. Moreover, our scheme tries to optimize incremental counting in a mobile environment. Note that our approach has a general purpose of counting RFID tags. Combined with other commands, it can be flexibly adopted in various applications. Our contributions are summarized as follows. • We propose a novel anonymous estimating scheme which does not collect the ID from each RFID tag, but is still able to estimate the number of tags accurately. • We present estimators for both static and dynamic sets of tags. The static set specifies a snapshot of a set of tags, and the dynamic set considers that tags can join or leave the set with time. Both our estimators are more efficient than the existing protocols, even when the cardinality of the tag set varies across many orders of magnitude. • We propose a novel send-and-reply protocol among the reader and tags to improve performance. The rest of our paper is as follows. Section II contains the related work. Section III presents our problem definition and system model. Section IV outlines the main idea of our schemes. Section V details the algorithms. Our schemes are evaluated in Section VI, and Section VII concludes. II. RELATED WORK For a reader to successfully identify every tag in proximity, collision arbitration protocols must be considered so that replies from multiple tags will not be garbled due to collision. Collision arbitration protocols are divided into two approaches: ALOHA-based [15]–[17] and tree-based [18]–[20]. In the first
approach,the framed-slotted ALOHA (FSA)protocol,which B.System Model is an extension of the pure ALOHA protocol [21],is widely used in RFID standards.Built on that,adaptive FSA protocols, The MAC protocol for our RFID system is based on the where frame size is adaptively adjusted,are explored in [15], adaptive framed-slotted ALOHA model.To read a set of tags,a [22-[241. reader first powers up and transmits continuous wave (CW)to Recent research work [12],[13],[17],[25]is the closest to energize tags.Each tag waits for the reader's command before this paper.A probabilistic analytical model for anonymously replying.This is known as the Reader Talks First mode. estimating tag population is first proposed in [12].The main The communication between the reader and tags is composed idea is to use the framed-slotted ALOHA protocol and monitor of multiple frames.Each frame is partitioned into slots.Here, the number of empty and collision slots to count tags.However, we refer to an individual frame as a round.The reader will first the drawbacks of the estimators in [12]are that all the tags broadcast a begin round command containing the frame size f must be readable by the reader in a single probe and that the in the forthcoming round,and a random seed R.The frame size reader must know approximately the magnitude of the number is the number of slots available for tags to choose in a round of tags to be estimated.Due to these constraints,an Enhanced Each tag picks a slot,and this slot determines when a tag will Zero-Based (EZB)estimator is presented in [13].By tuning reply.An RFID tag uses a hash function h(),f,R,and its ID the parameters for multiple iterations,the number of tags can to pick a slot in the current round,ie.,h(f,R,id)[0,f-1]. be estimated with high accuracy,even when the tag population We assume that the outputs of the hash function have a uniform varies a lot.The key improvement in our work over [12]and random distribution such that the tag has the equal probability [13]is that our scheme does not scan the entire frame,which to select any slot within the round given a seed and ID. drastically reduces time cost.Finally,another novel estimator Each RFID tag has a slot counter which will decrease each for the same problem is proposed in [25]with more focus on time the reader indicates that the current slot has ended.The tag the multiple-reader scenario.However,the scheme requires a will only reply when its slot counter reaches zero.When all the special geometric distribution hash function,which might not slots in the frame have been accounted for,the reader sends an be available in the off-the-shelf RFID systems. end round command to terminate this round.We assume that the reader can issue an end round command to terminate a III.PROBLEM DEFINITION AND SYSTEM MODEL round at any time without waiting for the frame to end.The A.Problem Definition procedure is illustrated in Fig.1.We call this the original send- and-reply protocol. TABLE I NOTATIONS Begin round End slot End round command ommand Symbols Descriptions command Confidence interval Reader Error probability t Number of distinct tags Tag#1 #1 Upper bound of the number of tags Tag#2 #2 t Estimation of the number of tags Random variable for the number of continuous empty Tag#3 #3 slots before the first non-empty slot in a frame Frame size (the number of slots in a frame) R Random seed Tag#t e Load factor t/f Empty Slot 无 Number of waiting slots Collision Slot m Number of rounds (frames) -round a】 Hash function TO Theoretical time cost (in number of slots)in a round Fig.1.Collection sequence of passive RFID systems using the adaptive FSA m Number of sets of tags Since every RFID tag chooses its own slot individually,there Given an RFID reader and a set of tags,we want to quickly will be instances where no tag picks a particular slot.We term and accurately estimate the number of distinct RFID tags in this as an empty slot.A slot that has only been chosen by the set without identifying each tag individually.Our algorithms one tag is known as a singleton slot.A slot that is chosen by allow a user to specify his desired accuracy using two variables, more than one tag is called a collision slot.We refer to both a confidence interval e and an error probability 6.Lower values singleton slot and collision slot as non-empty slot in this paper. of e and o result in a more accurate estimation.Our algorithms After collecting all replies,the reader can generate a bitstring. return an estimation t of the actual number of tags t,such such as that Pr[lf-t1-6.For example,if the set has 5000 RFID tags,and given e=5%and 6=1%,the desired {·1110|11110|11·}, estimator should output the number within [4750,5250]with probability greater than 99%.Table I summarizes the notations where 0 indicates an empty slot,and 1 represents a non-empty used. slot
approach, the framed-slotted ALOHA (FSA) protocol, which is an extension of the pure ALOHA protocol [21], is widely used in RFID standards. Built on that, adaptive FSA protocols, where frame size is adaptively adjusted, are explored in [15], [22]–[24]. Recent research work [12], [13], [17], [25] is the closest to this paper. A probabilistic analytical model for anonymously estimating tag population is first proposed in [12]. The main idea is to use the framed-slotted ALOHA protocol and monitor the number of empty and collision slots to count tags. However, the drawbacks of the estimators in [12] are that all the tags must be readable by the reader in a single probe and that the reader must know approximately the magnitude of the number of tags to be estimated. Due to these constraints, an Enhanced Zero-Based (EZB) estimator is presented in [13]. By tuning the parameters for multiple iterations, the number of tags can be estimated with high accuracy, even when the tag population varies a lot. The key improvement in our work over [12] and [13] is that our scheme does not scan the entire frame, which drastically reduces time cost. Finally, another novel estimator for the same problem is proposed in [25] with more focus on the multiple-reader scenario. However, the scheme requires a special geometric distribution hash function, which might not be available in the off-the-shelf RFID systems. III. PROBLEM DEFINITION AND SYSTEM MODEL A. Problem Definition TABLE I NOTATIONS Symbols Descriptions ǫ Confidence interval δ Error probability t Number of distinct tags tmax Upper bound of the number of tags t˜ Estimation of the number of tags X Random variable for the number of continuous empty slots before the first non-empty slot in a frame f Frame size (the number of slots in a frame) R Random seed ρ Load factor t/f k Number of waiting slots n Number of rounds (frames) h(·) Hash function T(·) Theoretical time cost (in number of slots) in a round m Number of sets of tags Given an RFID reader and a set of tags, we want to quickly and accurately estimate the number of distinct RFID tags in the set without identifying each tag individually. Our algorithms allow a user to specify his desired accuracy using two variables, a confidence interval ǫ and an error probability δ. Lower values of ǫ and δ result in a more accurate estimation. Our algorithms return an estimation t˜ of the actual number of tags t, such that P r[|t˜ − t| ≤ ǫt] ≥ 1 − δ. For example, if the set has 5000 RFID tags, and given ǫ = 5% and δ = 1%, the desired estimator should output the number within [4750, 5250] with probability greater than 99%. Table I summarizes the notations used. B. System Model The MAC protocol for our RFID system is based on the adaptive framed-slotted ALOHA model. To read a set of tags, a reader first powers up and transmits continuous wave (CW) to energize tags. Each tag waits for the reader’s command before replying. This is known as the Reader Talks First mode. The communication between the reader and tags is composed of multiple frames. Each frame is partitioned into slots. Here, we refer to an individual frame as a round. The reader will first broadcast a begin round command containing the frame size f in the forthcoming round, and a random seed R. The frame size is the number of slots available for tags to choose in a round. Each tag picks a slot, and this slot determines when a tag will reply. An RFID tag uses a hash function h(·), f, R, and its ID to pick a slot in the current round, i.e., h(f, R, id) → [0, f −1]. We assume that the outputs of the hash function have a uniform random distribution such that the tag has the equal probability to select any slot within the round given a seed and ID. Each RFID tag has a slot counter which will decrease each time the reader indicates that the current slot has ended. The tag will only reply when its slot counter reaches zero. When all the slots in the frame have been accounted for, the reader sends an end round command to terminate this round. We assume that the reader can issue an end round command to terminate a round at any time without waiting for the frame to end. The procedure is illustrated in Fig. 1. We call this the original sendand-reply protocol. #1 #2 #3 #t Begin round command End slot command End round command Reader ... Tag#1 Tag#2 Tag#3 Tag#t Singleton Slot Collision Slot Empty Slot ... st 1 round ... Fig. 1. Collection sequence of passive RFID systems using the adaptive FSA Since every RFID tag chooses its own slot individually, there will be instances where no tag picks a particular slot. We term this as an empty slot. A slot that has only been chosen by one tag is known as a singleton slot. A slot that is chosen by more than one tag is called a collision slot. We refer to both singleton slot and collision slot as non-empty slot in this paper. After collecting all replies, the reader can generate a bitstring, such as { · · · | 1 | 0 | 1 | 1 | 0 | 1 | · · · }, where 0 indicates an empty slot, and 1 represents a non-empty slot
IV.INTUITION A.Basic Algorithm The previous research [12],[13],[17]takes advantage of Again,our algorithm is based on the idea of making obser- the framed-slotted ALOHA protocol to estimate the number vation on the first non-empty slot.However,if the number of of tags.The basic idea is based on the probability model we tags t is small,the position of the first reply may be located have described previously.The reader scans all the slots and at the end of the frame.Apparently,it is not efficient to use records the status of each slot:empty,singleton,or collision. the original send-and-reply protocol described in Section IIl. By examining the number of empty slots,collision slots,the In that protocol,a reader broadcasts f and R at the beginning reader can then estimate the number of tags. of a round,and waits for the first reply from tags.Therefore, This estimation method while powerful,has some limitations. when the first reply is toward the end of the round,the reader The main limitation is the large frame size,which translates to has to wait for the period of time almost equal to the frame a long protocol running time,when there exist a huge amount size. of tags.Suppose for a large tag population but the frame size is To resolve this issue to improve the query efficiency,we considerably small.All the tags'responses will be packed in a propose a new send-and-reply communication protocol among small number of slots,which means that the number of empty reader and tags.Compared to the original protocol,our new slots will become zero and number of collision slots will be protocol can identify the first non-empty slot in O(log2f)time equal to the frame size.To make estimation accurate,the frame slots instead of (f). size should be in proportion to the number of tags.Therefore, The new send-and-reply protocols for reader and tags are scanning the whole frame is inefficient when tag population is shown in Algorithm 1 and 2 respectively.In the protocols,the large.Furthermore,the performance is even worse in mobile reader sends an extra frame range r to all tags.Initially,the environment,where either RFID tag or reader can move.To reader splits the whole frame into two,and sets the first half count tags over a period of time,we have to use a very large frame as the candidate range,the second half frame as the frame size at the beginning,such that we can superimpose all alternative range.The reader always sends out its candidate the frames and guarantee that the number of empty slots is not range to the tags.Each tag evaluates h(f,R,id)and replies zero in the end [13]. immediately if the result is inside the range r.Otherwise,it To overcome the large frame size problem in the previous keeps silent without doing anything.Then the reader checks protocols,we propose a new idea based on a randomized the forthcoming slot.If the slot is empty,which indicates algorithm for counting.Suppose we have n random numbers there is no tag within the candidate range,the reader splits the uniformly and randomly chosen from (0.1).By examining the alternative range into two and picks the first half as the new smallest number,say x,we can estimate n.Intuitively,the candidate range,and the second half as the new alternative smaller z is,the larger n would be.If all the numbers are range.If the slot is not empty,which indicates there is at uniformly laid out,n should be approximated by 1/x.Of least one tag in the candidate range,the reader then splits the course,this estimation is very crude with a very large variance. candidate range into two,and sets the first half as the the new Fortunately,we can run the same process for a sufficiently large candidate range,and the second half as the new alternative number of times,the estimation will become more accurate. range.The above procedure is like a binary search tree as shown More details will be described later. in Fig.2.The reader keeps traversing from the root to the leaves Our scheme does not require the reader to scan the whole and records the path in each iteration.Finally,the reader can frame.Instead,the reader only needs to identify the first non- identify the first non-empty slot using the equation in line 16 empty slot,and uses the number of consecutive empty slots of Algorithm 1,where z;is a 0/1 bit indicating the state of the before that to estimate the number of tags.Again.the fewer ithiteration. the empty slots appear before the first non-empty slot,the more Fig.2 illustrates a simple example with frame size of 16. tags there are.In practice,certain number of iterations of such In the first iteration,the reader sends the frame size 16,search operations are performed,and the mean value is used to achieve an accurate estimation.For example,given, range [0,7],and a random seed to all tags.No tag replies,so the first slot is empty.Then the reader starts the second iteration {0|0|1} →X1=2 with a new range r=[8,11].At this time,at least one tag {1} +X2=0 replies,so the slot is "1".Repeating the same process twice, {0101010101010101011} →X3=9 the reader identifies the first non-empty slot to be 10. where Xi denotes the number of empty slots before the It is not difficult to find that if the number of tags is relatively first reply position in round i.From theoretical analysis and small to the frame size,our new send-and-reply protocol is extensive simulation,we find even though multiple iterations more efficient than the original protocol.Otherwise,the original are required for accuracy,the total time is still much shorter protocol is better.Therefore,we combine both of them to than the schemes in prior work. determine X.In the combined send-and-reply protocol,we define the number of waiting slots k.At every round,the V.ANONYMOUS ESTIMATING ALGORITHMS original protocol is tried first.Only when there is no reply In this section,we describe our novel RFID tag estimating within k slots,we turn to use our new protocol.So in the scheme,First Non-Empty slots Based (FNEB)estimator. worst case,only k+log2 f slots are required
IV. INTUITION The previous research [12], [13], [17] takes advantage of the framed-slotted ALOHA protocol to estimate the number of tags. The basic idea is based on the probability model we have described previously. The reader scans all the slots and records the status of each slot: empty, singleton, or collision. By examining the number of empty slots, collision slots, the reader can then estimate the number of tags. This estimation method while powerful, has some limitations. The main limitation is the large frame size, which translates to a long protocol running time, when there exist a huge amount of tags. Suppose for a large tag population but the frame size is considerably small. All the tags’ responses will be packed in a small number of slots, which means that the number of empty slots will become zero and number of collision slots will be equal to the frame size. To make estimation accurate, the frame size should be in proportion to the number of tags. Therefore, scanning the whole frame is inefficient when tag population is large. Furthermore, the performance is even worse in mobile environment, where either RFID tag or reader can move. To count tags over a period of time, we have to use a very large frame size at the beginning, such that we can superimpose all the frames and guarantee that the number of empty slots is not zero in the end [13]. To overcome the large frame size problem in the previous protocols, we propose a new idea based on a randomized algorithm for counting. Suppose we have n random numbers uniformly and randomly chosen from (0,1). By examining the smallest number, say x, we can estimate n. Intuitively, the smaller x is, the larger n would be. If all the numbers are uniformly laid out, n should be approximated by 1/x. Of course, this estimation is very crude with a very large variance. Fortunately, we can run the same process for a sufficiently large number of times, the estimation will become more accurate. More details will be described later. Our scheme does not require the reader to scan the whole frame. Instead, the reader only needs to identify the first nonempty slot, and uses the number of consecutive empty slots before that to estimate the number of tags. Again, the fewer the empty slots appear before the first non-empty slot, the more tags there are. In practice, certain number of iterations of such operations are performed, and the mean value is used to achieve an accurate estimation. For example, given, { 0 | 0 | 1 } → X1 = 2 { 1 } → X2 = 0 { 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 } → X3 = 9 where Xi denotes the number of empty slots before the first reply position in round i. From theoretical analysis and extensive simulation, we find even though multiple iterations are required for accuracy, the total time is still much shorter than the schemes in prior work. V. ANONYMOUS ESTIMATING ALGORITHMS In this section, we describe our novel RFID tag estimating scheme, First Non-Empty slots Based (FNEB) estimator. A. Basic Algorithm Again, our algorithm is based on the idea of making observation on the first non-empty slot. However, if the number of tags t is small, the position of the first reply may be located at the end of the frame. Apparently, it is not efficient to use the original send-and-reply protocol described in Section III. In that protocol, a reader broadcasts f and R at the beginning of a round, and waits for the first reply from tags. Therefore, when the first reply is toward the end of the round, the reader has to wait for the period of time almost equal to the frame size. To resolve this issue to improve the query efficiency, we propose a new send-and-reply communication protocol among reader and tags. Compared to the original protocol, our new protocol can identify the first non-empty slot in O(log2f) time slots instead of O(f). The new send-and-reply protocols for reader and tags are shown in Algorithm 1 and 2 respectively. In the protocols, the reader sends an extra frame range r to all tags. Initially, the reader splits the whole frame into two, and sets the first half frame as the candidate range, the second half frame as the alternative range. The reader always sends out its candidate range to the tags. Each tag evaluates h(f, R, id) and replies immediately if the result is inside the range r. Otherwise, it keeps silent without doing anything. Then the reader checks the forthcoming slot. If the slot is empty, which indicates there is no tag within the candidate range, the reader splits the alternative range into two and picks the first half as the new candidate range, and the second half as the new alternative range. If the slot is not empty, which indicates there is at least one tag in the candidate range, the reader then splits the candidate range into two, and sets the first half as the the new candidate range, and the second half as the new alternative range. The above procedure is like a binary search tree as shown in Fig. 2. The reader keeps traversing from the root to the leaves and records the path in each iteration. Finally, the reader can identify the first non-empty slot using the equation in line 16 of Algorithm 1, where zi is a 0/1 bit indicating the state of the i th iteration. Fig. 2 illustrates a simple example with frame size of 16. In the first iteration, the reader sends the frame size 16, search range [0, 7], and a random seed to all tags. No tag replies, so the first slot is empty. Then the reader starts the second iteration with a new range r = [8, 11]. At this time, at least one tag replies, so the slot is “1”. Repeating the same process twice, the reader identifies the first non-empty slot to be 10. It is not difficult to find that if the number of tags is relatively small to the frame size, our new send-and-reply protocol is more efficient than the original protocol. Otherwise, the original protocol is better. Therefore, we combine both of them to determine X. In the combined send-and-reply protocol, we define the number of waiting slots k. At every round, the original protocol is tried first. Only when there is no reply within k slots, we turn to use our new protocol. So in the worst case, only k + log2 f slots are required
Algorithm 1 New send-and-reply protocol for the reader of the FNEB estimator is shown in Algorithm 3.The algorithm 1:if f is not a power of 2 then takes tmar,6 and e as inputs,where tmaz denotes the upper 2: f =2flog2f bound of tag population t.Initially,the reader computes pa- 3:end if rameters f,k,and n by inputs,and then applies the combined 4:a=0,b=f/2-1 send-and-reply protocol n rounds to obtain the average value of 5:Set the search range r [a,b]and random seed R X.denoted by Y.At last,the estimation t is calculated below: 6:for i=1 to log2 f do 7: Reader broadcasts r,f,and R,and listens in the forth- 1+Y coming slot for reply (only one slot) i=f.mn-y (1) 8: if the slot is EMPTY then 9 21=0 10: a=b+1,b=b+rl/2,and updates r 1 else Algorithm 3 FNEB estimator for static tag set 12: 21=1 b=(b-1)/2,and updates r INPUT:tmaz,6,and e 13: OUTPUT:t 14 end if 1:Compute the frame size f and waiting slots k 15:end for i6:ReturnX=∑ef(1-2)·2lo8ef-i 2:Compute the number of rounds n 3: for i =1 to n do 4: Generate a new random seed Ri Algorithm 2 New send-and-reply protocol for each tag 5: Broadcast (f,Ri)to all tags and wait their replies 1:Receive range r,f,and R from reader 6: Run the original send-and-reply protocol 2:Compute slot number sn=h(f.R,id) if receive reply before kth slot then 3:if sn is inside r then 8 Xi slot number of first reply -1 4: Reply immediately 9 else 5:else 10: Run the new send-and-reply protocol 6: Keep silent X;value returned by Algorithm 1 7:end if 12: end if 13:end for Frame Size=16 Empty Slots 14:Add all Xi and get the average Y=Xi/n L-0 15:Return fIn 1tY 3 4 In the next two subsections,we will explain why this algo- ④。5 rithm can achieve the desired accurate estimation and how to 16 0(6 07 compute parameters f,k,and n(lines 1 and 2 in Algorithm 3). 8 To ease understanding,we first present the mathematics behind 0 -10 the algorithm and how to pick parameter n.We then describe 11 12 how to determine f and k. 013 。15 14 B.Pick n 0101☐ Z3 The value of n directly determines the performance of our Fig.2.Illustration of our new send-and-reply protocol scheme.If n is too small,the estimated f cannot meet the de- sired accuracy.However,a large n will increase the estimation time.Next,we first present the theoretical underpinnings for Our combined send-and-reply protocol requires a slight mod- the FNEB algorithm,followed by the bounds for n that can ification to existing RFID tags.We add an optional bit mask satisfy the accuracy requirement. to indicate the search range r in each end slot command sent Given the frame size f,each tag has the probability to by the reader.If the parameter is set to a valid range,those select a specific slot in the frame.For t tags in total,the tags who pick a response slot inside the range will reply in probability of a certain slot to be empty (denoted as Po)is the forthcoming slot,no matter what value their slot counters Po=(1-).Since f is normally large,Po can be simplified are.If the parameter is set to null,the original send-and-reply to Poe-,where p=t.We call p the load factor.Let protocol is then used. the random variable X be the number of consecutive empty With the basic idea described above,the complete algorithm slots before the first non-empty slot in a frame.We then have
Algorithm 1 New send-and-reply protocol for the reader 1: if f is not a power of 2 then 2: f = 2⌈log2 f⌉ 3: end if 4: a = 0, b = f /2 − 1 5: Set the search range r = [a, b] and random seed R 6: for i = 1 to log2 f do 7: Reader broadcasts r, f, and R, and listens in the forthcoming slot for reply (only one slot) 8: if the slot is EMPTY then 9: zi = 0 10: a = b + 1, b = b + |r|/2, and updates r 11: else 12: zi = 1 13: b = (b − 1)/2, and updates r 14: end if 15: end for 16: Return X = Plog2 f i=1 (1 − zi) · 2 log2 f−i Algorithm 2 New send-and-reply protocol for each tag 1: Receive range r, f, and R from reader 2: Compute slot number sn = h(f, R, id) 3: if sn is inside r then 4: Reply immediately 5: else 6: Keep silent 7: end if z1 z2 z3 z4 [0~7] [0~3] 0 2 4 6 8 2 3 4 1 0 1 1 0 0 1 0 1 0 1 0 1 0 1 0 5 6 7 8 9 11 12 13 15 14 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 Empty Slots 0 1 0 1 10 Frame Size = 16 1 10 12 14 [12~13] [8~11] [8~9] [4~5] [0~1] Fig. 2. Illustration of our new send-and-reply protocol Our combined send-and-reply protocol requires a slight modification to existing RFID tags. We add an optional bit mask to indicate the search range r in each end slot command sent by the reader. If the parameter is set to a valid range, those tags who pick a response slot inside the range will reply in the forthcoming slot, no matter what value their slot counters are. If the parameter is set to null, the original send-and-reply protocol is then used. With the basic idea described above, the complete algorithm of the FNEB estimator is shown in Algorithm 3. The algorithm takes tmax, δ and ǫ as inputs, where tmax denotes the upper bound of tag population t. Initially, the reader computes parameters f, k, and n by inputs, and then applies the combined send-and-reply protocol n rounds to obtain the average value of X, denoted by Y . At last, the estimation t˜ is calculated below: t˜= f · ln 1 + Y Y (1) Algorithm 3 FNEB estimator for static tag set INPUT: tmax, δ, and ǫ OUTPUT: t˜ 1: Compute the frame size f and waiting slots k 2: Compute the number of rounds n 3: for i = 1 to n do 4: Generate a new random seed Ri 5: Broadcast (f, Ri) to all tags and wait their replies 6: Run the original send-and-reply protocol 7: if receive reply before kth slot then 8: Xi = slot number of first reply - 1 9: else 10: Run the new send-and-reply protocol 11: Xi = value returned by Algorithm 1 12: end if 13: end for 14: Add all Xi and get the average Y = Pn i=1 Xi/n 15: Return t˜= f ln 1+Y Y In the next two subsections, we will explain why this algorithm can achieve the desired accurate estimation and how to compute parameters f, k, and n (lines 1 and 2 in Algorithm 3). To ease understanding, we first present the mathematics behind the algorithm and how to pick parameter n. We then describe how to determine f and k. B. Pick n The value of n directly determines the performance of our scheme. If n is too small, the estimated t˜ cannot meet the desired accuracy. However, a large n will increase the estimation time. Next, we first present the theoretical underpinnings for the FNEB algorithm, followed by the bounds for n that can satisfy the accuracy requirement. Given the frame size f, each tag has the probability 1 f to select a specific slot in the frame. For t tags in total, the probability of a certain slot to be empty (denoted as P0) is P0 = (1− 1 f ) t . Since f is normally large, P0 can be simplified to P0 ≈ e −ρ , where ρ = t f . We call ρ the load factor. Let the random variable X be the number of consecutive empty slots before the first non-empty slot in a frame. We then have
Pr[X u]=Po(1-Po).The expectation of X is is asymptotically normal with mean 0 and variance 1;that is, f-1 f-1 Z satisfies the standard normal distribution and its cumulative E(X)=∑uPr(X=)=∑uP1-R) distribution function is u=0 2=0 1 f-1)P时+1-fPH+B Φ(x)= V2-心 e-号du. 1-Po We can find a constant c which makes Po(1-P6)-fF8 1-P% Pr[-c≤Z≤d=Φ(c)-Φ(-c) Since that01-6.Combining o and Eq.3 to solve the inequalities,we get where Xi is the random variable X for the ith observation. Note that E(Xi)=E(X)and Var(X;)=Var(X).Since n≥ c2e-o(ep-e-ce)2 the reader gives a different random seed in each broadcast,Xi (1-e-p)2 (1 <i<n)is independent with each other.Therefore,we have ■ EY)=∑EX=nEX =E(X) In practice,the number of tags t is not known a priori. making it difficult to predict the exact number of rounds. and However,the minimum number of rounds n is a monotonically Var(Y)=Var(i)=nVar(X)Var(x) increasing function against the load factor p;that is,the number of rounds calculated by t=tma is large enough for the actual n2 n2 t.Therefore,n in line 2 of Algorithm 3 is computed by Since that E(Y)=E(X),by solving Eq.2 for t,we get n=etma (ctmal-cem t=f.In 1+E(Y) (4) E(Y) (1-e-etmaz/f)2 Then,according to Eq.1,by substituting Y for E(Y),we have C.Determine Optimal Parameters f and k i=f.mnl+y The estimating time of our algorithms is affected by two y factors:the number of rounds and the time cost in each round. Next,we will show how to use Var(Y)to compute the tight Here,the time cost is measured by the number of slots.From the discussion above.we find that the number of rounds n is bound of parameter n. dependent on the frame size f.The time cost in a round is either Theorem 1.Given 6.e,and p.if the number of rounds n is x+1*(if the number of empty slots observed in that round not less than the agorithm described above is smaller than korThat relies on both fand can guarantee the accuracy requirement,that is,Pr-t Hence,if we select inappropriate f and k,the performance of e≥1-6. our scheme will be adversely affected.Our remaining problem is to determine the "best"value for parameter f and k on a Proof:We use u and a to denote the expectation and given upper bound tmaz. standard variance of Y,i.e.,u=E(Y)and o =VVar(Y)= Remember that the probability of the random variable X VVar(X)/n.By the central limit theorem,we know equals to u is Po(1-Po),where Po e-t/f.We use the z=Y-4 *Note that one additional slot is needed for the first non-empty slot
P r[X = u] = P u 0 (1 − P0). The expectation of X is E(X) = f X−1 u=0 uP r(X = u) = f X−1 u=0 uP u 0 (1 − P0) = (f − 1)P f+1 0 − fPf 0 + P0 1 − P0 = P0 1 − P0 (1 − P f 0 ) − fPf 0 . Since that 0 c, then we can guarantee P r[|t˜− t| 6 ǫt] > 1 − δ. Combining σ and Eq. 3 to solve the inequalities, we get n > c 2 e −ρ (e ρ − e −ǫρ) 2 (1 − e−ǫρ) 2 . In practice, the number of tags t is not known a priori, making it difficult to predict the exact number of rounds. However, the minimum number of rounds n is a monotonically increasing function against the load factor ρ; that is, the number of rounds calculated by t = tmax is large enough for the actual t. Therefore, n in line 2 of Algorithm 3 is computed by n = c 2 · e −tmax/f · (e tmax/f − e −ǫtmax/f ) 2 (1 − e−ǫtmax/f ) 2 C. Determine Optimal Parameters f and k The estimating time of our algorithms is affected by two factors: the number of rounds and the time cost in each round. Here, the time cost is measured by the number of slots. From the discussion above, we find that the number of rounds n is dependent on the frame size f. The time cost in a round is either x + 1∗ (if the number of empty slots observed in that round is smaller than k) or k + log2f. That relies on both f and k. Hence, if we select inappropriate f and k, the performance of our scheme will be adversely affected. Our remaining problem is to determine the “best” value for parameter f and k on a given upper bound tmax. Remember that the probability of the random variable X equals to u is P u 0 (1 − P0), where P0 = e −t/f . We use the ∗Note that one additional slot is needed for the first non-empty slot
function T()to denote the time cost in each round.Given k, programming problem,or bound a small search range to t,and f,T()can be expressed as exhaustively find the optimal values of f and k.Both T(k,t,f) methods can reduce the computation cost in practice. Since the ratio is relatively stable,the optimal number =∑=(u+1)Pr(X=u)+∑I=k(k+log2f)Pr(X=u) of rounds n will not get obvious increase,when tmax ∑-6P+log2fPb becomes large.Therefore,as shown in the evaluation = 瑞+-loga fP, section,our algorithm performs well even if we count a huge amount of tags. where the first term describes the cost of using the original send-and-reply protocol,if there is a reply within k slots.The D.Enhancement:Adjusting Skewed tmar second term,indicating the cost of using our new send-and- In practice,users may overestimate the upper bound tmaz. reply protocol,is a constant k+log2 f.Both of them are The actual t may be much smaller than the bound.Thus,the multiplied by their probabilities. optimal parameters f and k computed by tmaz may be too large Therefore,n.T(k,t,f)is the estimating time of our algo- for estimation,since it causes many empty slots before the first rithm for a specified t.Our goal is to find parameters f and k reply in each round.We call this the skewed tmaz problem. to minimize the time cost averaging over all possible values of t from 1 to tmaz.Then,the problem is to minimize (a) ) 80000 20000 © 1 n.T(k,t,f) 10000 tmaz t=1 16000 1000 60000 100 subject to k,f∈N,and0≤k≤f,where 2 12000 C2e-tmarlf(etmarlf-e-etmarl)2 n= 40000 (1-e-etmaz/f)2 8000 T(k,t,f)= 1-P +logz fPo. 1-P% 20000 4000 0 0.05 0.1 .1 0.5 1.0 This is a nonlinear programming problem with two unknown integer variables.Although it is difficult to find an expression Fig.3.Time cost versus the normalized number of tags for different tmax: of f and k,the problem is solvable by enumerating all possible (a)comparison under t/.1;(b)comparison under t/tmar>0.1 parameters to find the optimal values.Given parameters:tmar, e and 6,we first fix f and enumerate all values of k from I to To show the effect of different tmax on the performance of f to find the best value of k which can minimize the objective our FNEB estimator,we plot the estimating time in number of function.Then,we repeat the process to search for the optimal slots against t under three different tmax(see Fig.3).To ease f.Note that these procedures are all computed by the reader comparison,we normalize t by tmaz and separate the figure into offline. two parts.As we see,when the value of t/tmaz approaches 1, Table II shows the optimal parameters for some specified the time cost decreases significantly.Also,for the same value tmaz under that e=5%and 6=1%.In the table,nop,fop, of t/tmar,the smaller tmaz will spend less time,when t is and kop indicate the optimal number of rounds,frame size,and absolutely close to tmaz. number of waiting slots for each tmax respectively.The ratio Based on these observations,we propose an enhanced ap- in the last column is computed by tmaz over fop. proach to solve the skewed tmaz problem.As mentioned before, a larger tmaz usually causes more empty slots.Therefore,we TABLE II can use the position of first reply to decide whether tmaz is too OPTIMAL PARAMETERS FOR DIFFERENT tma WITH 6=0.01,E=0.05 large for t.If it is,we will adaptively shrink tmaz in the next t Kop Ratio (=tmar/fop) round.The main algorithm is shown in Algorithm 4.which 10 3927 55 0 1.818 should be appended at the end of each iteration (between lines 500 4024 264 8 1.894 12 and 13)in Algorithm 3. I000 4058 521 1.919 5000 4014 265T 1.886 Recall that X is the random variable indicating the number 100004024 5279 1.894 of empty slots before the first reply from tags,and Xi is the 50000. 4042 26205 1.908 observed value of X in the ith round.Let variable N enumerate all possible numbers of tags,decreasing from tmar to 1.Then, From the table,we have the following observations. Pr[X =Xilt =N]is the probability of observing Xi empty The ratio of tmar to fop is close to 1.9,and k is close to slots on the condition that t=N.According to Bayes'theorem, log2 f.Based on this observation,we can either directly we have use the quasi-optimal parameters:f1.9 and k log2 f Prt=NX=x]=PK=X北=N (5) for our estimating algorithm without solving the non-linear PrX=Xil
function T (·) to denote the time cost in each round. Given k, t, and f, T (·) can be expressed as T (k, t, f) = Pk−1 u=0(u+1)P r(X=u)+Pf−1 u=k (k+log2 f)P r(X=u) = Pk−1 u=0 P u 0 +log2 f·P k 0 = 1−P k 0 1−P0 +log2 fP k 0 , where the first term describes the cost of using the original send-and-reply protocol, if there is a reply within k slots. The second term, indicating the cost of using our new send-andreply protocol, is a constant k + log2 f. Both of them are multiplied by their probabilities. Therefore, n · T (k, t, f) is the estimating time of our algorithm for a specified t. Our goal is to find parameters f and k to minimize the time cost averaging over all possible values of t from 1 to tmax. Then, the problem is to minimize 1 tmax tXmax t=1 n · T (k, t, f) subject to k, f ∈ N, and 0 6 k 6 f, where n = c 2 e −tmax/f (e tmax/f − e −ǫtmax/f ) 2 (1 − e−ǫtmax/f ) 2 T (k, t, f) = 1 − P k 0 1 − P0 + log2 fPk 0 . This is a nonlinear programming problem with two unknown integer variables. Although it is difficult to find an expression of f and k, the problem is solvable by enumerating all possible parameters to find the optimal values. Given parameters: tmax, ǫ and δ, we first fix f and enumerate all values of k from 1 to f to find the best value of k which can minimize the objective function. Then, we repeat the process to search for the optimal f. Note that these procedures are all computed by the reader offline. Table II shows the optimal parameters for some specified tmax under that ǫ = 5% and δ = 1%. In the table, nop, fop, and kop indicate the optimal number of rounds, frame size, and number of waiting slots for each tmax respectively. The ratio in the last column is computed by tmax over fop. TABLE II OPTIMAL PARAMETERS FOR DIFFERENT tmax WITH δ = 0.01, ǫ = 0.05 tmax nop fop kop Ratio (= tmax/fop) 100 3927 55 6 1.818 500 4024 264 8 1.894 1000 4058 521 9 1.919 5000 4014 2651 12 1.886 10000 4024 5279 13 1.894 50000 4042 26205 15 1.908 From the table, we have the following observations. • The ratio of tmax to fop is close to 1.9, and k is close to log2 f. Based on this observation, we can either directly use the quasi-optimal parameters: f ≈ 1.9 and k ≈ log2 f for our estimating algorithm without solving the non-linear programming problem, or bound a small search range to exhaustively find the optimal values of f and k. Both methods can reduce the computation cost in practice. • Since the ratio is relatively stable, the optimal number of rounds n will not get obvious increase, when tmax becomes large. Therefore, as shown in the evaluation section, our algorithm performs well even if we count a huge amount of tags. D. Enhancement: Adjusting Skewed tmax In practice, users may overestimate the upper bound tmax. The actual t may be much smaller than the bound. Thus, the optimal parameters f and k computed by tmax may be too large for estimation, since it causes many empty slots before the first reply in each round. We call this the skewed tmax problem. 80000 60000 40000 20000 0 0.05 0.1 Spending time (in number of slots) t/tmax (a) tmax: 10000 1000 100 80000 60000 40000 20000 0 0.05 0.1 Spending time (in number of slots) t/tmax (a) tmax: 10000 1000 100 20000 16000 12000 8000 4000 0.1 0.5 1.0 Spending time (in number of slots) t/tmax (b) tmax: 10000 1000 100 Fig. 3. Time cost versus the normalized number of tags for different tmax: (a) comparison under t/tmax 6 0.1; (b) comparison under t/tmax > 0.1 To show the effect of different tmax on the performance of our FNEB estimator, we plot the estimating time in number of slots against t under three different tmax (see Fig. 3). To ease comparison, we normalize t by tmax and separate the figure into two parts. As we see, when the value of t/tmax approaches 1, the time cost decreases significantly. Also, for the same value of t/tmax, the smaller tmax will spend less time, when t is absolutely close to tmax. Based on these observations, we propose an enhanced approach to solve the skewed tmax problem. As mentioned before, a larger tmax usually causes more empty slots. Therefore, we can use the position of first reply to decide whether tmax is too large for t. If it is, we will adaptively shrink tmax in the next round. The main algorithm is shown in Algorithm 4, which should be appended at the end of each iteration (between lines 12 and 13) in Algorithm 3. Recall that X is the random variable indicating the number of empty slots before the first reply from tags, and Xi is the observed value of X in the ith round. Let variable N enumerate all possible numbers of tags, decreasing from tmax to 1. Then, P r[X = Xi |t = N] is the probability of observing Xi empty slots on the condition that t = N. According to Bayes’ theorem, we have P r[t = N|X = Xi ] = P r[X = Xi |t = N] P r[X = Xi ] , (5)
Algorithm 4 Adaptively shrink skewed tmaz directly apply our previous algorithms on each tag set,these /After getting Xi,we test whether to shrink tmaz */ overlapping tags will be counted multiple times,resulting in 1:p=0 erroneous overall estimations. 2:for N tmaz to 1 do We have extended our FNEB algorithms to estimate multiple 3: Pr[X=Xilt=N] p=p+maPr=Xt=司 tag sets.Due to page limit,we cannot include the details in this 4: if 1-p1-6).Second,they are more efficient than other estimators we do not list here Table III shows the performance of our enhanced FNEB All estimators were implemented in Java.We first investigate estimator.From that,we find that the final value of tmar can be the estimators for static set,then the estimators for multiple adjusted close to t within several shrinks.As a result,different sets.Unless otherwise specified,we set the maximum number numbers of tags can lead to almost the same total time. of RFID tags tmaz to 10000,the confidence level e to 0.05, E.Extension:Estimating Multiple Tag Sets and the error probability 6 to 0.01.Each result is the average of 100 iterations.These experiments test the hypothesis that our Previously,we only considered a static tag set.However,for estimators can be more efficient than other estimators. certain applications,we may need to count multiple tag sets in a dynamic environment where either the tags or reader is mobile. A.Time Efficiency For example,a single reader cannot cover all the tags in a large Prior work in [12]and [13]uses the number of slots that warehouse.Instead,we have to either deploy multiple readers a reader has to scan as an indicator of time efficiency.The or dispatch a mobile reader moving through the warehouse reader that scans a few slots will perform faster than the reader to cover all tags.In that case,different tag sets queried by that needs to scan many slots.However,the number of slots readers at different places could have overlapping tags.If we used is misleading,since different types of slots have variant
Algorithm 4 Adaptively shrink skewed tmax /* After getting Xi , we test whether to shrink tmax */ 1: p = 0 2: for N = tmax to 1 do 3: p = p + P P r[X=Xi|t=N] tmax i=1 P r[X=Xi|t=i] 4: if 1 − p < 0.1% and N < tmax then 5: tmax = N 6: Recompute f, k, and n, and restart new rounds 7: break 8: end if 9: end for where P r[X = Xi ] = Ptmax i=1 P r[X = Xi |t = i]. In the algorithm, Eq. 5 is added to variable p as N decreases in each iteration (line 3). So p presents the probability P r[N ≤ t ≤ tmax] on condition that Xi empty slots have been observed, and 1 − p is the probability P r[1 ≤ t < N] correspondingly. Once 1 − p is smaller than a very small probability (like 0.1% in our algorithm), it means that t can not be larger than N with high possibility. Therefore, we can shrink tmax to the value of N. Recall the analysis in Section V-B, P r[X = Xi |t = N] can be computed by (e −N/f ) Xi · (1 − e −N/f ). However, when the shrinking occurs in the latter rounds, restarting new rounds may incur a large overhead. Therefore, we constrain the number of rounds for shrinking. If tmax remains unchanged in certain consecutive rounds, the current tmax is deemed stable enough. We will not run Algorithm 4 after those rounds. In the simulation, we set a heuristic value of 30 rounds which is large enough for adjustment. TABLE III RESULTS FROM THE ADAPTIVE SHRINK ALGORITHM FOR SINGLE SET OF RFID TAGS WITH tmax = 10000, δ = 0.01, AND ǫ = 0.05 No. of No. of Final value Shrinking Total time tags shrinks of tmax overhead (slots) 10 5.6 14.4 350.7 5525.9 50 5.5 69.9 365.1 5738.0 100 5.4 135.7 418.8 5732.4 500 5.2 667.6 444.9 5758.8 1000 4.9 1307.3 441.3 5683.2 5000 1.9 6467.5 371.3 5660.8 Table III shows the performance of our enhanced FNEB estimator. From that, we find that the final value of tmax can be adjusted close to t within several shrinks. As a result, different numbers of tags can lead to almost the same total time. E. Extension: Estimating Multiple Tag Sets Previously, we only considered a static tag set. However, for certain applications, we may need to count multiple tag sets in a dynamic environment where either the tags or reader is mobile. For example, a single reader cannot cover all the tags in a large warehouse. Instead, we have to either deploy multiple readers or dispatch a mobile reader moving through the warehouse to cover all tags. In that case, different tag sets queried by readers at different places could have overlapping tags. If we directly apply our previous algorithms on each tag set, these overlapping tags will be counted multiple times, resulting in erroneous overall estimations. We have extended our FNEB algorithms to estimate multiple tag sets. Due to page limit, we cannot include the details in this paper. The intuition of the protocol is as follows. Suppose we have m tag sets S1, S2, ..., Sm, and for each set the number of empty slots before the first non-empty slot is X1, X2, ..., Xm. In a global view, min(X1, X2, ..., Xm) infers the total tag size |S1 ∪ S2 ∪ ... ∪ Sm|. However, each set i (i ∈ [1, m]) does not know whether Xi is minimal. Therefore, we need to track all sets to record the minimal number. In practice, the optimization is used to speed up the above process. If no tag replied before the minimal number of empty slots that we already know, we just terminate reading such a set, because it does not change the minimal value. The reason why we can minimize slot count from different sets is that the reply slot by each tag is only dependent on the frame size f and random seed R. So long as the same parameters are used, a tag will always pick the same slot in the frame. Based on this property, any reply that occurs before the first reply in other sets must belong to a new tag. In other words, even if the same tags have responded in multiple sets, the first non-empty slot will remain the same. The final result is equivalent to having all distinct tags belong to one large single set. Therefore, our extended approach remains accurate while significantly reducing time cost. VI. PERFORMANCE EVALUATION The goal of this paper is to design an estimator to count tags efficiently and anonymously in both static and dynamic environments. Here, we evaluate the performance of our FNEB estimator, the enhanced FNEB estimator for single set of tags, and the extended FNEB estimator for multiple sets of tags. Through extensive simulation, we compare our estimators against several well-known estimators mentioned in the related work. They are the Combined Simple Estimator (CSE) [12], the Unified Probabilistic Estimator (UPE) [12], and the Enhanced Zero-Based (EZB) estimator [13]. These estimators are selected for two reasons. First, they can all provide the desired estimating accuracy (say, P r[|t˜− t| ≤ ǫt] ≥ 1 − δ). Second, they are more efficient than other estimators we do not list here. All estimators were implemented in Java. We first investigate the estimators for static set, then the estimators for multiple sets. Unless otherwise specified, we set the maximum number of RFID tags tmax to 10000, the confidence level ǫ to 0.05, and the error probability δ to 0.01. Each result is the average of 100 iterations. These experiments test the hypothesis that our estimators can be more efficient than other estimators. A. Time Efficiency Prior work in [12] and [13] uses the number of slots that a reader has to scan as an indicator of time efficiency. The reader that scans a few slots will perform faster than the reader that needs to scan many slots. However, the number of slots used is misleading, since different types of slots have variant
a=0.001,B=0.40 c=0.01,B=0.40 durations in practice.According to the current standards(EPC 310 EZB -EZB global Class-1 Gen-2 [28]),we assume a reader needs almost -FNEB 一FNEB 300 us to detect an empty slot,1500 us to detect a collision slot.and 3000 us to detect a collision slot.Therefore.estimators (like CSE and UPE)that must identify the type of each slot will spend long time on every slot.However,for EZB and our 50 00 00 Num of sets (m) Num of sets (m) FNEB that only distinguish an empty slot from a non-empty a=0.1,=040 10° a=0.5,=0.40 slot,the duration of every slot is equivalent to that of an empty EZB EZB -FNEB -FNEB slot. 1)Single set of RFID tags:In Table IV,we show the number of slots scanned by every estimator.As we see,if we only compare the number appeared,it seems that CSE and UPE Num(m) 100 mofes 100 perform well since the sum of slots is small. However,despite a little more slots needed for estimation, Fig.5. Cumulative number of slots for estimation versus the number of sets, our proposed algorithms do not have poor performance (effi- while increasing a and holding 3 ciency)relative to CSE and UPE,since the duration of each slot in FNEB and enhanced FNEB is much smaller than that a=0.01,=0.20 c=0.01,B=0.40 EZB EZB in CSE and UPE.As described above,CSE and UPE have FNEB FNEE to identify whether a slot is empty,singleton or collision,so additional time is spent to check the CRC(Cyclic Redundancy Check)checksum.Our algorithms otherwise only determine whether a slot is empty or non-empty.Therefore,each slot 50 100 50 00 Num of sets(m) Num of sets(m in our algorithms costs much small time than CSE and UPE. X10 a=0.01,B=0.60 X109 c=0.01,B=0.80 Fig.4 shows the amount of time required by all estimators -E7R --E78 FNEB with respect to variant slot durations.We see that our enhanced FNEB outperforms any other schemes,especially in large-scale RFID systems.In addition,we understand that the skewed tmaz is really a serious problem.Without dynamically shrinking Num of sets (m) 00 Num of ets (m) 100 tmaz,the FNEB spends much longer time than others,when the number of tags is smaller than 2000. Fig.6.Cumulative number of slots for estimation versus the number of sets, while increasing B and holding a CSE A-UPE data sets.Let a denote the percentage of the size of each set ¥EZB 615 FNEB to tmaz,and B denote the percentage of the overlapped tags oEnnanced FNEB between two tag sets.In Fig.5,we hold parameter a and change B to conduct the comparison,and vice versa in Fig.6.From 10 the results,we see that our scheme is more efficient than EZB in all tests. -△A-d合-A4A0-合4-46-合A心A4 B.Additional Discussions 00000006680-0●◆。。-00。 This subsection covers some other issues whose details are 2000 4000.6000 800010000 omitted due to the page limit. Num of tags(t) 1)Accuracy requirements:In our simulation,we randomly Fig.4.Time-efficiency comparison of single set estimators select 1000 possible values for t,ranging from I to tmar. The results show that the estimation falling out of the 2)Multiple sets of RFID tags:Considering multiple sets of range t-et,t+et only twice.The estimating accuracy tags,only two estimators,EZB and our extended FNEB,can holds with more than 1-o probability. be used to estimate the number of tags among all estimators 2) Scalability:The tag population may vary across many mentioned early.So we only compare our extended FNEB orders of magnitude,ranging from tens to thousands of against EZB here.For simplicity,"FNEB"in Fig.5 and 6 tags.In our simulation,we consider the tag population is refer to the "extended FNEB".Also,since both estimators varies in four scales of tmaz:100,1000,10000,and distinguish between empty slot and non-empty slot,we use the 100000.The results show the estimating time does not number of slots instead of the absolute time for evaluation. increase obviously.Our estimator scales well. In the simulation.we set m =100.and use the same model 3)Signal loss:Our scheme leverages the first non-empty described at the beginning of Section VI to generate multiple slot in a frame for estimation.In practice,when the link
durations in practice. According to the current standards (EPC global Class-1 Gen-2 [28]), we assume a reader needs almost 300 µs to detect an empty slot, 1500 µs to detect a collision slot, and 3000 µs to detect a collision slot. Therefore, estimators (like CSE and UPE) that must identify the type of each slot will spend long time on every slot. However, for EZB and our FNEB that only distinguish an empty slot from a non-empty slot, the duration of every slot is equivalent to that of an empty slot. 1) Single set of RFID tags: In Table IV, we show the number of slots scanned by every estimator. As we see, if we only compare the number appeared, it seems that CSE and UPE perform well since the sum of slots is small. However, despite a little more slots needed for estimation, our proposed algorithms do not have poor performance (effi- ciency) relative to CSE and UPE, since the duration of each slot in FNEB and enhanced FNEB is much smaller than that in CSE and UPE. As described above, CSE and UPE have to identify whether a slot is empty, singleton or collision, so additional time is spent to check the CRC (Cyclic Redundancy Check) checksum. Our algorithms otherwise only determine whether a slot is empty or non-empty. Therefore, each slot in our algorithms costs much small time than CSE and UPE. Fig. 4 shows the amount of time required by all estimators with respect to variant slot durations. We see that our enhanced FNEB outperforms any other schemes, especially in large-scale RFID systems. In addition, we understand that the skewed tmax is really a serious problem. Without dynamically shrinking tmax, the FNEB spends much longer time than others, when the number of tags is smaller than 2000. 0 2000 4000 6000 8000 10000 0 5 10 15 20 Num of tags (t) Absolute time (second) CSE UPE EZB FNEB Enhanced FNEB Fig. 4. Time-efficiency comparison of single set estimators 2) Multiple sets of RFID tags: Considering multiple sets of tags, only two estimators, EZB and our extended FNEB, can be used to estimate the number of tags among all estimators mentioned early. So we only compare our extended FNEB against EZB here. For simplicity, “FNEB” in Fig. 5 and 6 is refer to the “extended FNEB”. Also, since both estimators distinguish between empty slot and non-empty slot, we use the number of slots instead of the absolute time for evaluation. In the simulation, we set m = 100, and use the same model described at the beginning of Section VI to generate multiple 0 50 100 0 1 2 3 x 106 Num of sets (m) Cumulative spending time (slots) α=0.001, β=0.40 EZB FNEB 0 50 100 0 1 2 3 x 106 Num of sets (m) Cumulative spending time (slots) α=0.01, β=0.40 EZB FNEB 0 50 100 0 1 2 3 x 106 Num of sets (m) Cumulative spending time (slots) α=0.1, β=0.40 EZB FNEB 0 50 100 0 1 2 3 x 106 Num of sets (m) Cumulative spending time (slots) α=0.5, β=0.40 EZB FNEB Fig. 5. Cumulative number of slots for estimation versus the number of sets, while increasing α and holding β 0 50 100 0 1 2 3 x 106 Num of sets (m) Cumulative spending time (slots) α=0.01, β=0.20 EZB FNEB 0 50 100 0 1 2 3 x 106 Num of sets (m) Cumulative spending time (slots) α=0.01, β=0.40 EZB FNEB 0 50 100 0 1 2 3 x 106 Num of sets (m) Cumulative spending time (slots) α=0.01, β=0.60 EZB FNEB 0 50 100 0 1 2 3 x 106 Num of sets (m) Cumulative spending time (slots) α=0.01, β=0.80 EZB FNEB Fig. 6. Cumulative number of slots for estimation versus the number of sets, while increasing β and holding α data sets. Let α denote the percentage of the size of each set to tmax, and β denote the percentage of the overlapped tags between two tag sets. In Fig. 5, we hold parameter α and change β to conduct the comparison, and vice versa in Fig. 6. From the results, we see that our scheme is more efficient than EZB in all tests. B. Additional Discussions This subsection covers some other issues whose details are omitted due to the page limit. 1) Accuracy requirements: In our simulation, we randomly select 1000 possible values for t, ranging from 1 to tmax. The results show that the estimation falling out of the range [t − ǫt, t + ǫt] only twice. The estimating accuracy holds with more than 1 − δ probability. 2) Scalability: The tag population may vary across many orders of magnitude, ranging from tens to thousands of tags. In our simulation, we consider the tag population varies in four scales of tmax: 100, 1000, 10000, and 100000. The results show the estimating time does not increase obviously. Our estimator scales well. 3) Signal loss: Our scheme leverages the first non-empty slot in a frame for estimation. In practice, when the link
TABLE IV TOTAL TIME(IN NUMBER OF SLOTS)FOR FIVE SINGLE SET ESTIMATORS.SINCE CSE AND UPE NEED TO IDENTIFY THE TYPE OF A SLOT,WE LIST THE DETAIL:EMPTY SLOTS,SINGLETON SLOTS,AND COLLISION SLOTS.FOR OTHERS,WE SIMPLY SHOW THE SUM. Number Total time (in number of slots) of tags CSE UPE EZB FNEB Enhanced FNEB empty singleton collision sum empty singleton collision sum sum sum sum 10 2220 530 305 3055 1135 384 71 1590 21.052 98.132 5526 50 2264 534 345 3143 53 269 416 840 21.052 91.808 5738 100 2277 642 328 3247 91 239 1050 1380 21,052 84.559 5732 500 1974 972 450 3396 151 380 1509 2040 21.052 46.525 5758 1000 1926 1373 704 4005 150 388 1592 2130 21,052 26,010 5683 5000 971 1822 4358 7151 147 406 1697 2250 21,052 6510 5661 quality is poor,the reader may not be able to detect [8]A.Juels,D.Molnar,and D.Wagner,"Security and privacy issues in the signal sent by RFID tags,resulting in the reader e-passports,"in SECURECOMM,2005. [9]RFID driver's licenses debated. [Onlinel. Available: possibly observing more empty slots.We can compensate http://www.wired.com/politics/security/news/2004/10/65243 by averaging the results over multiple rounds.In addition, [10]A.Juels,"RFID security and privacy:A research survey,"Manuscript, a learning phase can be adopted to characterize the link RSA Laboratories,September 2005. quality before estimation. [11]C.C.Tan,B.Sheng,and Q.Li,"Secure and serverless RFID authenti- cation and search protocols,"IEEE Transactions on Wireless Communi- 4)Active attacks:If an attacker can intentionally generate cations,2008. a reply in an arbitrary slot,there is no feasible solution [12]M.Kodialam and T.Nandagopal,"Fast and reliable estimation schemes to solve this problem till now,since all replies from in RFID systems,"in MOB/COM,2006,pp.322-333. [13]M.Kodialam,T.Nandagopal,and W.C.Lau,"Anonymous Tracking the legitimate tags may be corrupted by the attacker. using RFID tags,"in INFOCOM,2007. Therefore,active attacks are excluded in this paper. [14]Z.Bar-Yossef,T.S.Jayram,R.Kumar,D.Sivakumar,and L.Trevisan, "Counting distinct elements in a data stream,"in RANDOM,2002. VII.CONCLUSIONS [15]J.Zhai and G.-N.Wang,"An anti-collision algorithm using two- functioned estimation for RFID tags"in /CCS4(4)2005,pp.702-711. In this paper,we consider the problem of estimating the num- [16]J.Myung and W.Lee,"Adaptive splitting protocols for rfid tag collision ber of distinct tags without identifying each tag in a large scale arbitration,"in MOB/HOC,2006,pp.202-213. RFID system.We present a new scheme and its variations based [17]H.Vogt,"Efficient object identification with passive RFID tags,"in PERVASIVE,2002,Pp.98-113. on the probability of the position of the first reply from a group [18]C.Law,K.Lee,and K.-Y.Siu,"Efficient memoryless protocol for tag of tags.These schemes can be used to estimate tag population in identification,"in D/AL-M,2000,pp.75-84 both static and dynamic environments.Theoretical analysis and [19]D.Hush and C.Wood,"Analysis of tree algorithms for rfid arbitration," in1ST,1998. extensive simulation show our approach drastically improves [20]F.Zhou,C.Chen,D.Jin,C.Huang,and H.Min,"Evaluating and the time efficiency over prior schemes. optimizing power consumption of anti-collision protocols for applications in rfid systems,"in ISLPED,2004. ACKNOWLEDGMENTS [21]N.Abramson,"The Aloha system -another alterative for computer communications,"in AFIPS Conference,1970 We would like to thank all the reviewers for their helpful [22]J.-R.Cha and J.-H.Kim,"Novel anti-collision algorithms for fast object comments.This project was supported in part by US Na- identification in RFID system,"in /CPADS (2),2005,pp.63-67. tional Science Foundation grants CNS-0721443,CNS-0831904, [23]B.Zhen,M.Kobayashi,and M.Shimizu,"Framed ALOHA for multiple rfid objects identification,"IEICE Transactions 2005 CAREER Award CNS-0747108.the National High-Tech Re- [24]S.-R.Lee,S.-D.Joo,and C.-W.Lee,"An enhanced dynamic framed slot- search and Development Program of China(863)under Grant ted ALOHA algorithm for RFID tag identification,"in MOB/OUITOUS No.2006AA01Z199,the National Natural Science Foundation 2005,pp.166-174. [25]C.Qian,H.Ngan,and Y.Liu,"Cardinality estimation for large-scale of China under Grant No.90718031.No.60721002.No. RFID systems,"in PERCOM 08. 60573106 and the National Basic Research Program of China 26]H.Tijims,Understanding Probability:Chance Rules in Everyday Life. Cambridge University Press,2007. (973)under Grant No.2009CB320705. 27]M.Abramowitz and I.A.Stegun,Handbook of mathematical fimctions REFERENCES with Formulas,Graphs,and Mathematical Tables.Dover Publications, 1972. [1]L.Ni,Y.Liu,Y.C.Lau,and A.Patil,"Landmarc:indoor location sensing [28]"EPC radio-frequency identity protocols class-1 generation-2 UHF RFID using active RFID,"Percom '03. protocol for communications at 860 mhz -960 mhz version 1.1.0," [2]C.Wang.H.Wu,and N.-F.Tzeng,"RFID-based 3-d positioning EPCglobal,Tech.Rep.,2005. schemes,.”FOCOM'07. [3]C.-H.Lee and C.-W.Chung."Efficient storage scheme and query pro- cessing for supply chain management using RFID,"in S/GMOD 08. [4]A.Nemmaluri,M.D.Corner,and P.Shenoy,"Sherlock:automatically locating objects for humans,"in MobiSys 08. [5]L.Ravindranath,V.N.Padmanabhan,and P.Agrawal,"Sixthsense:RFID- based enterprise intelligence,"in MobiSys '08. [6]C.C.Tan,B.Sheng,and Q.Li,"How to monitor for missing RFID tags," in IEEE ICDCS,2008. [7]B.Sheng,C.C.Tan,Q.Li,and W.Mao,"Finding popular categoried for RFID tags,"in ACM Mobihoc,2008
TABLE IV TOTAL TIME (IN NUMBER OF SLOTS) FOR FIVE SINGLE SET ESTIMATORS. SINCE CSE AND UPE NEED TO IDENTIFY THE TYPE OF A SLOT, WE LIST THE DETAIL: EMPTY SLOTS, SINGLETON SLOTS, AND COLLISION SLOTS. FOR OTHERS, WE SIMPLY SHOW THE SUM. Number Total time (in number of slots) of tags CSE UPE EZB FNEB Enhanced FNEB empty singleton collision sum empty singleton collision sum sum sum sum 10 2220 530 305 3055 1135 384 71 1590 21,052 98,132 5526 50 2264 534 345 3143 155 269 416 840 21,052 91,808 5738 100 2277 642 328 3247 91 239 1050 1380 21,052 84,559 5732 500 1974 972 450 3396 151 380 1509 2040 21,052 46,525 5758 1000 1926 1375 704 4005 150 388 1592 2130 21,052 26,010 5683 5000 971 1822 4358 7151 147 406 1697 2250 21,052 6510 5661 quality is poor, the reader may not be able to detect the signal sent by RFID tags, resulting in the reader possibly observing more empty slots. We can compensate by averaging the results over multiple rounds. In addition, a learning phase can be adopted to characterize the link quality before estimation. 4) Active attacks: If an attacker can intentionally generate a reply in an arbitrary slot, there is no feasible solution to solve this problem till now, since all replies from the legitimate tags may be corrupted by the attacker. Therefore, active attacks are excluded in this paper. VII. CONCLUSIONS In this paper, we consider the problem of estimating the number of distinct tags without identifying each tag in a large scale RFID system. We present a new scheme and its variations based on the probability of the position of the first reply from a group of tags. These schemes can be used to estimate tag population in both static and dynamic environments. Theoretical analysis and extensive simulation show our approach drastically improves the time efficiency over prior schemes. ACKNOWLEDGMENTS We would like to thank all the reviewers for their helpful comments. This project was supported in part by US National Science Foundation grants CNS-0721443, CNS-0831904, CAREER Award CNS-0747108, the National High-Tech Research and Development Program of China (863) under Grant No. 2006AA01Z199, the National Natural Science Foundation of China under Grant No. 90718031, No. 60721002, No. 60573106 and the National Basic Research Program of China (973) under Grant No. 2009CB320705. REFERENCES [1] L. Ni, Y. Liu, Y. C. Lau, and A. Patil, “Landmarc: indoor location sensing using active RFID,” Percom ’03. [2] C. Wang, H. Wu, and N.-F. Tzeng, “RFID-based 3-d positioning schemes,” INFOCOM ’07. [3] C.-H. Lee and C.-W. Chung, “Efficient storage scheme and query processing for supply chain management using RFID,” in SIGMOD ’08. [4] A. Nemmaluri, M. D. Corner, and P. Shenoy, “Sherlock: automatically locating objects for humans,” in MobiSys ’08. [5] L. Ravindranath, V. N. Padmanabhan, and P. Agrawal, “Sixthsense: RFIDbased enterprise intelligence,” in MobiSys ’08. [6] C. C. Tan, B. Sheng, and Q. Li, “How to monitor for missing RFID tags,” in IEEE ICDCS , 2008. [7] B. Sheng, C. C. Tan, Q. Li, and W. Mao, “Finding popular categoried for RFID tags,” in ACM Mobihoc, 2008. [8] A. Juels, D. Molnar, and D. Wagner, “Security and privacy issues in e-passports,” in SECURECOMM, 2005. [9] RFID driver’s licenses debated. [Online]. Available: http://www.wired.com/politics/security/news/2004/10/65243 [10] A. Juels, “RFID security and privacy: A research survey,” Manuscript, RSA Laboratories, September 2005. [11] C. C. Tan, B. Sheng, and Q. Li, “Secure and serverless RFID authentication and search protocols,” IEEE Transactions on Wireless Communications, 2008. [12] M. Kodialam and T. Nandagopal, “Fast and reliable estimation schemes in RFID systems,” in MOBICOM, 2006, pp. 322–333. [13] M. Kodialam, T. Nandagopal, and W. C. Lau, “Anonymous Tracking using RFID tags,” in INFOCOM, 2007. [14] Z. Bar-Yossef, T. S. Jayram, R. Kumar, D. Sivakumar, and L. Trevisan, “Counting distinct elements in a data stream,” in RANDOM, 2002. [15] J. Zhai and G.-N. Wang, “An anti-collision algorithm using twofunctioned estimation for RFID tags,” in ICCSA (4), 2005, pp. 702–711. [16] J. Myung and W. Lee, “Adaptive splitting protocols for rfid tag collision arbitration,” in MOBIHOC, 2006, pp. 202–213. [17] H. Vogt, “Efficient object identification with passive RFID tags,” in PERVASIVE, 2002, pp. 98–113. [18] C. Law, K. Lee, and K.-Y. Siu, “Efficient memoryless protocol for tag identification,” in DIAL-M, 2000, pp. 75–84. [19] D. Hush and C. Wood, “Analysis of tree algorithms for rfid arbitration,” in ISIT, 1998. [20] F. Zhou, C. Chen, D. Jin, C. Huang, and H. Min, “Evaluating and optimizing power consumption of anti-collision protocols for applications in rfid systems,” in ISLPED, 2004. [21] N. Abramson, “The Aloha system - another alternative for computer communications,” in AFIPS Conference, 1970. [22] J.-R. Cha and J.-H. Kim, “Novel anti-collision algorithms for fast object identification in RFID system,” in ICPADS (2), 2005, pp. 63–67. [23] B. Zhen, M. Kobayashi, and M. Shimizu, “Framed ALOHA for multiple rfid objects identification,” IEICE Transactions 2005. [24] S.-R. Lee, S.-D. Joo, and C.-W. Lee, “An enhanced dynamic framed slotted ALOHA algorithm for RFID tag identification,” in MOBIQUITOUS, 2005, pp. 166–174. [25] C. Qian, H. Ngan, and Y. Liu, “Cardinality estimation for large-scale RFID systems,” in PERCOM ’08. [26] H. Tijims, Understanding Probability: Chance Rules in Everyday Life. Cambridge University Press, 2007. [27] M. Abramowitz and I. A. Stegun, Handbook of mathematical functions with Formulas, Graphs, and Mathematical Tables. Dover Publications, 1972. [28] “EPC radio-frequency identity protocols class-1 generation-2 UHF RFID protocol for communications at 860 mhz - 960 mhz version 1.1.0,” EPCglobal, Tech. Rep., 2005