II.RELATED WORK backscatters a message or keeps silent according to the reader's One closely related problem is RFID identification which command.Such a communication model has been widely aims at collecting the tag IDs of all RFIDs in the interro- adopted in many RFID systems compliant with the de facto gation area [7,19,20,24,29].The identification protocols EPCglobal C1G2 standard [2]. based on collision arbitration can generally be classified into The practical communication channel is mostly error-prone two categories:Ahola-based protocols [20]and Tree-based depending on various factors including transmission power, protocols [7,29].In small-scale RFID systems,we may interrogation distance,antenna gain,interference,etc [6]. directly apply the identification protocols to count the exact tag Due to the channel attenuation even if there are some tags cardinality.Nevertheless,such a method becomes infeasible transmitting back responses,the reader may fail to detect them due to the low efficiency of identification protocols.Rather in practice.We call such missing detection errors as false than identifying all the tags,probabilistic estimation protocols negatives.On the other hand,the reader may falsely detect a estimate the number of tags which meets the customized busy channel due to the interferences,even when no response accuracy requirement. is transmitted.We call such errors as false positives. Kodialam and Nandagopal present Unified Simple Estimator (USE)and Unified Probabilistic Estimator(UPE)[12].Those B.Problem description schemes are vulnerable to multiple counting problems when The objective of this work is to efficiently and accurately multiple RFID readers are deployed to cover the interrogation estimate the tag cardinality.To meet stringent realtime re- region.Besides,the protocols require a cardinality upper bound known in advance.Kodialam et al.propose Enhanced quirement,the estimation protocol should compute an accurate tag cardinality in an efficient manner.Since the cardinality of Zero-Based (EZB)estimator to estimate relatively large num- ber of tags [13]. RFID tags could easily grow up to tens of thousands (e.g.,a Recent probabilistic estimation approaches achieve typical port inventory application concerns hundreds of con- tainers,each of which contains a large number of products), O(log n)estimation efficiency to the total number of RFID we need to design a scalable and efficient estimation approach tags n.Han et al.propose the First Non-Empty slot Based that can guarantee the customized accuracy requirement. (FNEB)estimator with binary search method [11].Qian et al. [16]propose the Lottery Frame (LoF)based estimator,which Consistent with the existing approaches [11,16,27],the is a replicate-insensitive estimation protocol.Both approaches accuracy requirement is presented by (e,6)-approximation.We denote by n the estimated number of the tag cardinality while require the tags to cooperate with the reader by generating large volume of random numbers and respond accordingly. the actual number is n.Given the accuracy requirement of One most recent protocol,Probabilistic Estimating Tree (PET) (,0)-approximation,we expect an estimation result n,which based estimator [27],advances the estimation efficiency and satisfies Pr-n<En}>1-6.For example,when achieves O(log logn)processing time efficiency to the tag the actual tag cardinality is 10000,=5%,6 =1%)- cardinality n.Such approaches only consider an ideal wireless approximation expects an estimation result within the interval 9500,10500 with a probability of 99%and above. channel which is free of transmission error. We abstract the time efficiency with the total time slots used III.SYSTEM MODEL AND PROBLEM DESCRIPTION to estimate the cardinality.Most recent approaches only need to distinguish an idle slot from a busy slot [11,16,27].The A.System model smaller number of time slots means the shorter processing time We consider a large-scale RFID system consisting of three and thus the higher time efficiency,and vice versa.Meanwhile, major components:a large volume of RFID tags,several RFID we seek to reduce the computation and memory burden at readers,and a backend server connecting the readers.Multiple the RFID tags to facilitate the use of low-cost but resource- RFID readers are normally deployed to ensure a full coverage constrained passive tags rather than expensive active ones. in large-scale RFID systems.The backend server coordinates While most recent estimation protocols assume zero error the RFID readers and initiates the cardinality estimation pro- rates in the underlying wireless channel,we target at practical cess.The RFID readers relay the commands received from channel conditions with false negatives and false positives. the backend server and broadcast to tags,and later report When the bit error rate is high,it becomes very difficult to the tags'responses back to the server.When coordinated and derive accurate cardinality estimation.Nevertheless,when the synchronized,the multiple readers can logically be considered channel is in mild conditions,we expect that the estimation as one RFID reader.The RFID system may use lightweight protocol computes a reasonably accurate estimation result. passive RFID tags or powerful active ones. RFID systems may operate over a wide variety of frequency IV.ESTIMATION PROTOCOL bands (e.g.,13.56/433/900MHz).We exclusively focus on the RFID systems operating in the 900MHz ultra-high frequency In this section,we first discuss the design principle of band.We assume that the RFID system works on frame-slotted ZOE protocol.We then consolidate the essential idea with Aloha model.RFID readers initiate interrogation by sending a cardinality estimation protocol,which provides O(1)time operation codes and specifying PHY/MAC parameters.When efficiency for each estimation round.Table I summarizes the energized by the continuous waves from reader,each tag key notations used across this paper.II. RELATED WORK One closely related problem is RFID identification which aims at collecting the tag IDs of all RFIDs in the interrogation area [7, 19, 20, 24, 29]. The identification protocols based on collision arbitration can generally be classified into two categories: Ahola-based protocols [20] and Tree-based protocols [7, 29]. In small-scale RFID systems, we may directly apply the identification protocols to count the exact tag cardinality. Nevertheless, such a method becomes infeasible due to the low efficiency of identification protocols. Rather than identifying all the tags, probabilistic estimation protocols estimate the number of tags which meets the customized accuracy requirement. Kodialam and Nandagopal present Unified Simple Estimator (USE) and Unified Probabilistic Estimator (UPE) [12]. Those schemes are vulnerable to multiple counting problems when multiple RFID readers are deployed to cover the interrogation region. Besides, the protocols require a cardinality upper bound known in advance. Kodialam et al. propose Enhanced Zero-Based (EZB) estimator to estimate relatively large number of tags [13]. Recent probabilistic estimation approaches achieve O(log n) estimation efficiency to the total number of RFID tags n. Han et al. propose the First Non-Empty slot Based (FNEB) estimator with binary search method [11]. Qian et al. [16] propose the Lottery Frame (LoF) based estimator, which is a replicate-insensitive estimation protocol. Both approaches require the tags to cooperate with the reader by generating large volume of random numbers and respond accordingly. One most recent protocol, Probabilistic Estimating Tree (PET) based estimator [27], advances the estimation efficiency and achieves O(log log n) processing time efficiency to the tag cardinality n. Such approaches only consider an ideal wireless channel which is free of transmission error. III. SYSTEM MODEL AND PROBLEM DESCRIPTION A. System model We consider a large-scale RFID system consisting of three major components: a large volume of RFID tags, several RFID readers, and a backend server connecting the readers. Multiple RFID readers are normally deployed to ensure a full coverage in large-scale RFID systems. The backend server coordinates the RFID readers and initiates the cardinality estimation process. The RFID readers relay the commands received from the backend server and broadcast to tags, and later report the tags’ responses back to the server. When coordinated and synchronized, the multiple readers can logically be considered as one RFID reader. The RFID system may use lightweight passive RFID tags or powerful active ones. RFID systems may operate over a wide variety of frequency bands (e.g., 13.56/433/900MHz). We exclusively focus on the RFID systems operating in the 900MHz ultra-high frequency band. We assume that the RFID system works on frame-slotted Aloha model. RFID readers initiate interrogation by sending operation codes and specifying PHY/MAC parameters. When energized by the continuous waves from reader, each tag backscatters a message or keeps silent according to the reader’s command. Such a communication model has been widely adopted in many RFID systems compliant with the de facto EPCglobal C1G2 standard [2]. The practical communication channel is mostly error-prone depending on various factors including transmission power, interrogation distance, antenna gain, interference, etc [6]. Due to the channel attenuation even if there are some tags transmitting back responses, the reader may fail to detect them in practice. We call such missing detection errors as false negatives. On the other hand, the reader may falsely detect a busy channel due to the interferences, even when no response is transmitted. We call such errors as false positives. B. Problem description The objective of this work is to efficiently and accurately estimate the tag cardinality. To meet stringent realtime requirement, the estimation protocol should compute an accurate tag cardinality in an efficient manner. Since the cardinality of RFID tags could easily grow up to tens of thousands (e.g., a typical port inventory application concerns hundreds of containers, each of which contains a large number of products), we need to design a scalable and efficient estimation approach that can guarantee the customized accuracy requirement. Consistent with the existing approaches [11, 16, 27], the accuracy requirement is presented by (ε,δ)-approximation. We denote by nˆ the estimated number of the tag cardinality while the actual number is n. Given the accuracy requirement of (ε,δ)-approximation, we expect an estimation result nˆ, which satisfies Pr{|nˆ − n| ≤ εn} ≥ 1 − δ. For example, when the actual tag cardinality is 10000, (ε = 5%, δ = 1%)- approximation expects an estimation result within the interval [9500, 10500] with a probability of 99% and above. We abstract the time efficiency with the total time slots used to estimate the cardinality. Most recent approaches only need to distinguish an idle slot from a busy slot [11, 16, 27]. The smaller number of time slots means the shorter processing time and thus the higher time efficiency, and vice versa. Meanwhile, we seek to reduce the computation and memory burden at the RFID tags to facilitate the use of low-cost but resourceconstrained passive tags rather than expensive active ones. While most recent estimation protocols assume zero error rates in the underlying wireless channel, we target at practical channel conditions with false negatives and false positives. When the bit error rate is high, it becomes very difficult to derive accurate cardinality estimation. Nevertheless, when the channel is in mild conditions, we expect that the estimation protocol computes a reasonably accurate estimation result. IV. ESTIMATION PROTOCOL In this section, we first discuss the design principle of ZOE protocol. We then consolidate the essential idea with a cardinality estimation protocol, which provides O(1) time efficiency for each estimation round. Table I summarizes the key notations used across this paper