TABLE I Section V considers additional security issues not directly UNENCRYPTED TABLE IN DATABASE related to user privacy.Finally we review the related work in Section VI and conclude in Section VII D Time Location 101 10.00am Office 1■ II.RELATED WORK 10210.00am Office 3 101 10:5am Office 2 Privacy protection is an important component in ubiquitous computing environments [9],[10].The technique of using mix zones was proposed by [11],[12]where by users could specify certain areas where nobody could trace their movements.Other time is the time that tag was read,and location is the physical researchers [13],[14],[15]proposed techniques to help users location corresponding to that particular reader.We assume define privacy policies.This approach is more flexible than that the database knows the locations of all the RFID readers. mix zones at the cost of additional complexity in specifying and can associate the data from a reader to a location. the policies.The idea of allowing users to specify "virtual The database can represent the data from all the RFID walls"was suggested by [16]to simplify the creation of a readers as a table with 3 attributes,ID,Time,and Location. privacy policy.Physical access control [17],[18]addresses Table I illustrates this database.A user who own tag 101,can the problem of specifying a privacy policy.Users can only query this database later by issuing a query, obtain the location information of people that are were present together at the same time.The intuition is that users that were Select from DB where ID=101 and Time=10:00 am at the same location at the same time already know each and determine he was at Office 1 at 10 am. other's presence,and thus there is no privacy issues when releasing that information later.Our work differs from these Such a setup provides no privacy protections,since anyone proposed techniques in that we do not rely on trusted servers to can query the database to determine anybody's movements. protect user privacy.Our idea of separating location,time,and Under the trusted server assumption,we can protect privacy identity is similar to that proposed by [19].but our solutions by associating a password with each ID.The user running are designed to work with RFID tags. the above query for instance,will have to supply the correct RFID security is an active area of research with many password associated with ID=101,before the database will different protocols being proposed [20],[21].[22].While our release the information.While more complex schemes can paper also proposes a simple security protocol,our focus is be designed to provide better access control,a fundamental less on the security and privacy between RFID reader and tag, requirement is that the database cannot be compromised.An but oriented more towards data already collected and archived. adversary with access to Table I learns everything. Closely related to our paper is research on searching en- In this paper,we consider the problem of how to protect crypted data.In this problem,a user encrypts his data and privacy in the event of such a server is compromised by an stores it at an untrusted server.The user wants to be able to adversary. search of part of his data in an efficient manner.Since the A.Adversary model server is untrustworthy,the user cannot send over his secret key.The user also cannot request the server to transmit all the We assume that the adversary seeks only to track the encrypted data back since it is inefficient.An search system movements of a user.The adversary succeeds if he is able using symmetric key to encrypt data was proposed by [23],to extract the identity of the user from the database,or if he while [24]suggested a public key based scheme.Practical is able to link two entries in the database to the same user. encrypted database query retrieval systems were proposed We assume that the adversary can have free access to the by [25],[26].However,unlike our paper,prior research in this database data such as in Table I,as well as observe the area do not consider the privacy implications of ubiquitous database interactions between a user and the database.The environments such as malicious tracking of users.This was adversary is however,unable to determine the identity of shown in the second strawman approach by using [26]as someone querying the database.For instance,the adversary an example.Furthermore,these prior techniques assume that cannot deduce a user identity through the MAC address of more advance hardware such as laptops are used,rather than the user's device when the user is querying the database. computational weak RFID tags. Techniques such as an anonymizing network [27]can be deployed to achieve this.Also,the adversary cannot reprogram III.PROBLEM FORMULATION the database to execute functions it otherwise will not perform. We consider a network of RFID readers,R1,...,Rn,are In our paper,we assume that the RFID tags are able to per- deployed throughout a facility.Each reader is programmed form simple operations such as generating random numbers, to periodically broadcast a query to read all the RFID tags perform a hash function,and XOR two bitstrings together. within its vicinity.The captured data is then forwarded to a These are common assumptions made in RFID security lit- backend database for storage.In the unencrypted scenario,the erature [28].[29],[30].While our solution in the paper is database stores this information as a ID:time location presented using hash functions,symmetric key encryption can tuple,where ID is a number that identifies that RFID tag be substituted instead with minor modifications. 54Section V considers additional security issues not directly related to user privacy. Finally we review the related work in Section VI and conclude in Section VII. II. RELATED WORK Privacy protection is an important component in ubiquitous computing environments [9], [10]. The technique of using mix zones was proposed by [11], [12] where by users could specify certain areas where nobody could trace their movements. Other researchers [13], [14], [15] proposed techniques to help users define privacy policies. This approach is more flexible than mix zones at the cost of additional complexity in specifying the policies. The idea of allowing users to specify “virtual walls” was suggested by [16] to simplify the creation of a privacy policy. Physical access control [17], [18] addresses the problem of specifying a privacy policy. Users can only obtain the location information of people that are were present together at the same time. The intuition is that users that were at the same location at the same time already know each other’s presence, and thus there is no privacy issues when releasing that information later. Our work differs from these proposed techniques in that we do not rely on trusted servers to protect user privacy. Our idea of separating location, time, and identity is similar to that proposed by [19], but our solutions are designed to work with RFID tags. RFID security is an active area of research with many different protocols being proposed [20], [21], [22]. While our paper also proposes a simple security protocol, our focus is less on the security and privacy between RFID reader and tag, but oriented more towards data already collected and archived. Closely related to our paper is research on searching encrypted data. In this problem, a user encrypts his data and stores it at an untrusted server. The user wants to be able to search of part of his data in an efficient manner. Since the server is untrustworthy, the user cannot send over his secret key. The user also cannot request the server to transmit all the encrypted data back since it is inefficient. An search system using symmetric key to encrypt data was proposed by [23], while [24] suggested a public key based scheme. Practical encrypted database query retrieval systems were proposed by [25], [26]. However, unlike our paper, prior research in this area do not consider the privacy implications of ubiquitous environments such as malicious tracking of users. This was shown in the second strawman approach by using [26] as an example. Furthermore, these prior techniques assume that more advance hardware such as laptops are used, rather than computational weak RFID tags. III. PROBLEM FORMULATION We consider a network of RFID readers, R1, · · · , Rn, are deployed throughout a facility. Each reader is programmed to periodically broadcast a query to read all the RFID tags within its vicinity. The captured data is then forwarded to a backend database for storage. In the unencrypted scenario, the database stores this information as a ID : time : location tuple, where ID is a number that identifies that RFID tag, TABLE I UNENCRYPTED TABLE IN DATABASE ID Time Location 1. 101 10:00am Office 1 2. 102 10:00am Office 3 3. 101 10:15am Office 2 · · · · · · · · · · · · time is the time that tag was read, and location is the physical location corresponding to that particular reader. We assume that the database knows the locations of all the RFID readers, and can associate the data from a reader to a location. The database can represent the data from all the RFID readers as a table with 3 attributes,ID, Time, and Location. Table I illustrates this database. A user who own tag 101, can query this database later by issuing a query, Select * from DB where ID=101 and Time=10:00 am and determine he was at Office 1 at 10 am. Such a setup provides no privacy protections, since anyone can query the database to determine anybody’s movements. Under the trusted server assumption, we can protect privacy by associating a password with each ID. The user running the above query for instance, will have to supply the correct password associated with ID=101, before the database will release the information. While more complex schemes can be designed to provide better access control, a fundamental requirement is that the database cannot be compromised. An adversary with access to Table I learns everything. In this paper, we consider the problem of how to protect privacy in the event of such a server is compromised by an adversary. A. Adversary model We assume that the adversary seeks only to track the movements of a user. The adversary succeeds if he is able to extract the identity of the user from the database, or if he is able to link two entries in the database to the same user. We assume that the adversary can have free access to the database data such as in Table I, as well as observe the database interactions between a user and the database. The adversary is however, unable to determine the identity of someone querying the database. For instance, the adversary cannot deduce a user identity through the MAC address of the user’s device when the user is querying the database. Techniques such as an anonymizing network [27] can be deployed to achieve this. Also, the adversary cannot reprogram the database to execute functions it otherwise will not perform. In our paper, we assume that the RFID tags are able to perform simple operations such as generating random numbers, perform a hash function, and XOR two bitstrings together. These are common assumptions made in RFID security literature [28], [29], [30]. While our solution in the paper is presented using hash functions, symmetric key encryption can be substituted instead with minor modifications. 54