Smartphone Privacy Leakage of Social Relationships and Demographics from Surrounding Access Points Chen Wang*,Chuyu Wang*t,Yingying Chen*,Lei Xiet and Sanglu Lut *Department of Electrical and Computer Engineering Stevens Institute of Technology,Hoboken,NJ,USA {cwang42,yingying.chen;@stevens.edu State Key Laboratory for Novel Software Technology Nanjing University,Nanjing,Jiangsu,China wangcyu217@dislab.nju.edu.cn,(Ixie,sanglu@nju.edu.cn Abstract-While the mobile users enjoy the anytime anywhere personal information,in particular users'social relationships Internet access by connecting their mobile devices through Wi-Fi and demographics.could be derived.Prior work in demo- services,the increasing deployment of access points (APs)have raised a number of privacy concerns.This paper explores the graphics inference based on Wi-Fi network mainly rely on potential of smartphone privacy leakage caused by surrounding the context information obtained from passively sniffed users' APs.In particular,we study to what extent the users'personal Wi-Fi traffic [4],[5].For example,Cheng et al.examine information such as social relationships and demographics could users'Internet browsing activities by collecting their in-the- be revealed leveraging simple signal information from APs air traffic in public hotspots [4],whereas Huaxin et al.infer without examining the Wi-Fi traffic.Our approach utilizes users' user demographic information by passively sniffing the Wi- activities at daily visited places derived from the surrounding APs to infer users'social interactions and individual behaviors.Fur Fi traffic meta-data [5].These methods need to examine the thermore,we develop two new mechanisms:the Closeness-based Wi-Fi traffic and are thus not scalable to large number of Social Relationships Inference algorithm captures how closely users due to the high deployment overhead involved.Existing people interact with each other by evaluating their physical work in social relationships inference primarily depend on closeness and derives fine-grained social relationships,whereas the Behavior-based Demographics Inference method differentiates the encounter events detected by either bluetooth [6],Wi- various individual behaviors via the extracted activity features Fi SSID list [7],or GPS locations [8].These approaches (e.g.,activeness and time slots)at each daily place to reveal can only perform coarse-grained social relationships inference users'demographics.Extensive experiments conducted with 21 by examining whether users have interactions or not instead participants'real daily life including 257 different places in three of studying users'behaviors and how closely they interact cities over a 6-month period demonstrate that the simple signal information from surrounding APs have a high potential to reveal with each other.They can neither provide fine-grained so- people's social relationships and infer demographics with an over cial relationships(such as advisor-student,colleagues,friends, 90%accuracy when using our approach. husband-wife,neighbors)nor identify specific role of the user in the relationship. I.INTRODUCTION It is known that GPS.motion sensors and contact lists on Wi-Fi networks are becoming increasingly pervasive,to the mobile devices can exhibit privacy,but how much a user's point where public Wi-Fi access is readily in place in numer- privacy could be leaked from the ubiquitous access points is ous cities [1].And the number of public Wi-Fi Access Points unclear.In this work,we demonstrate that by examining the (APs)is expected to hit 340 million globally by 2018.resulting simple signal features of the surrounding APs it is possible to in one public Wi-Fi AP for every twenty people worldwide [2]. infer users'fine-grained social relationships and demographics More commonly,retail stores,offices,universities and homes without sniffing any Wi-Fi traffic.Specifically,the availability are usually Wi-Fi enabled for providing high bandwidth and of surrounding Wi-Fi APs is periodically scanned by mobile cost-effective connectivity to the Internet for the mobile users. devices because of their default systems purpose to optimize While the mobile users enjoy the anytime anywhere Internet network service via continuously seeking better Wi-Fi signals access by connecting their mobile devices (e.g.,smartphones)and remembered APs [9],[10]and accessing such information to the Wi-Fi networks,the surrounding APs have raised a only requires a common permission,which is considered number of privacy concerns.For example,mobile users could with low risk [11].Signal features such as the time-series of be located and tracked based on the ubiquitous APs,such as BSSIDs(i.e.MAC addresses)and Received Signal Strength using Google location service [3]. (RSS)are then extracted from these scanned APs and analyzed In this work,we study the potential of privacy leakage to derive users'activities at daily visited places.Our system caused by surrounding APs and explore to what extent the exploits the rich information of users'daily interactions andSmartphone Privacy Leakage of Social Relationships and Demographics from Surrounding Access Points Chen Wang∗, Chuyu Wang∗†, Yingying Chen∗, Lei Xie† and Sanglu Lu† ∗Department of Electrical and Computer Engineering Stevens Institute of Technology, Hoboken, NJ, USA {cwang42, yingying.chen}@stevens.edu †State Key Laboratory for Novel Software Technology Nanjing University, Nanjing, Jiangsu, China wangcyu217@dislab.nju.edu.cn, {lxie, sanglu}@nju.edu.cn Abstract—While the mobile users enjoy the anytime anywhere Internet access by connecting their mobile devices through Wi-Fi services, the increasing deployment of access points (APs) have raised a number of privacy concerns. This paper explores the potential of smartphone privacy leakage caused by surrounding APs. In particular, we study to what extent the users’ personal information such as social relationships and demographics could be revealed leveraging simple signal information from APs without examining the Wi-Fi traffic. Our approach utilizes users’ activities at daily visited places derived from the surrounding APs to infer users’ social interactions and individual behaviors. Furthermore, we develop two new mechanisms: the Closeness-based Social Relationships Inference algorithm captures how closely people interact with each other by evaluating their physical closeness and derives fine-grained social relationships, whereas the Behavior-based Demographics Inference method differentiates various individual behaviors via the extracted activity features (e.g., activeness and time slots) at each daily place to reveal users’ demographics. Extensive experiments conducted with 21 participants’ real daily life including 257 different places in three cities over a 6-month period demonstrate that the simple signal information from surrounding APs have a high potential to reveal people’s social relationships and infer demographics with an over 90% accuracy when using our approach. I. INTRODUCTION Wi-Fi networks are becoming increasingly pervasive, to the point where public Wi-Fi access is readily in place in numerous cities [1]. And the number of public Wi-Fi Access Points (APs) is expected to hit 340 million globally by 2018, resulting in one public Wi-Fi AP for every twenty people worldwide [2]. More commonly, retail stores, offices, universities and homes are usually Wi-Fi enabled for providing high bandwidth and cost-effective connectivity to the Internet for the mobile users. While the mobile users enjoy the anytime anywhere Internet access by connecting their mobile devices (e.g., smartphones) to the Wi-Fi networks, the surrounding APs have raised a number of privacy concerns. For example, mobile users could be located and tracked based on the ubiquitous APs, such as using Google location service [3]. In this work, we study the potential of privacy leakage caused by surrounding APs and explore to what extent the personal information, in particular users’ social relationships and demographics, could be derived. Prior work in demographics inference based on Wi-Fi network mainly rely on the context information obtained from passively sniffed users’ Wi-Fi traffic [4], [5]. For example, Cheng et al. examine users’ Internet browsing activities by collecting their in-theair traffic in public hotspots [4], whereas Huaxin et al. infer user demographic information by passively sniffing the WiFi traffic meta-data [5]. These methods need to examine the Wi-Fi traffic and are thus not scalable to large number of users due to the high deployment overhead involved. Existing work in social relationships inference primarily depend on the encounter events detected by either bluetooth [6], WiFi SSID list [7], or GPS locations [8]. These approaches can only perform coarse-grained social relationships inference by examining whether users have interactions or not instead of studying users’ behaviors and how closely they interact with each other. They can neither provide fine-grained social relationships (such as advisor-student, colleagues, friends, husband-wife, neighbors) nor identify specific role of the user in the relationship. It is known that GPS, motion sensors and contact lists on mobile devices can exhibit privacy, but how much a user’s privacy could be leaked from the ubiquitous access points is unclear. In this work, we demonstrate that by examining the simple signal features of the surrounding APs it is possible to infer users’ fine-grained social relationships and demographics without sniffing any Wi-Fi traffic. Specifically, the availability of surrounding Wi-Fi APs is periodically scanned by mobile devices because of their default systems purpose to optimize network service via continuously seeking better Wi-Fi signals and remembered APs [9], [10] and accessing such information only requires a common permission, which is considered with low risk [11]. Signal features such as the time-series of BSSIDs (i.e. MAC addresses) and Received Signal Strength (RSS) are then extracted from these scanned APs and analyzed to derive users’ activities at daily visited places. Our system exploits the rich information of users’ daily interactions and