behaviors embedded in these derived activities and discloses We show with experimental study of 21 participants that fine-grained social relationships (including advisor-student, by using our system one can achieve over 91%accuracy supervisor-employee,colleagues,friends,husband-wife and of inferring social relationships and over 90%accuracy neighbors)as well as demographic information (such as oc- of deriving demographic information via examining the cupation,gender,religion,marital status). simple signal features from surrounding APs. Our approach of using simple signal features of APs can II.RELATED WORK be easily applied to a large number of users.For example, advertisers or third party companies could mine users'per- In this work,we aim to understand the privacy leakage sonal information for targeted advertising or recommending of smartphone users,in particular discovering users'social services.However,such an approach could cause significant relationships and demographics,by analyzing only the avail- privacy leakage if it is utilized by advertisers with aggressive ability of surrounding APs without sniffing any Wi-Fi traffic business attempts,who could simply publish free apps to users Obtaining such information requires limited permission other while these free apps actively collect users'surrounding AP than turning on GPS or accessing to contact lists.Our work is information and send back to the server to derive users'social related to the research efforts in using various information col- relationships and demographics. lected from Wi-Fi network and/or smartphone for meaningful In particular,we describe people's daily places in three places extraction [12]-[15],social relationships inference [6], dimensions (i.e.temporal,spatial and contextual)to infer peo- [7],[16]-[18],and demographics derivation [4],[5],[19]. ple's activities at each place.For users performing activities As the contextual location can be used for learning the per- at the same place,we calculate physical closeness of the users son's interest and providing content-aware applications,there (e.g.,whether staying at the same room,adjacent rooms or have been active studies on extracting contextual meaning of inside the same building)and extract users'activeness (e.g., the locations people visited.For example,Kang et al.design a walking around or sitting)together with other features (e.g., cluster-based method to extract meaningful places from traces time slots and duration)to characterize their activities at daily of location coordinates collected from GPS and Wi-Fi based places.We then develop Closeness-based Social Relationships indoor location system [12].Kim et al.propose SensLoc that Inference algorithm to capture where,when and how closely utilizes a combination of acceleration,Wi-Fi,and GPS sensors people interact to derive fine-grained social relationships.We to find semantic places,detect user movements,and track design Behavior-based Demographics Inference method to travel paths [13].These existing methods however only focus capture individual behavior based on users'various daily on individual users'visited locations without analyzing the activities to reveal demographic information including occu- interactions between them.Besides.the obtained meaningful pation,gender,religion and marriage.We conduct extensive places may be not sufficient to infer the higher level personal experiments with 21 participants carrying their smartphones information,such as fine-grained social relationship and de- to collect surrounding Wi-Fi AP information in their real mographics,due to the lack of information about the users' daily life across three cities over 6 months and study to what daily behaviors and social interactions. extent we can derive these participants'social relationships Information in Wi-Fi networks and smartphones have been and demographic information. used in literature to infer users'social relationships.For We summarize our main contributions as follows: example,Wiese et.al [16]use the smartphone contact list to mine personal relationships.Moreover,the similarity of We demonstrate that simple signal information (e.g.,time- smartphones'SSID lists is used to reveal users'social relation- series of MAC addresses and RSS)from users'surround- ships [7].These methods can only derive coarse-grained social ing Wi-Fi APs can reveal private information including relationships without analyzing the behaviors and interactions both social relationships and demographics. among people.Vicinity detection via Bluetooth or Wi-Fi We develop statistical methods to detect and character- signals opens opportunities for social interaction analysis and ize users'daily visited places based on the AP signal the strength of friendship ties can be inferred from such information and further infer the context of daily places wireless signals [6,[18.However,these vicinity detection by deriving users'activity features(e.g.,activeness,time methods only consider the relative interaction between people slots and duration) without interaction context (e.g.,place context and behaviors). We design closeness-based social relationships inference They are unable to differentiate the specific type of various algorithm to analyze when,where and how closely users social relationships,such as family members and friends.Our interact with each other and reveal users'detailed social previous work focuses on extracting the social relationship relationships (e.g.,advisor-student,supervisor-employee, from smartphone App leaked information such as GPS loca- colleagues,friends,husband-wife,customer relationship tion,IMEI and network location [20.It could only derive and neighbors). the social relationships in a coarse-grained manner.In this We further abstract people's various behaviors (e.g.,paper,we take a closer look and study the privacy leakage home,working and leisure behaviors)to infer their demo-just from the surrounding APs and derive people's activities graphic information such as occupation,gender,religion, and various closeness levels of social interactions for inferring and marital status. detailed relationships demographic information.behaviors embedded in these derived activities and discloses fine-grained social relationships (including advisor-student, supervisor-employee, colleagues, friends, husband-wife and neighbors) as well as demographic information (such as occupation, gender, religion, marital status). Our approach of using simple signal features of APs can be easily applied to a large number of users. For example, advertisers or third party companies could mine users’ personal information for targeted advertising or recommending services. However, such an approach could cause significant privacy leakage if it is utilized by advertisers with aggressive business attempts, who could simply publish free apps to users while these free apps actively collect users’ surrounding AP information and send back to the server to derive users’ social relationships and demographics. In particular, we describe people’s daily places in three dimensions (i.e. temporal, spatial and contextual) to infer people’s activities at each place. For users performing activities at the same place, we calculate physical closeness of the users (e.g., whether staying at the same room, adjacent rooms or inside the same building) and extract users’ activeness (e.g., walking around or sitting) together with other features (e.g., time slots and duration) to characterize their activities at daily places. We then develop Closeness-based Social Relationships Inference algorithm to capture where, when and how closely people interact to derive fine-grained social relationships. We design Behavior-based Demographics Inference method to capture individual behavior based on users’ various daily activities to reveal demographic information including occupation, gender, religion and marriage. We conduct extensive experiments with 21 participants carrying their smartphones to collect surrounding Wi-Fi AP information in their real daily life across three cities over 6 months and study to what extent we can derive these participants’ social relationships and demographic information. We summarize our main contributions as follows: • We demonstrate that simple signal information (e.g., timeseries of MAC addresses and RSS) from users’ surrounding Wi-Fi APs can reveal private information including both social relationships and demographics. • We develop statistical methods to detect and characterize users’ daily visited places based on the AP signal information and further infer the context of daily places by deriving users’ activity features (e.g., activeness, time slots and duration) • We design closeness-based social relationships inference algorithm to analyze when, where and how closely users interact with each other and reveal users’ detailed social relationships (e.g., advisor-student, supervisor-employee, colleagues, friends, husband-wife, customer relationship and neighbors). • We further abstract people’s various behaviors (e.g., home, working and leisure behaviors) to infer their demographic information such as occupation, gender, religion, and marital status. • We show with experimental study of 21 participants that by using our system one can achieve over 91% accuracy of inferring social relationships and over 90% accuracy of deriving demographic information via examining the simple signal features from surrounding APs. II. RELATED WORK In this work, we aim to understand the privacy leakage of smartphone users, in particular discovering users’ social relationships and demographics, by analyzing only the availability of surrounding APs without sniffing any Wi-Fi traffic. Obtaining such information requires limited permission other than turning on GPS or accessing to contact lists. Our work is related to the research efforts in using various information collected from Wi-Fi network and/or smartphone for meaningful places extraction [12]–[15], social relationships inference [6], [7], [16]–[18], and demographics derivation [4], [5], [19]. As the contextual location can be used for learning the person’s interest and providing content-aware applications, there have been active studies on extracting contextual meaning of the locations people visited. For example, Kang et al. design a cluster-based method to extract meaningful places from traces of location coordinates collected from GPS and Wi-Fi based indoor location system [12]. Kim et al. propose SensLoc that utilizes a combination of acceleration, Wi-Fi, and GPS sensors to find semantic places, detect user movements, and track travel paths [13]. These existing methods however only focus on individual users’ visited locations without analyzing the interactions between them. Besides, the obtained meaningful places may be not sufficient to infer the higher level personal information, such as fine-grained social relationship and demographics, due to the lack of information about the users’ daily behaviors and social interactions. Information in Wi-Fi networks and smartphones have been used in literature to infer users’ social relationships. For example, Wiese et. al [16] use the smartphone contact list to mine personal relationships. Moreover, the similarity of smartphones’ SSID lists is used to reveal users’ social relationships [7]. These methods can only derive coarse-grained social relationships without analyzing the behaviors and interactions among people. Vicinity detection via Bluetooth or Wi-Fi signals opens opportunities for social interaction analysis and the strength of friendship ties can be inferred from such wireless signals [6], [18]. However, these vicinity detection methods only consider the relative interaction between people without interaction context (e.g., place context and behaviors). They are unable to differentiate the specific type of various social relationships, such as family members and friends. Our previous work focuses on extracting the social relationship from smartphone App leaked information such as GPS location, IMEI and network location [20]. It could only derive the social relationships in a coarse-grained manner. In this paper, we take a closer look and study the privacy leakage just from the surrounding APs and derive people’s activities and various closeness levels of social interactions for inferring detailed relationships demographic information