Computer Networks 83(2015)184-198 Contents lists available at ScienceDirect Computer Computer Networks Networks ELSEVIER journal homepage:www.elsevier.com/locate/comnet MobiCache:Cellular traffic offloading leveraging cooperative CrossMark caching in mobile social networks Sheng Zhang3*,Jie Wu b,Zhuzhong Qian3,Sanglu Lu State Key Laboratory for Novel Software Technology.Nanjing University.China PDepartment of Computer and Information Sciences,Temple University.USA ARTICLE INFO ABSTRACT Article history: Offloading cellular traffic through mobile social networks has arisen as a promising way for Received 3 November 2014 relieving cellular networks.Prior studies mainly focused on caching data in a number of Received in revised form 10 February 2015 pre-selected helpers.However,such a strategy would fail when mobile users enter and Accepted 2 March 2015 leave the target area over time.In this paper,we examine the research decisions and design Available online 20 March 2015 tradeoffs that arise when offloading cellular traffic in such a dynamic area of interest, referred to as a MobiArea,and we design an offloading framework,MobiCache,for maximiz- Keywords: Cellular traffic offloading ing cellular operators'revenues and minimizing the overhead imposed on mobile devices. Content floating On the user side,we propose a content floating-based cooperative caching strategy that Cooperative caching caches data in geographical floating circles.instead of selected helpers in previous studies. Mobile social networks to cope with the dynamics.A geographical routing scheme is designed for delivering data and queries towards floating circles.We also develop a cache replacement scheme to improve caching cost-effectiveness inside floating circles.On the operator side.query his- tory and feedback are maintained for cellular operators to optimize framework parameters that maximize their revenues.Extensive trace-driven simulations show that,compared with a state-of-the-art scheme,MobiCache offloads up to 52%more traffic with 15%shorter delay and 6%less forwarding cost. 2015 Elsevier B.V.All rights reserved. 1.Introduction already made several measures for throttling its customers [2].One straightforward solution to tackling this problem The past few years have witnessed the explosive popu- would be deploying more base stations to expand cellular larity of smart-phones and tablets.According to Cisco network capacity [3].which is limited,however,since it Visual Networking Index (VNI)[1,the mobile data traffic requires high financial input but gets low and diminishing generated in 2014 was nearly 30 times the size of the returns.Addressing this problem needs some paradigm-al- entire global Internet in 2000,and global mobile traffic will tering approaches. grow at a compound annual growth rate of 57 percent Due to the inherent proximity-based sharing ability of from 2014 to 2019;in other words,it will increase 10-fold mobile devices and the delay-tolerant nature of many cel- and reach 24.3 exabytes per month by 2019.This huge lular contents,offloading cellular traffic through mobile amount of data traffic,as a consequence,has degraded ser- social networks (MSNs)[4]has arisen recently as a promis- vice quality and created immense pressure on the limited ing method for relieving cellular networks [5-11].When spectrum of cellular networks.For example,AT&T has users are rewarded with proper incentives [9.10],it is very likely that they will be willing to wait for a predetermined Corresponding author.TeL:+8625 83681369. delay to obtain some cellular contents,and to help store- E-mail addresses:sheng@nju.edu.cn (S.Zhang).jiewu@temple.edu carry-forward them.Traffic offloading then can be realized (J.Wu).qzz@nju.edu.cn (Z.Qian).sanglu@nju.edu.cn (S.Lu). http://dx.doi.org/10.1016/j.comnet.2015.03.011 1389-1286/2015 Elsevier B.V.All rights reserved
MobiCache: Cellular traffic offloading leveraging cooperative caching in mobile social networks Sheng Zhang a,⇑ , Jie Wu b , Zhuzhong Qian a , Sanglu Lu a a State Key Laboratory for Novel Software Technology, Nanjing University, China bDepartment of Computer and Information Sciences, Temple University, USA article info Article history: Received 3 November 2014 Received in revised form 10 February 2015 Accepted 2 March 2015 Available online 20 March 2015 Keywords: Cellular traffic offloading Content floating Cooperative caching Mobile social networks abstract Offloading cellular traffic through mobile social networks has arisen as a promising way for relieving cellular networks. Prior studies mainly focused on caching data in a number of pre-selected helpers. However, such a strategy would fail when mobile users enter and leave the target area over time. In this paper, we examine the research decisions and design tradeoffs that arise when offloading cellular traffic in such a dynamic area of interest, referred to as a MobiArea, and we design an offloading framework, MobiCache, for maximizing cellular operators’ revenues and minimizing the overhead imposed on mobile devices. On the user side, we propose a content floating-based cooperative caching strategy that caches data in geographical floating circles, instead of selected helpers in previous studies, to cope with the dynamics. A geographical routing scheme is designed for delivering data and queries towards floating circles. We also develop a cache replacement scheme to improve caching cost-effectiveness inside floating circles. On the operator side, query history and feedback are maintained for cellular operators to optimize framework parameters that maximize their revenues. Extensive trace-driven simulations show that, compared with a state-of-the-art scheme, MobiCache offloads up to 52% more traffic with 15% shorter delay and 6% less forwarding cost. 2015 Elsevier B.V. All rights reserved. 1. Introduction The past few years have witnessed the explosive popularity of smart-phones and tablets. According to Cisco Visual Networking Index (VNI) [1], the mobile data traffic generated in 2014 was nearly 30 times the size of the entire global Internet in 2000, and global mobile traffic will grow at a compound annual growth rate of 57 percent from 2014 to 2019; in other words, it will increase 10-fold and reach 24.3 exabytes per month by 2019. This huge amount of data traffic, as a consequence, has degraded service quality and created immense pressure on the limited spectrum of cellular networks. For example, AT&T has already made several measures for throttling its customers [2]. One straightforward solution to tackling this problem would be deploying more base stations to expand cellular network capacity [3], which is limited, however, since it requires high financial input but gets low and diminishing returns. Addressing this problem needs some paradigm-altering approaches. Due to the inherent proximity-based sharing ability of mobile devices and the delay-tolerant nature of many cellular contents, offloading cellular traffic through mobile social networks (MSNs) [4] has arisen recently as a promising method for relieving cellular networks [5–11]. When users are rewarded with proper incentives [9,10], it is very likely that they will be willing to wait for a predetermined delay to obtain some cellular contents, and to help storecarry-forward them. Traffic offloading then can be realized http://dx.doi.org/10.1016/j.comnet.2015.03.011 1389-1286/ 2015 Elsevier B.V. All rights reserved. ⇑ Corresponding author. Tel.: +86 25 83681369. E-mail addresses: sheng@nju.edu.cn (S. Zhang), jiewu@temple.edu (J. Wu), qzz@nju.edu.cn (Z. Qian), sanglu@nju.edu.cn (S. Lu). Computer Networks 83 (2015) 184–198 Contents lists available at ScienceDirect Computer Networks journal homepage: www.elsevier.com/locate/comnet
S.Zhang et aL /Computer Networks 83(2015)184-198 185 through caching:some users intentionally cache cellular The remainder of this paper is organized as follows.We contents,which can be used to satisfy future queries from go over related work and our contributions in Section 2. other users We provide the system model and motivate our work in An important concern of this paradigm is that it may Section 3.Section 4 overviews the proposed framework. incur a long delay.Obviously not all cellular contents could Sections 5 and 6 present the details of our framework. bear long delays.But there are a large body of contents We discuss known issues and extensions in Section 7. (e.g,optional software update,interesting videos)that Before concluding the paper in Section 9,we evaluate the not only permit offloading,but for which it is essential to do performance of the framework in Section 8. so in order to relieve cellular networks and lower users'sub- scription fees.Prior studies mainly focus on placing data items in several selected helpers [5.10.12].so that future 2.Related work and our contributions queries can be responded to without signaling between cellular networks and users.However,if mobile users enter There is a rich heritage of studies on cellular network and leave the target area over time,these strategies would offloading and delay tolerable networks (DTNs)that fail to maintain data availability,since they could not find informed and inspired our design.We describe a subset an appropriate group of helpers that can stay and serve of these related efforts below inside the target area for a sufficiently long period. An MSN can be considered as a type of DTN,which lacks In this paper,we examine the research decisions and persistent end-to-end connectivity,due to low node den- design tradeoffs that arise when offloading cellular traffic in sity and/or unpredictable node mobility.To cope with the such a dynamic area of interest,referred to as a MobiArea. intermittent connectivity.epidemic routing [13]is pro- Our goal is to maximize cellular operators'revenues and posed to deliver packets via flooding:however,this incurs minimize the overhead imposed on mobile devices.We an extremely high forwarding cost.Some later work develop an offloading framework,MobiCache,which caches [14,15]reduces this cost through intelligent relay selec- data in geographical floating circles instead of fixed nodes. tion.Intentional routing16 translates an administrator- When a user requests a data item,the cellular operator offers specified routing metric into per-packet utilities.Social a choice to him/her:obtain the data either from the cellular features are exploited to facilitate DTN routing [17]or network immediately,or from that data's floating circle improve hypertext results [18].A Global Positioning within a predetermined system-specified delay. System (GPS)-assisted geographical routing protocol is Several challenges have to be addressed to realize our developed in 19 for vehicular delay-tolerant networks. goals.How to minimize the computational overhead Prior studies on cellular traffic offloading through imposed on mobile devices to conserve their batteries? device-to-device connections [20]can be categorized into How to route data or queries towards geographical floating two broad types,based on the hop count between cellular circles without incurring too much forwarding cost?When networks and users:single-hop [10]and multi-hop 6,8. two users with limited buffers have a contact,how to per- Most of them [6,8.10]mainly focus on selecting a group form cache replacement to improve cost-effectiveness? of users as bridges between cellular networks and users. How to determine the positions,radii,and lifetimes of Moreover,WiFi throughput prediction is exploited for floating circles,so as to minimize forwarding cost involved vehicular Internet access in [7.Auction-based incentive in circles,as well as the total access delay over all potential mechanisms for offloading are designed in [91. requesters? Computation offloading is investigated in [11. We investigate techniques to settle these challenges in The problem of placing data copies in nodes with lim- MobiCache,which is generally composed of two parts:the ited memories is investigated in [5,21,22]with different user side and the operator side.On the user side of our sys- goals:maximizing the total size of offloaded data,mini- tem,we propose to cache in floating circles,to overcome mizing total access cost,and maximizing the probability dynamics and develop a ticket-based geographical routing of retrieving a file that contains multiple erasure blocks, scheme for delivering data items and queries;a cache respectively.The publish/subscribe based caching is inves- replacement strategy is also designed to improve caching tigated in [12.Content floating technique [23,24]main- cost-effectiveness.On the operator side,we maintain tains data items through flooding them inside their query history and feedback information collected over respective floating circles and discarding them outside time for every data item,and perform predictions and esti- their respective circles.Theoretical analyses show that mations based on the following principle:similar data the expected data availability can be sufficiently long.pro- attracts similar users.In doing so,the computation-inten- vided that a critical condition is met.A short survey on sive task,ie.,parameter determination,is completed by caching mechanisms for web,ad hoc networks and DTNs cellular operators,which lowers the overhead imposed is presented in 25,which also provides some inspiring on mobile devices and also is convenient for operators to ideas for content caching and retrieval for vehicular maximize their revenues. delay-tolerant networks. Our framework is of course not a panacea.We admit Comparatively,in this paper,we propose to cache in that many challenges remain,e.g,privacy and energy floating circles,instead of fixed users,to cope with the issues.Addressing these concerns is a direction of our dynamics of the target area,and we develop MobiCache future work.While our approach does not generalize to for cellular operators to maximize their revenues,and all scenarios,we hope that our proposal will provide some minimize overhead on mobile devices.Although the idea potential guidelines for future offloading design. of content floating is not new,to the best of our knowledge
through caching: some users intentionally cache cellular contents, which can be used to satisfy future queries from other users. An important concern of this paradigm is that it may incur a long delay. Obviously not all cellular contents could bear long delays. But there are a large body of contents (e.g., optional software update, interesting videos) that not only permit offloading, but for which it is essential to do so in order to relieve cellular networks and lower users’ subscription fees. Prior studies mainly focus on placing data items in several selected helpers [5,10,12], so that future queries can be responded to without signaling between cellular networks and users. However, if mobile users enter and leave the target area over time, these strategies would fail to maintain data availability, since they could not find an appropriate group of helpers that can stay and serve inside the target area for a sufficiently long period. In this paper, we examine the research decisions and design tradeoffs that arise when offloading cellular traffic in such a dynamic area of interest, referred to as a MobiArea. Our goal is to maximize cellular operators’ revenues and minimize the overhead imposed on mobile devices. We develop an offloading framework, MobiCache, which caches data in geographical floating circles instead of fixed nodes. When a user requests a data item, the cellular operator offers a choice to him/her: obtain the data either from the cellular network immediately, or from that data’s floating circle within a predetermined system-specified delay. Several challenges have to be addressed to realize our goals. How to minimize the computational overhead imposed on mobile devices to conserve their batteries? How to route data or queries towards geographical floating circles without incurring too much forwarding cost? When two users with limited buffers have a contact, how to perform cache replacement to improve cost-effectiveness? How to determine the positions, radii, and lifetimes of floating circles, so as to minimize forwarding cost involved in circles, as well as the total access delay over all potential requesters? We investigate techniques to settle these challenges in MobiCache, which is generally composed of two parts: the user side and the operator side. On the user side of our system, we propose to cache in floating circles, to overcome dynamics and develop a ticket-based geographical routing scheme for delivering data items and queries; a cache replacement strategy is also designed to improve caching cost-effectiveness. On the operator side, we maintain query history and feedback information collected over time for every data item, and perform predictions and estimations based on the following principle: similar data attracts similar users. In doing so, the computation-intensive task, i.e., parameter determination, is completed by cellular operators, which lowers the overhead imposed on mobile devices and also is convenient for operators to maximize their revenues. Our framework is of course not a panacea. We admit that many challenges remain, e.g., privacy and energy issues. Addressing these concerns is a direction of our future work. While our approach does not generalize to all scenarios, we hope that our proposal will provide some potential guidelines for future offloading design. The remainder of this paper is organized as follows. We go over related work and our contributions in Section 2. We provide the system model and motivate our work in Section 3. Section 4 overviews the proposed framework. Sections 5 and 6 present the details of our framework. We discuss known issues and extensions in Section 7. Before concluding the paper in Section 9, we evaluate the performance of the framework in Section 8. 2. Related work and our contributions There is a rich heritage of studies on cellular network offloading and delay tolerable networks (DTNs) that informed and inspired our design. We describe a subset of these related efforts below. An MSN can be considered as a type of DTN, which lacks persistent end-to-end connectivity, due to low node density and/or unpredictable node mobility. To cope with the intermittent connectivity, epidemic routing [13] is proposed to deliver packets via flooding; however, this incurs an extremely high forwarding cost. Some later work [14,15] reduces this cost through intelligent relay selection. Intentional routing [16] translates an administratorspecified routing metric into per-packet utilities. Social features are exploited to facilitate DTN routing [17] or improve hypertext results [18]. A Global Positioning System (GPS)-assisted geographical routing protocol is developed in [19] for vehicular delay-tolerant networks. Prior studies on cellular traffic offloading through device-to-device connections [20] can be categorized into two broad types, based on the hop count between cellular networks and users: single-hop [10] and multi-hop [6,8]. Most of them [6,8,10] mainly focus on selecting a group of users as bridges between cellular networks and users. Moreover, WiFi throughput prediction is exploited for vehicular Internet access in [7]. Auction-based incentive mechanisms for offloading are designed in [9]. Computation offloading is investigated in [11]. The problem of placing data copies in nodes with limited memories is investigated in [5,21,22] with different goals: maximizing the total size of offloaded data, minimizing total access cost, and maximizing the probability of retrieving a file that contains multiple erasure blocks, respectively. The publish/subscribe based caching is investigated in [12]. Content floating technique [23,24] maintains data items through flooding them inside their respective floating circles and discarding them outside their respective circles. Theoretical analyses show that the expected data availability can be sufficiently long, provided that a critical condition is met. A short survey on caching mechanisms for web, ad hoc networks and DTNs is presented in [25], which also provides some inspiring ideas for content caching and retrieval for vehicular delay-tolerant networks. Comparatively, in this paper, we propose to cache in floating circles, instead of fixed users, to cope with the dynamics of the target area, and we develop MobiCache for cellular operators to maximize their revenues, and minimize overhead on mobile devices. Although the idea of content floating is not new, to the best of our knowledge, S. Zhang et al. / Computer Networks 83 (2015) 184–198 185
186 S.Zhang et aL /Computer Networks 83(2015)184-198 MobiCache is the first system that supports offloading in There are already some incentive mechanisms 9,10, dynamic areas without any stationary nodes'assistance which motivate users to tolerate a predetermined tolerable (e.g.throwboxes [26,27]).More specifically,the con- delay or to help in forwarding data.We will not discuss the tributions of this paper are summarized as follows.(1) design of such mechanisms,as it is orthogonal to the focus MobiCache intentionally maintains data availability within of this paper.Notations are summarized in Fig.1 for geographical floating circles to overcome the dynamics; reference. (2)we propose a ticket-based geographical routing scheme for delivering data items and queries;(3)a dynamic pro- 3.2.Motivations gramming-based caching replacement policy is designed for improving caching cost-effectiveness;and (4)we pro- 3.2.1.Popular regions vide a set of effective heuristics for operators to optimize We examine the Dartmouth trace[29 to better under- framework parameters to maximize their revenues. stand user mobility patterns.The trace consists of two parts:access point (AP)locations which provides the loca- 3.Background and motivations tions of 623 APs (only 507 of them are valid)on the Dartmouth campus,and client-AP associations which pro- This section overviews the assumptions and models vides the association information of 6022 clients over a used in this paper and motivates the offloading design of period of 703 days.After proper preprocessing on the raw MobiCache based on two observations. data,we find that most of the association records emerge during the last one-fourth of the duration,and most of 3.1.Models and assumptions the clients only have a small number of associations over 703 days.Therefore,we only use the mobility trace of 78 We denote by MobiArea the dynamic area of interest. clients collected from the 525th day to the 703th day. Denote a mobile user that has emerged in the MobiArea Fig.2 shows the locations of 340 APs that have associa- as ng.In this paper,mobile nodes,users,and devices will tions with at least one of the 78 clients during that time be interchangeably used.Users n and ny can communicate period.The number of associations of an AP is indicated with each other through a wireless interface (WiFi or by the type of the corresponding marker that represents Bluetooth)if and only if their distance is not larger than the AP.We see that,there are several extremely popular a fixed range.This range,and the physical size of each APs.These popular regions will facilitate MobiCache,as mobile user,are assumed to be negligibly small compared MobiCache would need to maintain data availability at geo- to the dimensions of the MobiArea. graphical floating circles for future data queries. As assumed in prior studies [5.12,21.22],most users are not willing to contribute all of their mobile storage to cach- 3.2.2.Pricing gap ing cellular data,thus,we denote the available buffer size Since users have to tolerate some delay before getting of user n as B.We also assume that each user is equipped data through offloading,this charge should be cut down with a GPS to determine his position;this can be easily to some extent.For example,Verizon charges a user achieved,since almost all of the smart-phones and tablets roughly 8 USD for 1 GB data [30].i.e.,0.8 cents for 1 MB nowadays have a rich set of sensors.We will discuss how data:if a user gets a data item via offloading.Verizon to deal with errors in positioning and how to reduce may charge it 0.4si cents,where si is the size of the data energy consumption for positioning in Section 7. in MB. Denote a data item as d,and its size as si.For simplicity. When a user contributes one forwarding of a data item, we assume that all necessary data transfer can be com- the cellular operator rewards the user with a few cents, pleted during any contact.If sy is too large to be transferred which should at least compensate for the energy consump- during a contact,then we can divide d into several small tion in forwarding the data.The data transfer rate between segments,each of which follows the assumption.The seg- mobile devices is assumed to be 1 MB/s (the typical rate of mentation overhead is discussed in Section 7. Bluetooth v1.2 [31]).thus,transmitting a data item of s MB User n is associated with an interest vector li.repre- lasts about s seconds.According US Energy Information sented by a M x 1 vector [p.p2.....Pul,where M denotes Administration report[32].the average price of residential the size of the keywords universe and p[0,1]indicates electricity in the US was about 12 cents per kW h in 2013. the probability of this user to be interested in the hth key- Based on measurements in [33].the energy consumption word ]The sum is normalized to 1.Similarly. rate of a mobile phone during transmission is about 2 W. data item d is associated with a characteristic vector Cj. Therefore,an operator should pay (12 x(2 x s))/(1000x represented by a Mx1 vector C=g,e以,e,where 3600)6.67sx 10-5 cents to a user who forwards a data item with a size of s in MB one time. ek[0,1]indicates the extent to which the hth keyword We see that,cellular operators could still make profit,as can characterize this data.Also,>Me=1.Then,the long as the forwarding times of a data item is less than probability that n is interested in d,is defined as: (0.4s)/6.67s×10-6)≈59,970,which is sufficiently large for mobile users to cooperatively cache data in geographi- cal floating circles.As evidenced in Section 8,the average =1 forwarding number of a data item is approximately
MobiCache is the first system that supports offloading in dynamic areas without any stationary nodes’ assistance (e.g., throwboxes [26,27]). More specifically, the contributions of this paper are summarized as follows. (1) MobiCache intentionally maintains data availability within geographical floating circles to overcome the dynamics; (2) we propose a ticket-based geographical routing scheme for delivering data items and queries; (3) a dynamic programming-based caching replacement policy is designed for improving caching cost-effectiveness; and (4) we provide a set of effective heuristics for operators to optimize framework parameters to maximize their revenues. 3. Background and motivations This section overviews the assumptions and models used in this paper and motivates the offloading design of MobiCache based on two observations. 3.1. Models and assumptions We denote by MobiArea the dynamic area of interest. Denote a mobile user that has emerged in the MobiArea as ni. In this paper, mobile nodes, users, and devices will be interchangeably used. Users ni and nj can communicate with each other through a wireless interface (WiFi or Bluetooth) if and only if their distance is not larger than a fixed range. This range, and the physical size of each mobile user, are assumed to be negligibly small compared to the dimensions of the MobiArea. As assumed in prior studies [5,12,21,22], most users are not willing to contribute all of their mobile storage to caching cellular data, thus, we denote the available buffer size of user ni as Bi. We also assume that each user is equipped with a GPS to determine his position; this can be easily achieved, since almost all of the smart-phones and tablets nowadays have a rich set of sensors. We will discuss how to deal with errors in positioning and how to reduce energy consumption for positioning in Section 7. Denote a data item as di, and its size as si. For simplicity, we assume that all necessary data transfer can be completed during any contact. If si is too large to be transferred during a contact, then we can divide di into several small segments, each of which follows the assumption. The segmentation overhead is discussed in Section 7. User ni is associated with an interest vector Ii, represented by a M 1 vector ½pi 1; pi 2; ... ; pi M T , where M denotes the size of the keywords universe and pi h 2 ½0; 1 indicates the probability of this user to be interested in the hth keyword [28]. The sum PM h¼1pi h is normalized to 1. Similarly, data item dj is associated with a characteristic vector Cj, represented by a M 1 vector Cj ¼ ½ej 1; ej 2; ... ; ej M T , where ej h 2 ½0; 1 indicates the extent to which the hth keyword can characterize this data. Also, PM h¼1ej h ¼ 1. Then, the probability that ni is interested in dj is defined as: Pij ¼ I T i Cj ¼ XM h¼1 pi h ej h: ð1Þ There are already some incentive mechanisms [9,10], which motivate users to tolerate a predetermined tolerable delay or to help in forwarding data. We will not discuss the design of such mechanisms, as it is orthogonal to the focus of this paper. Notations are summarized in Fig. 1 for reference. 3.2. Motivations 3.2.1. Popular regions We examine the Dartmouth trace [29] to better understand user mobility patterns. The trace consists of two parts: access point (AP) locations which provides the locations of 623 APs (only 507 of them are valid) on the Dartmouth campus, and client-AP associations which provides the association information of 6022 clients over a period of 703 days. After proper preprocessing on the raw data, we find that most of the association records emerge during the last one-fourth of the duration, and most of the clients only have a small number of associations over 703 days. Therefore, we only use the mobility trace of 78 clients collected from the 525th day to the 703th day. Fig. 2 shows the locations of 340 APs that have associations with at least one of the 78 clients during that time period. The number of associations of an AP is indicated by the type of the corresponding marker that represents the AP. We see that, there are several extremely popular APs. These popular regions will facilitate MobiCache, as MobiCache would need to maintain data availability at geographical floating circles for future data queries. 3.2.2. Pricing gap Since users have to tolerate some delay before getting data through offloading, this charge should be cut down to some extent. For example, Verizon charges a user roughly 8 USD for 1 GB data [30], i.e., 0.8 cents for 1 MB data; if a user gets a data item via offloading, Verizon may charge it 0:4si cents, where si is the size of the data in MB. When a user contributes one forwarding of a data item, the cellular operator rewards the user with a few cents, which should at least compensate for the energy consumption in forwarding the data. The data transfer rate between mobile devices is assumed to be 1 MB/s (the typical rate of Bluetooth v1.2 [31]), thus, transmitting a data item of si MB lasts about si seconds. According US Energy Information Administration report [32], the average price of residential electricity in the US was about 12 cents per kW h in 2013. Based on measurements in [33], the energy consumption rate of a mobile phone during transmission is about 2 W. Therefore, an operator should pay ð12 ð2 siÞÞ=ð1000 3600Þ 6:67si 106 cents to a user who forwards a data item with a size of si in MB one time. We see that, cellular operators could still make profit, as long as the forwarding times of a data item is less than ð0:4siÞ=ð6:67si 106 Þ 59; 970, which is sufficiently large for mobile users to cooperatively cache data in geographical floating circles. As evidenced in Section 8, the average forwarding number of a data item is approximately 186 S. Zhang et al. / Computer Networks 83 (2015) 184–198
S.Zhang et aL /Computer Networks 83 (2015)184-198 187 Notation Definitions n a mobile user/node/device B available buffer size of user n L the interest vector of user ni the size of data d, the characteristic vector of data di P the probability that n;is interested in d, a data item cj the 2-D coordinates of the floating circle center of d; i the radius of the floating circle of d the lifetime of the floating circle of di of tickets for delivering di to its circle k of tickets for delivering n,'s request for di's circle Fig.1.Main notations are summarized for reference 441510 4.41 4.405 0 。89 4.4 0039 0 4.395 80 ● 4.39 6 0 4.385 438 8 4.37 .17 8.18 8.19 8.2 8.21 8.22 8.23 x coordinate x10 Fig.2.The locations(indicated by the centers of markers)and the number of associations(circle:0-100:triangle:100-1000:square:1000-10.000:star: 10,000)of the selected 340 access points in Dartmouth trace [29]. 250-800 in MobiCache,which is much smaller than the 4.1.On the cellular operator side maximum allowable number of forwarding times.This tremendous gap allows us to design a cooperative To maximize the cellular operator's revenues caching-based method for efficient offloading. (Section 6.2),MobiCache estimates the following parame- ters that guide the set-up of the floating circle for d:(1) 4.MobiCache overview The coordinates of the circle center.c1=(c.c) (Section 6.3).which should be determined to minimize Our objective is to offer a practical and feasible offload- the total access delay of all potential requesters of d.For ing service at dynamic areas without any stationary infras- example,there are more potential requesters of d in the tructure assistance.We provide an overview of MobiCache northwest part in this figure,so the circle center for d is through an illustrative example in Fig.3.After user n suc- chosen near that part.(2)The radius of the floating circle, cessfully downloads data d via cellular networks,the cel- r(Section 6.3).A large radius is beneficial to maintaining lular operator detects that this is the first download of d in the availability of d,but incurs unnecessary forwarding the MobiArea.Based on history and feedback information cost.(3)The lifetime of the floating circle,h(Section 6.4). collected in the past (Section 6.1).the cellular operator As time goes on,the potential requesters of di become learns that there may be many forthcoming requests for fewer,while the maintenance cost becomes greater. d:for example,the nodes with horizontal lines are poten- MobiCache should carefully determine h to maximize tial requesters of d.To relieve the crowded cellular net- operators'revenue.When a floating circle expires,all work but,meanwhile,satisfy those upcoming requesters, devices involved in the circle are free to delete the the cellular operator resorts to caching d in a geographical corresponding data.(4)The maximum number of copies floating circle. of d when delivering d to its circle,k(Section 6.5).The
250–800 in MobiCache, which is much smaller than the maximum allowable number of forwarding times. This tremendous gap allows us to design a cooperative caching-based method for efficient offloading. 4. MobiCache overview Our objective is to offer a practical and feasible offloading service at dynamic areas without any stationary infrastructure assistance. We provide an overview of MobiCache through an illustrative example in Fig. 3. After user n1 successfully downloads data d1 via cellular networks, the cellular operator detects that this is the first download of d1 in the MobiArea. Based on history and feedback information collected in the past (Section 6.1), the cellular operator learns that there may be many forthcoming requests for d1; for example, the nodes with horizontal lines are potential requesters of d1. To relieve the crowded cellular network but, meanwhile, satisfy those upcoming requesters, the cellular operator resorts to caching d1 in a geographical floating circle. 4.1. On the cellular operator side To maximize the cellular operator’s revenues (Section 6.2), MobiCache estimates the following parameters that guide the set-up of the floating circle for d1: (1) The coordinates of the circle center, c1 ¼ ðc x 1 ; c y 1 Þ (Section 6.3), which should be determined to minimize the total access delay of all potential requesters of d1. For example, there are more potential requesters of d1 in the northwest part in this figure, so the circle center for d1 is chosen near that part. (2) The radius of the floating circle, r1 (Section 6.3). A large radius is beneficial to maintaining the availability of d1, but incurs unnecessary forwarding cost. (3) The lifetime of the floating circle, l1 (Section 6.4). As time goes on, the potential requesters of d1 become fewer, while the maintenance cost becomes greater. MobiCache should carefully determine l1 to maximize operators’ revenue. When a floating circle expires, all devices involved in the circle are free to delete the corresponding data. (4) The maximum number of copies of d1 when delivering d1 to its circle, k1 (Section 6.5). The Fig. 1. Main notations are summarized for reference. 8.17 8.18 8.19 8.2 8.21 8.22 8.23 x 105 4.375 4.38 4.385 4.39 4.395 4.4 4.405 4.41 4.415 x 105 x coordinate y coordinate Fig. 2. The locations (indicated by the centers of markers) and the number of associations (circle: 0–100; triangle: 100–1000; square: 1000–10,000; star: P10,000) of the selected 340 access points in Dartmouth trace [29]. S. Zhang et al. / Computer Networks 83 (2015) 184–198 187
188 S.Zhang et aL /Computer Networks 83 (2015)184-198 Query User History Feedback d n Parameter Estimation ● Circle Center Circle Radius 18 cle Lifetime ● Ticket No MobiArea Cellular Operator Fig.3.The big picture of MobiCache.c and r specify the floating circle for d(i=1,2 in this example).The nodes with horizontal and vertical lines denote potential requesters of d,and d2.respectively.The cellular operator estimates framework parameters based on query history and user feedback collected over time. operator can use k to control the tradeoff between the 4.2.4.Data query from users and response from floating circles delivery cost (i.e.,the number of data replicates)and the When node nz requests d from cellular networks,the delivery latency. operator detects that there is already a floating circle for d.so the operator offers a choice to nz:obtain the data either from the cellular network immediately,or from 4.2.On the user side the floating circle of d within a predetermined system- specified delay.If n accepts the latter,the operator returns Due to resource constraints of mobile devices.e.g..bat- ci and k17 to nz.Depending on the degree of interest,n tery lifetime and slow processing speed,MobiCache also can freely choose either to physically move to the circle tries to shift computation-intensive tasks to the cellular and get d,or to send queries (Section 5.4).If the latter operator side,so as to reduce the computational overhead happens.nz employs the geographical routing scheme in on mobile devices. Section 5.1 with k tickets to deliver queries to the circle of d,then waits for a response.The operator also guarantees 4.2.1.Delivering data to its floating circle to push d to nz via cellular networks once the system- With proper incentive mechanisms,the cellular opera- specified delay passes. In the next two sections,we provide the design details tor then sends these parameters to n and asks n to deliver of MobiCache on the user side and the cellular operator side, d to the specified circle.User n employs the geographical routing scheme in Section 5.1 to deliver d to its floating respectively. circle. 5.MobiCache on the user side 4.2.2.Maintaining data availability in floating circles In this section,we present the details of the proposed When the first copy of d enters its circle,d would get framework on the user side.Section 5.1 provides a routing replicated whenever a node with it contacts another node scheme for a user sending a data item or query to a geo- without it inside the circle (Section 5.2).When a critical graphical circle.Section 5.2 introduces how to maintain condition is met,the expected availability of d can be suf- data availability in floating circles.Section 5.3 deals with ficiently long,as demonstrated in [23,24].The reason for pairwise encounters within floating circles.Section 5.4 using the circle shape for content floating is that a circle shows how to get data from floating circles.Most parame- can be accurately expressed by its center and radius,yield- ters involved in this section are estimated by cellular ing little additional information that needs to be operators,which will be explained in Section 6. transmitted. 5.1.Delivering data to a floating circle 4.2.3.Cache replacement policy for pairwise encounters Some prior studies [14-16]focused on delivering data As there are some other data items,nodes with limited packets to mobile destinations;different from them,we buffers may meet inside the intersection of two or more are interested in routing data items or queries to geo- circles(e.g.ns and ne in Fig.3).A cache replacement strat- graphical regions.In the following,we propose a simple egy(Section 5.3)is then designed to improve cache cost- scheme based on active positioning.By "active position- effectiveness. ing"we mean that,mobile users actively record their
operator can use k1 to control the tradeoff between the delivery cost (i.e., the number of data replicates) and the delivery latency. 4.2. On the user side Due to resource constraints of mobile devices, e.g., battery lifetime and slow processing speed, MobiCache also tries to shift computation-intensive tasks to the cellular operator side, so as to reduce the computational overhead on mobile devices. 4.2.1. Delivering data to its floating circle With proper incentive mechanisms, the cellular operator then sends these parameters to n1 and asks n1 to deliver d1 to the specified circle. User n1 employs the geographical routing scheme in Section 5.1 to deliver d1 to its floating circle. 4.2.2. Maintaining data availability in floating circles When the first copy of d1 enters its circle, d1 would get replicated whenever a node with it contacts another node without it inside the circle (Section 5.2). When a critical condition is met, the expected availability of d1 can be suf- ficiently long, as demonstrated in [23,24]. The reason for using the circle shape for content floating is that a circle can be accurately expressed by its center and radius, yielding little additional information that needs to be transmitted. 4.2.3. Cache replacement policy for pairwise encounters As there are some other data items, nodes with limited buffers may meet inside the intersection of two or more circles (e.g., n5 and n6 in Fig. 3). A cache replacement strategy (Section 5.3) is then designed to improve cache costeffectiveness. 4.2.4. Data query from users and response from floating circles When node n7 requests d1 from cellular networks, the operator detects that there is already a floating circle for d1, so the operator offers a choice to n7: obtain the data either from the cellular network immediately, or from the floating circle of d1 within a predetermined systemspecified delay. If n7 accepts the latter, the operator returns c1 and k1;7 to n7. Depending on the degree of interest, n7 can freely choose either to physically move to the circle and get d1, or to send queries (Section 5.4). If the latter happens, n7 employs the geographical routing scheme in Section 5.1 with k1;7 tickets to deliver queries to the circle of d1, then waits for a response. The operator also guarantees to push d1 to n7 via cellular networks once the systemspecified delay passes. In the next two sections, we provide the design details of MobiCache on the user side and the cellular operator side, respectively. 5. MobiCache on the user side In this section, we present the details of the proposed framework on the user side. Section 5.1 provides a routing scheme for a user sending a data item or query to a geographical circle. Section 5.2 introduces how to maintain data availability in floating circles. Section 5.3 deals with pairwise encounters within floating circles. Section 5.4 shows how to get data from floating circles. Most parameters involved in this section are estimated by cellular operators, which will be explained in Section 6. 5.1. Delivering data to a floating circle Some prior studies [14–16] focused on delivering data packets to mobile destinations; different from them, we are interested in routing data items or queries to geographical regions. In the following, we propose a simple scheme based on active positioning. By ‘‘active positioning’’ we mean that, mobile users actively record their Fig. 3. The big picture of MobiCache. ci and ri specify the floating circle for di (i ¼ 1; 2 in this example). The nodes with horizontal and vertical lines denote potential requesters of d1 and d2, respectively. The cellular operator estimates framework parameters based on query history and user feedback collected over time. 188 S. Zhang et al. / Computer Networks 83 (2015) 184–198
S.Zhang et aL /Computer Networks 83 (2015)184-198 189 positions over time.In Section 7.we also show how to reduce energy consumption on positioning and cope with positioning errors through passive positioning. In our framework,each user n records his position La(ni using GPS every y time:we denote the position of n at time ky (k =0,1,2...)as pos =(posxeposy).Here, Yis used to trade off between positioning frequency and energy consumption,ie..the user n can decrease y to con- serve energy.We estimate the moving speed v(n)of n at 2 time ky as Fig.4.The GAP scheme.When node n.with c1 ticket of data d. v(n)= Ipos是n-pos&-1m2 encounters node n without d.n then forwards data d and a certain (2) number of tickets.determined by (n).(n).v(n)and v(n).to n. As shown in Fig.4,we estimate the moving direction of If n has data d,and ny waits for d,then n forwards d to n at time ky as the direction of the vector n(lines 3-4);this happens when n wants to download d (pos-pos).Define the deviation angle ()ofn or ny is an intermediate relay of d's query from another at time ky as the intersection angle of the moving direc- node.If t>1 and t=0,if neither of them deflect very tion,and the direction of(ca-pos).so(n)is much from the direction to the floating circle center (the deviation angle threshold A is set to nt/4 in our trace-driven simulations),then the total tickets are split between them (posin -pos-i)-(Ca-posn) (ni)=arccos. (3) based on the ratios of their deviation angles and moving pos4-pos&-im2·Ica-pos,店 speeds (lines 6-9).Here,to avoid the case in which the dominator is zero,we add a sufficiently small positive e Based on this positioning information,we propose GAP, to the formula.If the deviation angles of n and n are larger a GeogrAPhical routing scheme.Remember that,the cellu- and not larger than the threshold A,respectively,the ticket lar operator wants to minimize the overhead imposed on counter of ny is set to t-1.ie.,n leaves only one ticket for mobile devices,where the "overhead"regarding to routing itself(lines 10-13).Any node that has d but with only one is data forwarding and storage.To achieve this goal,the ticket is not allowed to replicate d unless it enters the float- operator choose to restrict the number of forwardings ing circle of d.Any node could discard d if it leaves the and replications of a data item d within an integer kd. floating circle of d,if it leaves the MobiArea,or if d expires. which is also called the number of tickets in this paper. We see from above that,using GAP,either of the num- The operator can adjust kd to reach a compromise between ber of forwardings and the number of copies of data d is delivery delay and forwarding cost. restricted to kd,allowing cellular operators to control the For each data item d,the initial ticket counter in its tradeoff between delivery cost and delay according to the source node is kd.When two nodes n and ny have a contact, statistics of the MobiArea. for each data d in the buffer of n.denote the ticket coun- ters of d in n and ny by t and t,respectively.Algorithm 5.2.Maintaining data availability in floating circles 1 shows the forwarding details in a contact. Within the circle specified by ca and ra,data d is repli- Algorithm 1.The GAP scheme cated whenever a node with it meets another node without it.Since such replication is based purely on the position of 1:When two nodes ni and nj have a contact: mobile nodes,every node in the circle should have a copy 2:for each data d in the buffer of ni do of d eventually.Nodes leaving the circle are free to delete 3: if >1 and nj waits for d do their copies of d.Due to the random nature of users'move- ments,it seems that there is no guarantee for data avail- 4: ni forwards d to nj ability.Surprisingly.as evidenced in [24].content floating 5: if 1 and =0 do can be quite reliable,as long as a critical condition is met; 6: if(n)≤Aand(ni)≤Ado furthermore,the critical condition can be easily satisfied 7: ni forwards d to nj in normal circumstances.For example,when node mobility 8 9← ‘网阿· 1}+e follows the random waypoint model [34].if we denote the transmission range of each node and the total number of 今 9←t号-t吗 nodes by range and total,respectively,and if the ratio h 10: if(n)>Aand(ny)≤Ado of floating circle radius to MobiArea radius tends to zero. 11: nj forwards d nj the critical condition is 12: 9-9-1 range.total.h>16/45 0.356 (4) 13: 9-1 14:end for Some previous studies [12,21]keep data in several fixed popular nodes,while MobiCache stores data within geo- graphical regions,which has the following advantage
positions over time. In Section 7, we also show how to reduce energy consumption on positioning and cope with positioning errors through passive positioning. In our framework, each user ni records his position using GPS every ci time; we denote the position of ni at time kci ðk ¼ 0; 1; 2; ...Þ as posni kci ¼ ðposxni kci ; posyni kci Þ. Here, ci is used to trade off between positioning frequency and energy consumption, i.e., the user ni can decrease ci to conserve energy. We estimate the moving speed mðniÞ of ni at time kci as mðniÞ ¼ kposni kci posni ðk1Þci k2 ci : ð2Þ As shown in Fig. 4, we estimate the moving direction of ni at time kci as the direction of the vector ðposni kci posni ðk1Þci Þ. Define the deviation angle aðniÞ of ni at time kci as the intersection angle of the moving direction, and the direction of ðcd posni kci Þ, so aðniÞ is aðniÞ ¼ arccos ðposni kci posni ðk1Þci Þðcd posni kci Þ kposni kci posni ðk1Þci k2 kcd posni kci k2 : ð3Þ Based on this positioning information, we propose GAP, a GeogrAPhical routing scheme. Remember that, the cellular operator wants to minimize the overhead imposed on mobile devices, where the ‘‘overhead’’ regarding to routing is data forwarding and storage. To achieve this goal, the operator choose to restrict the number of forwardings and replications of a data item d within an integer kd, which is also called the number of tickets in this paper. The operator can adjust kd to reach a compromise between delivery delay and forwarding cost. For each data item d, the initial ticket counter in its source node is kd. When two nodes ni and nj have a contact, for each data d in the buffer of ni, denote the ticket counters of d in ni and nj by t d i and t d j , respectively. Algorithm 1 shows the forwarding details in a contact. Algorithm 1. The GAP scheme 1: When two nodes ni and nj have a contact: 2: for each data d in the buffer of ni do 3: if td i P 1 and nj waits for d do 4: ni forwards d to nj 5: if td i > 1 and t d j ¼ 0 do 6: if aðniÞ 6 A and aðnjÞ 6 A do 7: ni forwards d to nj 8: t d j aðniÞþ aðniÞþaðnjÞþ2 mðnjÞþ mðniÞþmðnjÞþ2 t d i l m 9: t d i td i td j 10: if aðniÞ > A and aðnjÞ 6 A do 11: ni forwards d nj 12: t d j td i 1 13: t d i 1 14: end for If ni has data d, and nj waits for d, then ni forwards d to nj (lines 3–4); this happens when nj wants to download d or nj is an intermediate relay of d0 s query from another node. If t d i > 1 and t d j ¼ 0, if neither of them deflect very much from the direction to the floating circle center (the deviation angle threshold A is set to p=4 in our trace-driven simulations), then the total tickets are split between them based on the ratios of their deviation angles and moving speeds (lines 6–9). Here, to avoid the case in which the dominator is zero, we add a sufficiently small positive to the formula. If the deviation angles of ni and nj are larger and not larger than the threshold A, respectively, the ticket counter of nj is set to t d i 1, i.e., ni leaves only one ticket for itself (lines 10–13). Any node that has d but with only one ticket is not allowed to replicate d unless it enters the floating circle of d. Any node could discard d if it leaves the floating circle of d, if it leaves the MobiArea, or if d expires. We see from above that, using GAP, either of the number of forwardings and the number of copies of data d is restricted to kd, allowing cellular operators to control the tradeoff between delivery cost and delay according to the statistics of the MobiArea. 5.2. Maintaining data availability in floating circles Within the circle specified by cd and rd, data d is replicated whenever a node with it meets another node without it. Since such replication is based purely on the position of mobile nodes, every node in the circle should have a copy of d eventually. Nodes leaving the circle are free to delete their copies of d. Due to the random nature of users0 movements, it seems that there is no guarantee for data availability. Surprisingly, as evidenced in [24], content floating can be quite reliable, as long as a critical condition is met; furthermore, the critical condition can be easily satisfied in normal circumstances. For example, when node mobility follows the random waypoint model [34], if we denote the transmission range of each node and the total number of nodes by range and total, respectively, and if the ratio h of floating circle radius to MobiArea radius tends to zero, the critical condition is range total h > 16=45 0:356: ð4Þ Some previous studies [12,21] keep data in several fixed popular nodes, while MobiCache stores data within geographical regions, which has the following advantage. i j Fig. 4. The GAP scheme. When node ni, with c P 1 ticket of data d, encounters node nj without d; ni then forwards data d and a certain number of tickets, determined by aðniÞ; aðnjÞ; mðniÞ and mðnjÞ, to nj. S. Zhang et al. / Computer Networks 83 (2015) 184–198 189
190 S.Zhang et aL/Computer Networks 83(2015)184-198 Mobile nodes enter and leave the MobiArea over time. Even if we cache data in some popular nodes,and let these nodes hand over their cached data to other nodes when they leave the MobiArea,what do we actually get?The answer is caching data in a region,and the region is,in fact the entire MobiArea. Some other studies[26,27]have proposed to strategi- cally deploy throwbox,a kind of stationary wireless node that acts as a relay,to improve DTN throughput.Why not place some throwboxes in the MobiArea for caching? There are several concerns.First and foremost,throwboxes are expensive and cannot be massively deployed.Second throwboxes restrict our choice of caching positions to a Fig.5.Cache replacement.The weight of a data item da for user n is limited set of candidate locations-locations of throw- determined by the remaining lifetime rle before the floating circle of de boxes,which may result in performance degradation,indi- expires and the chord length cl. cated by the total access delay over all requesters decisions separately.Taking n for example,we formulate 5.3.Cache replacement policy for pairwise encounters the cache replacement as the following Knapsack problem [351: This section focuses on cache replacement for pairwise encounters inside floating circles,and does not take into max account data or query copies incurred by GAP(Section 5.1) k=1 or other routing protocols(Section 5.4).Ideally,a mobile user m maintains a data item if this user is geographically inside subject to X5k≤B时 (5) the floating circle of this data item.However,user n has k=1 a limited buffer of size B;therefore,user n has to strategi- k∈{0,1},k∈{1,2,,m} cally select a subset of items for caching. The cache replacement occurs when two mobile users where xy indicates whether d would be cached in n.and with different sets of data items have a contact.In Fig.5, the product(rlk.cl)serves as the weight of dk for n. user n meets another user n at time ky:without loss of Observing that sk and Bi are integers,we can use dynamic generality,the union set of data items cached at these programming to solve this problem in O(mB time. two users is assumed to be s={d,d2.....dm).The cache replacement decision is based on remaining lifetime and 5.4.Data query from users and response from floating circles chord length.The remaining lifetime rl of data d represents the remaining time before the floating circle of d expires. When a mobile user np requests de from the cellular The chord length clu of data dk for a user n denotes the operator,if the cellular operator detects that there is length that n would travel in the floating circle of dy if n already a floating circle of dk.the operator offers two keeps its moving direction unchanged.Fig.5 illustrates choices:directly download the data from cellular net- cl and clhi.Given the center position and radius of a float- works,or indirectly get the data via opportunistic device- ing circle,the current position and the moving direction of to-device connections within a predetermined delay.If n a user,the chord length can be easily attained by plane chooses the latter,the operator replies with Ck.ck,rk,and geometry. ku(see Fig.1 for notations).User n then checks the interest We note that the popularity of a data item is not consid- probability Pa(see Eq.(1)).Depending on Pik,n can choose ered here.The reason is implicit,if somewhat subtle:the either to physically move to the circle,or to send queries lifetime (Section 6.4)of a floating circle already reflects and wait for a response. the popularity of the respective data,ie..the floating circle If n chooses to send queries,it employs GAP to send of a more popular data has a longer lifetime queries with ku tickets towards the floating circle of dk. We have the following argument:both of n and ny are Any user ny inside the circle of dk exploits existing DTN inside the floating circle of every data item in S.Recall that. routing algorithms,e.g..Spray and Wait [14]and del- each user is free to delete a data item after he leaves the egation forwarding [15].for delivering de to n upon receiv- respective circle,which implies,if a node caches d.this ing the query.Before n receives any response from the node is inside the floating circle of di.In addition,as we floating circle,n might get a copy of dk when contacting assumed in Section 3.1,the transmission range and the another user who happens to have a copy of dk.Anyway, physical size of every mobile user are negligibly small user n can send the request again to the cellular operator compared to the size of the MobiArea;taking these two for free if he does not receive dy after the predetermined sys- facts together completes the argument. tem-specified delay passes,the operator then pushes da to n Each data item has four options:cached at both nodes, directly via cellular networks. at n.at n.or at neither of them,which makes the problem untractable.We propose to approximate it as follows: 1As the physical position ofn is not fixed.we cannot use GAP in this regarding s as the selection pool,n and ny make their response process
Mobile nodes enter and leave the MobiArea over time. Even if we cache data in some popular nodes, and let these nodes hand over their cached data to other nodes when they leave the MobiArea, what do we actually get? The answer is caching data in a region, and the region is, in fact, the entire MobiArea. Some other studies [26,27] have proposed to strategically deploy throwbox, a kind of stationary wireless node that acts as a relay, to improve DTN throughput. Why not place some throwboxes in the MobiArea for caching? There are several concerns. First and foremost, throwboxes are expensive and cannot be massively deployed. Second, throwboxes restrict our choice of caching positions to a limited set of candidate locations—locations of throwboxes, which may result in performance degradation, indicated by the total access delay over all requesters. 5.3. Cache replacement policy for pairwise encounters This section focuses on cache replacement for pairwise encounters inside floating circles, and does not take into account data or query copies incurred by GAP (Section 5.1) or other routing protocols (Section 5.4). Ideally, a mobile user maintains a data item if this user is geographically inside the floating circle of this data item. However, user ni has a limited buffer of size Bi; therefore, user ni has to strategically select a subset of items for caching. The cache replacement occurs when two mobile users with different sets of data items have a contact. In Fig. 5, user ni meets another user nj at time kci ; without loss of generality, the union set of data items cached at these two users is assumed to be S ¼ fd1; d2; ... ; dmg. The cache replacement decision is based on remaining lifetime and chord length. The remaining lifetime rld of data d represents the remaining time before the floating circle of d expires. The chord length clki of data dk for a user ni denotes the length that ni would travel in the floating circle of dk if ni keeps its moving direction unchanged. Fig. 5 illustrates clki and clhi. Given the center position and radius of a floating circle, the current position and the moving direction of a user, the chord length can be easily attained by plane geometry. We note that the popularity of a data item is not considered here. The reason is implicit, if somewhat subtle: the lifetime (Section 6.4) of a floating circle already reflects the popularity of the respective data, i.e., the floating circle of a more popular data has a longer lifetime. We have the following argument: both of ni and nj are inside the floating circle of every data item in S. Recall that, each user is free to delete a data item after he leaves the respective circle, which implies, if a node caches di, this node is inside the floating circle of di. In addition, as we assumed in Section 3.1, the transmission range and the physical size of every mobile user are negligibly small compared to the size of the MobiArea; taking these two facts together completes the argument. Each data item has four options: cached at both nodes, at ni, at nj, or at neither of them, which makes the problem untractable. We propose to approximate it as follows: regarding S as the selection pool, ni and nj make their decisions separately. Taking ni for example, we formulate the cache replacement as the following Knapsack problem [35]: max Xm k¼1 xk rlk clki subject to Xm k¼1 xk sk 6 Bi xk 2 f0; 1g; 8k 2 f1; 2; ... ; mg ð5Þ where xk indicates whether dk would be cached in ni, and the product ðrlk clkiÞ serves as the weight of dk for ni. Observing that sk and Bi are integers, we can use dynamic programming to solve this problem in OðmBiÞ time. 5.4. Data query from users and response from floating circles When a mobile user ni requests dk from the cellular operator, if the cellular operator detects that there is already a floating circle of dk, the operator offers two choices: directly download the data from cellular networks, or indirectly get the data via opportunistic deviceto-device connections within a predetermined delay. If ni chooses the latter, the operator replies with Ck; ck;rk, and kki (see Fig. 1 for notations). User ni then checks the interest probability Pik (see Eq. (1)). Depending on Pik; ni can choose either to physically move to the circle, or to send queries and wait for a response. If ni chooses to send queries, it employs GAP to send queries with kki tickets towards the floating circle of dk. Any user nj inside the circle of dk exploits existing DTN routing algorithms,1 e.g., Spray and Wait [14] and delegation forwarding [15], for delivering dk to ni upon receiving the query. Before ni receives any response from the floating circle, ni might get a copy of dk when contacting another user who happens to have a copy of dk. Anyway, user ni can send the request again to the cellular operator for free if he does not receive dk after the predetermined system-specified delay passes, the operator then pushes dk to ni directly via cellular networks. j i Fig. 5. Cache replacement. The weight of a data item dk for user ni is determined by the remaining lifetime rlk before the floating circle of dk expires and the chord length clki. 1 As the physical position of ni is not fixed, we cannot use GAP in this response process. 190 S. Zhang et al. / Computer Networks 83 (2015) 184–198
S.Zhang et aL /Computer Networks 83(2015)184-198 191 We note that,users may not be willing to decide,on a T per-item basis,whether to use cellular networks or 8 C offloading.To make the procedure seamless,we allow cost bene; Ci users to setup their policies in our framework,e.g.,"down- load Facebook stuff as fast as possible,and optional soft- o to p0t0 ware updates as cheaply as possible".In doing so.when flagy users get data from floating circles,the service they experi- ence is still seamless. In this section,we see that the computation overhead at mobile devices is small.We shall see in the next section that,the computation-intensive task,e.g.,parameter esti- Fig.6.The information maintained by the cellular operator for each data mation,is shifted to the cellular operator side. item d.Parameters in white grids can be obtained based on history and feedback:parameters in shadow grids need to be estimated. 6.MobiCache on the cellular operator side 6.2.Revenue maximization This section presents the details of MobiCache on the operator side.We first introduce the query history and user Denote by bene the benefit that the operator gains from feedback information maintained by cellular operators in delivering data d to user j:denote by cost the reward that Section 6.1,and presents the revenue maximization prob- the operator offers to user j for forwarding data d,one time. lem in Section 6.2.In Sections 6.3-6.5.we explain how to We note that,each data item d would be forwarded many optimize framework parameters based on the collected times during the lifetime of its floating circle;the cellular information. operator tracks how many times a user forwards a data item through feedback,as shown in Fig.6.We define them 6.1.Query history and user feedback as follows: Fig.6 shows the information maintained by a cellular 0. if the delivery is via cellular networks. bene= operator for data di.In the first row,s is the size:Ci is 0.4si,otherwise.(si is the size of di.) the characteristic vector;T is the point of time when the (6) cellular operator starts to set up the floating circle for d:q and f are the number of requests for d and the num- 0 if the delivery is via cellular networks. ber of forwardings that d experiences from time T to cost 6.67s x 10-5.otherwise. T+,respectively.In the second row,cost and bene will (7) be introduced in the next subsection. The third row records the first request for d:user no If the delivery is via cellular networks,we have sends a request for di at time to at position pos and bene=cost=0;this is reasonable,since delivering d to receives d via cellular networks;the cellular operator asks user j via cellular networks implies that,cellular networks no to deliver d to the respective floating circle with k tick- may fail to satisfy another user somewhere,due to its con- ets.Recall the definition of Ti.we have Tto. gestion.Otherwise,we set benef and cost according to the Each following row corresponds to a request for d;dur- pricing gap in Section 3.We see that,for two different ing the period from T to T+:user n (j=1.2.....q) users ny and nk,bene bene and cost costf.therefore, sends a request for di at time ty at position pos.then without causing any confusion,bene and bene will be the operator informs ny to send a query to the respective used interchangeably,and the same holds for cost and circle with ky tickets,finally n receives di at time t at posi- cost.The revenue of an operator is defined as follows: tion pos via offloading (flag,=1)or cellular networks R=>(bene-cost). (8) 1ag=0) diny When users send requests or receive data,they are asked to provide feedback:geographical positions and To maximize R,an operator should offload as much cellu- the number of forwardings that they have helped in the lar traffic as possible,while incurring as little forwarding past(including forwardings inside and outside floating cir- cost as possible.In the next three subsection,we present cles).We mention that,feedback may violate the privacy of how to achieve this goal through parameter optimization, users;we do not consider privacy issues as they are out of based upon query history and user feedback information. the scope of this paper.In doing so,the operator obtains the white fields in Fig.6:the rest of this section then 6.3.Circle center and radius focuses on determining the fields in shadow. At the beginning,there is no query history or user feed- The center position c of data di should be chosen to back for the proposed framework to estimate framework minimize the sum of distances between c and the posi- parameters.We propose to either use part of the trace as tions of potential requesters while they are sending warm-up period to accumulate necessary information,or ini- requests.But how can we know who would download d tialize the settings with empirical values. and where they are in advance?The basic principle is that
We note that, users may not be willing to decide, on a per-item basis, whether to use cellular networks or offloading. To make the procedure seamless, we allow users to setup their policies in our framework, e.g., ‘‘download Facebook stuff as fast as possible, and optional software updates as cheaply as possible’’. In doing so, when users get data from floating circles, the service they experience is still seamless. In this section, we see that the computation overhead at mobile devices is small. We shall see in the next section that, the computation-intensive task, e.g., parameter estimation, is shifted to the cellular operator side. 6. MobiCache on the cellular operator side This section presents the details of MobiCache on the operator side. We first introduce the query history and user feedback information maintained by cellular operators in Section 6.1, and presents the revenue maximization problem in Section 6.2. In Sections 6.3–6.5, we explain how to optimize framework parameters based on the collected information. 6.1. Query history and user feedback Fig. 6 shows the information maintained by a cellular operator for data di. In the first row, si is the size; Ci is the characteristic vector; Ti is the point of time when the cellular operator starts to set up the floating circle for di; qi and fi are the number of requests for di and the number of forwardings that di experiences from time Ti to Ti þ li, respectively. In the second row, costi and benei will be introduced in the next subsection. The third row records the first request for di: user n0 sends a request for di at time t0 at position posn0 t0 and receives di via cellular networks; the cellular operator asks n0 to deliver di to the respective floating circle with ki tickets. Recall the definition of Ti, we have Ti t0. Each following row corresponds to a request for di during the period from Ti to Ti þ li: user nj (j ¼ 1; 2; ... ; qi) sends a request for di at time tj at position posnj tj , then the operator informs nj to send a query to the respective circle with kij tickets, finally nj receives di at time t 0 j at position posnj t0 j via offloading (flagj ¼ 1) or cellular networks (flagj ¼ 0). When users send requests or receive data, they are asked to provide feedback: geographical positions and the number of forwardings that they have helped in the past (including forwardings inside and outside floating circles). We mention that, feedback may violate the privacy of users; we do not consider privacy issues as they are out of the scope of this paper. In doing so, the operator obtains the white fields in Fig. 6; the rest of this section then focuses on determining the fields in shadow. At the beginning, there is no query history or user feedback for the proposed framework to estimate framework parameters. We propose to either use part of the trace as warm-up period to accumulate necessary information, or initialize the settings with empirical values. 6.2. Revenue maximization Denote by benej i the benefit that the operator gains from delivering data di to user j; denote by costj i the reward that the operator offers to user j for forwarding data di one time. We note that, each data item di would be forwarded many times during the lifetime of its floating circle; the cellular operator tracks how many times a user forwards a data item through feedback, as shown in Fig. 6. We define them as follows: benej i ¼ 0; if the delivery is via cellular networks; 0:4si; otherwise: ðsi is the size of di:Þ ð6Þ costj i ¼ 0; if the delivery is via cellular networks; 6:67si 106 ; otherwise: ð7Þ If the delivery is via cellular networks, we have benej i ¼ costj i ¼ 0; this is reasonable, since delivering di to user j via cellular networks implies that, cellular networks may fail to satisfy another user somewhere, due to its congestion. Otherwise, we set benej i and costj i according to the pricing gap in Section 3. We see that, for two different users nj and nk; benej i ¼ benek i and costj i ¼ costk i , therefore, without causing any confusion, benej i and benei will be used interchangeably, and the same holds for costj i and costi. The revenue of an operator is defined as follows: R ¼ X di;nj ðbenej i costj iÞ: ð8Þ To maximize R, an operator should offload as much cellular traffic as possible, while incurring as little forwarding cost as possible. In the next three subsection, we present how to achieve this goal through parameter optimization, based upon query history and user feedback information. 6.3. Circle center and radius The center position ci of data di should be chosen to minimize the sum of distances between ci and the positions of potential requesters while they are sending requests. But how can we know who would download di and where they are in advance? The basic principle is that, Fig. 6. The information maintained by the cellular operator for each data item di. Parameters in white grids can be obtained based on history and feedback; parameters in shadow grids need to be estimated. S. Zhang et al. / Computer Networks 83 (2015) 184–198 191
192 S.Zhang et aL /Computer Networks 83 (2015)184-198 similar data items attract similar users.The degree of simi- device-to-device connections.For each dx esa,denote by larity [36]between d and d can be measured by the N(t)the set of users that receive dy before time t.We correlation coefficient P: use bi(t)to denote the total benefit of maintaining the cir- ∑,e-e- cle of d when the lifetime of the circle is t.We have Pi= (9) V∑ae-eV∑a(e-e b()=∑∑Pabene (11) dx∈sn∈( where el denotes the average of the elements in Ci.Given a Fig.7 shows an example.We assume threshold m,we define sd={dilp>m).that is.the s3=1,s4={d1,d2h,p3=0.3.andp23=0.8.When t=3. correlation coefficient between di and every item in sd is we have N(3)=0 and N(3)={ns).then we have not smaller than m.Then,c is estimated as the weighted b3(3)=P23.benes =0.32:when t=5,we have N3(5)= average of the center positions of items in sd.That is, n1} and N3(5)={ns}.then we have b3(5)= 9-TPa·cx P13.benes +P23.benes =0.44;and so on.We then have (10) 4∑4e the b3(t)curve as in the Fig.7(c). The intuition for estimating the number of forwardings For the radius r of data di.if it is chosen too small,then of di is that,floating circles with physically close centers the critical condition in Eq.(4)cannot be satisfied:if too have similar density of users.Given a threshold 2.we large,then some unnecessary forwardings and replications define Sf={dx lc-cxll212).that is,the distance would be incurred.Hence,we set r to be the smallest value between c and the floating circle center of any item in that satisfies the critical condition. S is not larger than 12.Then,the number of forwardings Please note that,c and r will be updated after the float- of di in unit time,denoted as Aci.can be estimated as ing circle of d expires.This update can improve the accu- racy of estimations for future data,since the updated △Ci ∑dke断 (12) values are realistic and accurate. ∑de(T 6.4.Circle lifetime We use ci(t)to denote the total cost of maintaining the cir- cle of d when the lifetime of the circle is t.We have The circle lifetime l of data d,should be chosen to maxi- ci()=△G,cost·t. (13) mize a cellular operator's revenue.The benefit of maintain- ing floating circles comes from serving data requests.The Fig.7 shows an example,where Acs is assumed to be cost of maintaining floating circles results from data for- 104.We have c3(t)=0.0667t as in Fig.7(c).Then,the life- wardings.Note that,when we need to estimate l,we do time I is set to the value that maximize the revenue not have any information about requests and forwardings b3(t)-c3(t).It is worth mentioning that,if generated of d.Therefore,we need to estimate l based on informa- from our estimation is not positive,then the operator tion of some other data that is similar to di. should not set up a floating circle for di,since it could not Since similar data attract similar users,we assume that, gain any revenue from offloading d. for each request for an item dx in s4,there will be also a request for d.and the operator gains a benefit of 6.5.Number of tickets in GAP (P bene)by satisfying this request.Remember that we have the start time and end time of each request;for exam- As we mentioned before,the ticket number is used to ple,in Fig.7(a).t is the time when n receives d via trade off between delivery delay and forwarding cost:if ni =5 7 2 t2 2 p05t t均=8 0.1 3 t3 3 =10 0.12 =14 (a)Request information of d 0.80 ns t=3 0.40 ca(t) 6 6=8 t7 pos 7=12 0 358101214 (b)Request information of d2 (c)The benefit ba(t)and cost ca(t)of da Fig.7.An example of estimating circle lifetime.We assume=(d.dP=0.3.P=0.8.5a=1.and Ac3=10.Then,we can plot ba(t)and ca(t)as in the figure,where the maximum marginal revenue happens when t =12,thus.I is set to 12
similar data items attract similar users. The degree of similarity [36] between dj and di can be measured by the correlation coefficient qji: qji ¼ PM k¼1ðej k ej Þðei k ei Þ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi PM k¼1ðej k ej Þ 2 q ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi PM k¼1ðei k ei Þ 2 q : ð9Þ where ei denotes the average of the elements in Ci. Given a threshold g1, we define Sdi ¼ fdjjqji P g1g, that is, the correlation coefficient between di and every item in Sdi is not smaller than g1. Then, ci is estimated as the weighted average of the center positions of items in Sdi . That is, ci ¼ X dx2Sd i qxi P cx dy2Sd i qyi : ð10Þ For the radius ri of data di, if it is chosen too small, then the critical condition in Eq. (4) cannot be satisfied; if too large, then some unnecessary forwardings and replications would be incurred. Hence, we set ri to be the smallest value that satisfies the critical condition. Please note that, ci and ri will be updated after the floating circle of di expires. This update can improve the accuracy of estimations for future data, since the updated values are realistic and accurate. 6.4. Circle lifetime The circle lifetime li of data di should be chosen to maximize a cellular operator’s revenue. The benefit of maintaining floating circles comes from serving data requests. The cost of maintaining floating circles results from data forwardings. Note that, when we need to estimate li, we do not have any information about requests and forwardings of di. Therefore, we need to estimate li based on information of some other data that is similar to di. Since similar data attract similar users, we assume that, for each request for an item dx in Sdi , there will be also a request for di, and the operator gains a benefit of ðqxi beneiÞ by satisfying this request. Remember that we have the start time and end time of each request; for example, in Fig. 7(a), t 0 1 is the time when n1 receives d1 via device-to-device connections. For each dx 2 Sdi , denote by Ni xðtÞ the set of users that receive dx before time t. We use biðtÞ to denote the total benefit of maintaining the circle of di when the lifetime of the circle is t. We have biðtÞ ¼ X dx2Sd i X nj2Ni x ðtÞ qxi benej i : ð11Þ Fig. 7 shows an example. We assume s3 ¼ 1; Sd3 ¼ fd1; d2g; q13 ¼ 0:3, and q23 ¼ 0:8. When t ¼ 3, we have N3 1ð3Þ¼; and N3 2ð3Þ¼fn5g, then we have b3ð3Þ ¼ q23 bene3 ¼ 0:32; when t ¼ 5, we have N3 1ð5Þ ¼ fn1g and N3 2ð5Þ¼fn5g, then we have b3ð5Þ ¼ q13 bene3 þ q23 bene3 ¼ 0:44; and so on. We then have the b3ðtÞ curve as in the Fig. 7(c). The intuition for estimating the number of forwardings of di is that, floating circles with physically close centers have similar density of users. Given a threshold g2, we define Sc i ¼ fdxj jjci cxjj2 6 g2g, that is, the distance between ci and the floating circle center of any item in Sc i is not larger than g2. Then, the number of forwardings of di in unit time, denoted as Dci, can be estimated as Dci ¼ P dx2Sc i f x P dx2Sc i ðlxðrx Þ 2Þ ðriÞ 2 : ð12Þ We use ciðtÞ to denote the total cost of maintaining the circle of di when the lifetime of the circle is t. We have ciðtÞ ¼ Dci costi t: ð13Þ Fig. 7 shows an example, where Dc3 is assumed to be 104 . We have c3ðtÞ ¼ 0:0667t as in Fig. 7(c). Then, the lifetime l3 is set to the value that maximize the revenue b3ðtÞ c3ðtÞ. It is worth mentioning that, if li generated from our estimation is not positive, then the operator should not set up a floating circle for di, since it could not gain any revenue from offloading di. 6.5. Number of tickets in GAP As we mentioned before, the ticket number is used to trade off between delivery delay and forwarding cost: if Fig. 7. An example of estimating circle lifetime. We assume Sd3 ¼ fd1; d2g; q13 ¼ 0:3; q23 ¼ 0:8; s3 ¼ 1, and Dc3 ¼ 104 . Then, we can plot b3ðtÞ and c3ðtÞ as in the figure, where the maximum marginal revenue happens when t ¼ 12, thus, l3 is set to 12. 192 S. Zhang et al. / Computer Networks 83 (2015) 184–198
S.Zhang et aL /Computer Networks 83(2015)184-198 193 the delivery delay is too long.some subsequent requests the assumption.Let us take a look at the segmentation from other users may happen before the floating circle is overhead.Denote by cc the communication capacity set up:if it is too short,the floating circle may have to wait between two nodes,and by hd the length of the for a long time before any request arrives,which incurs Bluetooth protocol header.For a data item d of size s,if unnecessary forwarding cost.It is hard to analyse the s+hd cc.then we divide it into [ainadl segments. relationship between ticket number,delivery delay,and Remember that,the proposed framework uses ticket coun- forwarding cost;furthermore,estimating the generation ters to restrict the number of replications of any item to be time of subsequent requests seems impossible.We there- not larger than the number of tickets k(see Section 5.1). fore resort to the following heuristic. Therefore,the overall forwarding or storing overhead incurred by segmentation is not larger than [hd.k. 6.5.1.Delivering a data item to a floating circle Though this value increases as the length of protocol This happens when setting up a floating circle.A data header hd increases,in the current design of Bluetooth pro- delivery can be characterized by its start and end positions. tocols,hd is far less than cc.Thus,the segmentation over- The degree of similarity between two data deliveries can head is negligible. be measured by the Euclidean distance between their start and end points.For example,user ng is asked to send d 7.2.Active vs.passive positioning towards c at time ty:given a threshold n.we can select a set of data items:for each item d,in this set,1)the dis- Location information is used in three aspects in the pro- tance between pos and where the first requester of dx posed framework.One is to help the framework determine sends a request for dx,is not larger than 1s.2)the distance whether the moving direction of a mobile user deflect between cx and ci is not larger than 1s.Denote this set as much from the direction of a floating circle(Section 5.1); then the ticket number k can be estimated as the second one is to help the framework determine whether a mobile user is within the floating circle of a data item(Sections 5.2 and 5.3):and the last one is to provide (14) the location feedback when a user sends a query or receives a data item at the end (Section 5.4). The current design of MobiCache uses active positioning. where S denotes the cardinality of the set.Note that, ie.,mobile users actively record their positions over time. In fact,when there is energy concern,we can switch to after the circle of d,expires,k should be updated as fol- passive positioning,ie.,mobile users record their positions lows:k=k+1,if the floating circle is set up too late that only when necessary.Specifically.for the first aspect some data requests have to retrieve data via cellular net- above,we can use Spray and Wait [14]or Delegation works;otherwise,k=k-1. Forwarding [15],both of which do not need geographical information,instead of GAP:for the second and third 6.5.2.Delivering a data query to a floating circle aspects,mobile users turn on their GPS only when encoun- This happens when a user ny sends a data request tering the others,sending data queries or finally receiving towards the floating circle of di.Using the same method data items.We see that,this kind of passive positioning as above,we can estimate ky.The only difference is that, decreases the dependence on location information,and data di has only one k,because the first requester of di is meanwhile conserve energy to some extent. unique;but there are many k's,since there are many sub- Besides,when there are positioning errors,for the sec- sequent requesters.When user ny eventually receives d.it ond aspect above,we just need to set the radius of a float- informs the cellular operator of the delay he experienced. ing circle to be the one calculated in Section 6.3 plus the Then,ku is updated as follows:k=ky+1,if n obtains d maximum positioning error,which can guarantee that via cellular networks;otherwise,k=ky-1. the critical condition in Eg.(4)is still met.For the third The expositions hitherto focus on the scenario where aspect,position errors would not be amplified when decid- one data item has only one floating circle.However,when ing the center of a circle.Therefore,passive positioning can there is some sort of extremely popular data that users mitigate the negative impact of position errors. wish to have in a very small delay.multiple floating circles could be set up within a mobiArea to reduce access delay. 8.Performance evaluation The main idea of MobiCache can adapt naturally to this new context. In this section,we conduct extensive trace-driven sim- ulations to evaluate our design under different network 7.Discussion settings and reveal insights of the proposed design performance. 7.1.Segmentation overhead 8.1.Realistic traces and baseline algorithms We have assumed that all necessary data transfer can be completed during any contact in Section 3.1.If a data Our evaluations are conducted on three realistic traces, item is too large to be transferred in one contact,then where Bluetooth-enables devices (e.g.,iMotes)periodically we can divide it into several small segments which follow detects their peers nearly and record the contacts over
the delivery delay is too long, some subsequent requests from other users may happen before the floating circle is set up; if it is too short, the floating circle may have to wait for a long time before any request arrives, which incurs unnecessary forwarding cost. It is hard to analyse the relationship between ticket number, delivery delay, and forwarding cost; furthermore, estimating the generation time of subsequent requests seems impossible. We therefore resort to the following heuristic. 6.5.1. Delivering a data item to a floating circle This happens when setting up a floating circle. A data delivery can be characterized by its start and end positions. The degree of similarity between two data deliveries can be measured by the Euclidean distance between their start and end points. For example, user nk is asked to send di towards ci at time tk; given a threshold g3, we can select a set of data items: for each item dx in this set, 1) the distance between posnk tk and where the first requester of dx sends a request for dx, is not larger than g3, 2) the distance between cx and ci is not larger than g3. Denote this set as Sdi njtj , then the ticket number ki can be estimated as ki ¼ X dx2Sdi nj t j kx 0 BB@ 1 CCA , jSdi njtj j ð14Þ where jSdi njtj j denotes the cardinality of the set. Note that, after the circle of di expires, ki should be updated as follows: ki ¼ ki þ 1, if the floating circle is set up too late that some data requests have to retrieve data via cellular networks; otherwise, ki ¼ ki 1. 6.5.2. Delivering a data query to a floating circle This happens when a user nj sends a data request towards the floating circle of di. Using the same method as above, we can estimate kij. The only difference is that, data di has only one ki, because the first requester of di is unique; but there are many kij’s, since there are many subsequent requesters. When user nj eventually receives di, it informs the cellular operator of the delay he experienced. Then, kij is updated as follows: kij ¼ kij þ 1, if nj obtains di via cellular networks; otherwise, kij ¼ kij 1. The expositions hitherto focus on the scenario where one data item has only one floating circle. However, when there is some sort of extremely popular data that users wish to have in a very small delay, multiple floating circles could be set up within a MobiArea to reduce access delay. The main idea of MobiCache can adapt naturally to this new context. 7. Discussion 7.1. Segmentation overhead. We have assumed that all necessary data transfer can be completed during any contact in Section 3.1. If a data item is too large to be transferred in one contact, then we can divide it into several small segments which follow the assumption. Let us take a look at the segmentation overhead. Denote by cc the communication capacity between two nodes, and by hd the length of the Bluetooth protocol header. For a data item d of size s, if s þ hd > cc, then we divide it into d s cchde segments. Remember that, the proposed framework uses ticket counters to restrict the number of replications of any item to be not larger than the number of tickets k (see Section 5.1). Therefore, the overall forwarding or storing overhead incurred by segmentation is not larger than d s cchde hd k. Though this value increases as the length of protocol header hd increases, in the current design of Bluetooth protocols, hd is far less than cc. Thus, the segmentation overhead is negligible. 7.2. Active vs. passive positioning Location information is used in three aspects in the proposed framework. One is to help the framework determine whether the moving direction of a mobile user deflect much from the direction of a floating circle (Section 5.1); the second one is to help the framework determine whether a mobile user is within the floating circle of a data item (Sections 5.2 and 5.3); and the last one is to provide the location feedback when a user sends a query or receives a data item at the end (Section 5.4). The current design of MobiCache uses active positioning, i.e., mobile users actively record their positions over time. In fact, when there is energy concern, we can switch to passive positioning, i.e., mobile users record their positions only when necessary. Specifically, for the first aspect above, we can use Spray and Wait [14] or Delegation Forwarding [15], both of which do not need geographical information, instead of GAP; for the second and third aspects, mobile users turn on their GPS only when encountering the others, sending data queries or finally receiving data items. We see that, this kind of passive positioning decreases the dependence on location information, and meanwhile conserve energy to some extent. Besides, when there are positioning errors, for the second aspect above, we just need to set the radius of a floating circle to be the one calculated in Section 6.3 plus the maximum positioning error, which can guarantee that the critical condition in Eq. (4) is still met. For the third aspect, position errors would not be amplified when deciding the center of a circle. Therefore, passive positioning can mitigate the negative impact of position errors. 8. Performance evaluation In this section, we conduct extensive trace-driven simulations to evaluate our design under different network settings and reveal insights of the proposed design performance. 8.1. Realistic traces and baseline algorithms Our evaluations are conducted on three realistic traces, where Bluetooth-enables devices (e.g., iMotes) periodically detects their peers nearly and record the contacts over S. Zhang et al. / Computer Networks 83 (2015) 184–198 193