Computer Communications 83 (2016)45-55 Contents lists available at ScienceDirect computer cmmuncatons Computer Communications ELSEVIER journal homepage:www.elsevier.com/locate/comcom Joint storage assignment for D2D offloading systems CrossMark Wei Wang",Xiaobing Wu,Lei Xie,Sanglu Lu State Key Laboratory for Novel Software Technology.Nanjing University.Nanjing 210093.China ARTICLE INFO ABSTRACT Article history: D2D offloading reduces the load of cellular network by asking mobile nodes to download content directly Received 3 September 2014 from storage of neighboring helpers via short range links.In this paper.we introduce a novel storage Revised 18 December 2015 Accepted 22 February 2016 assignment scheme that can enhance storage utilization for D2D networks that have different types of Available online 3 March 2016 storage nodes.Unlike traditional D2D systems that only use storage as static content cache,our scheme uses on-demand relaying to enhance storage utilization.Our on-demand relaying scheme replicates rare Keywords: content when it is requested.Therefore,the proposed scheme can greatly increase the amount of con- Device-to-Device communication tent supported by the offloading system.We develop a convex optimization based algorithm to find the DTN optimal storage assignment tradeoff between static caching and on-demand relaying.Numerical results Opportunistic networks and real-world trace-driven simulations show that our algorithm can achieve 30%reduction in offloading failure rate compared to static schemes. 2016 Elsevier B.V.All rights reserved. 1.Introduction starts downloading the movie directly through the cellular network 5.To make offloading successful,the movie must be cached in at Mobile traffic has increased at a compound annual growth rate least one of the helpers in a small virtual region that Alice can of more than 70%in recent years [1].Meeting such a large surge contact before she becomes impatient.The size of the "region"is in traffic demand is a great challenge for both cellular network de- determined by the user mobility pattern and the time that Alice signers and operators.One way to relieve the pressure on current can wait.Because the total storage contributed by helpers in this cellular networks is to offload part of the mobile traffic through virtual region is limited,the offloading system can only provide a Device-to-Device (D2D)communication [2.Because most mobile limited amount of content to Alice.Even if the offloading system traffic comes from content downloading [1].one of the impor- covers a large area and has a huge number of helpers,it can only tant offloading targets is mobile content,which includes applica- offload a constant amount of content due to the locality of the con- tion updates and video clips.In D2D offloading systems for con- tent in the static caching scheme. tent downloading,mobile helpers serve as content caches for other In this paper,we improve the scalability of D2D systems by in- nodes.These helpers directly deliver content to neighboring de- troducing different roles to mobile helpers.We assign some of the vices through short-range and low-cost wireless links (e.g.,Blue- mobile helpers as on-demand relays instead of static caches,which tooth,WiFi Direct or LTE D2D links [3]).In this way,the download- are referred to as seeds in this paper.Due to the inherent non- ing traffic does not pass through base stations and core networks uniformity in the contact pattern,relays may have a higher con- so that the limited resource of cellular networks can be saved [4. tact probability to the subscriber than seeds.For example,Alice Existing D2D offloading systems "statically"assign content to may request her friends to act as mobile relays to download the mobile helpers based on content popularity [5.6].Subscribers need movie.Although it might be difficult for Alice to directly contact to directly contact mobile helpers that have the content of interest a seed,there will be a higher chance that one of her friends can in their cache.Such static caching schemes have limited scalabil- successfully download the movie.Afterwards,Alice can download ity with respect to network size.Consider the case where a sub- the movie from her friends when she intentionally or accidentally scriber called Alice wishes to download a movie through the D2D meets them.In this way.the number of helpers that Alice can con- offloading system.Due to limited mobility.Alice can only contact a tact is effectively enlarged with the help of her friends. small number of mobile helpers before she becomes impatient and In addition to using friends as relays,mobile nodes mounted on buses or static nodes deployed at public places (e.g..entrance of subways)can also act as relay nodes.These types of nodes have Corresponding author.Tel:+86 13851658076. regular contact patterns to a given subset of subscribers.For exam- E-mail addresses:ww@nju.edu.cn,wangwei.ww@gmail.com (W.Wang). ple,a subscriber may pass by a given subway station every day.If wuxb@nju.edu.cn (X.Wu).Ixie@nju.edu.cn (L Xie).sanglu@nju.edu.cn (S.Lu). http://dx.doiorg/10.1016/j.comcom.2016.02.012 0140-3664/6 2016 Elsevier B.V.All rights reserved
Computer Communications 83 (2016) 45–55 Contents lists available at ScienceDirect Computer Communications journal homepage: www.elsevier.com/locate/comcom Joint storage assignment for D2D offloading systems Wei Wang∗ , Xiaobing Wu, Lei Xie, Sanglu Lu State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210093, China a r t i c l e i n f o Article history: Received 3 September 2014 Revised 18 December 2015 Accepted 22 February 2016 Available online 3 March 2016 Keywords: Device-to-Device communication DTN Opportunistic networks a b s t r a c t D2D offloading reduces the load of cellular network by asking mobile nodes to download content directly from storage of neighboring helpers via short range links. In this paper, we introduce a novel storage assignment scheme that can enhance storage utilization for D2D networks that have different types of storage nodes. Unlike traditional D2D systems that only use storage as static content cache, our scheme uses on-demand relaying to enhance storage utilization. Our on-demand relaying scheme replicates rare content when it is requested. Therefore, the proposed scheme can greatly increase the amount of content supported by the offloading system. We develop a convex optimization based algorithm to find the optimal storage assignment tradeoff between static caching and on-demand relaying. Numerical results and real-world trace-driven simulations show that our algorithm can achieve 30% reduction in offloading failure rate compared to static schemes. © 2016 Elsevier B.V. All rights reserved. 1. Introduction Mobile traffic has increased at a compound annual growth rate of more than 70% in recent years [1]. Meeting such a large surge in traffic demand is a great challenge for both cellular network designers and operators. One way to relieve the pressure on current cellular networks is to offload part of the mobile traffic through Device-to-Device (D2D) communication [2]. Because most mobile traffic comes from content downloading [1], one of the important offloading targets is mobile content, which includes application updates and video clips. In D2D offloading systems for content downloading, mobile helpers serve as content caches for other nodes. These helpers directly deliver content to neighboring devices through short-range and low-cost wireless links (e.g., Bluetooth, WiFi Direct or LTE D2D links [3]). In this way, the downloading traffic does not pass through base stations and core networks so that the limited resource of cellular networks can be saved [4]. Existing D2D offloading systems “statically” assign content to mobile helpers based on content popularity [5,6]. Subscribers need to directly contact mobile helpers that have the content of interest in their cache. Such static caching schemes have limited scalability with respect to network size. Consider the case where a subscriber called Alice wishes to download a movie through the D2D offloading system. Due to limited mobility, Alice can only contact a small number of mobile helpers before she becomes impatient and ∗ Corresponding author. Tel.: +86 13851658076. E-mail addresses: ww@nju.edu.cn, wangwei.ww@gmail.com (W. Wang), wuxb@nju.edu.cn (X. Wu), lxie@nju.edu.cn (L. Xie), sanglu@nju.edu.cn (S. Lu). starts downloading the movie directly through the cellular network [5]. To make offloading successful, the movie must be cached in at least one of the helpers in a small virtual region that Alice can contact before she becomes impatient. The size of the “region” is determined by the user mobility pattern and the time that Alice can wait. Because the total storage contributed by helpers in this virtual region is limited, the offloading system can only provide a limited amount of content to Alice. Even if the offloading system covers a large area and has a huge number of helpers, it can only offload a constant amount of content due to the locality of the content in the static caching scheme. In this paper, we improve the scalability of D2D systems by introducing different roles to mobile helpers. We assign some of the mobile helpers as on-demand relays instead of static caches, which are referred to as seeds in this paper. Due to the inherent nonuniformity in the contact pattern, relays may have a higher contact probability to the subscriber than seeds. For example, Alice may request her friends to act as mobile relays to download the movie. Although it might be difficult for Alice to directly contact a seed, there will be a higher chance that one of her friends can successfully download the movie. Afterwards, Alice can download the movie from her friends when she intentionally or accidentally meets them. In this way, the number of helpers that Alice can contact is effectively enlarged with the help of her friends. In addition to using friends as relays, mobile nodes mounted on buses or static nodes deployed at public places (e.g., entrance of subways) can also act as relay nodes. These types of nodes have regular contact patterns to a given subset of subscribers. For example, a subscriber may pass by a given subway station every day. If http://dx.doi.org/10.1016/j.comcom.2016.02.012 0140-3664/© 2016 Elsevier B.V. All rights reserved
6 W.Wang et aL /Computer Communications 83 (2016)45-55 there is a relay deployed at the subway station,this relay can ac- tively search for the content requested by the subscriber and make sure the content can be delivered when the user passes by.Our analysis shows that using one relay with regular contact patterns can be as effective as increasing the number of seeds by a constant D2D ratio. communication An important issue in a relaying system is to allocate storage for content.Because users can choose to statically cache the content or RelaySeed dynamically download new content for their friends,mobile relays Cellular and seeds are actually different roles played by the same group coverace of helpers.In this way,relays and seeds need to share the lim- ited storage resources in mobile helpers and storage allocation be- comes critical to the performance of the offloading system.When Subscriber too many helpers are acting as seeds,there will be an insufficient amount of relays to"ferry"content between seeds and subscribers. Conversely,if there were too few seeds storing the requested con- Fig.1.Illustration of the communication model. tent,it would be difficult for relays to find a seed,making relaying useless.To find the optimal tradeoff point,we introduce the con- Information searching in opportunistic networks:Content of- cept of offloading efficiency to incorporate mobile relays and seeds floading systems focus on reducing the data delivery cost for a in the same optimization framework.We prove that the storage given set of content [5,6,13-16].As contents are cached in a dis- optimization problem is convex for various types of contact pat- tributed way,one important issue in content offloading systems terns and can be efficiently solved in a distributed manner.Our is to search information in a network with dynamic topology and numerical and trace-driven results show that relaying can reduce intermittent connections.User interests and network structures the offloading failure rate by more than 30%in D2D systems. can be used to help the content searching process.SPOON uti- The main contributions of this paper are as follows: lizes interest based searching to improve the performance of P2P file sharing [17].Social contact information is utilized in [18]to 1.We propose a new content download pattern that uses relays in find the optimal caching point based on user interests.Seeker- addition to traditional seeds to improve the scalability of D2D Assisted Searching (SAS)use seekers to help node to search infor- offloading systems. mation,while assuming users with the same interest meets more 2.We introduce a storage optimization framework that can be ap- plied to different types of helpers in D2D offloading systems. frequently [19].Optimal searching strategy in linear opportunistic networks is studied in [201. 3.Assuming constant helper density and Poisson contact pro- Replication optimization in opportunistic networks:The goal cesses,we show that offloading systems with n helpers can of optimal content replication is to maximize the social welfare, support (Vn)pieces of content when using both relays and which is defined as the expected utility gain from offloading seeds,while the amount of content supported by a static Delay-utility models are introduced in [5]to describe the impa- caching system is only 0(1). tience of subscribers.The utility optimization problem is shown 4.We show that the joint storage assignment problem for relays to be NP-hard [5.6].There are several ways to obtain approximate and seeds is convex and can be efficiently solved.Furthermore, solutions,including greedy heuristic algorithms [6.convex relax- our model can be extended to other types of helpers with con- ation [5]and a distributed algorithm based on voting [13.Dis- cave offloading efficiency functions.This leads to a better un- tributed caching in Femtocells is considered in 21.The storage derstanding of storage allocation when the network has multi- allocation problem under this scenario is similar to D2D networks ple types of helpers. with fixed contact patterns.This problem can be formulated as a The rest of this paper is organized as follows:Section 2 reviews set cover problem that is NP-hard [21].Utility based replication al- related works in D2D content dissemination and Section 3 intro- gorithm is used in 22]to distribute mobile videos via human mo- duces the system model used in this paper.Section 4 analyzes the bility between fixed venues.In [23].Chen et al.propose a new file D2D offloading efficiency for relays and shows that on-demand re- replica optimization algorithm that considers both the node stor- lays can asymptotically increase the amount of content supported age and contact frequency. by the system.The storage assignment problem is described in In comparison,we study tradeoffs between using storage for re- Section 5 and the benefits of our scheme are shown by numerical laying and static caching under the D2D scenario.Our model com- results and trace-driven simulations in Section 6.Finally,Section 7 bines static caching systems [5,6]and message relaying methods in presents conclusions of this work. DTN[24].Unlike the traditional DTN routing problems,which only consider cache size in the asymptotical sense [8].we study cache utility with finite storage in the offloading system.In contrast to 2.Related work traditional DTN systems,where the source tries to find a best way to replicate the message,our system adopts a destination centric Message delivery in opportunistic networks:Opportunistic model,where the destination actively searches for the content. networks utilize the intermittent connection between mobile de- vices to transfer data from the source to the destination [7.For 3.Network model message delivery applications,the source first replicates the mes- sage to several relays using short-range links.Relays then deliver 3.1.System overview the message to the destination when they contact the destina- tion [8,9].Most research works in message delivery applications The D2D offloading system studied in this paper combines two consider how to reduce the end-to-end delay under storage con- communication networks:a cellular network and an opportunistic straints in relays [10-12].The content offloading systems studied in network of D2D communication,as shown in Fig 1.The cellular this paper are different to message delivery applications,since the network provides pervasive network coverage and a global com- initial request comes from the destination rather than the source. munication channel for all mobiles.Due to the high cost of cellular
46 W. Wang et al. / Computer Communications 83 (2016) 45–55 there is a relay deployed at the subway station, this relay can actively search for the content requested by the subscriber and make sure the content can be delivered when the user passes by. Our analysis shows that using one relay with regular contact patterns can be as effective as increasing the number of seeds by a constant ratio. An important issue in a relaying system is to allocate storage for content. Because users can choose to statically cache the content or dynamically download new content for their friends, mobile relays and seeds are actually different roles played by the same group of helpers. In this way, relays and seeds need to share the limited storage resources in mobile helpers and storage allocation becomes critical to the performance of the offloading system. When too many helpers are acting as seeds, there will be an insufficient amount of relays to “ferry” content between seeds and subscribers. Conversely, if there were too few seeds storing the requested content, it would be difficult for relays to find a seed, making relaying useless. To find the optimal tradeoff point, we introduce the concept of offloading efficiency to incorporate mobile relays and seeds in the same optimization framework. We prove that the storage optimization problem is convex for various types of contact patterns and can be efficiently solved in a distributed manner. Our numerical and trace-driven results show that relaying can reduce the offloading failure rate by more than 30% in D2D systems. The main contributions of this paper are as follows: 1. We propose a new content download pattern that uses relays in addition to traditional seeds to improve the scalability of D2D offloading systems. 2. We introduce a storage optimization framework that can be applied to different types of helpers in D2D offloading systems. 3. Assuming constant helper density and Poisson contact processes, we show that offloading systems with n helpers can support (√n) pieces of content when using both relays and seeds, while the amount of content supported by a static caching system is only O(1). 4. We show that the joint storage assignment problem for relays and seeds is convex and can be efficiently solved. Furthermore, our model can be extended to other types of helpers with concave offloading efficiency functions. This leads to a better understanding of storage allocation when the network has multiple types of helpers. The rest of this paper is organized as follows: Section 2 reviews related works in D2D content dissemination and Section 3 introduces the system model used in this paper. Section 4 analyzes the D2D offloading efficiency for relays and shows that on-demand relays can asymptotically increase the amount of content supported by the system. The storage assignment problem is described in Section 5 and the benefits of our scheme are shown by numerical results and trace-driven simulations in Section 6. Finally, Section 7 presents conclusions of this work. 2. Related work Message delivery in opportunistic networks: Opportunistic networks utilize the intermittent connection between mobile devices to transfer data from the source to the destination [7]. For message delivery applications, the source first replicates the message to several relays using short-range links. Relays then deliver the message to the destination when they contact the destination [8,9]. Most research works in message delivery applications consider how to reduce the end-to-end delay under storage constraints in relays [10–12]. The content offloading systems studied in this paper are different to message delivery applications, since the initial request comes from the destination rather than the source. Fig. 1. Illustration of the communication model. Information searching in opportunistic networks: Content of- floading systems focus on reducing the data delivery cost for a given set of content [5,6,13–16]. As contents are cached in a distributed way, one important issue in content offloading systems is to search information in a network with dynamic topology and intermittent connections. User interests and network structures can be used to help the content searching process. SPOON utilizes interest based searching to improve the performance of P2P file sharing [17]. Social contact information is utilized in [18] to find the optimal caching point based on user interests. SeekerAssisted Searching (SAS) use seekers to help node to search information, while assuming users with the same interest meets more frequently [19]. Optimal searching strategy in linear opportunistic networks is studied in [20]. Replication optimization in opportunistic networks: The goal of optimal content replication is to maximize the social welfare, which is defined as the expected utility gain from offloading. Delay-utility models are introduced in [5] to describe the impatience of subscribers. The utility optimization problem is shown to be NP-hard [5,6]. There are several ways to obtain approximate solutions, including greedy heuristic algorithms [6], convex relaxation [5] and a distributed algorithm based on voting [13]. Distributed caching in Femtocells is considered in [21]. The storage allocation problem under this scenario is similar to D2D networks with fixed contact patterns. This problem can be formulated as a set cover problem that is NP-hard [21]. Utility based replication algorithm is used in [22] to distribute mobile videos via human mobility between fixed venues. In [23], Chen et al. propose a new file replica optimization algorithm that considers both the node storage and contact frequency. In comparison, we study tradeoffs between using storage for relaying and static caching under the D2D scenario. Our model combines static caching systems [5,6] and message relaying methods in DTN [24]. Unlike the traditional DTN routing problems, which only consider cache size in the asymptotical sense [8], we study cache utility with finite storage in the offloading system. In contrast to traditional DTN systems, where the source tries to find a best way to replicate the message, our system adopts a destination centric model, where the destination actively searches for the content. 3. Network model 3.1. System overview The D2D offloading system studied in this paper combines two communication networks: a cellular network and an opportunistic network of D2D communication, as shown in Fig 1. The cellular network provides pervasive network coverage and a global communication channel for all mobiles. Due to the high cost of cellular
W.Wang et aL/Computer Communications 83(2016)45-55 communication,it is mainly used for control message exchange Section 4.2.Relays are notified through the cellular network and content delivery traffic is offloaded through D2D communi- when they are selected.Each relay then reserves relay storage cation.D2D communication leverages opportunistic short-range segments for this downloading session and starts searching for links formed between nearby mobiles to reduce content delivery the given content.As the notification is sent through the cellu- cost.The links used for D2D communication can be WiFi Direct. lar network,the delay of the notification process can be ignored Bluetooth,LTE/LTE-A,etc.Compared to directly communicating compared to the delay of D2D offloading. with the base station,these short-range links are usually faster, 2.Relays try to replicate the requested content in the reserved re- consumes less energy and free of mobile data charges.For exam- lay storage when contact a seed.The content is replicated to ple,LTE only provides 300 Mbps shared transmission rate,while the relay via short range D2D links. the 802.11ac protocol used by WiFi provides transmission rate 3.When the subscriber contacts a seed or a relay that has already higher than 500 Mbps [25.Real-world measurements also show replicated the content,the subscriber can download the content that WiFi is 14.5 times and 23 times more energy efficient than through D2D links.We call this as an offloading success. 3G and LTE,respectively [26].However,these D2D links have short 4.If the subscriber becomes impatient before reaching any seed communication ranges.For example,WiFi only has communication or relay with the requested content,he/she will directly down- range up to 100 m.Therefore,short-range links only exist when load the content through the cellular network.We call this as mobiles move close to each other,and subscribers may become an offloading failure. impatient before they can find a D2D transmission opportunity[5] 5.Relays receive notifications when the offloading process suc- One important issue for D2D systems is to encourage nodes to ceeds or fails.The storage reserved for the downloading session act as helpers.Helpers contribute their storage as content caches is then released. and allow other nodes to download content from them through D2D links.Therefore,users should have motivations to contribute The advantage of the above on-demand relaying scheme is that storage and battery power of their devices.Helpers may come from it only uses the cellular network to deliver a small amount of con- two sources:mobile devices provided by users and special devices trol information,while the content replications are delivered by deployed by the network operator.The incentive for mobile helpers D2D links between seeds and relays,so the overall relaying cost may come from discounts or other benefits provided by the net- is negligible.Mobile helpers acting as relays also do not consume work operator.Such mechanisms have been well studied in exist- significantly more energy or bandwidth than those acting as seeds, ing work,e.g.[27].In D2D system.it is also possible to utilize because the only extra cost for relays is to receive the control social connections to provide incentives for helpers.For example, message.Note that our relaying scheme is different from relaying users may ask their friends to help them in content downloading. schemes in traditional DTN networks [8.10-12].which mainly fo- Note that incentive schemes developed for traditional D2D offload- cus on point-to-point message delivery rather than content distri- ing systems can be directly applied to our system,as our system bution. do not incur higher storage and energy cost to helpers than tradi tional offloading schemes. 3.2.System model Network operators are also willing to deploy helpers to offload the traffic of cellular systems so that they do not need to contin- We assume that content in the system is divided into segments uously deploy more base stations to handle the increasing traf- with lengths of M.Such a segmentation mechanism is commonly fic.These helpers can be mobile nodes mounted on buses [28 used in streaming and content delivery systems [29].Typical val- or fixed caching points on Femtocells [21].As D2D helpers do ues for segment lengths are 1 to 2 M Bytes.Segments of the same not need high-throughput backhaul,deployment of D2D helpers is file are treated as if they are independent pieces of content,so we much easier than deploying small base stations,e.g,microcells or use the term "content"instead of segments in the rest of this pa- picocells.The advantage of these specialized helpers is that they per.Therefore,it is possible that the user downloads some of the can provide heterogeneous mobility patterns,and it is easier for segments of a given file using the offloading system,while other network operators to optimize the content distribution process on segments of the same file are directly downloaded from the cellu- specialized devices lar network. Different types of helpers can coexist in D2D offloading sys- We assume that there are N pieces of content in the system and tems.Helpers with different mobility patterns,storage sizes or de- that each content i has a mean request rate of Ri.We further as- livery strategies may lead to different performance tradeoffs.In sume that the contact duration between nodes is long enough so this paper,we mainly focus on two types of helpers:Firstly,seeds that at least one segment can be downloaded during the contact are stable content caches that serve for all requests to a given period.As short-range links normally have high transmission rate content.The system allocates a certain number of seeds for each e.g.WiFi has transmission rates from 54-500 Mbps and Bluetooth piece of content to ensure the availability of content.Secondly,re- has rate of 24 Mbps,it takes less than one second to transmit a lays are temporary storage allocated for a specific request.They single segment.Thus,the duration for content transmission is neg- are selected according to the information of the given request to ligible compared to the durations of the contacts,which have av- enhance storage utility.The content cached in relays may dynami- erage length of 370 s [30].Systems with limited contact durations cally change due to arrival or completion of relaying requests.Note can be further optimized using schemes described in [31. that a single node may act as different roles at the same time:it Contacts between nodes are assumed to be independent non- may have part of its storage used as seeds and part of its storage lattice renewal processes.Our model covers most of the existing used as relays. mobility models,which assume that the contact patterns are Pois- Relaying procedure: son processes or have power-law inter-contact times [5,321.In the The content downloading process for a system with both seeds later part of this paper,we may further limit the contact processes and relays is described as follows: to be only Poisson processes,which have been verified by real- world traces [33]as well as theoretical analysis [32].To verify our 1.A subscriber sends a content request to the cellular operator. model in real systems,we also use real-world traces in our exper- Either the operator or the subscriber can select some relays iments,see Section 6.2.In real-world,contact processes between among all available relays to help the subscriber downloading different types of nodes can have different inter-contact time dis- the content.Detailed relay selection procedure is described in tributions.For example,the contact process between relays and
W. Wang et al. / Computer Communications 83 (2016) 45–55 47 communication, it is mainly used for control message exchange and content delivery traffic is offloaded through D2D communication. D2D communication leverages opportunistic short-range links formed between nearby mobiles to reduce content delivery cost. The links used for D2D communication can be WiFi Direct, Bluetooth, LTE/LTE-A, etc. Compared to directly communicating with the base station, these short-range links are usually faster, consumes less energy and free of mobile data charges. For example, LTE only provides 300 Mbps shared transmission rate, while the 802.11ac protocol used by WiFi provides transmission rate higher than 500 Mbps [25]. Real-world measurements also show that WiFi is 14.5 times and 23 times more energy efficient than 3G and LTE, respectively [26]. However, these D2D links have short communication ranges. For example, WiFi only has communication range up to 100 m. Therefore, short-range links only exist when mobiles move close to each other, and subscribers may become impatient before they can find a D2D transmission opportunity [5]. One important issue for D2D systems is to encourage nodes to act as helpers. Helpers contribute their storage as content caches and allow other nodes to download content from them through D2D links. Therefore, users should have motivations to contribute storage and battery power of their devices. Helpers may come from two sources: mobile devices provided by users and special devices deployed by the network operator. The incentive for mobile helpers may come from discounts or other benefits provided by the network operator. Such mechanisms have been well studied in existing work, e.g., [27]. In D2D system, it is also possible to utilize social connections to provide incentives for helpers. For example, users may ask their friends to help them in content downloading. Note that incentive schemes developed for traditional D2D offloading systems can be directly applied to our system, as our system do not incur higher storage and energy cost to helpers than traditional offloading schemes. Network operators are also willing to deploy helpers to offload the traffic of cellular systems so that they do not need to continuously deploy more base stations to handle the increasing traf- fic. These helpers can be mobile nodes mounted on buses [28] or fixed caching points on Femtocells [21]. As D2D helpers do not need high-throughput backhaul, deployment of D2D helpers is much easier than deploying small base stations, e.g., microcells or picocells. The advantage of these specialized helpers is that they can provide heterogeneous mobility patterns, and it is easier for network operators to optimize the content distribution process on specialized devices. Different types of helpers can coexist in D2D offloading systems. Helpers with different mobility patterns, storage sizes or delivery strategies may lead to different performance tradeoffs. In this paper, we mainly focus on two types of helpers: Firstly, seeds are stable content caches that serve for all requests to a given content. The system allocates a certain number of seeds for each piece of content to ensure the availability of content. Secondly, relays are temporary storage allocated for a specific request. They are selected according to the information of the given request to enhance storage utility. The content cached in relays may dynamically change due to arrival or completion of relaying requests. Note that a single node may act as different roles at the same time: it may have part of its storage used as seeds and part of its storage used as relays. Relaying procedure: The content downloading process for a system with both seeds and relays is described as follows: 1. A subscriber sends a content request to the cellular operator. Either the operator or the subscriber can select some relays among all available relays to help the subscriber downloading the content. Detailed relay selection procedure is described in Section 4.2. Relays are notified through the cellular network when they are selected. Each relay then reserves relay storage segments for this downloading session and starts searching for the given content. As the notification is sent through the cellular network, the delay of the notification process can be ignored compared to the delay of D2D offloading. 2. Relays try to replicate the requested content in the reserved relay storage when contact a seed. The content is replicated to the relay via short range D2D links. 3. When the subscriber contacts a seed or a relay that has already replicated the content, the subscriber can download the content through D2D links. We call this as an offloading success. 4. If the subscriber becomes impatient before reaching any seed or relay with the requested content, he/she will directly download the content through the cellular network. We call this as an offloading failure. 5. Relays receive notifications when the offloading process succeeds or fails. The storage reserved for the downloading session is then released. The advantage of the above on-demand relaying scheme is that it only uses the cellular network to deliver a small amount of control information, while the content replications are delivered by D2D links between seeds and relays, so the overall relaying cost is negligible. Mobile helpers acting as relays also do not consume significantly more energy or bandwidth than those acting as seeds, because the only extra cost for relays is to receive the control message. Note that our relaying scheme is different from relaying schemes in traditional DTN networks [8,10–12], which mainly focus on point-to-point message delivery rather than content distribution. 3.2. System model We assume that content in the system is divided into segments with lengths of M. Such a segmentation mechanism is commonly used in streaming and content delivery systems [29]. Typical values for segment lengths are 1 to 2 M Bytes. Segments of the same file are treated as if they are independent pieces of content, so we use the term “content” instead of segments in the rest of this paper. Therefore, it is possible that the user downloads some of the segments of a given file using the offloading system, while other segments of the same file are directly downloaded from the cellular network. We assume that there are N pieces of content in the system and that each content i has a mean request rate of Ri. We further assume that the contact duration between nodes is long enough so that at least one segment can be downloaded during the contact period. As short-range links normally have high transmission rate, e.g., WiFi has transmission rates from 54–500 Mbps and Bluetooth has rate of 24 Mbps, it takes less than one second to transmit a single segment. Thus, the duration for content transmission is negligible compared to the durations of the contacts, which have average length of 370 s [30]. Systems with limited contact durations can be further optimized using schemes described in [31]. Contacts between nodes are assumed to be independent nonlattice renewal processes. Our model covers most of the existing mobility models, which assume that the contact patterns are Poisson processes or have power-law inter-contact times [5,32]. In the later part of this paper, we may further limit the contact processes to be only Poisson processes, which have been verified by realworld traces [33] as well as theoretical analysis [32]. To verify our model in real systems, we also use real-world traces in our experiments, see Section 6.2. In real-world, contact processes between different types of nodes can have different inter-contact time distributions. For example, the contact process between relays and
W.Wang et aL /Computer Communications 83 (2016)45-55 subscribers may be different from that between seeds and sub- and relays can deliver contents to the subscriber via short range scribers. links.The efficiency of the seed only depends on the contact pro- We assume that there are n helpers willing to contribute their cess between subscribers and seeds.However,the efficiency of the storage.We assume that the storage contributed by each helper is relay is dependent on the number of seeds,in addition to the con- able to hold I pieces of content.This assumption can be relaxed tact process between subscribers and relays.This is because re- and will be discussed in Section 5.The storage in helpers can be lays need to first download the content from seeds before they can used as either seeds or relays.We use nsi and n.i to represent the transmit it to the subscribers. number of seeds and relays used by offloading sessions for content We first consider the offloading efficiency of a single relay node. When the contact pattern and number of seeds are known,the We define the offloading failure probability as the probability efficiency of the relays can be calculated as follows: that the subscriber cannot discover the content through D2D com- munication within a given time period of T.To evaluate the effi- Theorem 1.Suppose there are ns.i seeds for content i,and the inter- ciency of a single helper,we define the offloading efficiency Es of contact time between seeds and relays follows a Cumulative Distribu- a single seed as follows: tion Function (CDF)of Y(t)with a mean of uy.If the inter-contact time between relay and subscriber follows CDF of X(t)with a mean Es =-In Ps. (1) of ux,then the offloading efficiency of the relay node for content i is where Ps is the probability that the subscriber cannot receive the given by: content from that particular seed.A higher offloading efficiency means that the seed is more efficient in transmitting the content E,=-ln()- (T-t)"dx( (4) through the D2D link.Similarly.we define the efficiency of a relay Er as-In P.where P,is the probability that the subscriber cannot where: receive the content from the given relay.Table 1 summarizes the symbols used in this paper. (t)=1 As the offloading failure probability depends on the size of the (-X()dr/u (5) cache allocated for the given content,it is a function of the number of relays and seeds for content i.We denote the offloading failure probability for content i as Fnn).Therefore,the total amount Y)=1-(1-Y(t)dr1y (6) of traffic offloaded by D2D communication can be written as: Proof.There are two cases that the given relay will fail to deliver N D=>MRi(1-F(ns.i.nr.i)). (2) the content. (i).The relay never meets the subscriber within time T.For non-lattice renewal processes,the CDF of the time between a For example,suppose that the contact process between a seed randomly picked time point t and the next contact is given by and a subscriber is a Poisson process with a rate of As.Then,the (1-X())dr/ux [34].Therefore,the probability that the relay probability that the seed cannot meet the subscriber during the never meets the subscriber within time T is given by time period of T is given by e-T.Consequently.the offloading ef- ficiency of a single seed is Es =AsT. (T)=1- (1-X(t)dt/μx: (7) Offloading fails when all the ns.i seeds and n.i relays are un- 0 able to deliver the content to the subscriber.When contacts be- (ii).The relay meets the subscriber within time T,but it never tween different pairs of nodes are independent,the event that the meets any seed before the last contact with the subscriber.Sup- given seed or relay can successfully deliver the content is also in- pose that last contact between the relay and the subscriber hap- dependent of other relays/seeds.Therefore,we can multiply the pens in(T-t.T-t+dt),with 0<t<T.The CDF of the time du- failure probability of a single seed/relay to get: ration from last contact to the point of request expiration (at time F(ns.i.nr.)=e-nse-nE =p prd (3) T)is the age distribution of the renewal process.Therefore,the probability that the last contact happens in(T-t.T-t+dt)(with Eq.(3)allows us to directly compare the efficiency of relays and seeds.When there are more than two types of helpers,the offload- age of t at the point of request expire)can be given by-dX(t).Be- cause the probability that the relay does not meet any of the ns.i ing failure probability can be written in a similar way.For details, please refer to Section 5.3. seeds before T-t is given by ((T-t))"s.the probability for the Existing studies that use the static caching model are special second failure case is given by: cases of our system model in Eg.(3).When there are only seeds, we have F(ns..)=P.For a Poisson process with a rate of s. (Y(T-t))""dx(t). (8) we have F(ns..0)=e-mT,which is the same as the results of static caching systems 5.6]. Taking the sum of the failure probability of case (i)and (ii) and translating it into offloading efficiency,we get the result of 4.On-demand relaying in D2D systems Theorem1.▣ Similarly.the offloading efficiency of seeds can be written as: 4.1.Offloading efficiency for helpers Es=-InP=-InY(T) (9) We study the offloading efficiency of our relaying scheme in From Theorem 1,we can see that the ratio of Er/Es is bounded this section.In the relaying scheme,the identity of the content above by the number of seeds: that the subscriber is searching for is delivered to relays through the cellular network.Therefore,only two contacts are needed in Corollary 1.The efficiency ratio of Er/Es is bounded above by ns.i our relaying scheme.In contrast,traditional relaying scheme re- quires three contacts with an extra contact between the subscriber Proof.Because both X(t)and Y(t)are Complementary Cumula- and the relay to deliver the identity of the content.Both the seed tive Distribution Function (CCDF)for age distribution of a renewal
48 W. Wang et al. / Computer Communications 83 (2016) 45–55 subscribers may be different from that between seeds and subscribers. We assume that there are n helpers willing to contribute their storage. We assume that the storage contributed by each helper is able to hold I pieces of content. This assumption can be relaxed and will be discussed in Section 5. The storage in helpers can be used as either seeds or relays. We use ns, i and nr, i to represent the number of seeds and relays used by offloading sessions for content i. We define the offloading failure probability as the probability that the subscriber cannot discover the content through D2D communication within a given time period of T. To evaluate the effi- ciency of a single helper, we define the offloading efficiency Es of a single seed as follows: Es = − ln Ps, (1) where Ps is the probability that the subscriber cannot receive the content from that particular seed. A higher offloading efficiency means that the seed is more efficient in transmitting the content through the D2D link. Similarly, we define the efficiency of a relay Er as − ln Pr, where Pr is the probability that the subscriber cannot receive the content from the given relay. Table 1 summarizes the symbols used in this paper. As the offloading failure probability depends on the size of the cache allocated for the given content, it is a function of the number of relays and seeds for content i. We denote the offloading failure probability for content i as F(ns, i, nr, i). Therefore, the total amount of traffic offloaded by D2D communication can be written as: D = N i=1 MRi(1 − F (ns,i, nr,i)). (2) For example, suppose that the contact process between a seed and a subscriber is a Poisson process with a rate of λs. Then, the probability that the seed cannot meet the subscriber during the time period of T is given by e−λsT . Consequently, the offloading ef- ficiency of a single seed is Es = λsT. Offloading fails when all the ns, i seeds and nr, i relays are unable to deliver the content to the subscriber. When contacts between different pairs of nodes are independent, the event that the given seed or relay can successfully deliver the content is also independent of other relays/seeds. Therefore, we can multiply the failure probability of a single seed/relay to get: F (ns,i, nr,i) = e−ns,iEse−nr,iEr = Pns,i s Pnr,i r . (3) Eq. (3) allows us to directly compare the efficiency of relays and seeds. When there are more than two types of helpers, the offloading failure probability can be written in a similar way. For details, please refer to Section 5.3. Existing studies that use the static caching model are special cases of our system model in Eq. (3). When there are only seeds, we have F (ns,i, 0) = P ns,i s . For a Poisson process with a rate of λs, we have F (ns,i, 0) = e−ns,iλsT , which is the same as the results of static caching systems [5,6]. 4. On-demand relaying in D2D systems 4.1. Offloading efficiency for helpers We study the offloading efficiency of our relaying scheme in this section. In the relaying scheme, the identity of the content that the subscriber is searching for is delivered to relays through the cellular network. Therefore, only two contacts are needed in our relaying scheme. In contrast, traditional relaying scheme requires three contacts with an extra contact between the subscriber and the relay to deliver the identity of the content. Both the seed and relays can deliver contents to the subscriber via short range links. The efficiency of the seed only depends on the contact process between subscribers and seeds. However, the efficiency of the relay is dependent on the number of seeds, in addition to the contact process between subscribers and relays. This is because relays need to first download the content from seeds before they can transmit it to the subscribers. We first consider the offloading efficiency of a single relay node. When the contact pattern and number of seeds are known, the efficiency of the relays can be calculated as follows: Theorem 1. Suppose there are ns, i seeds for content i, and the intercontact time between seeds and relays follows a Cumulative Distribution Function (CDF) of Y(t) with a mean of μy. If the inter-contact time between relay and subscriber follows CDF of X(t) with a mean of μx, then the offloading efficiency of the relay node for content i is given by: Er = − ln Xˆ(T ) − T 0 Yˆ(T − t) ns,i dXˆ(t) , (4) where: Xˆ(t) = 1 − t 0 (1 − X(τ ))dτ /μx (5) Yˆ(t) = 1 − t 0 (1 − Y (τ ))dτ /μy. (6) Proof. There are two cases that the given relay will fail to deliver the content. (i). The relay never meets the subscriber within time T. For non-lattice renewal processes, the CDF of the time between a randomly picked time point t and the next contact is given by t 0 (1 − X(τ ))dτ /μx [34]. Therefore, the probability that the relay never meets the subscriber within time T is given by Xˆ(T ) = 1 − T 0 (1 − X(τ ))dτ /μx. (7) (ii). The relay meets the subscriber within time T, but it never meets any seed before the last contact with the subscriber. Suppose that last contact between the relay and the subscriber happens in (T − t, T − t + dt), with 0 ≤ t ≤ T. The CDF of the time duration from last contact to the point of request expiration (at time T) is the age distribution of the renewal process. Therefore, the probability that the last contact happens in (T − t, T − t + dt) (with age of t at the point of request expire) can be given by −dXˆ(t). Because the probability that the relay does not meet any of the ns, i seeds before T − t is given by Yˆ(T − t) ns,i , the probability for the second failure case is given by: − T 0 Yˆ(T − t) ns,i dXˆ(t). (8) Taking the sum of the failure probability of case (i) and (ii) and translating it into offloading efficiency, we get the result of Theorem 1. Similarly, the offloading efficiency of seeds can be written as: Es = − ln Ps = − lnYˆ(T ). (9) From Theorem 1, we can see that the ratio of Er/Es is bounded above by the number of seeds: Corollary 1. The efficiency ratio of Er/Es is bounded above by ns, i. Proof. Because both Xˆ(t) and Yˆ(t) are Complementary Cumulative Distribution Function (CCDF) for age distribution of a renewal
W.Wang et aL/Computer Communications 83(2016)45-55 Table 1 Notations used in this paper. Symbol Description The number of contents The segment length for contents The total number of helpers L The side length for the network region The time before the subscriber becomes impatient The storage size for helpers R Request rate for content i fls.i The number of seeds caching content i The number of relays for content i Es.E Offloading efficiency for one seed or one relay Ps,Pr The probability that subscriber cannot receive content from a seed or relay The subscriber contact rate for seeds and relays X(t) CDF of the inter-contact time between relays and subscribers Y(t) CDF of the inter-contact time between seeds and relays process,which are decreasing functions within range of [0,1].we larger than AT,which means a relay is always less effective than a have: seed when As =Ar.This is reasonable because a relay must down- load the content from the seed before it can deliver it to the sub- R(T)- ((T-t))"d(t) scriber. In summary.the offloading efficiency of relays can reach the ((T-t))"dx(t) upper bound of ns.i times the number of seeds when relays have high contact rates to subscribers.However,relays are less efficient =E(T-t)10≤t≤T than seeds when they have same contact rates as seeds. ≥min(T-t)"1o≤t≤T 4.2.Relay selection =((T) (10) According to the above analysis,relaying through a node with Therefore the same contact pattern as seeds is not helpful.However,D2D contact patterns are inherently non-uniform.The contact processes Er =-In T)- (T-t))Td(t) between different groups of mobile nodes have different contact patterns.As seeds serve for all the subscribers in the network,the ≤-nsiln?(T). (11) contact process between a seed and a subscriber should follow the typical contact pattern between random pairs of nodes.However. AsEs=-lnY(T).we have Er≤ns.iEs.□ relays may have different contact patterns to a given subscriber. We can intentionally select relays for a given offloading request Example:poisson contact process Consider the case where both contact processes are Poisson based on previous knowledge of their contact patterns.For exam- process.We have X(t)=1-e-Art and ux=1/r.where Ar is the ple,a device belonging to a friend of the user or a device mounted rate for the contact process between a relay and the subscriber. at a subway station that the user always passes by,can be selected Similarly,we have Y(t)=1-e-Ast and uy =1/As.Using Eqs.(5) as the relay in order to improve the efficiency of content distribu- and (6),we have(t)=e-t and ?(t)=e-st.We can obtain: tion.As discussed in Section 4.1,the offloading efficiency mainly depends on the contact rates of relays.Therefore,we propose to -In nsi入se-dT-入re-nT select relays based on their contact rates to reduce the relay selec- Er ns.is-入 入r≠ns.i入s (12) tion complexity. λrT-ln(1+rT) 入r=几s.i入s The relay selection process can be carried out either by the net- work operator or by the subscriber.When the relay is selected by Note that the sum of two exponentially distributed inter-contact the network operator,the network operator needs to record the times actually follows a hypoexponential distribution as expected. top-k nodes with the highest contact rates for each subscriber. Eq.(12)shows some characteristics of the offloading efficiency Such records can also be generated by the user devices and sub- for relay nodes. mitted to the operator along with the offloading request.These (i)When r ns.is,Er approaches ns.isT.Compared to the nodes will have higher contact rate to the subscriber compared to offloading efficiency of the seed Es =AsT.we see that a single relay randomly selected nodes. has the same efficiency as ns i seeds.where the upper bound in To verify the existence of nodes with higher contact rates,we Corollary 1 is achieved.Therefore,for rare content with low ns.i. studied the MIT reality trace [35],which contains Bluetooth traces using nr.i frequently contacting relays to download the content can from 100 mobiles for more than 9 months.Fig.2 shows the CDF of have effects similar to multiplying nsi by nr.i. the contact rates for the top-k frequently contacted neighbors for (ii)When a relay has the same contact pattern to the subscriber each mobile,normalized by the average contact rate.We see that as seeds,i.e,入r=λs=入,we have a relay offloading efficiency more than 90%of the subscribers have at least 5 neighbors with of a contact rate 10 times higher than average.More than 80%sub- scribers have 10 neighbors with contact rates 10 times higher than Er=入T+ln ns.i-1 几.i-e-(a4-1)x7 (13) average.Similar distributions of frequently contacted neighbors are also observed in recently collected traces as in [36]. Note that we have ns.-1s ns.i-e-(ns-)AT when ns.i 1.Thus Relays can also be selected by the subscribers.A subscriber the second term in Eq.(13)is always negative and Er can never be can request friends or colleagues to help with downloading.As
W. Wang et al. / Computer Communications 83 (2016) 45–55 49 Table 1 Notations used in this paper. Symbol Description N The number of contents M The segment length for contents n The total number of helpers L The side length for the network region T The time before the subscriber becomes impatient I The storage size for helpers Ri Request rate for content i ns, i The number of seeds caching content i nr, i The number of relays for content i Es , Er Offloading efficiency for one seed or one relay Ps , Pr The probability that subscriber cannot receive content from a seed or relay λs , λr The subscriber contact rate for seeds and relays X(t) CDF of the inter-contact time between relays and subscribers Y(t) CDF of the inter-contact time between seeds and relays process, which are decreasing functions within range of [0, 1], we have: Xˆ(T ) − T 0 Yˆ(T − t) ns,i dXˆ(t) ≥ − T 0 Yˆ(T − t) ns,i dXˆ(t) = E Yˆ(T − t) ns,i |0 ≤ t ≤ T ≥ min Yˆ(T − t) ns,i |0 ≤ t ≤ T = Yˆ(T ) ns,i . (10) Therefore, Er = − ln Xˆ(T ) − T 0 Yˆ(T − t) ns,i dXˆ(t) ≤ −ns,i lnYˆ(T ). (11) As Es = − lnYˆ(T ), we have Er ≤ ns, iEs. Example: poisson contact process Consider the case where both contact processes are Poisson process. We have X(t) = 1 − e−λrt and μx = 1/λr, where λr is the rate for the contact process between a relay and the subscriber. Similarly, we have Y (t) = 1 − e−λst and μy = 1/λs. Using Eqs. (5) and (6), we have Xˆ(t) = e−λrt and Yˆ(t) = e−λst. We can obtain: Er = ⎧ ⎨ ⎩ − lnns,iλse−λrT − λre−ns,iλsT ns,iλs − λr λr = ns,iλs λrT − ln(1 + λrT ) λr = ns,iλs . (12) Note that the sum of two exponentially distributed inter-contact times actually follows a hypoexponential distribution as expected. Eq. (12) shows some characteristics of the offloading efficiency for relay nodes. (i) When λr ns, iλs, Er approaches ns, iλsT. Compared to the offloading efficiency of the seed Es = λsT, we see that a single relay has the same efficiency as ns, i seeds, where the upper bound in Corollary 1 is achieved. Therefore, for rare content with low ns, i, using nr, i frequently contacting relays to download the content can have effects similar to multiplying ns, i by nr, i. (ii) When a relay has the same contact pattern to the subscriber as seeds, i.e., λr = λs = λ, we have a relay offloading efficiency of: Er = λT + ln ns,i − 1 ns,i − e−(ns,i−1)λT . (13) Note that we have ns,i − 1 ≤ ns,i − e−(ns,i−1)λT when ns, i ≥ 1. Thus the second term in Eq. (13) is always negative and Er can never be larger than λT, which means a relay is always less effective than a seed when λs = λr. This is reasonable because a relay must download the content from the seed before it can deliver it to the subscriber. In summary, the offloading efficiency of relays can reach the upper bound of ns, i times the number of seeds when relays have high contact rates to subscribers. However, relays are less efficient than seeds when they have same contact rates as seeds. 4.2. Relay selection According to the above analysis, relaying through a node with the same contact pattern as seeds is not helpful. However, D2D contact patterns are inherently non-uniform. The contact processes between different groups of mobile nodes have different contact patterns. As seeds serve for all the subscribers in the network, the contact process between a seed and a subscriber should follow the typical contact pattern between random pairs of nodes. However, relays may have different contact patterns to a given subscriber. We can intentionally select relays for a given offloading request based on previous knowledge of their contact patterns. For example, a device belonging to a friend of the user or a device mounted at a subway station that the user always passes by, can be selected as the relay in order to improve the efficiency of content distribution. As discussed in Section 4.1, the offloading efficiency mainly depends on the contact rates of relays. Therefore, we propose to select relays based on their contact rates to reduce the relay selection complexity. The relay selection process can be carried out either by the network operator or by the subscriber. When the relay is selected by the network operator, the network operator needs to record the top-k nodes with the highest contact rates for each subscriber. Such records can also be generated by the user devices and submitted to the operator along with the offloading request. These nodes will have higher contact rate to the subscriber compared to randomly selected nodes. To verify the existence of nodes with higher contact rates, we studied the MIT reality trace [35], which contains Bluetooth traces from 100 mobiles for more than 9 months. Fig. 2 shows the CDF of the contact rates for the top-k frequently contacted neighbors for each mobile, normalized by the average contact rate. We see that more than 90% of the subscribers have at least 5 neighbors with a contact rate 10 times higher than average. More than 80% subscribers have 10 neighbors with contact rates 10 times higher than average. Similar distributions of frequently contacted neighbors are also observed in recently collected traces as in [36]. Relays can also be selected by the subscribers. A subscriber can request friends or colleagues to help with downloading. As
50 W.Wang et aL /Computer Communications 83 (2016)45-55 Request for Content B Request for Content C -t(n+)spL21.It is with regular contact patterns as relays.For example,nodes de- easy to see that N=(L)in this case.Consequently.the amount of ployed at subway stations have high contact rates to many users. content increase as N=©(√m)when the network size L increases.. If we use these nodes as seeds to cache popular content,their ef- Note that the relaying scheme needs (vn)relays,which re- ficiency is equivalent to Ar/s regular seeds.However,if we use quires more friends with constant contact rates when the network them as relays.their efficiency is multiplied by n/2 times,which size increases.In cases where subscribers only have constant num- can be higher than Ar/As.Therefore,it is preferable to use these ber of friends,multi-hop relaying can be used.In multi-hop re- nodes as relays instead of as seeds. laying.friends of friends are requested to act as relays.It can be Note that our relay selection process only considers the contact shown that offloading efficiency of (ns.i-s)sT can be achieved rate of a given relay.Relays can improve the offloading efficiency by using relays that are h hop away from the subscriber when each as long as they have higher contact rates than a randomly selected hop satisfiesrnsis and h is o(logn).Therefore,we can still node.In the case that two relays always have the same mobility find (n)relays via multi-hop relaying in the case where each pattern,e.g.,two mobile phones belonging to the same person,the subscriber/relay only has a constant number of friends. offloading efficiency will be worse than relays have independent mobility patterns.How to select relays under correlated mobility 4.4.Reuse of relays patterns is leaved as future works. 4.3.Asymptotic analysis for Poisson process Relays can be reused for offloading different pieces of content. This is especially useful for offloading rare contents.Define a relay session as the period which starts at the time when the request Relays with higher contact rates increase the amount of content reaches the relay,and ends when either the content is delivered to that the system can support when the network size increases.Con- the subscriber or the request expires.As shown in Fig.3.content sider a network deployed on a square region with side length of L. stored in a single relay segment changes over time while the stor- Suppose that the density of helpers in the network is p.The num- age segment always retain some content even after the session is ber of helpers n then increases as pL-when the network size L finished.Moreover,relays who do not contact a seed will do noth- increases.As the network size increases,the contact rate between ing during the relay session,as shown in the relay session for con- a particular pair of nodes decreases at a speed of (L-2)for vari- ous types of mobility models [321. tent C in Fig.3.Among the©(√n)relays reserving resources for We first show that the total amount of content that the of- the session,only a small fraction F(nsi.0)of them actually down- load the content. floading system can support is (1)under static caching systems. In static caching systems with Poisson contact process,we have As described in Section 3.1.the relay storage segment is com- F(ns..0)=e-nsT.wheres=(L-2).Suppose that nsi=o(12). mitted to caching the requested content during the relay session. i.e.0s ns.i<c1L2,forarbitrary small constant c when Loo.We Since the number of relay segments used for a session for content can make F(ns,0)arbitrarily close to 1 when L approaches infin- i is n.i.which can be obtained through methods described in 5.1. ity by taking the constant c small enough.Therefore,to achieve the average number of relay segments committed to relaying con- tent i can be derived by Little's Theorem: an acceptable offloading success probability,nsi at least should be 2(L2).i.e..ns.iz c2L2 for some constant c2 when L-oo.The nni Rinr.iTi, (15)
50 W. Wang et al. / Computer Communications 83 (2016) 45–55 0 10 20 30 40 50 60 0 0.2 0.4 0.6 0.8 1 λr /λs CDF Top−5 Top−10 Fig. 2. CDF of λr/λs for top frequent contacting friends in MIT reality trace. subscribers are familiar with their friends, they will know the exact contact patterns of these relays. Such relays may exhibit different contact patterns compared to relays selected by the network operator. For example, the subscriber may contact colleagues regularly on workdays [35]. Suppose that inter-contact time between relay and subscriber is bounded above by T0, with T0 < T. By the reversibility of the contact process [34], the efficiency of a single relay is at least as good as ns, i seeds that duplicated the content at time T0 after the subscriber requested the content. By Theorem 1, we have the efficiency Er bounded below by −ns,i ln(Yˆ(T − T0)). Considering the case that the contact process between the seed and relay is a Poisson process, Er is bounded below by ns,i(T − T0)λs. In this case, a single relay has the same efficiency as ns, i/2 seeds when T0 = T/2. The above analysis shows the potential benefits of using nodes with regular contact patterns as relays. For example, nodes deployed at subway stations have high contact rates to many users. If we use these nodes as seeds to cache popular content, their ef- ficiency is equivalent to λr/λs regular seeds. However, if we use them as relays, their efficiency is multiplied by ns, i/2 times, which can be higher than λr/λs. Therefore, it is preferable to use these nodes as relays instead of as seeds. Note that our relay selection process only considers the contact rate of a given relay. Relays can improve the offloading efficiency as long as they have higher contact rates than a randomly selected node. In the case that two relays always have the same mobility pattern, e.g., two mobile phones belonging to the same person, the offloading efficiency will be worse than relays have independent mobility patterns. How to select relays under correlated mobility patterns is leaved as future works. 4.3. Asymptotic analysis for Poisson process Relays with higher contact rates increase the amount of content that the system can support when the network size increases. Consider a network deployed on a square region with side length of L. Suppose that the density of helpers in the network is ρ. The number of helpers n then increases as ρL2 when the network size L increases. As the network size increases, the contact rate between a particular pair of nodes decreases at a speed of (L−2) for various types of mobility models [32]. We first show that the total amount of content that the of- floading system can support is O(1) under static caching systems. In static caching systems with Poisson contact process, we have F (ns,i, 0) = e−ns,iλsT , where λs = (L−2). Suppose that ns,i = o(L2), i.e., 0 ≤ ns, i < c1L2, forarbitrary small constant c1 when L → ∞. We can make F(ns, i, 0) arbitrarily close to 1 when L approaches infinity by taking the constant c1 small enough. Therefore, to achieve an acceptable offloading success probability, ns, i at least should be (L2), i.e., ns, i ≥ c2L2 for some constant c2 when L → ∞. The Fig. 3. Content replacement process for a single relay storage segment. total number of available storage segments is ρL2I and we have N i=1 ns,i ≤ ρL2I. Therefore, the total amount of content that the system can support is N = O(1). Now consider system with relays. As friends contact regularly irrespective of the network size, it is reasonable to assume that the contact rate between relays and subscribers, λr, remains constant when the network size increases. As λs = (L−2), we have λs ≥ cλL−2 for some constant cλ. To ensure that F(ns, i, nr, i) ≤ F0 for some constant F0 ≤ 1, we can choose ns,i = nr,i = L − ln F0/(cλT ). For sufficiently large L, we have λr ns, iλs due to ns, iλs is O(L−1) in this case. Using the analysis in Section 4.1, we have Er ≥ (ns,i − ε)λsT for arbitrarily small ε. Therefore, F (ns,i, nr,i) ≤ e−(ns,i+nr,ins,i−nr,iε)λsT ≤ e−(L √− ln F0/(cλT ))2cλL−2T = F0. (14) Thus, we can ensure that there are at least N pieces of content with bounded failure probability, while N i=1(ns,i + nr,i) ≤ ρL2I. It is easy to see that N = (L) in this case. Consequently, the amount of content increase as N = (√n) when the network size L increases. Note that the relaying scheme needs (√n) relays, which requires more friends with constant contact rates when the network size increases. In cases where subscribers only have constant number of friends, multi-hop relaying can be used. In multi-hop relaying, friends of friends are requested to act as relays. It can be shown that offloading efficiency of (ns,i − ε)λsT can be achieved by using relays that are h hop away from the subscriber when each hop satisfies λr ns, iλs and h is O(log n). Therefore, we can still find (√n) relays via multi-hop relaying in the case where each subscriber/relay only has a constant number of friends. 4.4. Reuse of relays Relays can be reused for offloading different pieces of content. This is especially useful for offloading rare contents. Define a relay session as the period which starts at the time when the request reaches the relay, and ends when either the content is delivered to the subscriber or the request expires. As shown in Fig. 3, content stored in a single relay segment changes over time while the storage segment always retain some content even after the session is finished. Moreover, relays who do not contact a seed will do nothing during the relay session, as shown in the relay session for content C in Fig. 3. Among the (√n) relays reserving resources for the session, only a small fraction F(ns, i, 0) of them actually download the content. As described in Section 3.1, the relay storage segment is committed to caching the requested content during the relay session. Since the number of relay segments used for a session for content i is nr, i, which can be obtained through methods described in 5.1, the average number of relay segments committed to relaying content i can be derived by Little’s Theorem: nr,i = Rinr,i Ti, (15)
W.Wang et aL/Computer Communications 83(2016)45-55 where Ri is the request rate of content i,n.i is the number of contact patterns and content request rates.Therefore,we should relays for each request and Ti is the average duration of a relay carefully set the number of relays used for each piece of content session.The fraction of offloading session with length longer than based on these parameters.Seeds need to statically cache a piece t can be written as F(nsi.nr.i.t)by substituting the time limit of content to ensure that there are minimal copies of the given T with a variable t in the offloading failure probability.Therefore, content in the network.This leaves limited storage for relays.Thus, we have Ti=F(nsi.n.i.t)dt as subscribers waiting at most for we need to dynamically adjust the number of relays allocated for time T. each piece of content,depending on its popularity and the number From Eq.(15).we see that nr.i can be smaller than nr.i when of seeds currently caching the content. R1.This shows rare content do not actually use nr.i storage In this section,we jointly optimize the storage assigned to segments on average.Thus,it would be more efficient to allocate seeds and relays so that the overall offloading failure probability more relays for rare content,since a relay storage segment can dy- is minimized for content with different request rates. namically switch between different content to get better storage The storage assignment problem for a system with both relays utilization.For "hot"contents that are always simultaneously re- and seeds can be formulated as follows: quested by multiple subscribers,i.e.,RiT>1.a better strategy is Minimize∑,MRF(nfi,nfrs)) (17) to statically cache the content in seeds,since spreading the "hot" content among different relays is less efficient when Ar is compa rable to ns.is as shown in Section 4.1. Relays stay idle between relay sessions.The average length of s.t∑1(fsi+ku.ifri)≤I (18) idle period can be derived using the G/G/m/m queueing model.The utilization factor for relay segments depends on both the request arrival process and the node contact patterns,which do not have fsi+fr s1Vi (19) a general closed form formula.To simplify our model,we use the relay reuse factor ki=RiT to approximate the utility of relays for content i in Section 5. fs.i≥0,ri≥0i (20) 4.5.Dynamical loads wheref=andfr=are the fractions of nodes that serve as seeds and relays for content i.Eg.(18)limits the sum of stor- There are two cases where the requesting rate of R;can change age segments allocated for seeds and relays to be smaller than nl. The first case is a long-term evolution of content popularity.where which is the total number of storage segments in the system.The R;changes slowly.The second case is short-term load surges fraction of relay segments is multiplied by kwhich reflects the where R;can temporally exceed its average rate. impact of relay reuse as described in Section 4.4.Eq.(19)requires To handle long-term load changes,the cellular network keeps that the number of seeds plus the number of the relays should not track of the requests of each content.As all offloading requests are be larger than n for any content. sent through the cellular network,we can estimate the long-term In the above formulation,we relax the storage optimization request rates for a given content i as follows.Suppose that there problem from an integer programming problem,which is NP- are ri(t)requests for content i in a given time slot t,we use Expo- hard [5].to a convex optimization problem that can be efficiently nential Moving Average (EMA)to estimate the long-term request solved.This is because we use real variables fsi and fr.i instead rate as: of integer variables ns.i and nr.i.We can show that the approxi- mation ratio of the above relaxation approaches 1 when n goes to R(t)=aR(t-1)+(1-a)ri(t) (16) infinity,using similar arguments as in 5. where R(t)is the long-term request rate at time t and o is a con- stant smoothing factor with value between 0.8 and 0.95.The size Theorem 2.The objective function Eg.(17)in the storage assignment of the time slot can be chosen as one hour and the smoothing problem is convex,when Er(fs.i)is concave and twice differentiable. factor a determines the smoothness of the estimation.After esti- Proof.The convexity of the objective function in Eq.(17)can be mated the requesting rate,we execute the optimization algorithm derived as follows: described in Section 5 to calculate the optimal values for ns.i and As the nonnegative weighted sum of convex functions is con- nr.i To handle short-term surges in content request,we apply a vex,it is enough to show that F(nfs.i.nfr.i)is convex respect to fs.i rate limiting algorithm for relay requests.Note that the number of and fr.i.We have: seeds,ns.i.is independent of short-term requesting rate changes. F(nfs.i.nfrt)=e-n-n听a (21) We only need to modify the number of relays nr.i for the given re- quest.In case that the instantaneous request rate ri(t)is larger than Define g(fs.i.fr.)as -nfs.Es-nfr.Er.Since F(nfs.i.nfr.i)= long-term average Ri(t).we set the number of relays allocated for e().if g(fsfr)is convex then Fnfs.nf.)is also convex a given request toso that the total number of relays allo- [371. cated for content i will not exceed BR(t)n i.where B is an over- As we have shown in Section 4.1,Er is normally a function of load factor which allows a single content to use more caches than fs i and Es is a constant with respect to fs.i and fr.i.Therefore, the long-term optimal value.In case of load surge,the number of when g(fs.i.fr.)is twice differentiable,we have: relays allocated for the content will be limited and the offloading a2E(5.i) efficiency for the given content will decrease temporally. V2g(fa.fr)=nf (22) 5.Storage assignment optimization Since nfr.i is always positive,gufs.fr.)is convex when E(fs)is concave. 5.1.Problem formulation Corollary 2.The storage assignment problem in Egs.(17)-(20)is con- The analysis in Section 4 shows that the efficiency of relays is vex for relays with nonlattice renewal contact process and twice dif- determined by various factors,including the number of seeds,relay ferentiable offloading efficiency function
W. Wang et al. / Computer Communications 83 (2016) 45–55 51 where Ri is the request rate of content i, nr, i is the number of relays for each request and Ti is the average duration of a relay session. The fraction of offloading session with length longer than t can be written as F(ns, i, nr, i, t) by substituting the time limit T with a variable t in the offloading failure probability. Therefore, we have Ti = T 0 F (ns,i, nr,i,t)dt as subscribers waiting at most for time T. From Eq. (15), we see that nr,i can be smaller than nr, i when Ri Ti 1, a better strategy is to statically cache the content in seeds, since spreading the “hot” content among different relays is less efficient when λr is comparable to ns, iλs as shown in Section 4.1. Relays stay idle between relay sessions. The average length of idle period can be derived using the G/G/m/m queueing model. The utilization factor for relay segments depends on both the request arrival process and the node contact patterns, which do not have a general closed form formula. To simplify our model, we use the relay reuse factor ku,i = RiT to approximate the utility of relays for content i in Section 5. 4.5. Dynamical loads There are two cases where the requesting rate of Ri can change. The first case is a long-term evolution of content popularity, where Ri changes slowly. The second case is short-term load surges, where Ri can temporally exceed its average rate. To handle long-term load changes, the cellular network keeps track of the requests of each content. As all offloading requests are sent through the cellular network, we can estimate the long-term request rates for a given content i as follows. Suppose that there are ri(t) requests for content i in a given time slot t, we use Exponential Moving Average (EMA) to estimate the long-term request rate as: Ri(t) = αRi(t − 1) + (1 − α)ri(t) (16) where Ri(t) is the long-term request rate at time t and α is a constant smoothing factor with value between 0.8 and 0.95. The size of the time slot can be chosen as one hour and the smoothing factor α determines the smoothness of the estimation. After estimated the requesting rate, we execute the optimization algorithm described in Section 5 to calculate the optimal values for ns, i and nr, i. To handle short-term surges in content request, we apply a rate limiting algorithm for relay requests. Note that the number of seeds, ns, i, is independent of short-term requesting rate changes. We only need to modify the number of relays nr, i for the given request. In case that the instantaneous request rate ri(t) is larger than long-term average Ri(t), we set the number of relays allocated for a given request to βRi(t) ri(t) nr,i so that the total number of relays allocated for content i will not exceed βRi(t)nr, i, where β is an overload factor which allows a single content to use more caches than the long-term optimal value. In case of load surge, the number of relays allocated for the content will be limited and the offloading efficiency for the given content will decrease temporally. 5. Storage assignment optimization 5.1. Problem formulation The analysis in Section 4 shows that the efficiency of relays is determined by various factors, including the number of seeds, relay contact patterns and content request rates. Therefore, we should carefully set the number of relays used for each piece of content based on these parameters. Seeds need to statically cache a piece of content to ensure that there are minimal copies of the given content in the network. This leaves limited storage for relays. Thus, we need to dynamically adjust the number of relays allocated for each piece of content, depending on its popularity and the number of seeds currently caching the content. In this section, we jointly optimize the storage assigned to seeds and relays so that the overall offloading failure probability is minimized for content with different request rates. The storage assignment problem for a system with both relays and seeds can be formulated as follows: Minimize N i=1 MRiF (n fs,i, n fr,i) (17) s.t. N i=1 (fs,i + ku,i fr,i) ≤ I (18) fs,i + fr,i ≤ 1 ∀ i (19) fs,i ≥ 0, fr,i ≥ 0 ∀ i (20) where fs,i = ns,i n and fr,i = nr,i n are the fractions of nodes that serve as seeds and relays for content i. Eq. (18) limits the sum of storage segments allocated for seeds and relays to be smaller than nI, which is the total number of storage segments in the system. The fraction of relay segments is multiplied by ku, i, which reflects the impact of relay reuse as described in Section 4.4. Eq. (19) requires that the number of seeds plus the number of the relays should not be larger than n for any content. In the above formulation, we relax the storage optimization problem from an integer programming problem, which is NPhard [5], to a convex optimization problem that can be efficiently solved. This is because we use real variables fs, i and fr, i instead of integer variables ns, i and nr, i. We can show that the approximation ratio of the above relaxation approaches 1 when n goes to infinity, using similar arguments as in [5]. Theorem 2. The objective function Eq. (17) in the storage assignment problem is convex, when Er(fs, i) is concave and twice differentiable. Proof. The convexity of the objective function in Eq. (17) can be derived as follows: As the nonnegative weighted sum of convex functions is convex, it is enough to show that F(nfs, i, nfr, i) is convex respect to fs, i and fr, i. We have: F (n fs,i, n fr,i) = e−n fs,iEs−n fr,iEr . (21) Define g(fs, i, fr, i) as −n fs,iEs − n fr,iEr. Since F (n fs,i, n fr,i) = eg(fs,i,fr,i) , if g(fs, i, fr, i) is convex then F(nfs, i, nfr, i) is also convex [37]. As we have shown in Section 4.1, Er is normally a function of fs, i and Es is a constant with respect to fs, i and fr, i. Therefore, when g(fs, i, fr, i) is twice differentiable, we have: ∇2g(fs,i, fr,i) = −n fr,i ∂2Er(fs,i) ∂ f 2 s,i . (22) Since nfr, i is always positive, g(fs, i, fr, i) is convex when Er(fs, i) is concave. Corollary 2. The storage assignment problem in Eqs. (17)–(20) is convex for relays with nonlattice renewal contact process and twice differentiable offloading efficiency function.
52 W.Wang et aL/Computer Communications 83 (2016)45-55 Proof.To show E(fs)=-In P(fs)is concave,we have: R9(,f片)/kui=-6/M (33) P,()=(T)- ((T-)dr(0). (23) Note that-v/M is a constant for all content i.When vo is given, o we can separately solve Eqs.(32)and(33)for each content to get Note that(t)is a decreasing function,therefore we the optimal allocation.If the solution has+f1.it is evident have -dX(t)/dt z0.Since exponential functions are log-convex. that content i is so popular that we can simply set=1 (-d(t)/dt)((Tt)is a log-convex function offon ev- and v>0 for content i.Therefore,one way to solve the above problems is to use vo as a global pricing signal [38]to coordinate ery point t[0.T].Taking integral over t preserves log-convexity the distributed storage allocation fsi and fr.i for each content i. 137sot)d(t)is also log-convex.Adding a pos- itive constant X(T)preserves log-convexity.thus Pr(fs)is log- 5.3.Extension of the optimization framework convex and E(fsi)is concave.This directly leads to the convex- ity of the objective function as discussed above.It is clear that In the above discussion,we mainly focus on systems with two all the constraints in the storage assignment optimization problem types of helpers,namely,seeds and relays.The optimization frame- in Egs.(18)-(20)are linear,so the storage assignment problem is work in Section 5.1 can be extended to cases where there are more convex.▣ than two types of helpers. Consider the case where there are WiFi APs or Femtocells act- Corollary 2 shows the convexity of the storage assignment ing as helpers in the D2D offloading system.These APs or Femto- problem.Therefore,the optimal solution for this problem can be cells can provide low cost Internet access to nearby nodes.Thus, found by efficient algorithms such as gradient projection methods. they can be treated as if they were seeds for all content.Sup- pose that there are nA APs in the system.Then,we can change 5.2.Distributed solution Eq.(3)to F(ns.i.n.i.n)=e-ere-A.where E is the of- The Lagrangian of the storage assignment optimization problem floading efficiency of APs which can be calculated through the con- tact patterns between nodes and APs.The offloading efficiency of can be written as: Er is a function of both ns.i and nA in this scenario,because relays L(fs.r,fri.v) can also download from APs or Femtocells.However,the general structure of the problem does not change and the framework of -∑MRF(nf,n)+∑5+-1) Section 5.1 can also be applied.Using our optimization framework, tradeoffs between deploying APs and mobile helpers can be easily characterized. +Vo (h+k) (24) The systems may contain different types of helpers that have different contact patterns or different storage sizes.In these cases, each type of helpers will have its own fraction parameter fh.i for where vi.ie (0.1.....N)are Lagrangian multipliers for the con- content i in the optimization problem and each type of helper may straints. have a different offloading efficiency Eh.By Theorem 2,the general The optimal solutions ff must satisfy the KKT condi- tradeoff point for storage assignment in different types of helpers tions [37].which require can be solved when the offloading efficiency functions are concave. E+f后-1≤0.i=1,,N (25) 5.4.Practical issues ∑(+kf周)-1≤0 (26) The above optimization framework provides a way to allocate i= the storage in an offloading system.Although centralized optimiza- tions in cellular networks are possible,there are several practical 20,i=0,,N (27) issues to be considered in implementation. Number of contents:First,the number of content pieces may be very large in real systems.Therefore,solving fs.i and fr.i for (g+后-1)=0.i=1,,N (28) each content may incur high computational costs.To address this problem.we propose to group contents into a fixed number of content categories based on the popularity of the content.We (+u-)=0 observe that the solutions for different contents only depend (29) on the value of Ri.Therefore,we can aggregate contents with similar request rates into one content category and optimize over content categories to obtain approximate solutions.In practice MR9()++哈=0,i=1,,N (30) content can be grouped into bins with geometric sequences of request rates.e.g..the first bin contains contents with request rate smaller than 0.1 requests per day and the rest bins covers MR9(f)++ku.i哈=0,i=1,,N (31)】 content request rates of 0.1-0.2.0.2-0.4,0.4-0.8.....respectively. Consider the case there are I categories of contents,each with m where(and)By contents of request rate R.We can change the objective function condition Eq.(28).we see that either v=0 orf+=1 must Eq.(17)to MmiRiF(nfs.i.nfri).the constraint in Eq.(18)to be satisfied.Therefore,if we do not allocate all nodes as seeds mi(f+kifr)sI and keep all other constraints unchanged or relays for a specific content i(which meansf+f is smaller to reduce the problem size. than 1).we should have v*=0.Thus except for extremely popular Limited number of friends:Second,when the optimal solution content withf+=1.KKT conditions require: has large fr.i for some pieces of content,the subscriber may not R9(,)=-6M have enough friends with high contact rates.To address this limita- (32) tion,we add constraints,such as nfr.is5.Vi,to limit the number
52 W. Wang et al. / Computer Communications 83 (2016) 45–55 Proof. To show Er(fs,i) = − ln Pr(fs,i) is concave, we have: Pr(fs,i) = Xˆ(T ) − T 0 Yˆ(T − t) n fs,i dXˆ(t). (23) Note that Xˆ(t) is a decreasing function, therefore we have −dXˆ(t)/dt ≥ 0. Since exponential functions are log-convex, −dXˆ(t)/dtYˆ(T − t) n fs,i is a log-convex function of fs, i on every point t ∈ [0, T]. Taking integral over t preserves log-convexity [37], so − T 0 Yˆ(T − t) n fs,i dXˆ(t) is also log-convex. Adding a positive constant Xˆ(T ) preserves log-convexity, thus Pr(fs, i) is logconvex and Er(fs, i) is concave. This directly leads to the convexity of the objective function as discussed above. It is clear that all the constraints in the storage assignment optimization problem in Eqs. (18)–(20) are linear, so the storage assignment problem is convex. Corollary 2 shows the convexity of the storage assignment problem. Therefore, the optimal solution for this problem can be found by efficient algorithms such as gradient projection methods. 5.2. Distributed solution The Lagrangian of the storage assignment optimization problem can be written as: L(fs,r, fr,i, ν) = N i=1 MRiF (n fs,i, n fr,i) +N i=1 νi(fs,i + fr,i − 1) +ν0 N i=1 (fs,i + ku,i fr,i) − I (24) where νi, i ∈ {0, 1, . . ., N} are Lagrangian multipliers for the constraints. The optimal solutions f ∗ s,i , f ∗ r,i , ν∗ i must satisfy the KKT conditions [37], which require f ∗ s,i + f ∗ r,i − 1 ≤ 0, i = 1, . . ., N (25) N i=1 f ∗ s,i + ku,i f ∗ r,i − I ≤ 0 (26) ν∗ i ≥ 0, i = 0, . . ., N (27) ν∗ i f ∗ s,i + f ∗ r,i − 1 = 0, i = 1, . . ., N (28) ν∗ 0 N i=1 f ∗ s,i + ku,i f ∗ r,i − I = 0 (29) MRiϕs (f ∗ s,i , f ∗ r,i ) + ν∗ i + ν∗ 0 = 0, i = 1, . . ., N (30) MRiϕr(f ∗ s,i , f ∗ r,i ) + ν∗ i + ku,iν∗ 0 = 0, i = 1, . . ., N (31) where ϕs (fs,i, fr,i) = ∂F (n fs,i,n fr,i) ∂ fs,i and ϕr(fs,i, fr,i) = ∂F (n fs,i,n fr,i) ∂ fr,i . By condition Eq. (28), we see that either ν∗ i = 0 or f ∗ s,i + f ∗ r,i = 1 must be satisfied. Therefore, if we do not allocate all nodes as seeds or relays for a specific content i (which means f ∗ s,i + f ∗ r,i is smaller than 1), we should have ν∗ i = 0. Thus except for extremely popular content with f ∗ s,i + f ∗ r,i = 1, KKT conditions require: Riϕs (f ∗ s,i , f ∗ r,i ) = −ν∗ 0/M (32) Riϕr(f ∗ s,i , f ∗ r,i )/ku,i = −ν∗ 0/M. (33) Note that −ν∗ 0/M is a constant for all content i. When ν∗ 0 is given, we can separately solve Eqs. (32) and (33) for each content to get the optimal allocation. If the solution has f ∗ s,i + f ∗ r,i ≥ 1, it is evident that content i is so popular that we can simply set f ∗ s,i + f ∗ r,i = 1 and ν∗ i > 0 for content i. Therefore, one way to solve the above problems is to use ν0 as a global pricing signal [38] to coordinate the distributed storage allocation fs, i and fr, i for each content i. 5.3. Extension of the optimization framework In the above discussion, we mainly focus on systems with two types of helpers, namely, seeds and relays. The optimization framework in Section 5.1 can be extended to cases where there are more than two types of helpers. Consider the case where there are WiFi APs or Femtocells acting as helpers in the D2D offloading system. These APs or Femtocells can provide low cost Internet access to nearby nodes. Thus, they can be treated as if they were seeds for all content. Suppose that there are nA APs in the system. Then, we can change Eq. (3) to F (ns,i, nr,i, nA) = e−ns,iEse−nr,iEr e−nAEA , where EA is the of- floading efficiency of APs which can be calculated through the contact patterns between nodes and APs. The offloading efficiency of Er is a function of both ns, i and nA in this scenario, because relays can also download from APs or Femtocells. However, the general structure of the problem does not change and the framework of Section 5.1 can also be applied. Using our optimization framework, tradeoffs between deploying APs and mobile helpers can be easily characterized. The systems may contain different types of helpers that have different contact patterns or different storage sizes. In these cases, each type of helpers will have its own fraction parameter fh, i for content i in the optimization problem and each type of helper may have a different offloading efficiency Eh. By Theorem 2, the general tradeoff point for storage assignment in different types of helpers can be solved when the offloading efficiency functions are concave. 5.4. Practical issues The above optimization framework provides a way to allocate the storage in an offloading system. Although centralized optimizations in cellular networks are possible, there are several practical issues to be considered in implementation. Number of contents: First, the number of content pieces may be very large in real systems. Therefore, solving fs, i and fr, i for each content may incur high computational costs. To address this problem, we propose to group contents into a fixed number of content categories based on the popularity of the content. We observe that the solutions for different contents only depend on the value of Ri. Therefore, we can aggregate contents with similar request rates into one content category and optimize over content categories to obtain approximate solutions. In practice, content can be grouped into bins with geometric sequences of request rates, e.g., the first bin contains contents with request rate smaller than 0.1 requests per day and the rest bins covers content request rates of 0.1–0.2, 0.2–0.4, 0.4–0.8, . . ., respectively. Consider the case there are l categories of contents, each with mi contents of request rate Ri. We can change the objective function Eq. (17) to l i=1 MmiRiF (n fs,i, n fr,i), the constraint in Eq. (18) to l i=1 mi fs,i + ku,i fr,i ≤ I and keep all other constraints unchanged to reduce the problem size. Limited number of friends: Second, when the optimal solution has large fr, i for some pieces of content, the subscriber may not have enough friends with high contact rates. To address this limitation, we add constraints, such as nfr, i ≤ 5, ∀i, to limit the number
W.Wang et aL/Computer Communications 83(2016)45-55 53 ◆Content A with relay Content B with relav 0 nt A no relay 00 Conte Content B no relay 0.2 50 0.2 488089-9 0.4 0.6 0.8 02 0.4 0.6 0.8 0.2 0.4 0.6 0.8 (a)Storage allocation for relaying scheme (b)Storage allocation for static scheme (c)Failure probability for relay and static scenario Fig.4.Convex optimization result (n 5000,k =10,T=1). of relays for each session to be smaller than five,i.e.,less than five This shows that relaying can support more content with low re- relays are used in each session.Our numerical results in Section 6 questing rates.The numerical results under various different pa- show that such limitations do not significantly change the system rameter settings also has similar trends that demonstrate relaying performance. can improve the offloading of rare content. Communication overhead:Relaying requires content requests to (2)Offloading performance be sent to relays through the cellular network,which incurs addi- Fig.5(a)illustrates systems with 100 content categories,where tional control traffic above that of traditional static solutions [6]. the content request rate follows a Zipf distribution with a=0.8 However,content delivery systems normally have large segment [39].We observe that the overall failure probability in the optimal sizes,such as 2M bytes in P2P-VoD systems [29].It only takes sev- relaying case reduces much faster than in the static case.We fur- eral hundred bytes to send relay requests to 10 relays,which incurs ther consider the impact of different factors on the offloading ef- less than 1%overhead.Compared to the 30%gain in offloading ef- ficiency.For the curve labeled no reuse,we set ki=1 so that re- ficiency.the 1%communication overhead is small. lays are never reused between contents.We observe that the per- Unreliable contacts:In real networks,detection of the existence formance of relaying is still better in this case than in the static of D2D links might be unreliable,and data transfer over D2D links case,because relays with higher contact rates to subscribers per- could be dropped.Moreover,some contacts may be too short to form better than seeds.On the other hand,if relays do not have transmit even a single segment.Considering these cases,we use higher contact rates but can be reused,the offloading failure prob- the effective contact rate in our system.Suppose that among all ability reduces faster when A is higher,as shown in the no friend contacts,there are a portion y of contacts that are unreliable or case.This is because relays with high reuse rates can be especially short contacts.The effective contact processes for seeds and re- helpful for rare content,which has more chances to be offloaded lays can then be modeled as Poisson processes with intensity of when A is high.We also observe that limiting the number of re- (1-y)As and (1-y)Ar.respectively,by the properties of Poisson lays for each session has little impact on performance,because the process [34].Our results and optimization schemes are still valid curve for limited friends,where nfr.i is smaller than five,is almost in this case. identical to the optimal solution. (3)Performance under different scenarios 6.Simulation results Fig.5(b)shows the relationship between offloading perfor- mance and the number of content categories in the system.In the 6.1.Numerical results experiments,we increase the number of content categories from 5 to 100.In the uniform case,we divide available storage slots in We use the numerical results obtained from gradient projection helpers evenly to these content categories without considering the algorithms to show the properties of the storage assignment prob request rates.Therefore,the offloading failure probability does not lem. change when the number of content categories increases.The static (1)Example for two categories of content scheme allocates storage based on content request rates,where We use an offloading system with only two content categories A popular content gets more replicas.The relay scheme uses part of and B to demonstrate the tradeoffs between relays and seeds.The the storage as seeds and allocates relays for content request based requesting rates for the two categories of content are 0.5 and 0.1 on our optimization algorithm.Both the static and relay scheme respectively.and the amount of content in each category is 1000. consider the request rates of contents,so that their offloading fail- We assume that there are n=5000 helpers,each of which con- ure probabilities decrease as the number of content categories in- tributes one storage segment.We assume the contact processes are creases,ie.,the request rates for content become different.The re- Poisson processes with kr =10 (i.e.,relays have 10 times higher lay scheme has lower offloading failure probability than the static contact rates than seeds.).Fig.4(a)and (b)shows the fractions of scheme storage allocated for each type of contents in the relaying scheme Fig.5(c)shows the performance of offloading schemes with dif- and static cache scheme,respectively.From Fig.4(b).we see that ferent content request rates.Both the uniform and static scheme static cache assignment tends to allocate all storage to the popular have constant offloading failure probability when the request rate content category A when A is small.This implies that a static cache increases,as they only use seeds for offloading.The relay scheme scheme only tries to offload a small number of contents.In the re- has better performance than the other two schemes.However,the laying scheme,the system allocates a high fraction of relays to the gap reduces when the request rate increases.This is because re- less popular content category B to help deliver content B even in lays reserve storage for each individual request.Thus,relays be- the case that A is quite low.The offloading failure probabilities are come congested when the request rates increase.However,the per- shown in Fig 4(c).Offloading failure probabilities of the relaying formance of relay scheme is always better than the static scheme, schemes for both content categories are always smaller than those since the static solutions are in a subset of the relaying solutions. of the static cases.The failure probability for the less popular con- Fig.5(d)shows the offloading performance under different tent category B has been greatly improved in the relaying systems numbers of helpers.We observe that all schemes have better
W. Wang et al. / Computer Communications 83 (2016) 45–55 53 Fig. 4. Convex optimization result (n = 5000, kr = 10, T = 1). of relays for each session to be smaller than five, i.e., less than five relays are used in each session. Our numerical results in Section 6 show that such limitations do not significantly change the system performance. Communication overhead: Relaying requires content requests to be sent to relays through the cellular network, which incurs additional control traffic above that of traditional static solutions [6]. However, content delivery systems normally have large segment sizes, such as 2M bytes in P2P-VoD systems [29]. It only takes several hundred bytes to send relay requests to 10 relays, which incurs less than 1% overhead. Compared to the 30% gain in offloading ef- ficiency, the 1% communication overhead is small. Unreliable contacts: In real networks, detection of the existence of D2D links might be unreliable, and data transfer over D2D links could be dropped. Moreover, some contacts may be too short to transmit even a single segment. Considering these cases, we use the effective contact rate in our system. Suppose that among all contacts, there are a portion γ of contacts that are unreliable or short contacts. The effective contact processes for seeds and relays can then be modeled as Poisson processes with intensity of (1 − γ )λs and (1 − γ )λr, respectively, by the properties of Poisson process [34]. Our results and optimization schemes are still valid in this case. 6. Simulation results 6.1. Numerical results We use the numerical results obtained from gradient projection algorithms to show the properties of the storage assignment problem. (1) Example for two categories of content We use an offloading system with only two content categories A and B to demonstrate the tradeoffs between relays and seeds. The requesting rates for the two categories of content are 0.5 and 0.1 respectively, and the amount of content in each category is 1000. We assume that there are n = 5000 helpers, each of which contributes one storage segment. We assume the contact processes are Poisson processes with kr = 10 (i.e., relays have 10 times higher contact rates than seeds.). Fig. 4(a) and (b) shows the fractions of storage allocated for each type of contents in the relaying scheme and static cache scheme, respectively. From Fig. 4(b), we see that static cache assignment tends to allocate all storage to the popular content category A when λ is small. This implies that a static cache scheme only tries to offload a small number of contents. In the relaying scheme, the system allocates a high fraction of relays to the less popular content category B to help deliver content B even in the case that λ is quite low. The offloading failure probabilities are shown in Fig 4(c). Offloading failure probabilities of the relaying schemes for both content categories are always smaller than those of the static cases. The failure probability for the less popular content category B has been greatly improved in the relaying systems. This shows that relaying can support more content with low requesting rates. The numerical results under various different parameter settings also has similar trends that demonstrate relaying can improve the offloading of rare content. (2) Offloading performance Fig. 5(a) illustrates systems with 100 content categories, where the content request rate follows a Zipf distribution with α = 0.8 [39]. We observe that the overall failure probability in the optimal relaying case reduces much faster than in the static case. We further consider the impact of different factors on the offloading ef- ficiency. For the curve labeled no reuse, we set ku,i = 1 so that relays are never reused between contents. We observe that the performance of relaying is still better in this case than in the static case, because relays with higher contact rates to subscribers perform better than seeds. On the other hand, if relays do not have higher contact rates but can be reused, the offloading failure probability reduces faster when λ is higher, as shown in the no friend case. This is because relays with high reuse rates can be especially helpful for rare content, which has more chances to be offloaded when λ is high. We also observe that limiting the number of relays for each session has little impact on performance, because the curve for limited friends, where nfr, i is smaller than five, is almost identical to the optimal solution. (3) Performance under different scenarios Fig. 5(b) shows the relationship between offloading performance and the number of content categories in the system. In the experiments, we increase the number of content categories from 5 to 100. In the uniform case, we divide available storage slots in helpers evenly to these content categories without considering the request rates. Therefore, the offloading failure probability does not change when the number of content categories increases. The static scheme allocates storage based on content request rates, where popular content gets more replicas. The relay scheme uses part of the storage as seeds and allocates relays for content request based on our optimization algorithm. Both the static and relay scheme consider the request rates of contents, so that their offloading failure probabilities decrease as the number of content categories increases, i.e., the request rates for content become different. The relay scheme has lower offloading failure probability than the static scheme. Fig. 5(c) shows the performance of offloading schemes with different content request rates. Both the uniform and static scheme have constant offloading failure probability when the request rate increases, as they only use seeds for offloading. The relay scheme has better performance than the other two schemes. However, the gap reduces when the request rate increases. This is because relays reserve storage for each individual request. Thus, relays become congested when the request rates increase. However, the performance of relay scheme is always better than the static scheme, since the static solutions are in a subset of the relaying solutions. Fig. 5(d) shows the offloading performance under different numbers of helpers. We observe that all schemes have better
54 W.Wang et aL /Computer Communications 83 (2016)45-55 static 女optimal 0.6 0.6 06 0.5 0.g ◇limited friends 0.4 0-0-0-000-0-0-0-00-0 20.4 0000 公 女relay 0.3 relay 0-0-0-0 02 o-unifor uniform 0.9 0. ostatic 0.1 static 0.2 0.4 0.6 0.8 20 Nun2otcateg8ries 60 80 100 Normalize (a)Different offloading schemes (b)Different number of content categories (c)Different requesting rates 10 ☆relav 女Relay 08 uniform -Relay with friend limit ostatic Static O-static 出0 10 8 0.4 0.2 50 n(numof nodes) 150 200 10 102 0 10 120 n(num of nodes) 10 40 (hours 100 (d)Different number of helpers (e)Amount of content supported by different (f)Performance for MIT reality traces schemes (Offloading failure probability=0.2). Fig.5.Offloading failure probability performances. performance when the number of helpers increases.The offload- T>80.Comparing Fig.5(a)-(f),we observe that the trace-driven ing failure probability deceases in a much faster speed in the relay results have trends similar to those in the numerical results,while case than in the static and uniform case. the trace-driven result is worse than that predicted by theory.This Fig.5(e)considers the amount of content that an offloading sys- may be caused by the limited choices of relays in the trace-driven tem can support when the network size increases as discussed simulation. in Section 4.3.The number of helpers n increases from 100 to 100,000 and we use convex optimization to find the amount of 7.Conclusion content that the system can offload with a failure probability smaller than 20%.We observe that the static system can only of- In this paper,we proposed a novel relay-based storage assign- fload a constant amount of content while in the relaying case,the ment framework for D2D systems.Instead of using storage as static amount of content increases with n.We also evaluate the case content cache,our scheme uses on-demand relaying to enhance where subscribers have less than 10 friends.In this case,we ob- the storage utilization.We demonstrated the usefulness of this serve that the relay system can support 10 times more content relaying framework through both theoretical analysis and trace than the static caching system.However,the amount of content driven simulations.To further improve our system,one important that the system can support converges after the number of helpers future research direction is to study the incentives of the helpers exceeds 5000.This implies that supporting larger amounts of con- so that more users will participate in the offloading system. tent will likely require multi-hop relaying or common relays de- ployed by the network operator for all subscribers. Acknowledgment 6.2.Trace-driven simulations This work is partially supported by NSFC Grants (No.61373129. No.91218302.No.61321491,No.61472185).JiangSu NSF(No. We use MIT reality traces [35]to verify the performance of our BK20151390).and EU FP7 IRSES MobileCloud Project Grant (No. scheme in real networks.The trace is collected by 100 students 612212). and faculties who carries smart phones for more than 9 months to collect Bluetooth devices in proximity [35.The Bluetooth trace References are collected by 100 internal devices which have discovered ap- proximately 20,000 external devices in total.Therefore,we only [1]Cisco.Cisco Visual Networking Index:Global Mobile Data Traffic Forecast Up- have a partial view of the entire contact process.The average con- date,Cisco White Paper,2014,pp.2013-2018. 2]3GPP,Feasibility study for proximity services.SA1 TR 28.803(2012). tact rate for users is 31.44 per day,after removing empty days,i.e. [3]K.Doppler.M.Rinne.C.Wijting.C.Ribeiro,K Hugl,Device-to-device commu- days with no contacts.Based on the estimation of average contact nication as an underlay to LTE-advanced networks.Commun.Mag.IEEE 47 rate,we allocate storage to 100 Zipf distributed pieces of content 12)(2009)42-49. [4]B.Han.P.Hui,V.A.Kumar.M.V.Marathe,J.Shao.A.Srinivasan,Mobile data using our optimization algorithm,while assuming that each node offloading through opportunistic communications and social participation.IEEE can store two pieces of contents.Only the 100 internal devices is Trans.Mobile Comput.11 (5)(2012)821-834. selected as relays in the simulation,while all devices can be seeds 5].Reich.A.Chaintreau.The age of impatience:optimal replication schemes for opportunistic networks,in:Proceedings of ACM CoNEXT,2009. Fig.5(f)depicts the offloading failure probabilities under dif- [6]Y.Li,G.Su,P.Hui.D.Jin,L Su,L Zeng.Multiple mobile data offloading through ferent schemes.We observe that both the static and relay scheme delay tolerant networks.in:Proceedings of ACM CHANTS,2011. outperform the uniform scheme.The offloading failure probability 17]M.Conti,M.Kumar,Opportunities in opportunistic computing.Computer 43 (1)(2010)42-50. of the relay scheme is similar to the static scheme when T is small. [8]A.Balasubramanian,B.Levine.A.Venkataramani.DTN routing as a resource However,the failure probability is reduced by more than 30%when allocation problem,in:Proceedings of ACM SIGCOMM.2007
54 W. Wang et al. / Computer Communications 83 (2016) 45–55 Fig. 5. Offloading failure probability performances. performance when the number of helpers increases. The offloading failure probability deceases in a much faster speed in the relay case than in the static and uniform case. Fig. 5(e) considers the amount of content that an offloading system can support when the network size increases as discussed in Section 4.3. The number of helpers n increases from 100 to 100,000 and we use convex optimization to find the amount of content that the system can offload with a failure probability smaller than 20%. We observe that the static system can only of- fload a constant amount of content while in the relaying case, the amount of content increases with n. We also evaluate the case where subscribers have less than 10 friends. In this case, we observe that the relay system can support 10 times more content than the static caching system. However, the amount of content that the system can support converges after the number of helpers exceeds 5000. This implies that supporting larger amounts of content will likely require multi-hop relaying or common relays deployed by the network operator for all subscribers. 6.2. Trace-driven simulations We use MIT reality traces [35] to verify the performance of our scheme in real networks. The trace is collected by 100 students and faculties who carries smart phones for more than 9 months to collect Bluetooth devices in proximity [35]. The Bluetooth trace are collected by 100 internal devices which have discovered approximately 20,000 external devices in total. Therefore, we only have a partial view of the entire contact process. The average contact rate for users is 31.44 per day, after removing empty days, i.e., days with no contacts. Based on the estimation of average contact rate, we allocate storage to 100 Zipf distributed pieces of content using our optimization algorithm, while assuming that each node can store two pieces of contents. Only the 100 internal devices is selected as relays in the simulation, while all devices can be seeds. Fig. 5(f) depicts the offloading failure probabilities under different schemes. We observe that both the static and relay scheme outperform the uniform scheme. The offloading failure probability of the relay scheme is similar to the static scheme when T is small. However, the failure probability is reduced by more than 30% when T > 80. Comparing Fig. 5(a)–(f), we observe that the trace-driven results have trends similar to those in the numerical results, while the trace-driven result is worse than that predicted by theory. This may be caused by the limited choices of relays in the trace-driven simulation. 7. Conclusion In this paper, we proposed a novel relay-based storage assignment framework for D2D systems. Instead of using storage as static content cache, our scheme uses on-demand relaying to enhance the storage utilization. We demonstrated the usefulness of this relaying framework through both theoretical analysis and trace driven simulations. To further improve our system, one important future research direction is to study the incentives of the helpers so that more users will participate in the offloading system. Acknowledgment This work is partially supported by NSFC Grants (No. 61373129, No. 91218302, No. 61321491, No. 61472185), JiangSu NSF (No. BK20151390), and EU FP7 IRSES MobileCloud Project Grant (No. 612212). References [1] Cisco, Cisco Visual Networking Index: Global Mobile Data Traffic Forecast Update, Cisco White Paper, 2014, pp. 2013–2018. [2] 3GPP, Feasibility study for proximity services, SA1 TR 28.803 (2012). [3] K. Doppler, M. Rinne, C. Wijting, C. Ribeiro, K. Hugl, Device-to-device communication as an underlay to LTE-advanced networks, Commun. Mag., IEEE 47 (12) (2009) 42–49. [4] B. Han, P. Hui, V.A. Kumar, M.V. Marathe, J. Shao, A. Srinivasan, Mobile data offloading through opportunistic communications and social participation, IEEE Trans. Mobile Comput. 11 (5) (2012) 821–834. [5] J. Reich, A. Chaintreau, The age of impatience: optimal replication schemes for opportunistic networks, in: Proceedings of ACM CoNEXT, 2009. [6] Y. Li, G. Su, P. Hui, D. Jin, L. Su, L. Zeng, Multiple mobile data offloading through delay tolerant networks, in: Proceedings of ACM CHANTS, 2011. [7] M. Conti, M. Kumar, Opportunities in opportunistic computing, Computer 43 (1) (2010) 42–50. [8] A. Balasubramanian, B. Levine, A. Venkataramani, DTN routing as a resource allocation problem, in: Proceedings of ACM SIGCOMM, 2007