W.Wang et aL/Computer Communications 83(2016)45-55 Table 1 Notations used in this paper. Symbol Description The number of contents The segment length for contents The total number of helpers L The side length for the network region The time before the subscriber becomes impatient The storage size for helpers R Request rate for content i fls.i The number of seeds caching content i The number of relays for content i Es.E Offloading efficiency for one seed or one relay Ps,Pr The probability that subscriber cannot receive content from a seed or relay The subscriber contact rate for seeds and relays X(t) CDF of the inter-contact time between relays and subscribers Y(t) CDF of the inter-contact time between seeds and relays process,which are decreasing functions within range of [0,1].we larger than AT,which means a relay is always less effective than a have: seed when As =Ar.This is reasonable because a relay must down- load the content from the seed before it can deliver it to the sub- R(T)- ((T-t))"d(t) scriber. In summary.the offloading efficiency of relays can reach the ((T-t))"dx(t) upper bound of ns.i times the number of seeds when relays have high contact rates to subscribers.However,relays are less efficient =E(T-t)10≤t≤T than seeds when they have same contact rates as seeds. ≥min(T-t)"1o≤t≤T 4.2.Relay selection =((T) (10) According to the above analysis,relaying through a node with Therefore the same contact pattern as seeds is not helpful.However,D2D contact patterns are inherently non-uniform.The contact processes Er =-In T)- (T-t))Td(t) between different groups of mobile nodes have different contact patterns.As seeds serve for all the subscribers in the network,the ≤-nsiln?(T). (11) contact process between a seed and a subscriber should follow the typical contact pattern between random pairs of nodes.However. AsEs=-lnY(T).we have Er≤ns.iEs.□ relays may have different contact patterns to a given subscriber. We can intentionally select relays for a given offloading request Example:poisson contact process Consider the case where both contact processes are Poisson based on previous knowledge of their contact patterns.For exam- process.We have X(t)=1-e-Art and ux=1/r.where Ar is the ple,a device belonging to a friend of the user or a device mounted rate for the contact process between a relay and the subscriber. at a subway station that the user always passes by,can be selected Similarly,we have Y(t)=1-e-Ast and uy =1/As.Using Eqs.(5) as the relay in order to improve the efficiency of content distribu- and (6),we have(t)=e-t and ?(t)=e-st.We can obtain: tion.As discussed in Section 4.1,the offloading efficiency mainly depends on the contact rates of relays.Therefore,we propose to -In nsi入se-dT-入re-nT select relays based on their contact rates to reduce the relay selec- Er ns.is-入 入r≠ns.i入s (12) tion complexity. λrT-ln(1+rT) 入r=几s.i入s The relay selection process can be carried out either by the net- work operator or by the subscriber.When the relay is selected by Note that the sum of two exponentially distributed inter-contact the network operator,the network operator needs to record the times actually follows a hypoexponential distribution as expected. top-k nodes with the highest contact rates for each subscriber. Eq.(12)shows some characteristics of the offloading efficiency Such records can also be generated by the user devices and sub- for relay nodes. mitted to the operator along with the offloading request.These (i)When r ns.is,Er approaches ns.isT.Compared to the nodes will have higher contact rate to the subscriber compared to offloading efficiency of the seed Es =AsT.we see that a single relay randomly selected nodes. has the same efficiency as ns i seeds.where the upper bound in To verify the existence of nodes with higher contact rates,we Corollary 1 is achieved.Therefore,for rare content with low ns.i. studied the MIT reality trace [35],which contains Bluetooth traces using nr.i frequently contacting relays to download the content can from 100 mobiles for more than 9 months.Fig.2 shows the CDF of have effects similar to multiplying nsi by nr.i. the contact rates for the top-k frequently contacted neighbors for (ii)When a relay has the same contact pattern to the subscriber each mobile,normalized by the average contact rate.We see that as seeds,i.e,入r=λs=入,we have a relay offloading efficiency more than 90%of the subscribers have at least 5 neighbors with of a contact rate 10 times higher than average.More than 80%sub- scribers have 10 neighbors with contact rates 10 times higher than Er=入T+ln ns.i-1 几.i-e-(a4-1)x7 (13) average.Similar distributions of frequently contacted neighbors are also observed in recently collected traces as in [36]. Note that we have ns.-1s ns.i-e-(ns-)AT when ns.i 1.Thus Relays can also be selected by the subscribers.A subscriber the second term in Eq.(13)is always negative and Er can never be can request friends or colleagues to help with downloading.AsW. Wang et al. / Computer Communications 83 (2016) 45–55 49 Table 1 Notations used in this paper. Symbol Description N The number of contents M The segment length for contents n The total number of helpers L The side length for the network region T The time before the subscriber becomes impatient I The storage size for helpers Ri Request rate for content i ns, i The number of seeds caching content i nr, i The number of relays for content i Es , Er Offloading efficiency for one seed or one relay Ps , Pr The probability that subscriber cannot receive content from a seed or relay λs , λr The subscriber contact rate for seeds and relays X(t) CDF of the inter-contact time between relays and subscribers Y(t) CDF of the inter-contact time between seeds and relays process, which are decreasing functions within range of [0, 1], we have: Xˆ(T ) − T 0 Yˆ(T − t) ns,i dXˆ(t) ≥ − T 0 Yˆ(T − t) ns,i dXˆ(t) = E Yˆ(T − t) ns,i |0 ≤ t ≤ T ≥ min Yˆ(T − t) ns,i |0 ≤ t ≤ T = Yˆ(T ) ns,i . (10) Therefore, Er = − ln Xˆ(T ) − T 0 Yˆ(T − t) ns,i dXˆ(t) ≤ −ns,i lnYˆ(T ). (11) As Es = − lnYˆ(T ), we have Er ≤ ns, iEs. Example: poisson contact process Consider the case where both contact processes are Poisson process. We have X(t) = 1 − e−λrt and μx = 1/λr, where λr is the rate for the contact process between a relay and the subscriber. Similarly, we have Y (t) = 1 − e−λst and μy = 1/λs. Using Eqs. (5) and (6), we have Xˆ(t) = e−λrt and Yˆ(t) = e−λst. We can obtain: Er = ⎧ ⎨ ⎩ − lnns,iλse−λrT − λre−ns,iλsT ns,iλs − λr λr = ns,iλs λrT − ln(1 + λrT ) λr = ns,iλs . (12) Note that the sum of two exponentially distributed inter-contact times actually follows a hypoexponential distribution as expected. Eq. (12) shows some characteristics of the offloading efficiency for relay nodes. (i) When λr ns, iλs, Er approaches ns, iλsT. Compared to the offloading efficiency of the seed Es = λsT, we see that a single relay has the same efficiency as ns, i seeds, where the upper bound in Corollary 1 is achieved. Therefore, for rare content with low ns, i, using nr, i frequently contacting relays to download the content can have effects similar to multiplying ns, i by nr, i. (ii) When a relay has the same contact pattern to the subscriber as seeds, i.e., λr = λs = λ, we have a relay offloading efficiency of: Er = λT + ln ns,i − 1 ns,i − e−(ns,i−1)λT . (13) Note that we have ns,i − 1 ≤ ns,i − e−(ns,i−1)λT when ns, i ≥ 1. Thus the second term in Eq. (13) is always negative and Er can never be larger than λT, which means a relay is always less effective than a seed when λs = λr. This is reasonable because a relay must download the content from the seed before it can deliver it to the subscriber. In summary, the offloading efficiency of relays can reach the upper bound of ns, i times the number of seeds when relays have high contact rates to subscribers. However, relays are less efficient than seeds when they have same contact rates as seeds. 4.2. Relay selection According to the above analysis, relaying through a node with the same contact pattern as seeds is not helpful. However, D2D contact patterns are inherently non-uniform. The contact processes between different groups of mobile nodes have different contact patterns. As seeds serve for all the subscribers in the network, the contact process between a seed and a subscriber should follow the typical contact pattern between random pairs of nodes. However, relays may have different contact patterns to a given subscriber. We can intentionally select relays for a given offloading request based on previous knowledge of their contact patterns. For example, a device belonging to a friend of the user or a device mounted at a subway station that the user always passes by, can be selected as the relay in order to improve the efficiency of content distribution. As discussed in Section 4.1, the offloading efficiency mainly depends on the contact rates of relays. Therefore, we propose to select relays based on their contact rates to reduce the relay selection complexity. The relay selection process can be carried out either by the network operator or by the subscriber. When the relay is selected by the network operator, the network operator needs to record the top-k nodes with the highest contact rates for each subscriber. Such records can also be generated by the user devices and submitted to the operator along with the offloading request. These nodes will have higher contact rate to the subscriber compared to randomly selected nodes. To verify the existence of nodes with higher contact rates, we studied the MIT reality trace [35], which contains Bluetooth traces from 100 mobiles for more than 9 months. Fig. 2 shows the CDF of the contact rates for the top-k frequently contacted neighbors for each mobile, normalized by the average contact rate. We see that more than 90% of the subscribers have at least 5 neighbors with a contact rate 10 times higher than average. More than 80% subscribers have 10 neighbors with contact rates 10 times higher than average. Similar distributions of frequently contacted neighbors are also observed in recently collected traces as in [36]. Relays can also be selected by the subscribers. A subscriber can request friends or colleagues to help with downloading. As