1592 IEEE/ACM TRANSACTIONS ON NETWORKING.VOL.17.NO.5.OCTOBER 2009 Opportunistic Energy-Efficient Contact Probing in Delay-Tolerant Applications Wei Wang,Member;IEEE,Mehul Motani,Member;IEEE,and Vikram Srinivasan,Member;IEEE Abstract-In many delay-tolerant applications,information is governed by the mobility of information carriers and their un- is opportunistically exchanged between mobile devices that en- derlying contact patterns. counter each other.In order to affect such information exchange, The notion of delay tolerance is useful in a variety of sce- mobile devices must have knowledge of other devices in their vicinity.We consider scenarios in which there is no infrastructure narios.One compelling application is providing connectivity and devices must probe their environment to discover other and network services during disasters and in rural environments. devices.This can be an extremely energy-consuming process where network infrastructure is minimal or nonexistent.An- and highlights the need for energy-conscious contact-probing other example comes from software developers who are devel- mechanisms.If devices probe very infrequently,they might miss oping dating applications for mobile phones.The profile of an many of their contacts.On the other hand,frequent contact ideal partner is entered into a Bluetooth-based mobile phone, probing might be energy inefficient.In this paper,we investigate which alerts the user whenever a matching profile is detected in the tradeoff between the probability of missing a contact and the contact-probing frequency.First,via theoretical analysis,we char. the vicinity (e.g.,www.bedd.com). acterize the tradeoff between the probability of a missed contact Current research and development efforts for delay-tolerant and the contact-probing interval for stationary processes.Next, applications fall broadly into two categories:delay-tolerant for time-varying contact arrival rates,we provide an optimization networking (DTN)[2]and delay-tolerant databases (DTD). framework to compute the optimal contact-probing interval as a For DTN applications,the goal is to enable communication function of the arrival rate.We characterize real-world contact between specific source-destination pairs in the network. patterns via Bluetooth phone contact-logging experiments and show that the contact arrival process is self-similar.We design Research in this area has involved studying algorithmic issues STAR,a contact-probing algorithm that adapts to the contact such as routing in networks [3],fundamental issues such as arrival process.Instead of using constant probing intervals, scaling laws [4],and performance bounds of routing algorithms STAR dynamically chooses the probing interval using both the [5]based on real-world contact patterns. short-term contact history and the long-term history based on DTD applications have been driven by the observation that time of day information.Via trace-driven simulations on our mobile devices are becoming increasingly powerful in terms experimental data,we demonstrate that STAR requires three of computation and storage and have multiple radio interfaces to five times less energy for device discovery than a constant such as Bluetooth,3G,WiFi,etc.[6].An effort is also being contact-probing interval scheme. made by phone manufacturers to embed sensors in these phones Index Terms-Bluetooth,delay-tolerant networking (DTN),en- to acquire and store personal information (health-related)and ergy efficiency. for environmental monitoring [7].As a consequence,these de- vices store large volumes of digital information such as songs, photographs,and sensory data and constitute a distributed ge- I.INTRODUCTION ographic database.The dating application stated earlier is an INCE its inception.the goal of networking research has example of a DTD application.The research community has in- been to provide instant,anytime,anywhere access to in- vestigated opportunistic query propagation and data aggregation formation.However,in recent times,research interest has been algorithms,based on device proximity.in [6]and [8]-10]. piqued by a new class of applications that are tolerant to delay. For both DTN and DTD applications,the common funda- In several of these applications,information is exchanged op- mental primitive is the opportunistic exchange of information portunistically between devices when they are within communi- between mobile devices when they are in communication range cation range of each other.In other words,information transport of each other.In order to enable such exchanges,devices will have to constantly probe the environment to discover other de- vices in the vicinity.Not surprisingly,device discovery!is an Manuscript received November 25.2007:revised August 12.2008;approved by IEEE/ACM TRANSACTIONS ON NETWORKING Editor B.Levine.First pub- extremely energy-consuming process.To understand this better, lished June 30,2009:current version published October 14,2009.Part of this we made measurements on a Nokia 6600 mobile phone.The work was presented at ACM MobiCom 2007,Montreal,QC,Canada. current drawn was:1)38.61 mA for Bluetooth device discovery; W.Wang was with the National University of Singapore,Singapore 119620. Singapore.He is now with Microsoft Research Asia,Beijing 100190,China 2)9.33 mA for enabling the device to be discoverable;3)51.47 (e-mail:wei.wang@microsoft.com:wangwei.ww@gmail.com) mA for Bluetooth data transfer;and 4)38.68 mA for making a M.Motani is with the Department of Electrical and Computer Engineering, phone call.In other words,the device discovery process is as National University of Singapore,Singapore 119620.Singapore (e-mail: energy-intensive as making a phone call. motani@nus.edu.sg). V.Srinivasan is with Bell Labs Research,Bangalore 560095,India (e-mail: Our measurements clearly motivate the need for energy-con- vikramsr@alcatel-lucent.com). scious device-discovery algorithms.Although the measure- Color versions of one or more of the figures in this paper are available online ments in this paper are based on Bluetooth devices,our at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TNET.2008.2008990 We use device discovery and contact probing interchangeably in this paper. 1063-6692/S26.00©2009EEE
1592 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 17, NO. 5, OCTOBER 2009 Opportunistic Energy-Efficient Contact Probing in Delay-Tolerant Applications Wei Wang, Member, IEEE, Mehul Motani, Member, IEEE, and Vikram Srinivasan, Member, IEEE Abstract—In many delay-tolerant applications, information is opportunistically exchanged between mobile devices that encounter each other. In order to affect such information exchange, mobile devices must have knowledge of other devices in their vicinity. We consider scenarios in which there is no infrastructure and devices must probe their environment to discover other devices. This can be an extremely energy-consuming process and highlights the need for energy-conscious contact-probing mechanisms. If devices probe very infrequently, they might miss many of their contacts. On the other hand, frequent contact probing might be energy inefficient. In this paper, we investigate the tradeoff between the probability of missing a contact and the contact-probing frequency. First, via theoretical analysis, we characterize the tradeoff between the probability of a missed contact and the contact-probing interval for stationary processes. Next, for time-varying contact arrival rates, we provide an optimization framework to compute the optimal contact-probing interval as a function of the arrival rate. We characterize real-world contact patterns via Bluetooth phone contact-logging experiments and show that the contact arrival process is self-similar. We design STAR, a contact-probing algorithm that adapts to the contact arrival process. Instead of using constant probing intervals, STAR dynamically chooses the probing interval using both the short-term contact history and the long-term history based on time of day information. Via trace-driven simulations on our experimental data, we demonstrate that STAR requires three to five times less energy for device discovery than a constant contact-probing interval scheme. Index Terms—Bluetooth, delay-tolerant networking (DTN), energy efficiency. I. INTRODUCTION SINCE its inception, the goal of networking research has been to provide instant, anytime, anywhere access to information. However, in recent times, research interest has been piqued by a new class of applications that are tolerant to delay. In several of these applications, information is exchanged opportunistically between devices when they are within communication range of each other. In other words, information transport Manuscript received November 25, 2007; revised August 12, 2008; approved by IEEE/ACM TRANSACTIONS ON NETWORKING Editor B. Levine. First published June 30, 2009; current version published October 14, 2009. Part of this work was presented at ACM MobiCom 2007, Montreal, QC, Canada. W. Wang was with the National University of Singapore, Singapore 119620, Singapore. He is now with Microsoft Research Asia, Beijing 100190, China (e-mail: wei.wang@microsoft.com; wangwei.ww@gmail.com). M. Motani is with the Department of Electrical and Computer Engineering, National University of Singapore, Singapore 119620, Singapore (e-mail: motani@nus.edu.sg). V. Srinivasan is with Bell Labs Research, Bangalore 560095, India (e-mail: vikramsr@alcatel-lucent.com). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TNET.2008.2008990 is governed by the mobility of information carriers and their underlying contact patterns. The notion of delay tolerance is useful in a variety of scenarios. One compelling application is providing connectivity and network services during disasters and in rural environments, where network infrastructure is minimal or nonexistent. Another example comes from software developers who are developing dating applications for mobile phones. The profile of an ideal partner is entered into a Bluetooth-based mobile phone, which alerts the user whenever a matching profile is detected in the vicinity (e.g., www.bedd.com). Current research and development efforts for delay-tolerant applications fall broadly into two categories: delay-tolerant networking (DTN) [2] and delay-tolerant databases (DTD). For DTN applications, the goal is to enable communication between specific source–destination pairs in the network. Research in this area has involved studying algorithmic issues such as routing in networks [3], fundamental issues such as scaling laws [4], and performance bounds of routing algorithms [5] based on real-world contact patterns. DTD applications have been driven by the observation that mobile devices are becoming increasingly powerful in terms of computation and storage and have multiple radio interfaces such as Bluetooth, 3G, WiFi, etc. [6]. An effort is also being made by phone manufacturers to embed sensors in these phones to acquire and store personal information (health-related) and for environmental monitoring [7]. As a consequence, these devices store large volumes of digital information such as songs, photographs, and sensory data and constitute a distributed geographic database. The dating application stated earlier is an example of a DTD application. The research community has investigated opportunistic query propagation and data aggregation algorithms, based on device proximity, in [6] and [8]–[10]. For both DTN and DTD applications, the common fundamental primitive is the opportunistic exchange of information between mobile devices when they are in communication range of each other. In order to enable such exchanges, devices will have to constantly probe the environment to discover other devices in the vicinity. Not surprisingly, device discovery1 is an extremely energy-consuming process. To understand this better, we made measurements on a Nokia 6600 mobile phone. The current drawn was: 1) 38.61 mA for Bluetooth device discovery; 2) 9.33 mA for enabling the device to be discoverable; 3) 51.47 mA for Bluetooth data transfer; and 4) 38.68 mA for making a phone call. In other words, the device discovery process is as energy-intensive as making a phone call. Our measurements clearly motivate the need for energy-conscious device-discovery algorithms. Although the measurements in this paper are based on Bluetooth devices, our 1We use device discovery and contact probing interchangeably in this paper. 1063-6692/$26.00 © 2009 IEEE
WANG etaL:OPPORTUNISTIC ENERGY-EFFICIENT CONTACT PROBING IN DELAY-TOLERANT APPLICATIONS 1593 device-discovery model can be used in other protocols.such C.Algorithm Design and Validation as 802.11 or UWB.As there are no centralized coordinators in DTNs,it is hard to synchronize devices in the network,and Finally,using insights gleaned from the theory and the data devices normally do not have information on which channel analysis,we investigate adaptive contact-probing algorithms. should be used in device discovery.In device-discovery proto- We design STAR,an algorithm that strives to estimate the ar- cols used in DTN,a mobile may need to send multiple packets rival rate and adapt the contact-probing interval based on this over a number of channels in order to scan for devices around estimate.We consider using several estimation methods in con- it.Therefore,the high probing energy consumption could be a junction with STAR and compare them using trace-driven simu- common problem in delay-tolerant networks. lations.We show that a simple weighted average of previous ar- One strategy to conserve energy is to increase the time be- rival rate estimates (STAR-PTS)works well and has 1/3 the en- tween subsequent device discoveries.The consequence of this ergy consumption of the constant nonadaptive contact-probing is that devices in the vicinity may go undiscovered,and opportu- algorithm for a missing probability of 0.2.The minimum mean nities to exchange data are lost.This points to a tradeoff between squared error estimator approach(STAR-MMSE)is even better energy and missed opportunities.For strategies that use a con- and has 1/5 the energy consumption of the nonadaptive scheme. stant device-discovery interval,the larger the probing interval, See Sections V and VI. the larger the missed opportunities and vice-versa.However,in stochastic environments,device discovery should be done adap- II.MODELING THE CONTACT PROCESS tively by choosing the probing interval based on the state of the environment.For example,late at night at home,device dis- covery can occur at large intervals without missing many con- A.System Model and Assumptions tacts,while on the subway to work,device discovery should be Assume that every device probes the environment governed done more frequently to catch the myriad of new contacts. In this paper,we investigate the design of energy-conscious, by some contact-probing algorithm.When a device A probes its environment,all devices that hear the probe respond to device adaptive contact-probing algorithms that trade off energy con- A with some information (e.g.,identity,services available,etc.). sumption and the probability of missing a contact.Specifically, Based on this information,A may choose to exchange data with we make the following contributions. some of its neighbors.We define two device,s A and B,to be in contact if they are within communication range of each other2. A.Theoretical framework The duration over which devices A and B are continuously in contact is called the contact duration for that contact.If neither We first develop a theoretical framework to characterize the device probes its environment during the contact duration,then tradeoff between the average contact-probing interval 7 and the we have a missed contact.We further assume,for the sake of contact missing probability Piss.We show that if the contact analysis,that each probe is an impulse and does not consume duration distribution is independent and identically distributed any time. (i.i.d.)and the contact arrival rate is constant,then for a given We assume that each probe consumes the same amount of missing probability constraint,the optimal contact-probing in- energy of ep.When the average contact-probing interval is T, terval is 1)constant and 2)depends only on contact duration dis-the energy consumption rate of the device will bep/T.With the tribution,independent of the contact arrival distribution.When same number of missed contacts,the algorithm that uses fewer the contact arrival rate is time varying,the optimal contact- probes (longer average probing interval)will be more energy probing interval is a function of the contact arrival rate.We pro- efficient. vide an optimization framework to compute the optimal con- For a given device,we assume that the contact durations tact-probing interval as a function of the contact arrival rate. tp()are i.i.d.random variables with cumulative distribution This theoretical base provides us with bounds on performance function (cdf)of Fp(t)and mean Eftp}=1/u. and also aids in the investigation and design of adaptive con- We assume that the contact arrival process is stationary.The tact-probing algorithms.See Sections II and III. contact arrival process is characterized by the intercontact time tc(i),defined as the time between the ith and (+1)-th contacts. B.Real-World Contact Pattern Experiments and Analysis The stationary assumption is that the sequence of intercontact times is a wide-sense stationary process;i.e.,the tc()have The theoretical development above depends on the contact a constant mean of Eftc}=1/A.We relax this assumption pattern statistics.To understand real-world contact patterns,we and deal with nonstationary contact arrival processes in the next conducted a large-scale data logging experiment [11].Nine vol- section. unteers were given Bluetooth devices equipped with a software See Fig.I for an illustration of contact duration tp(i)and program that probed for contacts every 30s and logged informa- intercontact time tc(). tion about other Bluetooth devices that came within range.Our database contains the largest number of unique devices discov- B.Missing Probability ered,compared to existing work [12]-[14].We conducted rig- orous analysis on the data.We confirmed that the contact dura- The missing probability Pmiss is the probability a contact that tion is Pareto-distributed.Moreover,our data analysis indicates occurs is not detected.For the following analysis,we assume weak correlations in contact patterns at 24-h time lags.Finally, we show that the contact process is self-similar with a Hurst pa- 2We do not assume a perfect communication region here.The imperfectness in communication is embedded in our contact data gathering process that uses rameter of 0.76.See Section IV. real devices
WANG et al.: OPPORTUNISTIC ENERGY-EFFICIENT CONTACT PROBING IN DELAY-TOLERANT APPLICATIONS 1593 device-discovery model can be used in other protocols, such as 802.11 or UWB. As there are no centralized coordinators in DTNs, it is hard to synchronize devices in the network, and devices normally do not have information on which channel should be used in device discovery. In device-discovery protocols used in DTN, a mobile may need to send multiple packets over a number of channels in order to scan for devices around it. Therefore, the high probing energy consumption could be a common problem in delay-tolerant networks. One strategy to conserve energy is to increase the time between subsequent device discoveries. The consequence of this is that devices in the vicinity may go undiscovered, and opportunities to exchange data are lost. This points to a tradeoff between energy and missed opportunities. For strategies that use a constant device-discovery interval, the larger the probing interval, the larger the missed opportunities and vice-versa. However, in stochastic environments, device discovery should be done adaptively by choosing the probing interval based on the state of the environment. For example, late at night at home, device discovery can occur at large intervals without missing many contacts, while on the subway to work, device discovery should be done more frequently to catch the myriad of new contacts. In this paper, we investigate the design of energy-conscious, adaptive contact-probing algorithms that trade off energy consumption and the probability of missing a contact. Specifically, we make the following contributions. A. Theoretical Framework We first develop a theoretical framework to characterize the tradeoff between the average contact-probing interval and the contact missing probability . We show that if the contact duration distribution is independent and identically distributed (i.i.d.) and the contact arrival rate is constant, then for a given missing probability constraint, the optimal contact-probing interval is 1) constant and 2) depends only on contact duration distribution, independent of the contact arrival distribution. When the contact arrival rate is time varying, the optimal contactprobing interval is a function of the contact arrival rate. We provide an optimization framework to compute the optimal contact-probing interval as a function of the contact arrival rate. This theoretical base provides us with bounds on performance and also aids in the investigation and design of adaptive contact-probing algorithms. See Sections II and III. B. Real-World Contact Pattern Experiments and Analysis The theoretical development above depends on the contact pattern statistics. To understand real-world contact patterns, we conducted a large-scale data logging experiment [11]. Nine volunteers were given Bluetooth devices equipped with a software program that probed for contacts every 30 s and logged information about other Bluetooth devices that came within range. Our database contains the largest number of unique devices discovered, compared to existing work [12]–[14]. We conducted rigorous analysis on the data. We confirmed that the contact duration is Pareto-distributed. Moreover, our data analysis indicates weak correlations in contact patterns at 24-h time lags. Finally, we show that the contact process is self-similar with a Hurst parameter of 0.76. See Section IV. C. Algorithm Design and Validation Finally, using insights gleaned from the theory and the data analysis, we investigate adaptive contact-probing algorithms. We design STAR, an algorithm that strives to estimate the arrival rate and adapt the contact-probing interval based on this estimate. We consider using several estimation methods in conjunction with STAR and compare them using trace-driven simulations. We show that a simple weighted average of previous arrival rate estimates (STAR-PTS) works well and has 1/3 the energy consumption of the constant nonadaptive contact-probing algorithm for a missing probability of 0.2. The minimum mean squared error estimator approach (STAR-MMSE) is even better and has 1/5 the energy consumption of the nonadaptive scheme. See Sections V and VI. II. MODELING THE CONTACT PROCESS A. System Model and Assumptions Assume that every device probes the environment governed by some contact-probing algorithm. When a device A probes its environment, all devices that hear the probe respond to device A with some information (e.g., identity, services available, etc.). Based on this information, A may choose to exchange data with some of its neighbors. We define two device,s A and B, to be in contact if they are within communication range of each other2. The duration over which devices A and B are continuously in contact is called the contact duration for that contact. If neither device probes its environment during the contact duration, then we have a missed contact. We further assume, for the sake of analysis, that each probe is an impulse and does not consume any time. We assume that each probe consumes the same amount of energy of . When the average contact-probing interval is , the energy consumption rate of the device will be . With the same number of missed contacts, the algorithm that uses fewer probes (longer average probing interval) will be more energy efficient. For a given device, we assume that the contact durations are i.i.d. random variables with cumulative distribution function (cdf) of and mean . We assume that the contact arrival process is stationary. The contact arrival process is characterized by the intercontact time , defined as the time between the th and -th contacts. The stationary assumption is that the sequence of intercontact times is a wide-sense stationary process; i.e., the have a constant mean of . We relax this assumption and deal with nonstationary contact arrival processes in the next section. See Fig. 1 for an illustration of contact duration and intercontact time . B. Missing Probability The missing probability is the probability a contact that occurs is not detected. For the following analysis, we assume 2We do not assume a perfect communication region here. The imperfectness in communication is embedded in our contact data gathering process that uses real devices
1594 IEEE/ACM TRANSACTIONS ON NETWORKING.VOL.17.NO.5.OCTOBER 2009 te(1): te(2) The overall missing probability Pmiss can be derived in a similar way(see the Appendix).Similar to the reliable probes case,the Contact 3 missing probability for unreliable probes is also independent of Contact 2 to(3) the intercontact time distribution.In the rest of this paper,we to(2) Contact 1 only consider cases where all contact probes are reliable. tp(1) C.Optimal Contact-Probing Scheme We will now prove the following theorem about the constant contact-probing scheme described earlier. Fig.1.Illustrating the contacts for a specific user probing at a constant in- terval T. Theorem 1:Consider a contact process for which the con- tact durations are i.i.d and the distribution of intercontact times is stationary,with an expected intercontact time of Consider the class of contact-probing strategies that do not exploit knowl- that for a device A.a contact B is missed if B is not discovered by A's probes.We will relax this later to compute the missing edge of the contact process.Then,among all contact-probing probability when neither A nor B's probes discover each other. strategies with the same average contact-probing interval,the Let us first consider the simplest possible contact strategy,where strategy that probes at constant intervals achieves the minimum each device probes for contacts at constant intervals of T(see missing probability. Fig.1). Proof:Consider a large interval of time L.Let us con- sider all strategies that probe the environment n times in this We first consider the case when contact probes are reliable; i.e.,every probe can successfully detect all contacts around the interval.As shown previously,for the strategy that probes at device.If tp T,the contact will never be missed.Consider a constant intervalsT the missing probability over dura- contact i that is initiated at time nT-x,nT-+dr],where tion L is Piss(T)=FD()dz.Assume that an arbi- d is an arbitrarily small value so that the time interval can be trary scheme probes n times at t1,t2,...,tn.Define to =0 treated as a time point.This contact will not be detected by the and tn+1 =nT.Then,we have n+1 intervals of I1=t- mobile at time nT if tp(i)<t,which happens with prob- to:12=t2-t1,...,In+1 =tn+1-tn.Since the scheme se- ability of Fp(t).By Blackwell's Theorem in renewal theory lects probe time ti without knowledge of the contact process,the [15],when the contact process is a general renewal process with expected number of missed contacts in an interval ofti1,ti a nonlattice distribution function,we can calculate the long-term isFp()dr,for1.All the contacts that occur in average missing probability given the probing interval of T as [tn,nT will be missed.The expected missing probability is ims(四= Fp(x)dx. (1) Fp(t)dr+Xn+ (3) Note that we do not need to restrict the intercontact time distri- since the expected number of contacts arriving in duration L is bution to be memoryless to perform the averaging procedure in λL.For Ii≥T,we have (1). Consider the mean value of tp given that tp T,which is E(tpltp <T}o [1-Fp()/Fp(T)ldr.Then, Pmiss(T)can be expressed as Pnis(1)=T-E(tpltp<Tx E(T). FD(T)d (2) =1 p(x)dx+(Ii-T)FD(T).(4) For a constant contact-probing interval of T,any contact with duration larger than T will always be detected.Equation Similarly,when Ii<T we have (2)shows that for a contact with duration shorter than T, the expected probability that the contact will be missed is T-EftpltT.Therefore,only contacts with duration smaller than T will be missed.If all the contacts with tp T have 入 zero contact duration,then Pniss(T)will be exactly Fp(T). The key observation here is that for a constant probing rT scheme,if the contact durations are i.i.d.and the intercontact ≥入F()dc-入 Fp(T)dc times are stationary,then the missing probability depends only on the distribution of the contact duration and the con- Fp(E)dx+(Ii-T)Fp(T).(5) tact-probing interval T.It is independent of the intercontact time distribution. We also have In real networks,contact probes are unreliable due to the nondeterministic nature of wireless channel.Consider the case where a probe can miss a present contact with probability of K. 入Ln+1≥λIn+1FD(T). (6)
1594 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 17, NO. 5, OCTOBER 2009 Fig. 1. Illustrating the contacts for a specific user probing at a constant interval . that for a device A, a contact B is missed if B is not discovered by A’s probes. We will relax this later to compute the missing probability when neither A nor B’s probes discover each other. Let us first consider the simplest possible contact strategy, where each device probes for contacts at constant intervals of (see Fig. 1). We first consider the case when contact probes are reliable; i.e., every probe can successfully detect all contacts around the device. If , the contact will never be missed. Consider a contact that is initiated at time , where is an arbitrarily small value so that the time interval can be treated as a time point. This contact will not be detected by the mobile at time if , which happens with probability of . By Blackwell’s Theorem in renewal theory [15], when the contact process is a general renewal process with a nonlattice distribution function, we can calculate the long-term average missing probability given the probing interval of as (1) Note that we do not need to restrict the intercontact time distribution to be memoryless to perform the averaging procedure in (1). Consider the mean value of given that , which is . Then, can be expressed as (2) For a constant contact-probing interval of , any contact with duration larger than will always be detected. Equation (2) shows that for a contact with duration shorter than , the expected probability that the contact will be missed is . Therefore, only contacts with duration smaller than will be missed. If all the contacts with have zero contact duration, then will be exactly . The key observation here is that for a constant probing scheme, if the contact durations are i.i.d. and the intercontact times are stationary, then the missing probability depends only on the distribution of the contact duration and the contact-probing interval . It is independent of the intercontact time distribution. In real networks, contact probes are unreliable due to the nondeterministic nature of wireless channel. Consider the case where a probe can miss a present contact with probability of . The overall missing probability can be derived in a similar way (see the Appendix). Similar to the reliable probes case, the missing probability for unreliable probes is also independent of the intercontact time distribution. In the rest of this paper, we only consider cases where all contact probes are reliable. C. Optimal Contact-Probing Scheme We will now prove the following theorem about the constant contact-probing scheme described earlier. Theorem 1: Consider a contact process for which the contact durations are i.i.d and the distribution of intercontact times is stationary, with an expected intercontact time of . Consider the class of contact-probing strategies that do not exploit knowledge of the contact process. Then, among all contact-probing strategies with the same average contact-probing interval, the strategy that probes at constant intervals achieves the minimum missing probability. Proof: Consider a large interval of time . Let us consider all strategies that probe the environment times in this interval. As shown previously, for the strategy that probes at constant intervals , the missing probability over duration is . Assume that an arbitrary scheme probes times at . Define and . Then, we have intervals of . Since the scheme selects probe time without knowledge of the contact process, the expected number of missed contacts in an interval of is , for . All the contacts that occur in will be missed. The expected missing probability is (3) since the expected number of contacts arriving in duration is . For , we have (4) Similarly, when we have (5) We also have (6)
WANG etaL:OPPORTUNISTIC ENERGY-EFFICIENT CONTACT PROBING IN DELAY-TOLERANT APPLICATIONS 1595 Putting all these into (3),we have 0.40 量一Exponential 0.35 -Uniform (瓜 ●-Pareto(k=2) FD()dx+(Ii-T)FD(T 0.30 0.25 +λIn+1FD(T) 80.20 0.15 F(c)+fD(T)(∑I-nT) i= 0.10 =Pmiss(T). (7) 0.05 0.00 681012141618202 D.Tradeoffs in Energy Efficiency and Pmiss Energy Consumption 1/(uT)) Having established that a constant contact-probing interval is Fig.2.Tradeoff between energy consumption and missing probability on dif- optimal under certain assumptions,we now explore the tradeoff ferent distributions. between energy efficiency and the missing probability.When contact durations are distributed according to a given distribu- tion,we can analytically determine the relationship between T interval should be around 7 to achieve a near-zero missing and Pniss(T)according to (1).Here,we demonstrate the rela- probability.In other words,for a constant arrival rate and a tionship between energy efficiency and missing probability for Pareto contact duration distribution,it is difficult to tradeoff several typical distributions.In Section IV,we will focus on dis- between Pmiss and T. tributions obtained from real-world Bluetooth contact logs. 1)Exponential Distribution:When the contact duration is E.Double Detection exponentially distributed,we have Fp()=1-e-u.Using As we stated earlier,a contact between device A and B is (1),we have missed only if neither device probes the environment during the mis(T)=I-1+e-r contact.Consider the case when two users A and B are inde- T (8)pendently probing the environment.Assume that both users are using the same constant contact-probing interval of T.Then, 2)Uniform Distribution:The uniform distribution is the probability that A does not discover B is Pniss(T).How- ever,the probability that neither A nor B discovers the other E2/4 user probes at times of T,2T,...,nT,and the other probes at Additionally,we have ,y+T,...,+(n-1)T.Without loss of generality,we can assume that y1.The mean when y=0.Since the two users are probing independently, is unbounded whenT (12) 户niss(T) Fig.2 shows the tradeoff between energy consumption Fp(x)dx+ and missing probability for these distributions.The energy consumption is computed as where we set ep1 and normalize the energy consumption rate by the average contact Fp(x)drdy duration of u.We see that for exponential and uniform distribu- tions,the missing probability of 5%-10%is near the knee of the 2 curve that is a good tradeoff point between energy consump- Fp(x)dxdy+ Fp(x)dxdy tion and missing probability.This means the contact-probing [G interval should be around 1/6 to 1/3 of the average contact 2 duration.However,for Pareto distribution,the contact-probing Fp(c)dxdy. (13)
WANG et al.: OPPORTUNISTIC ENERGY-EFFICIENT CONTACT PROBING IN DELAY-TOLERANT APPLICATIONS 1595 Putting all these into (3), we have (7) D. Tradeoffs in Energy Efficiency and Having established that a constant contact-probing interval is optimal under certain assumptions, we now explore the tradeoff between energy efficiency and the missing probability. When contact durations are distributed according to a given distribution, we can analytically determine the relationship between and according to (1). Here, we demonstrate the relationship between energy efficiency and missing probability for several typical distributions. In Section IV, we will focus on distributions obtained from real-world Bluetooth contact logs. 1) Exponential Distribution: When the contact duration is exponentially distributed, we have . Using (1), we have (8) 2) Uniform Distribution: The uniform distribution is (9) Additionally, we have (10) 3) Pareto Distribution: We have (11) In this case, we have when . The mean is unbounded when . Using (1), we have (12) Fig. 2 shows the tradeoff between energy consumption and missing probability for these distributions. The energy consumption is computed as , where we set and normalize the energy consumption rate by the average contact duration of . We see that for exponential and uniform distributions, the missing probability of 5%–10% is near the knee of the curve that is a good tradeoff point between energy consumption and missing probability. This means the contact-probing interval should be around 1/6 to 1/3 of the average contact duration. However, for Pareto distribution, the contact-probing Fig. 2. Tradeoff between energy consumption and missing probability on different distributions. interval should be around to achieve a near-zero missing probability. In other words, for a constant arrival rate and a Pareto contact duration distribution, it is difficult to tradeoff between and . E. Double Detection As we stated earlier, a contact between device A and B is missed only if neither device probes the environment during the contact. Consider the case when two users A and B are independently probing the environment. Assume that both users are using the same constant contact-probing interval of . Then, the probability that A does not discover B is . However, the probability that neither A nor B discovers the other during a contact is much higher than , even though their contact-probing processes are independent. Suppose one user probes at times of , and the other probes at . Without loss of generality, we can assume that . Then, the probability that during a contact, neither user discovers the other is given by When , has a minimum value of , and has maximum value of when . Since the two users are probing independently, is uniformly distributed in , and the average missing probability is (13)
1596 IEEE/ACM TRANSACTIONS ON NETWORKING.VOL.17.NO.5.OCTOBER 2009 For example,when the contact duration is uniformly dis-It can be verified that U(ti)is always a concave and increasing tributed as shown n(9,we have户nmis(T)='=景rmis(T), function for the three distributions we discussed above.Using which is smaller than the single user missing probability only by the Karush-Kuhn-Tucker (KKT)conditions [16],we have a constant factor. U()=c (16) III.NONSTATIONARY INTERCONTACT TIMES Until now,we have made the simplifying assumption that the for all nonzero optimal solutions 4.Note that =0 when intercontact times are stationary.This assumption is clearly not c.The energy cost cis a constant that can be obtained true.For example,one would expect the intercontact time pat- by solving the dual of(15).The optimal T*can then be solved terns late at night to differ significantly from the intercontact from time distribution during lunch hours or peak traffic hours.This implies that the optimal contact-probing interval will vary with )=关 (17) time.In the following sections,we study the contact arrival pro- cesses where the average contact arrival rate varies with time.In for a given Ai.In practice,we can set the parameter c instead this case,we assume that the contact arrival rate is constant over of N to achieve a certain tradeoff point between energy and some short time period of duration L.As before,we assume the missing probability.A larger value of c means energy is more contact durations are i.i.d..This assumption will be justified in precious(smaller N)and less energy will be used in the M time Section VI. slots.Consequently,the overall contact-missing probability will be larger.Once the parameter c is set,we can use(17)to calcu- A.Problem Formulation late the optimal T*based on the estimate of Ai for a time slot i. This solution minimizes the missing probability over all time, We divide the time to slots with equal duration of L.We de- subject to energy constraints,which are determined by the en- note the average contact arrival rate in the ith time slot as Ai. ergy cost c. We proved in Theorem 1 that using a constant con- Example /Let the contact duration distribution be uniform. tact-probing interval of T is optimal during the ith time L=1 h,and N is the number of times the device can probe slot.In this case,the expected number of contacts captured in 24 h.Then,the optimal contact-probing interval T*for the during time period i will be Ai(1-Pniss(T).However,to arrival rate.入zis minimize the overall contact-missing probability,the optimal probing intervals Ti could be different for different time slots AC that have a different arrival rate of Ai.Intuitively,more energy (18) u入 should be spent in time slots with high arrival rates to capture more contacts so that the overall contact missing probability by(I7).If the:are all equal,.then clearlyT=装,∀i.There- can be minimized. Consider the problem of maximizing the number of contacts ore,c=学(装)2 and Pi(T)=裴when T≤a captured over M time slots,given a constraint on the total en- ergy used in the M time slots.This problem is equivalent to min- B.Discussion imizing the contact-missing probability over the M time slots. In Section II,we show that a constant probing rate is optimal As each probe consumes equal energy,the constraint on total for stationary processes,and the optimal tradeoff points can energy consumption can be converted to a constraint of total be derived from (1).In this section,we show that the average number of probes used in the M time slots.Assume that the de- vice can take N probes during the M time slots.We can then missing probability can be further reduced when the contact arrival rate is time-varying.When the contact arrival ratesAi formulate the following optimization problem to find the op- are different in different time slots,we can use adaptive contact timal probing interval T for each time slot: probing to redistribute the limited probing energy according to the contact arrival rate.For a time slot with high contact Maximize 入(1-Pmis(T) arrival rate,we can increase the probing rate and reduce the missing probability so that most contacts arriving in this time i=1 M slot can be captured.When the contact arrival rate is low,the ≤N T≥0 i. (14) contact-probing rate can be reduced since the number of missed s.t i=1 contacts入:LPmiss(T)will be small for a small入i,even if Pisquite high.In this way.the overall missing probability Defining variable ti as 1/Ti and Ui(i)=Ai(1-Pmiss(1/i)), (can be minimized.However,the missing the optimization in(14)can be recast as M AL probability for time slots with low arrival rates could be very M high in this case. Maximize Ui(xi) We now make two observations regarding the the optimiza- =1 tion in (14).First,when the contact arrival rates are equal for all M time slots,it is not necessary to use different probing intervals S.t Lx≤N x≥0 Vi. (15) in different time slots,and a constant probing rate is optimal. In this scenario.the solution to (14)reduces to the solution for
1596 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 17, NO. 5, OCTOBER 2009 For example, when the contact duration is uniformly distributed as shown in (9), we have , which is smaller than the single user missing probability only by a constant factor. III. NONSTATIONARY INTERCONTACT TIMES Until now, we have made the simplifying assumption that the intercontact times are stationary. This assumption is clearly not true. For example, one would expect the intercontact time patterns late at night to differ significantly from the intercontact time distribution during lunch hours or peak traffic hours. This implies that the optimal contact-probing interval will vary with time. In the following sections, we study the contact arrival processes where the average contact arrival rate varies with time. In this case, we assume that the contact arrival rate is constant over some short time period of duration . As before, we assume the contact durations are i.i.d.. This assumption will be justified in Section VI. A. Problem Formulation We divide the time to slots with equal duration of . We denote the average contact arrival rate in the th time slot as . We proved in Theorem 1 that using a constant contact-probing interval of is optimal during the th time slot. In this case, the expected number of contacts captured during time period will be . However, to minimize the overall contact-missing probability, the optimal probing intervals could be different for different time slots that have a different arrival rate of . Intuitively, more energy should be spent in time slots with high arrival rates to capture more contacts so that the overall contact missing probability can be minimized. Consider the problem of maximizing the number of contacts captured over time slots, given a constraint on the total energy used in the time slots. This problem is equivalent to minimizing the contact-missing probability over the time slots. As each probe consumes equal energy, the constraint on total energy consumption can be converted to a constraint of total number of probes used in the time slots. Assume that the device can take probes during the time slots. We can then formulate the following optimization problem to find the optimal probing interval for each time slot: (14) Defining variable as and , the optimization in (14) can be recast as (15) It can be verified that is always a concave and increasing function for the three distributions we discussed above. Using the Karush–Kuhn–Tucker (KKT) conditions [16], we have (16) for all nonzero optimal solutions . Note that when . The energy cost is a constant that can be obtained by solving the dual of (15). The optimal can then be solved from (17) for a given . In practice, we can set the parameter instead of to achieve a certain tradeoff point between energy and missing probability. A larger value of means energy is more precious (smaller ) and less energy will be used in the time slots. Consequently, the overall contact-missing probability will be larger. Once the parameter is set, we can use (17) to calculate the optimal based on the estimate of for a time slot . This solution minimizes the missing probability over all time, subject to energy constraints, which are determined by the energy cost . Example 1: Let the contact duration distribution be uniform, h, and is the number of times the device can probe in 24 h. Then, the optimal contact-probing interval for the arrival rate is (18) by (17). If the are all equal, then clearly . Therefore, and when . B. Discussion In Section II, we show that a constant probing rate is optimal for stationary processes, and the optimal tradeoff points can be derived from (1). In this section, we show that the average missing probability can be further reduced when the contact arrival rate is time-varying. When the contact arrival rates are different in different time slots, we can use adaptive contact probing to redistribute the limited probing energy according to the contact arrival rate. For a time slot with high contact arrival rate, we can increase the probing rate and reduce the missing probability so that most contacts arriving in this time slot can be captured. When the contact arrival rate is low, the contact-probing rate can be reduced since the number of missed contacts will be small for a small , even if is quite high. In this way, the overall missing probability can be minimized. However, the missing probability for time slots with low arrival rates could be very high in this case. We now make two observations regarding the the optimization in (14). First, when the contact arrival rates are equal for all time slots, it is not necessary to use different probing intervals in different time slots, and a constant probing rate is optimal. In this scenario, the solution to (14) reduces to the solution for
WANG etaL:OPPORTUNISTIC ENERGY-EFFICIENT CONTACT PROBING IN DELAY-TOLERANT APPLICATIONS 1597 1.0 1.0 ●-Stationary process A-Time-varying processs(var=1) 0.8 Time-varying processs (var=5) 0.8 -Time-varying processs(var=10) 0.6 0.6 4 0.4 0.2 Stationary process No prediction error 0.2 -Prediction MSE 1 ■-Prediction MSE=4 0.0 ■ 10 12 14 16 18 0.0 20 0 101214 16 20 Normalized Probing Interval (T/t) Normalized Probing Interval(T/) Fig.4.Missing probability for time-varying arrival process (var =5)with different prediction mean squared errors. Fig.3.Optimal missing probability for time-varying arrival process than the constant probing interval scheme,even for large pre- stationary intercontact times,derived in Section II.This is illus- diction errors.This suggests that we can use the adaptive con- trated in Example 1. tact-probing algorithm even when we cannot accurately predict Second,with time-varying contact arrival rates,the optimiza- the arrival rate. tion algorithm should have enough information to predict the arrival rate Ai.For a fixed value of c,the algorithm needs the IV.HUMAN CONTACT PROCESSES arrival rate in the next time slot to calculate the optimal probing In Sections II and III,we derived analytical models for gen- interval from(17).If the algorithm does not correctly predict eral contact processes,which can be applied to different sce- the contact arrival rate A;in the next slot,the probing rate cal- narios,such as contacts between humans or between mobile culated based on the predicted A:will not be optimal. robots.In the next three sections,we mainly focus on the human We use an example to demonstrate the performance of adap- contact process based on a set of Bluetooth contact logging ex- tive contact probing in time-varying processes.Fig.3 shows the periments. performance for time-varying processes with different variances in contact arrival rates A;.In this example,the contact duration A.Data Collection Experiments is distributed as a Pareto distribution with :=2.For contact ar- In order to characterize contact distributions.we handed rival rates Ai,we assume they are random variables distributed Bluetooth phones to nine volunteers.We also installed static according to a truncated Gaussian distribution.3 We vary the mean and variance of the Gaussian distribution to change the Bluetooth probes in high-traffic areas.The phones had a J2ME program running on them that initiated a Bluetooth device variance of the contact arrival rate while keeping the mean ar- discovery every 30 s.If other Bluetooth devices are discovered rival rate to 1.To expand the curve,we change the t-axis to nor- malized probing interval TT instead of the energy consumption in the vicinity,then the time of contact and Bluetooth address rate 1/(uT)as in Fig.2.In Fig.3,a smaller value of Pmiss for were captured.Since the probing software could terminate due to lack of energy,crashes,etc.,we captured the start time and the same normalized probing interval represents better perfor- end time of each probing session.This allowed us to capture mance.Compared to the constant probing interval scheme used contact duration and intercontact time distributions accurately for stationary processes,we see that adaptive contact-probing Overall,we did 424 man days of data collection,and 12 332 scheme performs better when the variance of contact arrival rate increases.For a missing probability of 20%,the average con- unique devices were discovered in our experiment.To the best of our knowledge,this is the largest volume of unique devices tact-probing interval for processes with arrival rate variance of discovered compared to other comparable studies.4 For the 10 is about 12,which is nearly six times longer than the probing details of the experiment,please refer to [11]. interval used by the constant probing interval scheme.As we as- For the nine volunteers,we have the complete contact log sume each probe uses the same amount of energy,the adaptive over the experiment period.Our analysis and simulations are contact-probing scheme uses only 1/6 energy compared to the based on the contact patterns of these nine volunteers.Each constant probing interval scheme. In Fig.3,we assume that there are no prediction errors in contiguous set of scans during which a device is discovered is counted as a contact.Assume that a device(say D)is discovered Ai.Fig.4 demonstrates the performance of adaptive probing in m contiguous scans.Then,the duration of the contact with D scheme under different prediction mean squared errors(MSEs). We see that when the prediction error increases,the performance is the difference between the time of D's discovery in the mth becomes worse.However,the performance is still much better 4In the Haggle [14]studies,a maximum of 41 volunteers probing at 120-s intervals discovered 200 unique devices over five days.In the Serendipity [13] 3Note that the arrival rate cannot be negative;all negative arrival rates are set study,100 volunteers probing at 5-min intervals discovered 2798 unique devices to zero. over a nine-month period
WANG et al.: OPPORTUNISTIC ENERGY-EFFICIENT CONTACT PROBING IN DELAY-TOLERANT APPLICATIONS 1597 Fig. 3. Optimal missing probability for time-varying arrival process. stationary intercontact times, derived in Section II. This is illustrated in Example 1. Second, with time-varying contact arrival rates, the optimization algorithm should have enough information to predict the arrival rate . For a fixed value of , the algorithm needs the arrival rate in the next time slot to calculate the optimal probing interval from (17). If the algorithm does not correctly predict the contact arrival rate in the next slot, the probing rate calculated based on the predicted will not be optimal. We use an example to demonstrate the performance of adaptive contact probing in time-varying processes. Fig. 3 shows the performance for time-varying processes with different variances in contact arrival rates . In this example, the contact duration is distributed as a Pareto distribution with . For contact arrival rates , we assume they are random variables distributed according to a truncated Gaussian distribution.3 We vary the mean and variance of the Gaussian distribution to change the variance of the contact arrival rate while keeping the mean arrival rate to 1. To expand the curve, we change the -axis to normalized probing interval instead of the energy consumption rate as in Fig. 2. In Fig. 3, a smaller value of for the same normalized probing interval represents better performance. Compared to the constant probing interval scheme used for stationary processes, we see that adaptive contact-probing scheme performs better when the variance of contact arrival rate increases. For a missing probability of 20%, the average contact-probing interval for processes with arrival rate variance of 10 is about 12, which is nearly six times longer than the probing interval used by the constant probing interval scheme. As we assume each probe uses the same amount of energy, the adaptive contact-probing scheme uses only 1/6 energy compared to the constant probing interval scheme. In Fig. 3, we assume that there are no prediction errors in . Fig. 4 demonstrates the performance of adaptive probing scheme under different prediction mean squared errors (MSEs). We see that when the prediction error increases, the performance becomes worse. However, the performance is still much better 3Note that the arrival rate cannot be negative; all negative arrival rates are set to zero. Fig. 4. Missing probability for time-varying arrival process ( ) with different prediction mean squared errors. than the constant probing interval scheme, even for large prediction errors. This suggests that we can use the adaptive contact-probing algorithm even when we cannot accurately predict the arrival rate. IV. HUMAN CONTACT PROCESSES In Sections II and III, we derived analytical models for general contact processes, which can be applied to different scenarios, such as contacts between humans or between mobile robots. In the next three sections, we mainly focus on the human contact process based on a set of Bluetooth contact logging experiments. A. Data Collection Experiments In order to characterize contact distributions, we handed Bluetooth phones to nine volunteers. We also installed static Bluetooth probes in high-traffic areas. The phones had a J2ME program running on them that initiated a Bluetooth device discovery every 30 s. If other Bluetooth devices are discovered in the vicinity, then the time of contact and Bluetooth address were captured. Since the probing software could terminate due to lack of energy, crashes, etc., we captured the start time and end time of each probing session. This allowed us to capture contact duration and intercontact time distributions accurately. Overall, we did 424 man days of data collection, and 12 332 unique devices were discovered in our experiment. To the best of our knowledge, this is the largest volume of unique devices discovered compared to other comparable studies.4 For the details of the experiment, please refer to [11]. For the nine volunteers, we have the complete contact log over the experiment period. Our analysis and simulations are based on the contact patterns of these nine volunteers. Each contiguous set of scans during which a device is discovered is counted as a contact. Assume that a device (say D) is discovered in contiguous scans. Then, the duration of the contact with D is the difference between the time of D’s discovery in the th 4In the Haggle [14] studies, a maximum of 41 volunteers probing at 120-s intervals discovered 200 unique devices over five days. In the Serendipity [13] study, 100 volunteers probing at 5-min intervals discovered 2798 unique devices over a nine-month period
1598 IEEE/ACM TRANSACTIONS ON NETWORKING.VOL.17.NO.5.OCTOBER 2009 0.1 点 0.01 Contact duration 鲁一Buletooth Record --Number of new contacts 8 -Curve fitting of (2x) ----Number of contacts 0.1 10 100 0.1 10 100 x(contact duration in minutes) Minutes Fig.5.Contact duration distribution. Fig.6.Normalized correlation of contact duration and contact number over short time period ('=2 min). scan and the first scan.If a device is only detected in one scan. Number ofnew contacts:The number of newly arrived con- we treat the duration of contact as 30 s. tacts during the time window of [(-1)7,ir].This mea- surement determines the correlation of intercontact time. B.Distribution of Contact Duration Number of contacts:The number of contacts the device ob- We first look at contact duration distribution of the Bluetooth serves during the time window,including those that arrived contact data.Fig.5 plots the complementary cumulative distri- in previous windows. bution function (ccdf)curve,which is 1-Fp(z)(probability As shown in Fig.6,the number of contacts and number of of tp >x)in log-log scale.We see that the contact duration new contacts are highly correlated over short periods,and the distribution follows a power law(Pareto distribution).By curve correlation dropped faster when the correlation period is longer fitting,we can estimate FD()=1-(x/T)withT =30s than 60 min.The contact duration is less correlated.The high and =0.85.Although the mean for Pareto distribution with correlation in the number of new contacts makes it possible to 1 is unbounded,the average contact duration is around predict the contact arrival rate in future time slots,which will be 370 s in our case.This is because the probability of long contacts discussed in Section V. decreases rapidly when contact duration is longer than 2 h,and in our experiments,there are no contacts lasting longer than 5 h. D.Correlation at Large Time Scales The fact that the contact duration distribution follows a power The contact processes are highly related to human behavior. law has also been verified by other studies(see [14]and refer- Therefore,we expect to see diurnal variations in the contact pat- ences therein). terns.Also,since most people are creatures of habit,one might The fact that the contact duration distribution is Pareto-dis- expect to see periodicity in the contact processes over daily or tributed is a pessimistic result.For the Pareto distribution,be- weekly time scales.Fig.7 shows the correlation over long time yond a small threshold probing interval,the missing probability periods.We clearly see the diurnal variations.The autocorrela- rises sharply from close to zero as shown in Fig.2.Fortunately, tion drops to zero over 12 h and had a period of 24 h.Finally, as a consequence of human mobility,contact arrival rates are we do not see any significant correlation increase over a 168-h time-varying.As we show in later sections,this can be exploited (one-week)period. to achieve significant energy savings E.Self-Similar Nature C.Correlation Analysis The slowly decreasing slope of the autocorrelation for the In order to see if there is any predictability in the contact number of new contacts hints that the contact arrival process patterns,which will aid us in designing good adaptive contact- is self-similar [18],[19].We test for self-similarity over four probing mechanisms,we investigate the autocorrelation of the time scales of 10.100.1000.and 10 000 minutes.We use the contact processes. RS measurement to verify this conjecture.Let Xi denote the Here,we use the autocorrelation function [17]:Define an ag- number of new contacts seen in 1 min and Yi=>Xi be gregated measurement series f(i)as the average of certain mea- the aggregated number of new contacts inj minutes.Define surements,over a time window of [(i-1),.We calculate the autocorrelation of the measurement as Anf( f(+n).In this paper,we consider the measurement of three types: ,=婴+--+-) Contact duration:The average contact duration for the contacts occurring in a time window. 盟Y+i-4-20+-)
1598 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 17, NO. 5, OCTOBER 2009 Fig. 5. Contact duration distribution. scan and the first scan. If a device is only detected in one scan, we treat the duration of contact as 30 s. B. Distribution of Contact Duration We first look at contact duration distribution of the Bluetooth contact data. Fig. 5 plots the complementary cumulative distribution function (ccdf) curve, which is (probability of ) in log-log scale. We see that the contact duration distribution follows a power law (Pareto distribution). By curve fitting, we can estimate with s and . Although the mean for Pareto distribution with is unbounded, the average contact duration is around 370 s in our case. This is because the probability of long contacts decreases rapidly when contact duration is longer than 2 h, and in our experiments, there are no contacts lasting longer than 5 h. The fact that the contact duration distribution follows a power law has also been verified by other studies (see [14] and references therein). The fact that the contact duration distribution is Pareto-distributed is a pessimistic result. For the Pareto distribution, beyond a small threshold probing interval, the missing probability rises sharply from close to zero as shown in Fig. 2. Fortunately, as a consequence of human mobility, contact arrival rates are time-varying. As we show in later sections, this can be exploited to achieve significant energy savings. C. Correlation Analysis In order to see if there is any predictability in the contact patterns, which will aid us in designing good adaptive contactprobing mechanisms, we investigate the autocorrelation of the contact processes. Here, we use the autocorrelation function [17]: Define an aggregated measurement series as the average of certain measurements, over a time window of . We calculate the autocorrelation of the measurement as . In this paper, we consider the measurement of three types: • Contact duration: The average contact duration for the contacts occurring in a time window. Fig. 6. Normalized correlation of contact duration and contact number over short time period ( min). • Number of new contacts: The number of newly arrived contacts during the time window of . This measurement determines the correlation of intercontact time. • Number of contacts: The number of contacts the device observes during the time window, including those that arrived in previous windows. As shown in Fig. 6, the number of contacts and number of new contacts are highly correlated over short periods, and the correlation dropped faster when the correlation period is longer than 60 min. The contact duration is less correlated. The high correlation in the number of new contacts makes it possible to predict the contact arrival rate in future time slots, which will be discussed in Section V. D. Correlation at Large Time Scales The contact processes are highly related to human behavior. Therefore, we expect to see diurnal variations in the contact patterns. Also, since most people are creatures of habit, one might expect to see periodicity in the contact processes over daily or weekly time scales. Fig. 7 shows the correlation over long time periods. We clearly see the diurnal variations. The autocorrelation drops to zero over 12 h and had a period of 24 h. Finally, we do not see any significant correlation increase over a 168-h (one-week) period. E. Self-Similar Nature The slowly decreasing slope of the autocorrelation for the number of new contacts hints that the contact arrival process is self-similar [18], [19]. We test for self-similarity over four time scales of 10, 100, 1000, and 10 000 minutes. We use the measurement to verify this conjecture. Let denote the number of new contacts seen in 1 min and be the aggregated number of new contacts in minutes. Define
WANG etaL:OPPORTUNISTIC ENERGY-EFFICIENT CONTACT PROBING IN DELAY-TOLERANT APPLICATIONS 1599 1.0 TABLE I CONTACT ARRIVAL RATE IN DIFFERENT TIME SCALES. Contact duration 0.8 -Number of new contacts Time slot Mean arrival rate Arrival rate Normalized ····Number of contacts size (contact/time slot) variance variance 5 minutes 0.2150 1.3350 28.8923 10 minutes 0.4299 4.4575 24.1172 0.6 30 minutes 1.2901 30.4295 18.2930 I hour 2.5973 113.8747 17.1142 0.4 12 600 (noy. Average contact 10 arrival rate 500 0.2 ■-Variance of contact arrival rate 0.0 风风A风AA风 400 72 96 120 144168192 216240 Hours 6 300 Fig.7.Normalized correlation of contact duration and contact number over 200 long time periods (=5 min). 2 100 1000 Linear fitting (slope=0.7636) ·-·slope0.5 0 0 9 12 H(time of day) 100 Fig.9.Contact arrival rate and its variance over time of day. the process,the variance increases faster than CL1/2 when the 10. time scale L increases.The normalized variance of the arrival rate is the variance divided by the mean arrival rate.As shown in Fig.3,a larger normalized variance of the arrival rate leads to potentially better performance.In our human contact process, the normalized variance of the contact process is larger than 10 100 1000 10000 when the time slot size L is smaller than I h.Therefore,adaptive L(minutes) contact probing should perform better than the lowest curve in Fig.8.The R/S statistic of the contact arrival process. Fig.3 when the arrival rate in the next hour is known. G.Contact Arrival Rate versus Time of Day and The contact arrival rate distribution varies with time of day. In Fig.9,we plot the average rate at which new contacts are seen at different times of the day.The contact arrival rate during the early morning is quite small.This implies that we can use longer S(t,L (X-X五)2,X,五 contact-probing intervals during the early morning to save en- L ergy as described in Section III.The variance of the contact ar- i=t+1 rival rate over different days is also plotted in Fig.9.We see If ER(L/S(L)}=CL,with Hurst parameter hE(0.5,1). that the variance is quite large.This shows the contact arrival then the process is self-similar [19].From Fig.8,we see that our rate at the same hour in different days may vary drastically.Al- process has a Hurst parameter close to 0.76. though the arrival rates exhibit some patterns in a 24-h period, This implies that the contact arrivals are not only long-range we cannot simply use the time of day to infer the arrival rate due dependent but also bursty.An intuitive model for this is that ar- to the large variance at the same time of day. rivals follow an ON-OFF process where the ON and OFF durations have memory.This can potentially be exploited to achieve en- V.ADAPTIVE PROBING ALGORITHMS ergy savings. In Section IV,we saw that human contact patterns have time- F.Mean and Variance of Contact Arrival Rate varying contact arrival rates when the time slot size L is smaller than 1 h.Also,the contact arrival rates have high short-term cor- As discussed in Section III,we can adapt the probing interval relation and exhibit a 24-h periodic pattern.These observations according to the contact arrival rate when the arrival rate is time- suggest that the human contact process is both time-varying and varying.Table I shows the mean and variance of the contact predictable.We exploit these characteristics to design an effi- arrival rates at different time scales.Due to the self-similarity of cient adaptive contact-probing scheme in this section
WANG et al.: OPPORTUNISTIC ENERGY-EFFICIENT CONTACT PROBING IN DELAY-TOLERANT APPLICATIONS 1599 Fig. 7. Normalized correlation of contact duration and contact number over long time periods ( min). Fig. 8. The statistic of the contact arrival process. and If , with Hurst parameter , then the process is self-similar [19]. From Fig. 8, we see that our process has a Hurst parameter close to 0.76. This implies that the contact arrivals are not only long-range dependent but also bursty. An intuitive model for this is that arrivals follow an ON–OFF process where the ON and OFF durations have memory. This can potentially be exploited to achieve energy savings. F. Mean and Variance of Contact Arrival Rate As discussed in Section III, we can adapt the probing interval according to the contact arrival rate when the arrival rate is timevarying. Table I shows the mean and variance of the contact arrival rates at different time scales. Due to the self-similarity of TABLE I CONTACT ARRIVAL RATE IN DIFFERENT TIME SCALES. Fig. 9. Contact arrival rate and its variance over time of day. the process, the variance increases faster than when the time scale increases. The normalized variance of the arrival rate is the variance divided by the mean arrival rate. As shown in Fig. 3, a larger normalized variance of the arrival rate leads to potentially better performance. In our human contact process, the normalized variance of the contact process is larger than 10 when the time slot size is smaller than 1 h. Therefore, adaptive contact probing should perform better than the lowest curve in Fig. 3 when the arrival rate in the next hour is known. G. Contact Arrival Rate versus Time of Day The contact arrival rate distribution varies with time of day. In Fig. 9, we plot the average rate at which new contacts are seen at different times of the day. The contact arrival rate during the early morning is quite small. This implies that we can use longer contact-probing intervals during the early morning to save energy as described in Section III. The variance of the contact arrival rate over different days is also plotted in Fig. 9. We see that the variance is quite large. This shows the contact arrival rate at the same hour in different days may vary drastically. Although the arrival rates exhibit some patterns in a 24-h period, we cannot simply use the time of day to infer the arrival rate due to the large variance at the same time of day. V. ADAPTIVE PROBING ALGORITHMS In Section IV, we saw that human contact patterns have timevarying contact arrival rates when the time slot size is smaller than 1 h. Also, the contact arrival rates have high short-term correlation and exhibit a 24-h periodic pattern. These observations suggest that the human contact process is both time-varying and predictable. We exploit these characteristics to design an effi- cient adaptive contact-probing scheme in this section
1600 IEEE/ACM TRANSACTIONS ON NETWORKING.VOL.17.NO.5.OCTOBER 2009 Assumptions and Methodology 800 In this section.we investigate the design of adaptive contact- -c=20 ●—G=50 probing algorithms that can achieve a good tradeoff between 600 ▲c=100 the probability of a missed contact and energy consumption. As described in Section II-A,the energy consumption rate is inversely proportional to the average probing interval;e.g.,a device with average probing interval of 30 s will consume two 400 times more energy than a device with average probing interval of 60 s. From our earlier analysis in Section III,we see that in order to 200- compute the optimal contact-probing interval,we require k,the power law exponent for the contact duration distribution and the arrival rate.We contend that variations in the contact duration distribution,if any,will occur over large time scales.It is reason- 10 15 able to hypothesize that contact duration distribution will always ,(contact/hour) follow a power law but with different decay coefficientsk over Fig.10. Relationship between optimal contact-probing interval and the arrival large time scales.This is validated by the fact that experiments rate. at widely differing geographical locations have all displayed a power-law contact duration distribution but with different decay coefficients [14],[12],[20].For the rest of this section,we as- STAR-TOD first collects the historical average arrival rate sume that the contact duration distribution is known and focus Ar at hour H (e.g.,A12 is the arrival rate averaged over dif- on the problem of adapting the contact-probing interval to vari- ferent days for the same time period of 1200-1259 h).Then, ations in arrival rate.This assumption will be further justified in STAR-TOD reads the current hour H from the system clock of Section VI. the device and sets As shown in Section III,we can find the optimal value of the contact-probing interval by solving (17)when the arrival rate 入=AH (20) is known.Since our contact duration is Pareto-distributed,we have In other words,STAR-TOD implicitly assumes the contact ar- rival rate is highly correlated to time of day,and the arrival rate 1-5 will not vary dramatically in different days. Ti=T( (1-+1 (19) 入kT 2)STAR-PTS:The STAR-previous time slot(STAR-PTS) scheme uses the detected arrival rate in the previous time slots where k:=0.85 and T=30 s for our logging data as shown as the estimation i.More specifically,it sets in Section IV.With a different energy cost of c,we can achieve 入 different tradeoff points between Pmiss and T.Fig.10 shows 入=-j (21) the optimal contact-probing interval for different arrival rates. We see that when the arrival rate is high,the optimal T*quickly where Ai is the nearest time slot with nonzero arrival rate.If the converges to T.This means we can tolerate a certain amount arrival rate in the last time slot Ai_1 is nonzero,(21)will set the of estimation error in A;when the contact arrival rate is high. estimate to入i=入i-l. However,when the contact rate is small,we need to estimate Note that the historical arrival rate Aj used in STAR-PTS it accurately so that we will not underestimate the rate and use comes from the arrival rate observed by the device.When T extremely long contact-probing intervals. is large,the missing probability will be quite high,andj could be inaccurate.To compensate this effect,we set Aj as A.Contact Arrival Rate Estimation 入=1-Dmis(Tj刀 (22) Based on our analysis,we propose a class of adaptive con- tact-probing schemes based on short-term arrival rate estima- tion (STAR).STAR updates the estimate of the contact arrival where n is the number of new contacts detected by the device rate Ai periodically using a time slot length of L.The estimation in time slot j. is based on information available to the probing devices,such STAR-PTS uses only the short-term history of the arrival as time of day,statistics of historical arrival rates,and observed process.This scheme may converge to certain nonoptimal op- arrival rates in previous time slots.STAR then sets the probing erating points for human contact process.For example,after a interval for next probing according to the optimal T*calculated night where no contacts arrive,the Ai will converge to zero,and from(l9)using the estimated arrival rate入: the T*could be very long.In the next morning,it will be difficult We will consider the following estimation methods for STAR to detect new contacts since the missing probability is high due and compare their performance in Section VI. to the large T*.Therefore,STAR-PTS will keep on assuming 1)STAR-TOD:The STAR-time of day(STAR-TOD)scheme Ai is zero until it detects some contact by chance.To reduce this is motivated by two facts.First,people are creatures of habit. long“wam-up time”'of STAR-PTS in the morning,we set入i Second,Fig.7 shows that the auto-correlation peaks at 24-h as lags.These suggest that it may be possible to estimate the arrival rate using the time-of-day information. 入= A5+(1-w)AH (23) 2-7
1600 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 17, NO. 5, OCTOBER 2009 Assumptions and Methodology In this section, we investigate the design of adaptive contactprobing algorithms that can achieve a good tradeoff between the probability of a missed contact and energy consumption. As described in Section II-A, the energy consumption rate is inversely proportional to the average probing interval; e.g., a device with average probing interval of 30 s will consume two times more energy than a device with average probing interval of 60 s. From our earlier analysis in Section III, we see that in order to compute the optimal contact-probing interval, we require , the power law exponent for the contact duration distribution and the arrival rate. We contend that variations in the contact duration distribution, if any, will occur over large time scales. It is reasonable to hypothesize that contact duration distribution will always follow a power law but with different decay coefficients over large time scales. This is validated by the fact that experiments at widely differing geographical locations have all displayed a power-law contact duration distribution but with different decay coefficients [14], [12], [20]. For the rest of this section, we assume that the contact duration distribution is known and focus on the problem of adapting the contact-probing interval to variations in arrival rate. This assumption will be further justified in Section VI. As shown in Section III, we can find the optimal value of the contact-probing interval by solving (17) when the arrival rate is known. Since our contact duration is Pareto-distributed, we have (19) where and s for our logging data as shown in Section IV. With a different energy cost of , we can achieve different tradeoff points between and . Fig. 10 shows the optimal contact-probing interval for different arrival rates. We see that when the arrival rate is high, the optimal quickly converges to . This means we can tolerate a certain amount of estimation error in when the contact arrival rate is high. However, when the contact rate is small, we need to estimate it accurately so that we will not underestimate the rate and use extremely long contact-probing intervals. A. Contact Arrival Rate Estimation Based on our analysis, we propose a class of adaptive contact-probing schemes based on short-term arrival rate estimation (STAR). STAR updates the estimate of the contact arrival rate periodically using a time slot length of . The estimation is based on information available to the probing devices, such as time of day, statistics of historical arrival rates, and observed arrival rates in previous time slots. STAR then sets the probing interval for next probing according to the optimal calculated from (19) using the estimated arrival rate . We will consider the following estimation methods for STAR and compare their performance in Section VI. 1) STAR-TOD: The STAR-time of day (STAR-TOD) scheme is motivated by two facts. First, people are creatures of habit. Second, Fig. 7 shows that the auto-correlation peaks at 24-h lags. These suggest that it may be possible to estimate the arrival rate using the time-of-day information. Fig. 10. Relationship between optimal contact-probing interval and the arrival rate. STAR-TOD first collects the historical average arrival rate at hour (e.g., is the arrival rate averaged over different days for the same time period of 1200–1259 h). Then, STAR-TOD reads the current hour from the system clock of the device and sets (20) In other words, STAR-TOD implicitly assumes the contact arrival rate is highly correlated to time of day, and the arrival rate will not vary dramatically in different days. 2) STAR-PTS: The STAR-previous time slot (STAR-PTS) scheme uses the detected arrival rate in the previous time slots as the estimation . More specifically, it sets (21) where is the nearest time slot with nonzero arrival rate. If the arrival rate in the last time slot is nonzero, (21) will set the estimate to . Note that the historical arrival rate used in STAR-PTS comes from the arrival rate observed by the device. When is large, the missing probability will be quite high, and could be inaccurate. To compensate this effect, we set as (22) where is the number of new contacts detected by the device in time slot . STAR-PTS uses only the short-term history of the arrival process. This scheme may converge to certain nonoptimal operating points for human contact process. For example, after a night where no contacts arrive, the will converge to zero, and the could be very long. In the next morning, it will be difficult to detect new contacts since the missing probability is high due to the large . Therefore, STAR-PTS will keep on assuming is zero until it detects some contact by chance. To reduce this long “warm-up time” of STAR-PTS in the morning, we set as (23)
WANG etaL:OPPORTUNISTIC ENERGY-EFFICIENT CONTACT PROBING IN DELAY-TOLERANT APPLICATIONS 1601 where w is the weight for different estimations.In(23),the amount of memory to store the A;corresponding to different time-of-day information is combined with estimation from combinations of H and Ai-i's.Also,it needs a large amount short-term history to prevent the algorithm from getting stuck of training data for calculating the conditional mean.Therefore, at the nonoptimal operating point.This approach combines it has a much higher complexity compared to STAR-LMMSE both the long-term history of TOD and the short-term history. and STAR-PTS.Though it may not be optimal in terms of min- The long-term history can serve as a baseline of the everyday imizing missing probability,STAR-MMSE can serve as a per- routine of the user.The short-term history will track dynamic formance benchmark for estimation-based contact-probing al- changes in user behavior. gorithms. 3)STAR-LMMSE:STAR linear minimum mean-squared 5)STAR-Genie:STAR-Genie is a noncausal algorithm that error estimation(STAR-LMMSE)uses more short-term corre- assumes that we know the actual arrival rate in the next time lation information than STAR-PTS.It sets the estimation of the slot.It serves as a lower bound for the missing probability of all arrival rate in the next time slot to STAR-based algorithms. 入i-j (24) VI.SIMULATIONS A.Simulation Models Suppose the auto-correlation of arrival rates is We now compare the performance of different adaptive con- tact-probing algorithms by running trace-driven simulations on An= 入i入i+n (25) the data that we have collected. When calculating the missing probability,we use the Blue- tooth contact logs,which are based on a 30-s contact-probing We define interval as the baseline.We apply different adaptive contact- Ao A1 A2 Ag-11 probing algorithms to filter the contact log data.Assume that A0 A1 1-2 a device D has been logged in our experiments from time t to (26) t+T.Then,for a specific adaptive contact-probing algorithm, we say that the contact has been missed if the algorithm does not 40 initiate any probe in the intervalt,t+.The contact missing probability is computed as the ratio of the number of contacts and missed by the adaptive contact-probing algorithm to the total V=[A1 A2...A]. number of contacts made in our data logging experiments. (27) Then setting B.Verification of Analysis First,we verify that the relationship between Tand Pmiss de- [a1a2,...,ad]=V(A-1) (28) rived in(12)is correct for a constant contact-probing interval T. From our logging data,we have computed the decay exponent will minimize the MSE of the linear estimator [21],and the cor- of the contact distribution =0.85.Since the granularity of responding MSE is given by [21] our logging experiments is over 30-s intervals,we setT=30. The comparison between the trace-driven simulations and the E{(-A)2}=A0- (29) analytical results are shown in Fig.11.We see that the results of the simulation are quite close to the theoretical results.Since the constant contact-probing interval algorithm does not adapt Note that STAR-LMMSE only uses short-term history and to the changes in the contact arrival rate,we will use this as the may have the same"warm-up"problem as STAR-PTS.In this lower bound for the average contact-probing interval to achieve case,we can use the same weighted average method as dis- a given missing probability. cussed in STAR-PTS. 4)STAR-MMSE:We can also use the general nonlinear C.Performance Comparison MMSE scheme to estimate the arrival rate by setting [21] We use data-driven simulations to compare the performance 入=E{λlH,入-1,,入i-g} (30)of different adaptive probing algorithms.In the simulation,we use a time slot size of 10 min based on the correlation of human where H is the time of day as in STAR-TOD.In (30).H and contacts shown in Fig.6.Since the correlation drops below 0.1 Acan only take integer values.Therefore.the algorithm after 2 h,we use the arrival rate in the previous 2 h(i.e..g=12) needs to store the Ai corresponding to different combinations to estimate A;in STAR-LMMSE.For STAR-MMSE,we use of H and Ai-js.As we can see from Fig.10,the curve becomes g =4 due to the large storage space and training data size re- flat when Ai is large.Therefore,the algorithm can tolerate large quired by the algorithm.Additionally,we assume STAR-MMSE estimation errors when A:is large.In practice,we can"quan-has perfect knowledge of the arrival rates in the previous time tize"large values of A's so that the number of combinations slots.In this sense,it is a lower bound to the performance of the of H and Ai-i's can be reduced. algorithm that uses the detected arrivals only. The STAR-MMSE algorithm uses both the short-term his- Fig.12 shows the relationship between missing probability tory and time-of-day information.However,it requires a large and the long-term average probing interval for different algo-
WANG et al.: OPPORTUNISTIC ENERGY-EFFICIENT CONTACT PROBING IN DELAY-TOLERANT APPLICATIONS 1601 where is the weight for different estimations. In (23), the time-of-day information is combined with estimation from short-term history to prevent the algorithm from getting stuck at the nonoptimal operating point. This approach combines both the long-term history of TOD and the short-term history. The long-term history can serve as a baseline of the everyday routine of the user. The short-term history will track dynamic changes in user behavior. 3) STAR-LMMSE: STAR linear minimum mean-squared error estimation (STAR-LMMSE) uses more short-term correlation information than STAR-PTS. It sets the estimation of the arrival rate in the next time slot to (24) Suppose the auto-correlation of arrival rates is (25) We define . . . ... ... ... . . . (26) and (27) Then setting (28) will minimize the MSE of the linear estimator [21], and the corresponding MSE is given by [21] (29) Note that STAR-LMMSE only uses short-term history and may have the same “warm-up” problem as STAR-PTS. In this case, we can use the same weighted average method as discussed in STAR-PTS. 4) STAR-MMSE: We can also use the general nonlinear MMSE scheme to estimate the arrival rate by setting [21] (30) where is the time of day as in STAR-TOD. In (30), and can only take integer values. Therefore, the algorithm needs to store the corresponding to different combinations of and s. As we can see from Fig. 10, the curve becomes flat when is large. Therefore, the algorithm can tolerate large estimation errors when is large. In practice, we can “quantize” large values of ’s so that the number of combinations of and ’s can be reduced. The STAR-MMSE algorithm uses both the short-term history and time-of-day information. However, it requires a large amount of memory to store the corresponding to different combinations of and ’s. Also, it needs a large amount of training data for calculating the conditional mean. Therefore, it has a much higher complexity compared to STAR-LMMSE and STAR-PTS. Though it may not be optimal in terms of minimizing missing probability, STAR-MMSE can serve as a performance benchmark for estimation-based contact-probing algorithms. 5) STAR-Genie: STAR-Genie is a noncausal algorithm that assumes that we know the actual arrival rate in the next time slot. It serves as a lower bound for the missing probability of all STAR-based algorithms. VI. SIMULATIONS A. Simulation Models We now compare the performance of different adaptive contact-probing algorithms by running trace-driven simulations on the data that we have collected. When calculating the missing probability, we use the Bluetooth contact logs, which are based on a 30-s contact-probing interval as the baseline. We apply different adaptive contactprobing algorithms to filter the contact log data. Assume that a device D has been logged in our experiments from time to . Then, for a specific adaptive contact-probing algorithm, we say that the contact has been missed if the algorithm does not initiate any probe in the interval . The contact missing probability is computed as the ratio of the number of contacts missed by the adaptive contact-probing algorithm to the total number of contacts made in our data logging experiments. B. Verification of Analysis First, we verify that the relationship between and derived in (12) is correct for a constant contact-probing interval . From our logging data, we have computed the decay exponent of the contact distribution . Since the granularity of our logging experiments is over 30-s intervals, we set . The comparison between the trace-driven simulations and the analytical results are shown in Fig. 11. We see that the results of the simulation are quite close to the theoretical results. Since the constant contact-probing interval algorithm does not adapt to the changes in the contact arrival rate, we will use this as the lower bound for the average contact-probing interval to achieve a given missing probability. C. Performance Comparison We use data-driven simulations to compare the performance of different adaptive probing algorithms. In the simulation, we use a time slot size of 10 min based on the correlation of human contacts shown in Fig. 6. Since the correlation drops below 0.1 after 2 h, we use the arrival rate in the previous 2 h (i.e., ) to estimate in STAR-LMMSE. For STAR-MMSE, we use due to the large storage space and training data size required by the algorithm. Additionally, we assume STAR-MMSE has perfect knowledge of the arrival rates in the previous time slots. In this sense, it is a lower bound to the performance of the algorithm that uses the detected arrivals only. Fig. 12 shows the relationship between missing probability and the long-term average probing interval for different algo-