1 Distributed Multicast Tree Construction in Wireless Sensor Networks Hongyu Gong',Luoyi Fu2,Xinzhe Fu2,Lutian Zhao3,Kainan Wang2,and Xinbing Wang Dept.of Electronic Engineering,Shanghai Jiao Tong University,China.Email:{ann,xwang8@sjtu.edu.cn. 2Dept.of Computer Science,Shanghai Jiao Tong University,China. Email:yiluofu,fxz0114,sunnywkn @sjtu.edu.cn 3Dept.of Mathematics,Shanghai Jiao Tong University,China.Email:golbez@sjtu.edu.cn. Abstract-Multicast tree is a key structure for data dissemina- causing more energy consumption as well as more serious tion from one source to multiple receivers in wireless networks. interference to neighboring nodes.Besides,as the transmission Minimum length multica modeled as the Steiner Tree Problem, and is proven to be NP-hard.In this paper,we explore how distance increases,the messages suffer from higher probability to efficiently generate minimum length mult wireless sensor of transmission failure. networks (WSNs),where only limited knowledge of network Recently,GEographic Multicast(GEM),inspired by Eu- topology is available at each node.We design and analyze a clidean Steiner Tree,was proposed for routing in dense simple algorithm,which we call Toward Source Tree (TST),to wireless networks [11].Formally,given a network G=(V,E), build multicast trees in WSNs.We show three metrics of TST the weight of each edges,and a set of terminals S C V. algorithm,i.e.,running and energy efficiency.We prove that its running time is O(Vn log n),the best among all existing solutions the Steiner Tree Problem is to find a tree in G that spans S to our best knowledge.We prove that TST tree length is in the with the minimum total weight [13].This problem has been same order as Steiner tree,give a theoretical upper bound and proven to be NP-hard [14],and has not been visited for a use simulations to show the ratio be only 1.114 when nodes are long time.Former forms of its approximate implementation uniformly distributed.We evaluate energy efficiency in terms of message complexity and the number of forwardin prove that were not appropriate for constructing multicast trees in WSNs they are both order-optimal.We give an efficient way to construct for various reasons (details will be discussed in the following multicast tree in support of transmission of voluminous data. section.)In GEM,the authors took the first step to utilize the Steiner tree for constructing multicast trees in WSNs, achieving routing scalability and efficiency.This approach can I.INTRODUCTION potentially reduce the tree length,but this very simple form Wireless Sensor Network (WSN)is a network of wireless of utilization only considers the hop count in an unweighted sensor nodes into which sensing,computation and commu- graph,but not the total length of the multicast tree in a nication functions are integrated.Sensors are self-organizing weighted graph.Further,as for the performance analysis,the and deployed over a geographical region [2].Multicasting, statistical properties were all under the assumption that all i.e.,one-to-many message transmission,is one of the most nodes are uniformly distributed,making it difficult to tell its common data transmission patterns in WSNs.Tree is the topol- efficiency under a realistic network environment. ogy for non-redundant data transmission.To enable efficient In this paper,inspired by taking the advantage of the multicast,multicast tree has been proposed and widely used. Steiner tree property,we design a novel distributed algorithm It has not only been used for multicast capacity analysis in to construct an approximate minimum-length multicast tree wireless networks [3]-[5],but in practice,multicast supports for wireless sensor networks,aiming at achieving energy a wide range of applications like distance education,military efficiency,ease of implementation and low computational command and intelligent system [6]. complexity,at an affordable cost on the sub-optimality of tree Many researchers have been working on constructing ef length.In what follows,we call our design Toward Source ficient multicast trees [7],[12].They have proposed a Tree Algorithm,or TST for short.We quantitatively evaluate number of algorithms so as to minimize the routing complexity TST algorithm performance under general node distribution, as well as achieve the time and energy efficiency (for details,and show that TST has the following satisfactory metrics: please refer to the next section),but most of them did not Its running time is O(vn log n).the best among all focus on an important performance measure:the tree length. existing solutions for large multicast groups. This is a critical metric since larger tree length clearly results Its tree length is in the same order as Steiner tree,and in longer delay.To enable the messages to be forwarded simulation shows the constant ratio between them is only farther,sensors have to increase their transmission power, 1.114 with uniformly distributed nodes. Its message complexity (which we will formally define Part of this work was reported in the MobiHoc 2015 [1] later)and the number of nodes that participate in for- Copyright (c)2014 IEEE.Personal use of this material is permitted. However,permission to use this material for any other purposes must be warding are both order-optimal,yielding high energy obtained from the IEEE by sending a request to pubs-permissions@ieee.org. efficiency for sensor networks
1 Distributed Multicast Tree Construction in Wireless Sensor Networks Hongyu Gong1 , Luoyi Fu2 , Xinzhe Fu2 , Lutian Zhao3 , Kainan Wang2 , and Xinbing Wang 1 1Dept. of Electronic Engineering, Shanghai Jiao Tong University, China. Email:{ann, xwang8}@sjtu.edu.cn. 2Dept. of Computer Science, Shanghai Jiao Tong University, China. Email:{yiluofu, fxz0114, sunnywkn}@sjtu.edu.cn 3Dept. of Mathematics, Shanghai Jiao Tong University, China. Email: golbez@sjtu.edu.cn. Abstract—Multicast tree is a key structure for data dissemination from one source to multiple receivers in wireless networks. Minimum length multica modeled as the Steiner Tree Problem, and is proven to be NP-hard. In this paper, we explore how to efficiently generate minimum length mult wireless sensor networks (WSNs), where only limited knowledge of network topology is available at each node. We design and analyze a simple algorithm, which we call Toward Source Tree (TST), to build multicast trees in WSNs. We show three metrics of TST algorithm, i.e., running and energy efficiency. We prove that its running time is O( √ n log n), the best among all existing solutions to our best knowledge. We prove that TST tree length is in the same order as Steiner tree, give a theoretical upper bound and use simulations to show the ratio be only 1.114 when nodes are uniformly distributed. We evaluate energy efficiency in terms of message complexity and the number of forwardin prove that they are both order-optimal. We give an efficient way to construct multicast tree in support of transmission of voluminous data. I. INTRODUCTION Wireless Sensor Network (WSN) is a network of wireless sensor nodes into which sensing, computation and communication functions are integrated. Sensors are self-organizing and deployed over a geographical region [2]. Multicasting, i.e., one-to-many message transmission, is one of the most common data transmission patterns in WSNs. Tree is the topology for non-redundant data transmission. To enable efficient multicast, multicast tree has been proposed and widely used. It has not only been used for multicast capacity analysis in wireless networks [3]–[5], but in practice, multicast supports a wide range of applications like distance education, military command and intelligent system [6]. Many researchers have been working on constructing ef- ficient multicast trees [7]–[9], [12]. They have proposed a number of algorithms so as to minimize the routing complexity as well as achieve the time and energy efficiency (for details, please refer to the next section), but most of them did not focus on an important performance measure: the tree length. This is a critical metric since larger tree length clearly results in longer delay. To enable the messages to be forwarded farther, sensors have to increase their transmission power, Part of this work was reported in the MobiHoc 2015 [1] Copyright (c) 2014 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending a request to pubs-permissions@ieee.org. causing more energy consumption as well as more serious interference to neighboring nodes. Besides, as the transmission distance increases, the messages suffer from higher probability of transmission failure. Recently, GEographic Multicast (GEM), inspired by Euclidean Steiner Tree, was proposed for routing in dense wireless networks [11]. Formally, given a network G = (V, E), the weight of each edges, and a set of terminals S ⊆ V , the Steiner Tree Problem is to find a tree in G that spans S with the minimum total weight [13]. This problem has been proven to be NP-hard [14], and has not been visited for a long time. Former forms of its approximate implementation were not appropriate for constructing multicast trees in WSNs for various reasons (details will be discussed in the following section.) In GEM, the authors took the first step to utilize the Steiner tree for constructing multicast trees in WSNs, achieving routing scalability and efficiency. This approach can potentially reduce the tree length, but this very simple form of utilization only considers the hop count in an unweighted graph, but not the total length of the multicast tree in a weighted graph. Further, as for the performance analysis, the statistical properties were all under the assumption that all nodes are uniformly distributed, making it difficult to tell its efficiency under a realistic network environment. In this paper, inspired by taking the advantage of the Steiner tree property, we design a novel distributed algorithm to construct an approximate minimum-length multicast tree for wireless sensor networks, aiming at achieving energy efficiency, ease of implementation and low computational complexity, at an affordable cost on the sub-optimality of tree length. In what follows, we call our design Toward Source Tree Algorithm, or TST for short. We quantitatively evaluate TST algorithm performance under general node distribution, and show that TST has the following satisfactory metrics: • Its running time is O( √ n log n), the best among all existing solutions for large multicast groups. • Its tree length is in the same order as Steiner tree, and simulation shows the constant ratio between them is only 1.114 with uniformly distributed nodes. • Its message complexity (which we will formally define later) and the number of nodes that participate in forwarding are both order-optimal, yielding high energy efficiency for sensor networks
2 Its theoretical properties and distributive nature render apply the MST algorithm locally for routing [8].Dijkstra- it suitable for sensor network architecture and protocol based Localized Energy-Efficient Multicast Algorithm (DLE- design for performance improvement. MA)finds energy shortest paths leading through nodes with The rest of paper is organized as follows.Section II states maximal geographical advance towards desired destinations related work.In Section III,we introduce our network model. [9].Hierarchical geographic multicast routing (HGMR)tries In Section IV,we present our Toward Source Tree Algorithm to combine the advantages of geographic multicast routing to construct a multicast tree.In Section V,VI and VII. (GMR)and hierarchical rendezvous point multicast (HRPM) we evaluate the performance of our algorithm mainly from [1O].It achieves transmission times close to GMR,encoding three aspects:multicast tree length,running time and energy overhead close to HRPM,and good packet delivery ratio efficiency separately.In Section VIl,we use extensive simu- in simulations.Our concern for delay and the running time lations to further evaluate the performance and also illustrate of multicast tree construciton is orthogonal to the evaluation how to apply our TST algorithm to practical applications.We metrics in HGMR.In summary,few works have considered conclude this paper and present discussions of future works minimizing the distance of multicast routing or providing in Section IX. comprehensive quantitative analysis theoretically on the per- formance of routing policies. II.RELATED WORK Approximate Steiner Tree.Shortest Path Heuristic(SPH)and We review related works in three categories:the application Kruskal Shortest Path Heuristic (KSPH)add new nodes to of minimum multicast tree,multicast routing and the approx- existing subtrees through the shortest path [15].Average Dis- imate Steiner tree. tance Heuristic (ADH)joins subtrees that contain receivers by Minimum-length Multicast Tree.The most significant ad- a path passing non-receivers with minimal average distance to vantage of minimum multicast tree in wireless sensor networks existing subtrees [16].Santos et al.pushed forward distributed is energy efficiency.As sensor nodes are often powered by dual ascent (DA)algorithm,achieving good performance in batteries that drain rather fast and are difficult to replace, practice [17].The comparison of these algorithms with our energy conserving is extremely crucial in sensor networks. TST algorithm is shown in Table 2.1.These algorithms Furthermore,nodes in sensor network consume most of its were proposed for point-to-point networks.In this paper, energy in communication [18].Hence,minimum multicast we consider the Steiner Tree Problem in wireless sensor tree-based routing is desirable in many cases.Specifically,it networks that are broadcast in nature.In addition,each node is extensively used in the following two applications in sensor has limited computation and storage capability.Devices are networks-user query and data aggregation. usually battery-powered,therefore energy-efficiency is of great As wireless sensor networks are mostly data centric [19], importance.Due to these specific features and requirements, users have to query for information and disseminate it in the existing algorithms for P2P are not suitable for WSN. network.To spread the query in a network as energy-efficient To sum up,there have been extensive existing works focus- as possible,we need to build a minimum multicast tree and ing on multicast tree construction or the approximate Steiner route the data following the trees topology [20].To achieve tree problems,but we have not found a perfect adoption of this,some existing works apply a Steiner tree-based approach Steiner tree into constructing multicast trees. [21]. Data aggregation is to integrate the data from different sources III.NETWORK MODEL and route for eliminating redundancy.It saves energy by reducing the number of transmissions [22].Minimum-length Let us first use mathematical model to capture a wireless tree topology is a widely used technique to solve the implosion sensor network.We assume the network consists of n nodes problems in data centric routing [20].In data aggregation,the in total (or we call the network size is n),distributed indepen- routing pattern of a sensor network is similar to a reverse dently and identically in a unit square.Each node is assigned multicast tree [20].Achieving the optimal data aggregation, with a unique identifier to be distinguished from others.Each i.e.,constructing the minimum multicast tree,is also treated time when a source needs to transmit messages,it chooses m as a Steiner tree problem [22]. receivers randomly.In other words,m is the number of nodes Besides improve energy efficiency and extending network life- that participate in a multicast transmission,or we call it the time,routing on a minimum multicast tree also has underlying multicast group size.For our statistical analysis,we focus on merits,such as indirectly reducing network delay [23].Since the dense network and large multicast group where m and n the total length of tree is minimized,it is obvious that the path are both very large,and m<n.This is particularly suitable won't be too long between the source and any destination. to describe a wireless sensor network. Multicase tree construction.Many studies focus on multicast The geographical distribution of nodes is described by a routing in wireless networks,and useful techniques for routing density function f(x)where x is the position vector.Here we have been proposed in WSN.Sanchez et al.proposed Geo-allow x to be of any dimension;in the rest of this paper we let graphic Multicast Routing (GMR),a heuristic neighborhood it be a two-dimensional vector for ease of presentation,but it selection algorithm based on local geographic information does not hurt any generality.We assume f(x)is independent [7].Later Park et al.[24]combined distributed geographic of n and m.We also assume that 0 <e1 f(x)<e2 multicasting with beaconless routing.In Localized Energy- where e and e2 are both constants,i.e.,a node has a positive Efficient Multicast Algorithm (LEMA),forwarding elements probability to be located in any region of this area
2 • Its theoretical properties and distributive nature render it suitable for sensor network architecture and protocol design for performance improvement. The rest of paper is organized as follows. Section II states related work. In Section III, we introduce our network model. In Section IV, we present our Toward Source Tree Algorithm to construct a multicast tree. In Section V, VI and VII, we evaluate the performance of our algorithm mainly from three aspects: multicast tree length, running time and energy efficiency separately. In Section VIII, we use extensive simulations to further evaluate the performance and also illustrate how to apply our TST algorithm to practical applications. We conclude this paper and present discussions of future works in Section IX. II. RELATED WORK We review related works in three categories: the application of minimum multicast tree, multicast routing and the approximate Steiner tree. Minimum-length Multicast Tree. The most significant advantage of minimum multicast tree in wireless sensor networks is energy efficiency. As sensor nodes are often powered by batteries that drain rather fast and are difficult to replace, energy conserving is extremely crucial in sensor networks. Furthermore, nodes in sensor network consume most of its energy in communication [18]. Hence, minimum multicast tree-based routing is desirable in many cases. Specifically, it is extensively used in the following two applications in sensor networks - user query and data aggregation. As wireless sensor networks are mostly data centric [19], users have to query for information and disseminate it in the network. To spread the query in a network as energy-efficient as possible, we need to build a minimum multicast tree and route the data following the trees topology [20]. To achieve this, some existing works apply a Steiner tree-based approach [21]. Data aggregation is to integrate the data from different sources and route for eliminating redundancy. It saves energy by reducing the number of transmissions [22]. Minimum-length tree topology is a widely used technique to solve the implosion problems in data centric routing [20]. In data aggregation, the routing pattern of a sensor network is similar to a reverse multicast tree [20]. Achieving the optimal data aggregation, i.e., constructing the minimum multicast tree, is also treated as a Steiner tree problem [22]. Besides improve energy efficiency and extending network lifetime, routing on a minimum multicast tree also has underlying merits, such as indirectly reducing network delay [23]. Since the total length of tree is minimized, it is obvious that the path won’t be too long between the source and any destination. Multicase tree construction. Many studies focus on multicast routing in wireless networks, and useful techniques for routing have been proposed in WSN. Sanchez et al. proposed Geographic Multicast Routing (GMR), a heuristic neighborhood selection algorithm based on local geographic information [7]. Later Park et al. [24] combined distributed geographic multicasting with beaconless routing. In Localized EnergyEfficient Multicast Algorithm (LEMA), forwarding elements apply the MST algorithm locally for routing [8]. Dijkstrabased Localized Energy-Efficient Multicast Algorithm (DLEMA) finds energy shortest paths leading through nodes with maximal geographical advance towards desired destinations [9]. Hierarchical geographic multicast routing (HGMR) tries to combine the advantages of geographic multicast routing (GMR) and hierarchical rendezvous point multicast (HRPM) [10]. It achieves transmission times close to GMR, encoding overhead close to HRPM, and good packet delivery ratio in simulations. Our concern for delay and the running time of multicast tree construciton is orthogonal to the evaluation metrics in HGMR. In summary, few works have considered minimizing the distance of multicast routing or providing comprehensive quantitative analysis theoretically on the performance of routing policies. Approximate Steiner Tree. Shortest Path Heuristic (SPH) and Kruskal Shortest Path Heuristic (KSPH) add new nodes to existing subtrees through the shortest path [15]. Average Distance Heuristic (ADH) joins subtrees that contain receivers by a path passing non-receivers with minimal average distance to existing subtrees [16]. Santos et al. pushed forward distributed dual ascent (DA) algorithm, achieving good performance in practice [17]. The comparison of these algorithms with our TST algorithm is shown in Table 2.1. These algorithms were proposed for point-to-point networks. In this paper, we consider the Steiner Tree Problem in wireless sensor networks that are broadcast in nature. In addition, each node has limited computation and storage capability. Devices are usually battery-powered, therefore energy-efficiency is of great importance. Due to these specific features and requirements, existing algorithms for P2P are not suitable for WSN. To sum up, there have been extensive existing works focusing on multicast tree construction or the approximate Steiner tree problems, but we have not found a perfect adoption of Steiner tree into constructing multicast trees. III. NETWORK MODEL Let us first use mathematical model to capture a wireless sensor network. We assume the network consists of n nodes in total (or we call the network size is n), distributed independently and identically in a unit square. Each node is assigned with a unique identifier to be distinguished from others. Each time when a source needs to transmit messages, it chooses m receivers randomly. In other words, m is the number of nodes that participate in a multicast transmission, or we call it the multicast group size. For our statistical analysis, we focus on the dense network and large multicast group where m and n are both very large, and m ≤ n. This is particularly suitable to describe a wireless sensor network. The geographical distribution of nodes is described by a density function f(x) where x is the position vector. Here we allow x to be of any dimension; in the rest of this paper we let it be a two-dimensional vector for ease of presentation, but it does not hurt any generality. We assume f(x) is independent of n and m. We also assume that 0 < 1 ≤ f(x) ≤ 2 where 1 and 2 are both constants, i.e., a node has a positive probability to be located in any region of this area
TABLE 2.1:Comparison of Distributed Algorithm for Approximate Steiner Construction Algorithm Expected tree length Expected time Expected messages Assumptions SPH [15] 2-approximation O(m logn 72 O(mn) Known shortest KSPG [15] Steiner tree. O(m/logn O(mn) paths;Point- ADH[I6可 O(m) to-point network. O(m1ogn n O(nlogn+mn) Unknown shortest DA[17] O(m 0(n2) 0(mn2) paths;Applied in Point-to-point network. Unknown shortest TST O(√m) O(vn logn) 0(m) paths;Applied in Wireless network. To ensure the connectivity of the whole network,we set Algorithm 1 Neighbor Request from Multicast Members the transmission ranger[25].For all nodes, 1:for all receiver R in a multicast group do r is the same and fixed.We assume that two nodes u and 2: the number of request session:0 v can communicate with each other directly if and only if 3: coverage range:re←r the Euclidean distance between them,duv,is no larger than 4: time out interval::To←日(2 klog n) r.Every node can obtain its own geographical location,e.g. 5: set the node sequence as {R via the Global Position System (GPS).However,nodes do 6: total hop:H←0 not know the exact location of other nodes until they receive 7: path length:p←0 messages containing that piece of information. 8: forward the request message to its neighborhood 9 while no response is received when time is out for the TABLE 3.1:System Parameter threquest session do 10: k←k+1 number of all nodes in the network 11 re←2r m number of receivers 12: Tk+1←2Tk transmission range 13: set the node sequence as {R 14: H←0 Te search coverage range 15: p←0 Lv length of temporary tree 16: forward the request message to its neighborhood LM length of multicast tree 17: end while 18:end for IV.ALGORITHM In this section,we describe our Toward Source Tree algo- rithm in detail.This algorithm consists of three phases.In the a message containing all receivers'identifiers is sent from the first phase,the source broadcasts a message and wakes up all source so that all nodes in the network can be aware whether receivers it chooses.In the second phase,every receiver choos- they are receivers.Upon receiving this message,receivers then es the closest neighboring receiver that has shorter Euclidean wake up,label themselves with"R"and be ready to participate distances to the source node than the receiver node itself.and in the multicast routing.The source will also specify its own then a"temporary tree"can be established among all receivers. location in this message. However"temporary tree"is a virtual topology since multicast This step is necessary for multicast routing since no one group members may not be connected directly to each other except the source knows which nodes the messages are desti- given limited transmission range.We select appropriate relays nated for.In this phase,the broadcast information will notify to keep these members connected while controlling the tree the nodes who are selected into the multicast group,and all length.However,till the end of this stage cycles might exist. receivers will be awakened. Hence we eliminate these cycles to further reduce tree length and avoid redundant transmissions in the third phase.In what follows we describe the process in detail,and we will use an B.Phase 2:Connecting All Receivers example to illustrate how to generate such a tree at the end of In this phase,we first build a"temporary tree"consisting of this section. only the multicast group members,and then find the minimum- hop shortest path between each pair of members that are A.Phase 1:Identifying Receivers directly connected in the "temporary tree".All multicast group Each node has a label indicating its role in the multicast tree: members will be connected with the newly added relays. “S”stands for the source and“R”for receivers.in this phase, Step 1:Searching Receivers in the Neighorhood
3 TABLE 2.1: Comparison of Distributed Algorithm for Approximate Steiner Construction Algorithm Expected tree length Expected time Expected messages Assumptions SP H [15] 2-approximation Steiner tree, O( √ m) O(m ➮ n log n ) O(mn) Known shortest paths; Pointto-point network. KSP G [15] O(m ➮ n log n ) O(mn) ADH [16] O(m ➮ n log n ) O(n log n + mn) DA [17] O( √ m) O(n 2 ) O(mn2 ) Unknown shortest paths; Applied in Point-to-point network. T ST O( √ m) O( √ n log n) O(n) Unknown shortest paths; Applied in Wireless network. To ensure the connectivity of the whole network, we set the transmission range r = Θ ✏➮log n n ✑ [25]. For all nodes, r is the same and fixed. We assume that two nodes u and v can communicate with each other directly if and only if the Euclidean distance between them, duv, is no larger than r. Every node can obtain its own geographical location, e.g., via the Global Position System (GPS). However, nodes do not know the exact location of other nodes until they receive messages containing that piece of information. TABLE 3.1: System Parameter n number of all nodes in the network m number of receivers r transmission range rc search coverage range LV length of temporary tree LM length of multicast tree IV. ALGORITHM In this section, we describe our Toward Source Tree algorithm in detail. This algorithm consists of three phases. In the first phase, the source broadcasts a message and wakes up all receivers it chooses. In the second phase, every receiver chooses the closest neighboring receiver that has shorter Euclidean distances to the source node than the receiver node itself, and then a “temporary tree” can be established among all receivers. However “temporary tree” is a virtual topology since multicast group members may not be connected directly to each other given limited transmission range. We select appropriate relays to keep these members connected while controlling the tree length. However, till the end of this stage cycles might exist. Hence we eliminate these cycles to further reduce tree length and avoid redundant transmissions in the third phase. In what follows we describe the process in detail, and we will use an example to illustrate how to generate such a tree at the end of this section. A. Phase 1: Identifying Receivers Each node has a label indicating its role in the multicast tree: “S” stands for the source and “R” for receivers. In this phase, Algorithm 1 Neighbor Request from Multicast Members 1: for all receiver R in a multicast group do 2: the number of request session: k ← 0 3: coverage range: rc ← r 4: time out interval: T0 ← Θ ⑨ 2 k log n ❾ 5: set the node sequence as {R} 6: total hop: H ← 0 7: path length: p ← 0 8: forward the request message to its neighborhood 9: while no response is received when time is out for the k th request session do 10: k ← k + 1 11: rc ← 2 k r 12: Tk+1 ← 2Tk 13: set the node sequence as {R} 14: H ← 0 15: p ← 0 16: forward the request message to its neighborhood 17: end while 18: end for a message containing all receivers’ identifiers is sent from the source so that all nodes in the network can be aware whether they are receivers. Upon receiving this message, receivers then wake up, label themselves with “R” and be ready to participate in the multicast routing. The source will also specify its own location in this message. This step is necessary for multicast routing since no one except the source knows which nodes the messages are destinated for. In this phase, the broadcast information will notify the nodes who are selected into the multicast group, and all receivers will be awakened. B. Phase 2: Connecting All Receivers In this phase, we first build a “temporary tree” consisting of only the multicast group members, and then find the minimumhop shortest path between each pair of members that are directly connected in the “temporary tree”. All multicast group members will be connected with the newly added relays. Step 1: Searching Receivers in the Neighorhood
In this step,each multicast member chooses an appropriate the multicast group.At the same time,all relay nodes on neighbor to connect to.The neighboring member selection the minimum-hop shortest path record this pair of members, criteria is:each member chooses the closest one from the set previous hop and the next hop on the path.When all receivers of members that have shorter Euclidean distances to the source send the connect message,a "temporary tree"among all node than this node itself.If no such neighboring member can mutlicast group members including the source is constructed. be found,then this multicast member directly connects to the source. Algorithm 2 Request Forwarding When a member tries to contact its neighbor members. 1:for all node u receiving request message do it is regarded as the sender that sends request message.Its 2: dist location of u-sender location form is:.Sender id is used to identify the multicast members 5 H←H+1 sending the request message,and path length can be updated 6: newDist =location of u-location of previous hop with the location of previous hop and current hop.The coverage range re sets the range within which the multicast 7: p←-p+newDist member searches for its neighboring members.The Euclidean 果 forward the request message to its neighborhood distance between the sender and current node can be calculated if u is in the multicast group then with sender location,and messages will be discarded if the 10: nodeSourceDist =location of u-source location distance is larger than r.Node sequence records in order the nodes through which this message has passed,which acts as 11: senderSourceDist =sender location-source a guide for response from neighboring receivers so that the location response can be routed via the available path.The hop count 12: if nodeSourceDist i senderSourceDist then H is the number of hops the message has passed through,and 13: send respond message back to the sender p is the path length the message have been through when it 14: end if reaches the current node. 15: end if In each search session,the member broadcasts the request 16: end if message within search coverage range.The sender sets an 17:end for appropriate timeout interval.Once the sender receives replies from neighboring nodes,the search session terminates.Then it enters step 2.However,if time runs out and no reply is C.Phase 3:Eliminating Cycles obtained,it means that no appropriate neighboring members are found.The sender then doubles its search range and initi- In Phase 2,we construct a "temporary tree"made up of multicast group members.However,when other nodes are ates another search.In Algorithm 1,we show how a multicast group member connects to their neighboring members or the added to it as relays,cycles might be formed.In particular, when paths connecting different pairs of multicast members source. A node may receive more than one request message from share the same relay nodes,such node may receive redundant information,which indicates that cycles come into being the same sender.If it is within coverage range,it will choose Therefore,we check the existence of cycles in this phase and the one with the fewest hops among all the messages.If the eliminate them if any. numbers of hops are the same,it picks out the message with the shortest path length.Then it modifies this message.It adds Suppose a node u acts as a relay for k (>1)pairs of nodes itself to the node sequence,increases the hop count by 1 and in the multicast group,which are directly connected in the tem- calculates new path length given the location of the previous porary tree,denoted as (R11,R12).(R21,R22).....(Rk1,Rk2). Let us assume that in each pair,Ri is closer to source than hop.With these information updated,it forwards the message. Algorithm 2 describes how nodes deal with request messages Ri2 (1,廿R1≠R1.Last, id,node sequence,total hop H,path length p>.The respond it sends "Eliminate message "and its previous hops delete message can be routed with the path information provided by unnecessary edges accordingly.In Algorithm 3,we show how to wipe out the cycles. the node sequence. Step 2:Connecting to the Nearest Neighbor With respond messages,every member selects the closest D.Proof of Tree Topology neighbor.Once a neighbor is chosen,the connect message is The previous subsections describe how we can connect forwarded via the minimum-hop shortest path.The connect mutlicast group members using our TST algorithm.Now let message is used to establish a connection between nodes in us prove that the topology constructed by TST algorithm is
4 In this step, each multicast member chooses an appropriate neighbor to connect to. The neighboring member selection criteria is: each member chooses the closest one from the set of members that have shorter Euclidean distances to the source node than this node itself. If no such neighboring member can be found, then this multicast member directly connects to the source. When a member tries to contact its neighbor members, it is regarded as the sender that sends request message. Its form is: . Sender id is used to identify the multicast members sending the request message, and path length can be updated with the location of previous hop and current hop. The coverage range rc sets the range within which the multicast member searches for its neighboring members. The Euclidean distance between the sender and current node can be calculated with sender location, and messages will be discarded if the distance is larger than rc. Node sequence records in order the nodes through which this message has passed, which acts as a guide for response from neighboring receivers so that the response can be routed via the available path. The hop count H is the number of hops the message has passed through, and p is the path length the message have been through when it reaches the current node. In each search session, the member broadcasts the request message within search coverage range. The sender sets an appropriate timeout interval. Once the sender receives replies from neighboring nodes, the search session terminates. Then it enters step 2. However, if time runs out and no reply is obtained, it means that no appropriate neighboring members are found. The sender then doubles its search range and initiates another search. In Algorithm 1, we show how a multicast group member connects to their neighboring members or the source. A node may receive more than one request message from the same sender. If it is within coverage range, it will choose the one with the fewest hops among all the messages. If the numbers of hops are the same, it picks out the message with the shortest path length. Then it modifies this message. It adds itself to the node sequence, increases the hop count by 1 and calculates new path length given the location of the previous hop. With these information updated, it forwards the message. Algorithm 2 describes how nodes deal with request messages in detail. When a multicast member finds it closer to the source than the sender of the request message, it might be chosen as the neighbor by the sender. Therefore, this member will choose a path to the sender and respond with the respond message. The form of the respond message is: . The respond message can be routed with the path information provided by the node sequence. Step 2: Connecting to the Nearest Neighbor With respond messages, every member selects the closest neighbor. Once a neighbor is chosen, the connect message is forwarded via the minimum-hop shortest path. The connect message is used to establish a connection between nodes in the multicast group. At the same time, all relay nodes on the minimum-hop shortest path record this pair of members, previous hop and the next hop on the path. When all receivers send the connect message, a “temporary tree” among all mutlicast group members including the source is constructed. Algorithm 2 Request Forwarding 1: for all node u receiving request message do 2: dist = k location of u - sender location k 3: if dist 1) pairs of nodes in the multicast group, which are directly connected in the temporary tree, denoted as (R11, R12), (R21, R22),..., (Rk1, Rk2). Let us assume that in each pair, Ri1 is closer to source than Ri2 (1 ≤ i ≤ k). A relay stores its previous and the next hop of the path from Ri1 to Ri2, and they are denoted as P Hi and NHi respectively. Then it chooses one pair randomly, say, (Rj1, Rj2) and keeps the information: (Rj1, Rj2, P Hj , NHj ). For other pairs (Ri1, Ri2) where Ri1 6= Rj1, the relay modifies their information as (Rj1, Ri2, P Hj , NHi). Define a set Q, where Q = {q | q =, ∀ Ri1 6= Rj1}. Last, it sends “Eliminate message Q” and its previous hops delete unnecessary edges accordingly. In Algorithm 3, we show how to wipe out the cycles. D. Proof of Tree Topology The previous subsections describe how we can connect mutlicast group members using our TST algorithm. Now let us prove that the topology constructed by TST algorithm is
exactly a tree.We first show that temporary tree formed in 1(a).Solid nodes represent source nodes labeled by "S",or the second phase has a tree topology in Lemma 1.But relays multicast members labeled by "R".The hollow nodes can are added into the temporary tree to connect receivers,which be chosen as relays.The first step is to build a temporary might result in the existence of cycles.In Lemma 2 we show tree spanning all multicast members.The dashed lines denote that cycle elimination can in fact guarantee the tree topology. virtual connections between two members.Then nodes on the minimal-hop shortest path are engaged as relays between two Lemma 1:The temporary tree connecting m receivers has neighboring members.They form the topology as shown in a tree topology. Figure 1(b).Note that there exists a cycle marked with dotted Proof:Assume that each wireless node is identified with rectangular box.The last step is to eliminate unnecessary a unique label ni.We order these nodes based on their edges as is done in Figure 1(c).Finally we obtain the multicast distance from the source node,and we have an ordered set: tree as is shown in Figure 1(d). (n1,n2,...,nm}such that nj is closer to the source than ni for 1 5:end for 6:end for 7:for all node w receiving "Eliminate message Qi"do 8: if w is exactly PH;then 9: if w is not in the multicast group then 10: PHi+previous hop of w on path (Ri1,Ri2) 11: forward the modified Qi to the previous hop 12: eliminate information:(Ri,Ri2,PHi,NHi) 13: end if (c)Eliminating redundant edges and (d)Constructing the multicast tree 14 end if maintaining the topology of tree with relays added 15:end for Fig.4.1:Steps of the TST algorithm Based on Lemma 1,we have the following lemma: Lemma 2:The topology connecting nodes generated by TST V.LENGTH ANALYSIS algorithm is a tree that spans all multicast group members. The previous section described our Toward Source Tree Proof:The existence of cycles means that some nodes in algorithm.In the next three sections,we will discuss its the multicast tree may receive redundant messages,i.e.,some performance in terms of tree length,time complexity,and nodes have more than one previous hop.For these nodes,they energy efficiency.In this section,we discuss the length of TST. send "Eliminate messages"and ensure that they have only one We first obtain the length of temporary tree,first assuming previous hop.When all nodes in the multicast tree have only uniform distribution nodes and then extending to a general one previous hop,no cycle exists. setting.Next we explore the length of minimal-hop path that Multiple previous hops also indicate that multiple paths connects two receivers.Combining the length of temporary may exist between two nodes.Once some previous hops tree and the path,we can derive the upper bound for the are unnecessary,the paths involving these hops can also be multicast tree length. eliminated.Thus Algorithm 3 can eliminate these unnecessary paths,and this completes our proof. A.Temporary Tree in Uniform Distribution We start by discussing the tree length of the temporary tree. Lemma 3:Assume nodes are uniformly distributed in a unit E.Illustration square.The expected length of the temporary tree spanning m We use an example to illustrate our TST algorithm in Figure receivers is upper bounded by cvm,where c=5.622. 4.1.Nodes are distributed in the unit square as shown in Figure Proof:See Appendix A
5 exactly a tree. We first show that temporary tree formed in the second phase has a tree topology in Lemma 1. But relays are added into the temporary tree to connect receivers, which might result in the existence of cycles. In Lemma 2 we show that cycle elimination can in fact guarantee the tree topology. Lemma 1: The temporary tree connecting m receivers has a tree topology. Proof: Assume that each wireless node is identified with a unique label ni . We order these nodes based on their distance from the source node, and we have an ordered set: {n1, n2, ..., nm} such that nj is closer to the source than ni for 1 ≤ j 5: end for 6: end for 7: for all node w receiving “Eliminate message Qi” do 8: if w is exactly P Hi then 9: if w is not in the multicast group then 10: P Hi ← previous hop of w on path (Ri1, Ri2) 11: forward the modified Qi to the previous hop 12: eliminate information: (Ri1, Ri2, P Hi , NHi) 13: end if 14: end if 15: end for Based on Lemma 1, we have the following lemma: Lemma 2: The topology connecting nodes generated by TST algorithm is a tree that spans all multicast group members. Proof: The existence of cycles means that some nodes in the multicast tree may receive redundant messages, i.e., some nodes have more than one previous hop. For these nodes, they send “Eliminate messages” and ensure that they have only one previous hop. When all nodes in the multicast tree have only one previous hop, no cycle exists. Multiple previous hops also indicate that multiple paths may exist between two nodes. Once some previous hops are unnecessary, the paths involving these hops can also be eliminated. Thus Algorithm 3 can eliminate these unnecessary paths, and this completes our proof. E. Illustration We use an example to illustrate our TST algorithm in Figure 4.1. Nodes are distributed in the unit square as shown in Figure 1(a). Solid nodes represent source nodes labeled by “S”, or multicast members labeled by “R”. The hollow nodes can be chosen as relays. The first step is to build a temporary tree spanning all multicast members. The dashed lines denote virtual connections between two members. Then nodes on the minimal-hop shortest path are engaged as relays between two neighboring members. They form the topology as shown in Figure 1(b). Note that there exists a cycle marked with dotted rectangular box. The last step is to eliminate unnecessary edges as is done in Figure 1(c). Finally we obtain the multicast tree as is shown in Figure 1(d). R R R R R R R R R S (a) Building a temporary tree spanning multicast group members (b) Adding relay nodes (c) Eliminating redundant edges and maintaining the topology of tree (d) Constructing the multicast tree with relays added Fig. 4.1: Steps of the TST algorithm V. LENGTH ANALYSIS The previous section described our Toward Source Tree algorithm. In the next three sections, we will discuss its performance in terms of tree length, time complexity, and energy efficiency. In this section, we discuss the length of TST. We first obtain the length of temporary tree, first assuming uniform distribution nodes and then extending to a general setting. Next we explore the length of minimal-hop path that connects two receivers. Combining the length of temporary tree and the path, we can derive the upper bound for the multicast tree length. A. Temporary Tree in Uniform Distribution We start by discussing the tree length of the temporary tree. Lemma 3: Assume nodes are uniformly distributed in a unit square. The expected length of the temporary tree spanning m receivers is upper bounded by c √ m, where c = 5.622. Proof: See Appendix A
6 B.Temporary Tree in General Distribution For a node R,we use NR to denote the regions where nodes might be selected by R as a its neighbor.We use a rectangular Based on the conclusions of tree length in uniform distribu- region as approximate neighbor region N.and N C NR. tion,we further study the case that nodes are non-uniformly The approximate neighbor region is the region marked with distributed.We partition the unit square into k small squares, where m =k1+7 and 0. connect to the closest neighbor that has shorter Euclidean We claim that if a-<,the minimal length between two distance to S than the node itself.If no such receiver exists. adjacent cells is in an order of This comes from the it doesn't connect to other nodes.A tree can be constructed observation that we can connect adjacent cells by connecting among m nodes,and the expected length of such a tree is upper bounded by cvm,where c=5.622. nodes in adjacent grids whose edge length is as is shown in Figure 5.2.In this figure,the yellow and the black squares Proof:For those nodes that not closest to S,they can always are two adjacent cells with edge length of The blue grids find another node to connect to.For the node that is closest contained in them are the smaller squares with edge length of to S,it will be connected to by other nodes.We can prove Green lines are used to show that nodes in the adjacent that the topology formed by m nodes is exactly a tree with grids are connected. Lemma 1.We denote this tree as T. As we can see from Figure 5.2,for two adjacent cells with There are two differences of this lemma from Lemma 3. One is that the source is located outside the square,and the edge length of12 pairs of nodes in adjacent grids might exist.Denote P as the probability that a node exists other is that a node won't connect to others when it can't find another one that has shorter Euclidean distance to the source. in a grid with edge length of 1/k.Since the area of each Now we find a point S'that is closest to S in the boundary square is very small,we can regard nodes in the same square uniformly distributed.We have of square region,as is shown in Figure 5.1.With S as the source,a temporary tree as mentioned in TST algorithm can be established spanning all nodes in the network.We denote A=1-1-学 the temporary tree as T.In the following we demonstrate that Denote P2 as the probability that nodes exist in both of the the tree length of T can be used to estimate the upper bound adjacent grids of length of T. P2=1-P
6 B. Temporary Tree in General Distribution Based on the conclusions of tree length in uniform distribution, we further study the case that nodes are non-uniformly distributed. We partition the unit square into k small squares, where m = k 1+γ and 0 1 2 . We claim that if α−γ < 1 2 , the minimal length between two adjacent cells is in an order of o ⑨ √ 1 k ❾ . This comes from the observation that we can connect adjacent cells by connecting nodes in adjacent grids whose edge length is 1 kα , as is shown in Figure 5.2. In this figure, the yellow and the black squares are two adjacent cells with edge length of √ 1 k . The blue grids contained in them are the smaller squares with edge length of 1 kα . Green lines are used to show that nodes in the adjacent grids are connected. As we can see from Figure 5.2, for two adjacent cells with edge length of √ 1 k , k α−1/2 pairs of nodes in adjacent grids might exist. Denote P1 as the probability that a node exists in a grid with edge length of 1/kα. Since the area of each square is very small, we can regard nodes in the same square uniformly distributed. We have P1 = 1 − (1 − 1 k 2α−1 ) m1 k . Denote P2 as the probability that nodes exist in both of the adjacent grids. P2 = 1 − P 2 1
7 There are k-1/2 pairs of nodes in adjacent grids,and we C.Path With Minimal Hops denote P as the probability that at least one pair exist.We Receivers are connected by the minimal-hop path.In this have P=1-P5-a part,we study the relationship between the path length and Euclidean distance between two nodes. Hence we have Lemma 7:Let n nodes be independently and identically distributed over [0,1]x [0,1]with distribution function f(x). Suppose that the Euclidean distance between two nodes u and P-1-1-1-1-)学-w (5-1)vis z.The following properties hold: (a)The expectation of fewest relays that are needed to In order to let k squares connected by inter-square edges,it connect u and v converges to as n approaches oo; should hold that pk1.Therefore,we need the following (b)The length expectation of the path connecting uv and in- condition volving the fewest relays converges to Euclidean distance 1-k72(1-(1-a)学尺 1 (5-2) T. Proof:See Appendix A. The expression that r1()>r2()means that r2()/r1() 0 as k-oo.Condition (5-2)is equivalent to condition(5-3). D.Multicast Tree logk 1 k-1/2+y+ae1 ←2a-i (5-3) We divide the [0,1]x [0,1]network region into k squares.In each square,we construct a tree and connect nodes with intra- Condition(5-3)can be satisfied when aE(i).And the path length inter-square connection is in the order of o(vm). et,∈Tv converges to Euclidean distance as network size goes to oo according to Lemma 7.So we have: 1/k E(LM)≤cvm Vf(x)dx (5-6) Jx∈0.1]2 Remark:We derive an upper bound for the Toward Source Tree,but it is not a tight bound.In Section VIlI,we will show that TST algorithm has even better empirical performance than our theoretical bound. Lemma 8:Suppose Xi,1<i<oo,are independent random variables with distribution u having compact support in Rd, d 2.If the monotone function satisfies ()~x as x0 for some 0<a<d,then with probability 1 1/2 1/k/2 imn(do)/4M()=c(a,d) f(x)(d-a)/ddx Fig.5.2:Inter-square edges between nodes in adjacent square cells (5-7) Here f denotes the density of the absolutely continuous part of u and c(o,d)denotes a strictly positive constant which Lemma 6:Let m nodes be independently distributed with density function f(x).The expectation for the total length of depends only on the power a and the dimension d [26]. temporary tree E[Lv]is smaller than cvm.We have E[Lv]< Given a graph with some nodes and edges,building a minimal length tree spanning a subset of nodes with relays cvm feto.f(x)dx.where c5.622. appropriately added is formulated as Steiner Tree Problem.If Proof:See Appendix A. no relay nodes are allowed,then the tree with minimal length
7 There are k α−1/2 pairs of nodes in adjacent grids, and we denote P as the probability that at least one pair exist. We have P = 1 − P k α−1/2 2 . Hence we have P = 1 − (1 − (1 − (1 − 1 k 2α−1 ) m1 k ) 2 ) k α−1/2 (5-1) In order to let k squares connected by inter-square edges, it should hold that P k → 1. Therefore, we need the following condition 1 − k − 1 kα−1/2 (1 − (1 − 1 k 2α−1 ) m1 k ) 2 . (5-2) The expression that r1(k) r2(k) means that r2(k)/r1(k) → 0 as k → ∞. Condition (5-2) is equivalent to condition (5-3). log k k−1/2+γ+α1 1 k 2α−1 , (5-3) Condition (5-3) can be satisfied when α < γ + 1 2 . With 1 2 < α < γ + 1 2 , we can evaluate P. By (5-1) it can be verified that P ∼ 1 − exp(−21k 1/2−α+γ − 1 log 2k 1/2−α ). (5-4) It is easy to show that P k → 1 with the expression (5-4), which means such pairs of nodes exist for all adjacent cells with high probability. Since (1−P) is exponentially decaying to zero, the expectation of total path length needed to connect k cells is k 1−αP k + k 1/2 k(1 − P) ∼ k 1−α = o(k 1/2 ) (5-5) Due to the fact that k = o(m), the expected path length for inter-square connection is in the order of o( √ m). 1/kα 1/k1/2 1/k1/2 Fig. 5.2: Inter-square edges between nodes in adjacent square cells Lemma 6: Let m nodes be independently distributed with density function f(x). The expectation for the total length of temporary tree E[LV ] is smaller than c √ m. We have E[LV ] ≤ c √ m ❘ x∈[0,1]2 ➮ f(x)dx, where c ≈ 5.622. Proof: See Appendix A. C. Path With Minimal Hops Receivers are connected by the minimal-hop path. In this part, we study the relationship between the path length and Euclidean distance between two nodes. Lemma 7: Let n nodes be independently and identically distributed over [0, 1] × [0, 1] with distribution function f(x). Suppose that the Euclidean distance between two nodes u and v is x. The following properties hold: (a) The expectation of fewest relays that are needed to connect u and v converges to x r as n approaches ∞; (b) The length expectation of the path connecting uv and involving the fewest relays converges to Euclidean distance x. Proof: See Appendix A. D. Multicast Tree We divide the [0, 1]×[0, 1] network region into k squares. In each square, we construct a tree and connect nodes with intrasquare edges. Adjacent squares are connected by the intersquare edges. All nodes are connected by intra- and intersquare edges, and they can be used to estimate the tree length. Theorem 1: Let n nodes be independently and identically distributed in a unit square and their distribution satisfies the density function f(x). We construct a multicast tree spanning m receivers as well as the source with TST algorithm. When m and n are both very large, the expected length of the tree is upper bounded by c √ m ❘ x∈[0,1]2 ➮ f(x)dx, where c = 5.622. Proof: Denote ei,j as the edge connecting Receivers i and j in the temporary tree TV , li,j as the length of the minimal-hop path between the two receivers. Since redundant edges will be eliminated, E(LM) ≤ P ei,j∈TV E(li,j ). And the path length converges to Euclidean distance as network size goes to ∞ according to Lemma 7. So we have: E(LM) ≤ c √ m ❩ x∈[0,1]2 ➮ f(x)dx. (5-6) Remark: We derive an upper bound for the Toward Source Tree, but it is not a tight bound. In Section VIII, we will show that TST algorithm has even better empirical performance than our theoretical bound. Lemma 8: Suppose Xi , 1 ≤ i < ∞, are independent random variables with distribution µ having compact support in R d , d ≥ 2. If the monotone function ψ satisfies ψ(x) ∼ x α as x → 0 for some 0 < α < d, then with probability 1 limn→∞ n −(d−α)/dM(X1X2, ..., Xn) = c(α, d) ❩ Rd f(x) (d−α)/ddx (5-7) Here f denotes the density of the absolutely continuous part of µ and c(α, d) denotes a strictly positive constant which depends only on the power α and the dimension d [26]. Given a graph with some nodes and edges, building a minimal length tree spanning a subset of nodes with relays appropriately added is formulated as Steiner Tree Problem. If no relay nodes are allowed, then the tree with minimal length
is called minimal spanning tree.However,Steiner tree can Theorem 2:Let n nodes be independently and identically only optimize tree length by a constant ratio compared with distributed in unit square.The running time of TST algorithm the minimal spanning tree. isO(√n log n). Lemma 9:Let P be a set of n points on the Euclidean plane. Proof:There are three serial phases in TST algorithm.so we Let l(P)and Im(P)denote the lengths of the Steiner mini- discuss the time cost of each phase one by one. mum tree and the minimum spanning tree on P respectively. In Phase 1,the messages containing location information of The inequality holds:[28] the source are broadcast in the network.The furthest distance 9≥9.m between the source and another node is O(1),so at most (5-8) (relays are needed for a message to reach one node. In expectation,there are mr2e2n =O(nr2)nodes within Combining the two lemmas above,we can conclude that transmission range of a node and hence a node has to wait for the length of Steiner tree spanning m receivers is: O(nr2)time slots to transmit a message.The time needed Ls≥3 for Phase 1 is: ivm f(x)dx (5-9) J0.12 E(t1)≤O(nr) (6-1) Here c is the constant equal to c(1,2)mentioned in Lemma 8.Roberts estimated that c1 0.656 [27]. In Phase 2,the dominant time cost is searching for neigh- From (5-6)and(5-9),we prove that the length of Toward boring receivers.In thesearch session,the coverage range Source Tree is in the same order as that of Steiner tree,and is 2kr.We need O(k)relays to forward request messages the difference between them is only a constant ratio no larger from one receiver to any other nodes within its search coverage than 10. range. Since t the coverage range does not exceed√2,the Remark 1:Many works have shown the lengths of any number of search sessions cannot be more than log2 well-designed spanning tree consisting of m nodes can reach the order of O(vm),but the order-optimality of our Toward Source Tree in tree length is not a trivial result.TST is not E(t2)≤ 2'nr ≤O(nr). (6-2) a simple spanning tree of m multicast group members,since other relay nodes must be selected carefully and be involved in multicasting due to the limited transmission range of wireless In Phase 3,the worst case is that relays on the path whose sensors.Hence minimum multicast tree is formulated as a length is O(1)form cycles.Time for cycle elimination is Steiner tree problem instead of minimum spanning tree prob- lem.Constructing a minimum spanning tree takes polynomial E(ta)=O(nr2 =O(nr) (6-3) time,while constructing a Steiner tree is proved to be NP- hard.Asymptotic tree length of TST is a result based on The total running time is E(t)= CE(ti),so we have quanlitative analysis of this hard problem in graph theory.As =1 far as we know,few works have given asymptotic length of their multicast trees. E(t)≤O(Vn log n) (6-4) One may also doubt that when the network is dense,we can choose infinitely many relay nodes such that the path which completes our proof. 1 length between two multicast members approaches their Eu- Remark:For any algorithm to construct a multicast tree clidean distance.In this way,the tree can also reach the among a group of nodes,broadcast in Phase 1 is necessary. order-optimality in length.However,we should notice that it Since no node has a knowledge of the multicast group except would bring terribly long delay and large energy consumption the source,such information has to be forwarded to every since too many extra nodes are involved in multicasting.The node in the network so that they can know whether they technique to balance the tree length and delay is that we choose should participate in multicasting.The lower bound of time minimum-hop shortest path between multicast members,as is for multicast tree construction is).Since TST described in Section IV. achieves the time complexity upper bounded by O(vn logn). The minimum length multicast tree has potential benefits of the minimal time cost to construct a multicast tree is also energy efficiency and delay reduction.The asymptotic length upper bounded by O(vn logn).Hence the time complexity of TST not only indicates its order-optimality,but also shows of TST algorithm shares the same upper and lower bounds as that TST is quite a good approximation of Steiner tree since the minimal time cost,and the ratio between these two bounds the approximation ratio is no larger than 10. is only O(logn). The length of multicast trees have a great influence on communication quality in terms of transmission delay and VI.TIME ANALYSIS wireless interference.Construction of minimum-length trees is Time efficiency is another important aspect to evaluate an NP-hard problem,and takes exponential time.Approximate the quality of multicast routing algorithms.In practice,it is algorithms achieve larger tree length with lower time com- expected that the multicast tree can be constructed with small plexity.Now we explore the relationship between tree length time costs.Now let us derive the time complexity of TST. and time complexity in Figure 6.1.Since the lower bound of
8 is called minimal spanning tree. However, Steiner tree can only optimize tree length by a constant ratio compared with the minimal spanning tree. Lemma 9: Let P be a set of n points on the Euclidean plane. Let ls(P) and lm(P) denote the lengths of the Steiner minimum tree and the minimum spanning tree on P respectively. The inequality holds: [28] ls(P) ≥ √ 3 2 lm(P) (5-8) Combining the two lemmas above, we can conclude that the length of Steiner tree spanning m receivers is: LST ≥ √ 3 2 c1 √ m ❩ [0,1]2 ➮ f(x)dx (5-9) Here c1 is the constant equal to c(1, 2) mentioned in Lemma 8. Roberts estimated that c1 = 0.656 [27]. From (5-6) and (5-9), we prove that the length of Toward Source Tree is in the same order as that of Steiner tree, and the difference between them is only a constant ratio no larger than 10. Remark 1: Many works have shown the lengths of any well-designed spanning tree consisting of m nodes can reach the order of O( √ m), but the order-optimality of our Toward Source Tree in tree length is not a trivial result. TST is not a simple spanning tree of m multicast group members, since other relay nodes must be selected carefully and be involved in multicasting due to the limited transmission range of wireless sensors. Hence minimum multicast tree is formulated as a Steiner tree problem instead of minimum spanning tree problem. Constructing a minimum spanning tree takes polynomial time, while constructing a Steiner tree is proved to be NPhard. Asymptotic tree length of TST is a result based on quanlitative analysis of this hard problem in graph theory. As far as we know, few works have given asymptotic length of their multicast trees. One may also doubt that when the network is dense, we can choose infinitely many relay nodes such that the path length between two multicast members approaches their Euclidean distance. In this way, the tree can also reach the order-optimality in length. However, we should notice that it would bring terribly long delay and large energy consumption since too many extra nodes are involved in multicasting. The technique to balance the tree length and delay is that we choose minimum-hop shortest path between multicast members, as is described in Section IV. The minimum length multicast tree has potential benefits of energy efficiency and delay reduction. The asymptotic length of TST not only indicates its order-optimality, but also shows that TST is quite a good approximation of Steiner tree since the approximation ratio is no larger than 10. VI. TIME ANALYSIS Time efficiency is another important aspect to evaluate the quality of multicast routing algorithms. In practice, it is expected that the multicast tree can be constructed with small time costs. Now let us derive the time complexity of TST. Theorem 2: Let n nodes be independently and identically distributed in unit square. The running time of TST algorithm is O( √ n log n). Proof: There are three serial phases in TST algorithm, so we discuss the time cost of each phase one by one. In Phase 1, the messages containing location information of the source are broadcast in the network. The furthest distance between the source and another node is O(1), so at most O ⑨ 1 r ❾ relays are needed for a message to reach one node. In expectation, there are πr2 2n = O nr2 ✁ nodes within transmission range of a node and hence a node has to wait for O nr2 ✁ time slots to transmit a message. The time needed for Phase 1 is: O ❶r 1 r ➀ ≤ E(t1) ≤ O (nr). (6-1) In Phase 2, the dominant time cost is searching for neighboring receivers. In the k th search session, the coverage range is 2 k r. We need O ⑨ 2 k ❾ relays to forward request messages from one receiver to any other nodes within its search coverage range. Since the coverage range does not exceed √ 2, the number of search sessions cannot be more than ➔ log2 √ 2 r ↔ . E(t2) ≤ O ❹✝ log2 √ 2 r ✞ ❳ i=0 2 inr2 ➃ ≤ O(nr). (6-2) In Phase 3, the worst case is that relays on the path whose length is O(1) form cycles. Time for cycle elimination is E(t3) = O ⑩ 1 r nr2 ❿ = O(nr). (6-3) The total running time is E(t) = i P =3 i=1 E(ti), so we have O ⑩➱ n log n ❿ ≤ E(t) ≤ O( ➮ n log n). (6-4) which completes our proof. Remark: For any algorithm to construct a multicast tree among a group of nodes, broadcast in Phase 1 is necessary. Since no node has a knowledge of the multicast group except the source, such information has to be forwarded to every node in the network so that they can know whether they should participate in multicasting. The lower bound of time for multicast tree construction is O ✏➮ n log n ✑ . Since TST achieves the time complexity upper bounded by O( √ n log n), the minimal time cost to construct a multicast tree is also upper bounded by O( √ n log n). Hence the time complexity of TST algorithm shares the same upper and lower bounds as the minimal time cost, and the ratio between these two bounds is only O(log n). The length of multicast trees have a great influence on communication quality in terms of transmission delay and wireless interference. Construction of minimum-length trees is an NP-hard problem, and takes exponential time. Approximate algorithms achieve larger tree length with lower time complexity. Now we explore the relationship between tree length and time complexity in Figure 6.1. Since the lower bound of
time needed for multicast tree construction is O Theorem 4:Let n nodes be independently and identically the region with time complexity smaller than distributed in the unit square.The number of forwarding nodes in the multicast tree is infeasible.Accurate solution to Steiner tree problem achieves the approximation ratio of 1 at the cost of exponential time, and our algorithm achieves the ratio of 10.The approximation (7-1) ratio of other algorithms like those in Table 2.1 approaches 1 Θ(m): m =W but they have larger time costs. When m=O(n/logn),the number of forwarding nodes is order-optimal. Approximation ratio Proof:Let Ty be the virtual tree,ei.j be an edge in the of tree length virtual tree connecting two receivers i and j,and di.j be the Euclidean distance between them.When m is small,relay nodes form the dominant part of the forwarding nodes in our multicast tree.The total number of transmitting nodes,NTsr 10- in the Toward Source Tree is:Nrsr=e \et,i∈Tv -()As m grows larger,receivers are close to each Infeasible other and thus fewer relay nodes are added.Therefore,re- region ceivers are dominant in the multicast tree,Nrsr =e(m). O(c)Complexity We should discuss the number of forwarding nodes in two O(Vn/log n O(Vn log n) cases,and there exists a critical value for m that determines in Fig.6.1:Relationship between tree length and time complexity which case it should be discussed.The critical value satisfies: 6(-)=6(mc),some=6() Denote Nmin as the minimal number of relay nodes that VII.ENERGY EFFICIENCY are engaged in propagating the messages from one source to m receivers.[3]gives the lower bound of Nmin under Energy is a primary consideration in wireless sensor net- the assumption that all nodes are uniformly distributed.Now works since sensors are battery-powered and their energy we use its method and explore Nmin in the case of gen- is limited.We consider the following factors:1)the energy eral distribution.When m is small,the distance between consumed to construct such multicast trees;and 2)the energy two receivers is large compared with the transmission range. needed to send messages along the tree constructed by this Nin.This lower bound is achievable with our algorithm.The former one is usually measured by the amount algorithm.so Nin().When m is very large,there of exchanged messages to run distributed routing algorithms; exist many receivers within the transmission range of one and the latter directly depends on the number of nodes node,so that one transmission can deliver messages to a large participating in the transmission.We focus on both aspects. number of receivers.In this case,we only need to choose a connected dominating set from m receivers,and Nmin is exactly the size of minimum connected dominating set.We A.Message Complexity will give the definitions of both connected dominating set and The following theorem quantifies the message complexity minimum connected dominating set. in TST. Definition I(Connected dominating set):D is the connected Theorem 3:Let n nodes be independently and identically dominating set of a graph G if and only it satisfies two distributed in the unit square.The message complexity of TST properties: algorithm is O(n). Proof:See Appendix A. Remark:Since each node needs a message telling them (a)Any node in D can reach any other node in D by a path whether they are chosen as receivers,the lower bound of that stays entirely within D. message complexity is O(n).Hence TST algorithm is an (b)Every vertex in G either belongs to D or it is adjacent to order-optimal solution in terms of message complexity. a vertex in D. Definition 2 (Minimum connected dominating set):MD is B.Number of Forwarding Nodes the minimum connected dominating set of graph G if MD is Since the transmission range is fixed,the number of trans- the connected dominating set containing the smallest number of nodes. mitters in the tree determines the energy consumption for in- formation propagation.We evaluate the number of forwarding We still need to discuss Nmin in two cases.There also nodes in this subsection. exists a critical value md,and (md)=(m).so ma=
9 time needed for multicast tree construction is O ✏➮ n log n ✑ , the region with time complexity smaller than O ✏➮ n log n ✑ is infeasible. Accurate solution to Steiner tree problem achieves the approximation ratio of 1 at the cost of exponential time, and our algorithm achieves the ratio of 10. The approximation ratio of other algorithms like those in Table 2.1 approaches 1 but they have larger time costs. O(√n/log n ) O(√n log n) O(c ) Complexity Approximation ratio of tree length 1 10 { Infeasible region n Fig. 6.1: Relationship between tree length and time complexity VII. ENERGY EFFICIENCY Energy is a primary consideration in wireless sensor networks since sensors are battery-powered and their energy is limited. We consider the following factors: 1) the energy consumed to construct such multicast trees; and 2) the energy needed to send messages along the tree constructed by this algorithm. The former one is usually measured by the amount of exchanged messages to run distributed routing algorithms; and the latter directly depends on the number of nodes participating in the transmission. We focus on both aspects. A. Message Complexity The following theorem quantifies the message complexity in TST. Theorem 3: Let n nodes be independently and identically distributed in the unit square. The message complexity of TST algorithm is O(n). Proof: See Appendix A. Remark: Since each node needs a message telling them whether they are chosen as receivers, the lower bound of message complexity is O(n). Hence TST algorithm is an order-optimal solution in terms of message complexity. B. Number of Forwarding Nodes Since the transmission range is fixed, the number of transmitters in the tree determines the energy consumption for information propagation. We evaluate the number of forwarding nodes in this subsection. Theorem 4: Let n nodes be independently and identically distributed in the unit square. The number of forwarding nodes in the multicast tree is NT ST = ✽ ❁ ✿ Θ ✏➮ mn log n ✑ , m = O ⑨ n log n ❾ ; Θ(m), m = ω ⑨ n log n ❾ . (7-1) When m = O (n/ log n), the number of forwarding nodes is order-optimal. Proof: Let TV be the virtual tree, ei,j be an edge in the virtual tree connecting two receivers i and j, and di,j be the Euclidean distance between them. When m is small, relay nodes form the dominant part of the forwarding nodes in our multicast tree. The total number of transmitting nodes, NT ST in the Toward Source Tree is: NT ST = Θ ❶ P ei,j∈TV dij r ➀ = Θ ⑨ √m r ❾ . As m grows larger, receivers are close to each other and thus fewer relay nodes are added. Therefore, receivers are dominant in the multicast tree, NT ST = Θ(m). We should discuss the number of forwarding nodes in two cases, and there exists a critical value for m that determines in which case it should be discussed. The critical value satisfies: Θ ⑨ √mc r ❾ = Θ(mc), so mc = Θ ⑨ 1 r 2 ❾ . Denote Nmin as the minimal number of relay nodes that are engaged in propagating the messages from one source to m receivers. [3] gives the lower bound of Nmin under the assumption that all nodes are uniformly distributed. Now we use its method and explore Nmin in the case of general distribution. When m is small, the distance between two receivers is large compared with the transmission range. Nmin = Ω ⑨ √m r ❾ . This lower bound is achievable with our algorithm, so Nmin = Θ ⑨ √m r ❾ . When m is very large, there exist many receivers within the transmission range of one node, so that one transmission can deliver messages to a large number of receivers. In this case, we only need to choose a connected dominating set from m receivers, and Nmin is exactly the size of minimum connected dominating set. We will give the definitions of both connected dominating set and minimum connected dominating set. Definition 1 (Connected dominating set): D is the connected dominating set of a graph G if and only it satisfies two properties: (a) Any node in D can reach any other node in D by a path that stays entirely within D. (b) Every vertex in G either belongs to D or it is adjacent to a vertex in D. Definition 2 (Minimum connected dominating set): MD is the minimum connected dominating set of graph G if MD is the connected dominating set containing the smallest number of nodes. We still need to discuss Nmin in two cases. There also exists a critical value md, and Θ(md) = Θ ⑨ √md r ❾ , so md =
10 We find that more messages are exchanged when we fix the network size and add more multicast group members.This m=0 result is not so intuitive.On the one hand,multicast group Nmin (7-2) members become closer to each other when the multicast m=w n group size increases,so they need to search for the appropriate neighbors in a smaller coverage range.Fewer nodes are in- From (7-1)and (7-2).we can find when m-(o),the number of forwarding nodes in the multicast tree is optimal quired within the coverage range,and fewer request messages are sent.On the other hand,when a multicast group member in order sense. looks for neighbors,more other members might find they are Remark:When m =w(on).the number of forwarding closer to the source.Hence more response messages might nodes in TST tree may not be order-optimal.However,in be sent back.Total messages increase as more nodes join the graph theory,finding the minimum connected dominating set multicast group. of a given graph is proved to be NP-complete [29].And it Finally,we consider the number of forwarding nodes en- also requires global information of network topology.So we gaged in multicasting.To derive the statistical properties of consider it an acceptable sacrifice of energy to achieve the feasibility and time-efficiency in practice. TST,we set the network size as 100,000.In Figure 1(a),when the multicast group is small,the first part of the curve indicates VIII.SIMULATIONS AND APPLICATIONS that the number of forwarding nodes is O(vm).As there We first perform extensive simulations to evaluate the are more multicast group members,the number of forwarding nodes grows linearly with the group size. empirical performance of Toward Source Tree algorithm.in 2)Non-uniform Distribution:We randomly choose the terms of the length of the multicast tree,message complexity and the number of forwarding nodes engaged in the tree, location of source,x,within the unit square first.For the case of non-uniform distribution,we consider that nodes satisfy the and then present concrete examples of how Toward Source Tree algorithm can be applied to realistic scenarios.In the normal distribution:f(x)=e where x-xsll simulations,we mainly consider two common distribution is the Euclidean distance between the node and the source. patterns:uniform distribution and normal distribution. It is possible that the nodes are scattered outside unit square during simulations.If this happens,we relocate these nodes A.Performance Evaluation until they are within this unit square. 1)Uniform Distribution:We first consider nodes are uni- We evaluate the TST algorithm still in terms of tree length, formly distributed in a unit square and transmission range is set message complexity and the number of forwarding nodes. to ber=We explore the effect of multicast group size We find the results are quite similar to those in the uniform m on the tree length.Assuming that the network size is fixed distribution.In Figure 1(b),the length of TST tree is only as 1000,we obtain the lengths of the Stenier tree and TST a little larger than the optimal length in our simulations. The statistics show that the ratio between them is 1.110 on tree when the value of m varies.The length of the Steiner tree average.As for the message complexity,Figure 2(b)shows can be obtained via NewBossa in [30].Two curves in Figure that more messages are exchanged among more multicast 1(a)describe the relationship between m and the length of the Toward Source Tree as well as the Steiner Tree.It is shown group members or in denser networks,and the quantity of messages is still O(n).As is shown in Figure 1(b),the number that the length of TST tree is larger than that of the Steiner Tree but quite close to it.According to simulation statistics, of transmitting nodes in the multicast tree is linear with vm, the ratio of the tree length achieved by the two algorithms and becomes linear with m when more nodes participate in is 1.114 on average.When nodes are uniformly distributed, the multicast group assuming that the network size keeps unchanged. Toward Source Tree is a good approximation of the Steiner Tree. Then we evaluate the message complexity in the construc- B.Practical Applications tion of TST tree,and explore the relationship among the network size n,the multicast group size m and message TST algorithm can be implemented in practical systems as complexity.We set the network size n =200,600,1000 well as be integrated into sensor network architecture and respectively,and record the quantity of exchanged messages protocol design to improve the network performance.Many when multicast group size m varies.Different curves corre- sensor network systems need multicast transmission as they spond to different network sizes in Figure 2(a).As can be seen involve different kinds of sensors and interactions between in the figure,the quantity of exchanged messages increases various modules such as the LED lighting system in [31], with the multicast group size as well as the network size.It is the localization system in [32]and the building monitoring quite intuitive that the larger network size can result in more system in [33].These make multicast groups naturally form exchanged messages.Since the transmission power necessary within the systems and multicasting through a minimum to maintain the connectivity is less in dense networks than in multicast tree is a desirable way for these energy-constrained sparse networks,more relays are engaged in multicasting as sensor applications.Therefore,we can apply TST algorithm the network size increases.Hence messages used to contact for multicast routing within the aforementioned systems.In nodes and inquire routing information become more. addition,during the tree construction process,TST algorithm
10 Θ ⑨ n log n ❾ . Nmin = ✽ ❁ ✿ Θ ✏➮ mn log n ✑ , m = O ⑨ n log n ❾ ; Ω ⑨ n log n ❾ , m = ω ⑨ n log n ❾ . (7-2) From (7-1) and (7-2), we can find when m = O ⑨ n log n ❾ , the number of forwarding nodes in the multicast tree is optimal in order sense. Remark:When m = ω ⑨ n log n ❾ , the number of forwarding nodes in TST tree may not be order-optimal. However, in graph theory, finding the minimum connected dominating set of a given graph is proved to be NP-complete [29]. And it also requires global information of network topology. So we consider it an acceptable sacrifice of energy to achieve the feasibility and time-efficiency in practice. VIII. SIMULATIONS AND APPLICATIONS We first perform extensive simulations to evaluate the empirical performance of Toward Source Tree algorithm, in terms of the length of the multicast tree, message complexity and the number of forwarding nodes engaged in the tree, and then present concrete examples of how Toward Source Tree algorithm can be applied to realistic scenarios. In the simulations, we mainly consider two common distribution patterns: uniform distribution and normal distribution. A. Performance Evaluation 1) Uniform Distribution: We first consider nodes are uniformly distributed in a unit square and transmission range is set to be r = ➮log n n . We explore the effect of multicast group size m on the tree length. Assuming that the network size is fixed as 1000, we obtain the lengths of the Stenier tree and TST tree when the value of m varies. The length of the Steiner tree can be obtained via NewBossa in [30]. Two curves in Figure 1(a) describe the relationship between m and the length of the Toward Source Tree as well as the Steiner Tree. It is shown that the length of TST tree is larger than that of the Steiner Tree but quite close to it. According to simulation statistics, the ratio of the tree length achieved by the two algorithms is 1.114 on average. When nodes are uniformly distributed, Toward Source Tree is a good approximation of the Steiner Tree. Then we evaluate the message complexity in the construction of TST tree, and explore the relationship among the network size n, the multicast group size m and message complexity. We set the network size n = 200, 600, 1000 respectively, and record the quantity of exchanged messages when multicast group size m varies. Different curves correspond to different network sizes in Figure 2(a). As can be seen in the figure, the quantity of exchanged messages increases with the multicast group size as well as the network size. It is quite intuitive that the larger network size can result in more exchanged messages. Since the transmission power necessary to maintain the connectivity is less in dense networks than in sparse networks, more relays are engaged in multicasting as the network size increases. Hence messages used to contact nodes and inquire routing information become more. We find that more messages are exchanged when we fix the network size and add more multicast group members. This result is not so intuitive. On the one hand, multicast group members become closer to each other when the multicast group size increases, so they need to search for the appropriate neighbors in a smaller coverage range. Fewer nodes are inquired within the coverage range, and fewer request messages are sent. On the other hand, when a multicast group member looks for neighbors, more other members might find they are closer to the source. Hence more response messages might be sent back. Total messages increase as more nodes join the multicast group. Finally, we consider the number of forwarding nodes engaged in multicasting. To derive the statistical properties of TST, we set the network size as 100, 000. In Figure 1(a), when the multicast group is small, the first part of the curve indicates that the number of forwarding nodes is O( √ m). As there are more multicast group members, the number of forwarding nodes grows linearly with the group size. 2) Non-uniform Distribution: We randomly choose the location of source, xs, within the unit square first. For the case of non-uniform distribution, we consider that nodes satisfy the normal distribution: f(x) = √ 1 2π e − kx−xsk 2 2 , where kx − xsk is the Euclidean distance between the node and the source. It is possible that the nodes are scattered outside unit square during simulations. If this happens, we relocate these nodes until they are within this unit square. We evaluate the TST algorithm still in terms of tree length, message complexity and the number of forwarding nodes. We find the results are quite similar to those in the uniform distribution. In Figure 1(b), the length of TST tree is only a little larger than the optimal length in our simulations. The statistics show that the ratio between them is 1.110 on average. As for the message complexity, Figure 2(b) shows that more messages are exchanged among more multicast group members or in denser networks, and the quantity of messages is still O(n). As is shown in Figure 1(b), the number of transmitting nodes in the multicast tree is linear with √ m, and becomes linear with m when more nodes participate in the multicast group assuming that the network size keeps unchanged. B. Practical Applications TST algorithm can be implemented in practical systems as well as be integrated into sensor network architecture and protocol design to improve the network performance. Many sensor network systems need multicast transmission as they involve different kinds of sensors and interactions between various modules such as the LED lighting system in [31], the localization system in [32] and the building monitoring system in [33]. These make multicast groups naturally form within the systems and multicasting through a minimum multicast tree is a desirable way for these energy-constrained sensor applications. Therefore, we can apply TST algorithm for multicast routing within the aforementioned systems. In addition, during the tree construction process, TST algorithm