Hash-Based iP traceback Alex C. Snoerent, Craig Partridge, Luis A Sanchez, Christine E. Jones Fabrice Tchakountio, Stephen T. Kent, and w. timothy strayer BBN Technologies 10 Moulton Street, Cambridge, MA 138 Isnoeren,craigcej,ftchakou,kentstrayer@bbn.com ABSTRACT While distributed denial of service attacks, typically conducted by The design of the IP protocol makes it difficult to reliably ident flooding network links with large amounts of traffic, are the most widely reported, there are other forms of network attacks. Many the originator of an IP packet. Even in the absence of any delio erate attempt to disguise a packets origin, wide-spread packet for- other classes of attacks can be conducted with significantly smaller warding techniques such as NAT and encapsulation may obscure packet flows. In fact, there are a number of widely-deployed op- the packets true source. Techniques have been developed to deter erating systems and routers that can be disabled by a single argeted packet [13]. To institute accountability for these atta mine the source of large packet flows, but, to date, no system has the source of individual packets must be identified been presented to track individual packets in an efficient, scalable Unfortunately, the anonymous nature of the IP protocol makes it difficult to accurately identify the true source of an IP datagram if We present a hash-based technique for IP traceback that generates the source wishes to conceal it. The network routing infrastructure audit trails for traffic within the network, and can trace the origin of is stateless and based largely on destination addresses; no entity in single IP packet delivered by the network in the recent past. We an IP network is officially responsible for ensuring the source ad- emonstrate that the system is effective, space-efficient(requiring dress is correct. Many routers employ a technique called ingress approximately 0.5% of the link capacity per unit time in storage), filtering [9] to limit source addresses of IP datagrams from a stub We present both analytic and simulation results showing the sy network to addresses belonging to that network, but not all routers tems effectiveness have the resources necessary to examine the source address of each incoming packet, and ingress filtering provides no protection on transit networks. Furthermore, spoofed addresses are legiti- 1 INTRODUCTION mately used by network address transla ATs), Mobile IP, and Today's Internet infrastructure is extremely vulnerable to motivated various unidirectional link technologies hybrid satellite and well-equipped attackers. Tools are readily available, from chitectures covertly exchanged exploit programs to publicly released vulner- Accordingly, a well-placed attacker can generate offending IP pack ability assessment software, to degrade performance or even dis- ets that appear to have originated from almost anywhere. further- able vital network services. The consequences are serious and, in- more, while techniques such as ingress filtering increase the diffi creasingly, financially disastrous, as can be seen by all-too-frequent culty of mounting an attack, transit networks are dependent upon headlines naming the most recent victim of an attack. their peers to perform the appropriate filtering. This interdepen dence is clearly unacceptable from a liability perspective; each mo- t Alex C. Snoeren is also with the MIT Laboratory for Computer Science tivated network must be able to secure itself independent (snoeren @lcs, mit.edu) Systems that can reliably trace individual packets back to their lUis A. Sanchez was with BBN Technologies; he is now with Megisto a first and important step in making attackers(or, at least, the systems they use)accountable. There are a number of This work was sponsored by the Defense Advanced Research significant challenges in the construction of such a tracing system Agency(DARPA)under contract No. N66001-00-C-8038. Vi clusions contained in this document are those of the authors an ncluding determining which packets to trace, maintaining privacy be interpreted as representing official policies, either expressed or implied (a tracing system should not adversely impact the privacy of legit mate users), and minimizing cost( both in router time spent tracking Permission to make digital or hard copies of all or part of this work for rather than forwarding packets, and in storage used to keep infor- fee provided that copie ot made or distributed for profit or commercial advantage, and that copies We have developed a Source path lsolation Engine (SPIE)to en- ear this notice and the full citation on the first page. To copy otherwise, to able IP traceback, the ability to identify the source of a particular IP packet given a copy of the packet to be traced, its destination, and an ugust 27-31, 2001, San Diego, California, USA pproximate time of receipt. Historically, tracing individual pack ACMl58113-411-8/010008.S5 ets has required prohibitive amounts of memory; one of SPIE's key
Hash-Based IP Traceback Alex C. Snoeren†, Craig Partridge, Luis A. Sanchez‡, Christine E. Jones, Fabrice Tchakountio, Stephen T. Kent, and W. Timothy Strayer BBN Technologies 10 Moulton Street, Cambridge, MA 02138 {snoeren, craig, cej, ftchakou, kent, strayer}@bbn.com ABSTRACT The design of the IP protocol makes it difficult to reliably identify the originator of an IP packet. Even in the absence of any deliberate attempt to disguise a packet’s origin, wide-spread packet forwarding techniques such as NAT and encapsulation may obscure the packet’s true source. Techniques have been developed to determine the source of large packet flows, but, to date, no system has been presented to track individual packets in an efficient, scalable fashion. We present a hash-based technique for IP traceback that generates audit trails for traffic within the network, and can trace the origin of a single IP packet delivered by the network in the recent past. We demonstrate that the system is effective, space-efficient (requiring approximately 0.5% of the link capacity per unit time in storage), and implementable in current or next-generation routing hardware. We present both analytic and simulation results showing the system’s effectiveness. 1 INTRODUCTION Today’s Internet infrastructure is extremely vulnerable to motivated and well-equipped attackers. Tools are readily available, from covertly exchanged exploit programs to publicly released vulnerability assessment software, to degrade performance or even disable vital network services. The consequences are serious and, increasingly, financially disastrous, as can be seen by all-too-frequent headlines naming the most recent victim of an attack. †Alex C. Snoeren is also with the MIT Laboratory for Computer Science (snoeren@lcs.mit.edu). ‡Luis A. Sanchez was with BBN Technologies; he is now with Megisto Systems, Inc. (lsanchez@megisto.com). This work was sponsored by the Defense Advanced Research Projects Agency (DARPA) under contract No. N66001-00-C-8038. Views and conclusions contained in this document are those of the authors and should not be interpreted as representing official policies, either expressed or implied. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage, and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. SIGCOMM’01, August 27-31, 2001, San Diego, California, USA. Copyright 2001 ACM 1-58113-411-8/01/0008...$5.00 While distributed denial of service attacks, typically conducted by flooding network links with large amounts of traffic, are the most widely reported, there are other forms of network attacks. Many other classes of attacks can be conducted with significantly smaller packet flows. In fact, there are a number of widely-deployed operating systems and routers that can be disabled by a single welltargeted packet [13]. To institute accountability for these attacks, the source of individual packets must be identified. Unfortunately, the anonymous nature of the IP protocol makes it difficult to accurately identify the true source of an IP datagram if the source wishes to conceal it. The network routing infrastructure is stateless and based largely on destination addresses; no entity in an IP network is officially responsible for ensuring the source address is correct. Many routers employ a technique called ingress filtering [9] to limit source addresses of IP datagrams from a stub network to addresses belonging to that network, but not all routers have the resources necessary to examine the source address of each incoming packet, and ingress filtering provides no protection on transit networks. Furthermore, spoofed source addresses are legitimately used by network address translators (NATs), Mobile IP, and various unidirectional link technologies such as hybrid satellite architectures. Accordingly, a well-placed attacker can generate offending IP packets that appear to have originated from almost anywhere. Furthermore, while techniques such as ingress filtering increase the diffi- culty of mounting an attack, transit networks are dependent upon their peers to perform the appropriate filtering. This interdependence is clearly unacceptable from a liability perspective; each motivated network must be able to secure itself independently. Systems that can reliably trace individual packets back to their sources are a first and important step in making attackers (or, at least, the systems they use) accountable. There are a number of significant challenges in the construction of such a tracing system including determining which packets to trace, maintaining privacy (a tracing system should not adversely impact the privacy of legitimate users), and minimizing cost (both in router time spent tracking rather than forwarding packets, and in storage used to keep information). We have developed a Source Path Isolation Engine (SPIE) to enable IP traceback, the ability to identify the source of a particular IP packet given a copy of the packet to be traced, its destination, and an approximate time of receipt. Historically, tracing individual packets has required prohibitive amounts of memory; one of SPIE’s key 3
innovations is to reduce the memory requirement( down to 0.5% attack. The traceback system must not be confounded by a moti- link bandwidth per unit time)through the use of Bloom filters. By vated attacker who subverts a router with the intent to subvert the storing only packet digests, and not the packets themselves, SPIe tracing system also does not increase a network's vulnerability to eavesdropping The instability of Internet routing is well known [15] and its impli SPie therefore allows routers to efficiently determine if they for- warded a particular packet within a specified time interval while cations for tracing are important. Two packets sent by the same host maintaining the privacy of unrelated traffic. to the same destination may traverse wildly different paths. As a re- sult, any system that seeks to determine origins using multi-packet The rest of this paper examines SPIE in detail. We begin by defin- analysis techniques must be prepared to make sense of divergent ing the problem of IP traceback in section 2, and articulate the de- path information sired features of a traceback system. We survey previous work in section 3, relating their feature sets against our requirements. Sec The assumption that the packet size should not grow is pi the most controversial. There are a number of protocols today that tion 4 describes the digesting process in detail. Section 5 presents cause the packet size to grow, for example technologies that rely on an overview of the SPIE architecture, while section 6 offers a prac- IP tunnels, such as IPsec and mobile IP. However ical implementation of the concepts. Section 7 provides both an- packet size causes MTU problems and increases overhead sharply alytic and simulation results evaluating SPIEs traceback success (each byte of additional overhead reduces system bandwidth by rates. We discuss the issues involved in deploying SPIE in section 8 before concluding in section 9 with a brief look at future work. about 1%, given the average packet size of about 128 bytes). It follows that an efficient traceback system should not cause packet size to grow. 2 IP TRACEBACK We assume that an end host, and in particular the victim of an at- The concept of IP traceback is not yet well defined. In an tack, may be resource-poor and unable to maintain substantial ad- to clarify the context in which SPIE was developed, this ditional administrative state regarding the routing state or the pack presents a detailed and rather formal definition of traceback ets it has previously received. This assumption comes from the ope that presenting a strawman definition of traceback will also observed rise in special purpose devices such as microscopes, cam- help the community better evaluate different traceback scheme eras, and printers that are attached to the Internet but have few inter In order to remain consistent with the terminology in the literature nal resources other than those devoted to performing their primary we will consider a packet of interest to be nefarious, and term it an attack packet; similarly, the destination of the packet is a victim. We The final assumption that traceback queries are infrequent has im- note,however,that there are many reasons to trace the source of a portant design implications. It implies queries can be handled by a acket; many packets of interest are sent with no ill intent whats routers control path, and need not be dealt with on the forwarding ever path at line speed. While there may be auditing tasks associated with packet forwarding to support traceback that must be executed 2.1 Assumptions while forwarding, the processing of the audit trails is infrequent There are several important assumptions that a traceback system should make about a network and the traffic it carries 2.2 The goal Packets may be addressed to more than one physical host Duplicate packets may exist in the networ Ideally, a traceback system should be able to identify the source Routers may be subverted, but not often of any piece of data sent across the network. In an IP framework, Attackers are aware they are being traced the packet is the smallest atomic unit of data. Any smaller division of data(a byte, for instance)is contained within a unique packet. The routing behavior of the network may be unstable Hence an optimal Ip traceback system would precisely identify the The packet size should not grow as a result of tracing ource of an arbitrary IP packet. Any larger data unit or stream can End hosts may be resource constrained be isolated by searching for any particular packet containing data Traceback is an infrequent operation The first two assumptions are simply characteristics of the Internet As with any auditing system, a traceback system can only be effec- Protocol. IP packets may contain a multicast or broadcast address tive in networks in which it has been deployed. Hence we consider as their destination, causing the routing infrastructure to duplicate the source of a packet to be one of them internally. An attacker can also inject multiple, identical pack ets itself, possibly at multiple locations. A tracing system must The ingress point to the traceback-enabled network be prepared for a situation where there are multiple sources of the The actual host or network of origin ame(identical) packet, or a single source of multiple(also typi- One or more compromised routers within the enabled network The next two assumptions speak to the capabilities of the assist in concealing a packet's source, it becomes obvious that one acker(s). An attacker may gain access to routers along(or adjacent to)the path from attacker to victim by a variety of means. Further,a INdeed, w argue that it is desirable to trace the individual pack- ets within ecause the individual packets may have originated at sophisticated attacker is aware of the characteristics of the network, different si only at the victim)and are likely to have followed including the possibility that the network is capable of tracing an different 1
innovations is to reduce the memory requirement (down to 0.5% of link bandwidth per unit time) through the use of Bloom filters. By storing only packet digests, and not the packets themselves, SPIE also does not increase a network’s vulnerability to eavesdropping. SPIE therefore allows routers to efficiently determine if they forwarded a particular packet within a specified time interval while maintaining the privacy of unrelated traffic. The rest of this paper examines SPIE in detail. We begin by defining the problem of IP traceback in section 2, and articulate the desired features of a traceback system. We survey previous work in section 3, relating their feature sets against our requirements. Section 4 describes the digesting process in detail. Section 5 presents an overview of the SPIE architecture, while section 6 offers a practical implementation of the concepts. Section 7 provides both analytic and simulation results evaluating SPIE’s traceback success rates. We discuss the issues involved in deploying SPIE in section 8 before concluding in section 9 with a brief look at future work. 2 IP TRACEBACK The concept of IP traceback is not yet well defined. In an attempt to clarify the context in which SPIE was developed, this section presents a detailed and rather formal definition of traceback. We hope that presenting a strawman definition of traceback will also help the community better evaluate different traceback schemes. In order to remain consistent with the terminology in the literature, we will consider a packet of interest to be nefarious, and term it an attack packet; similarly, the destination of the packet is a victim. We note, however, that there are many reasons to trace the source of a packet; many packets of interest are sent with no ill intent whatsoever. 2.1 Assumptions There are several important assumptions that a traceback system should make about a network and the traffic it carries: • Packets may be addressed to more than one physical host • Duplicate packets may exist in the network • Routers may be subverted, but not often • Attackers are aware they are being traced • The routing behavior of the network may be unstable • The packet size should not grow as a result of tracing • End hosts may be resource constrained • Traceback is an infrequent operation The first two assumptions are simply characteristics of the Internet Protocol. IP packets may contain a multicast or broadcast address as their destination, causing the routing infrastructure to duplicate them internally. An attacker can also inject multiple, identical packets itself, possibly at multiple locations. A tracing system must be prepared for a situation where there are multiple sources of the same (identical) packet, or a single source of multiple (also typically identical) packets. The next two assumptions speak to the capabilities of the attacker(s). An attacker may gain access to routers along (or adjacent to) the path from attacker to victim by a variety of means. Further, a sophisticated attacker is aware of the characteristics of the network, including the possibility that the network is capable of tracing an attack. The traceback system must not be confounded by a motivated attacker who subverts a router with the intent to subvert the tracing system. The instability of Internet routing is well known [15] and its implications for tracing are important. Two packets sent by the same host to the same destination may traverse wildly different paths. As a result, any system that seeks to determine origins using multi-packet analysis techniques must be prepared to make sense of divergent path information. The assumption that the packet size should not grow is probably the most controversial. There are a number of protocols today that cause the packet size to grow, for example technologies that rely on IP tunnels, such as IPsec and mobile IP. However, increasing the packet size causes MTU problems and increases overhead sharply (each byte of additional overhead reduces system bandwidth by about 1%, given the average packet size of about 128 bytes). It follows that an efficient traceback system should not cause packet size to grow. We assume that an end host, and in particular the victim of an attack, may be resource-poor and unable to maintain substantial additional administrative state regarding the routing state or the packets it has previously received. This assumption comes from the observed rise in special purpose devices such as microscopes, cameras, and printers that are attached to the Internet but have few internal resources other than those devoted to performing their primary task. The final assumption that traceback queries are infrequent has important design implications. It implies queries can be handled by a router’s control path, and need not be dealt with on the forwarding path at line speed. While there may be auditing tasks associated with packet forwarding to support traceback that must be executed while forwarding, the processing of the audit trails is infrequent with respect to their generation. 2.2 The goal Ideally, a traceback system should be able to identify the source of any piece of data sent across the network. In an IP framework, the packet is the smallest atomic unit of data. Any smaller division of data (a byte, for instance) is contained within a unique packet. Hence an optimal IP traceback system would precisely identify the source of an arbitrary IP packet. Any larger data unit or stream can be isolated by searching for any particular packet containing data within the stream.1 As with any auditing system, a traceback system can only be effective in networks in which it has been deployed. Hence we consider the source of a packet to be one of: • The ingress point to the traceback-enabled network • The actual host or network of origin • One or more compromised routers within the enabled network If one assumes that any router along the path may be co-opted to assist in concealing a packet’s source, it becomes obvious that one 1Indeed, we would argue that it is desirable to trace the individual packets within a stream because the individual packets may have originated at different sites (meeting only at the victim) and are likely to have followed different paths through the network. 4
time to live(TTL) field and checksum recomputation, IP packets may be further transformed by intermediate routers. Packet trans formation may be the result of valid processing, router error, or malicious intent. A traceback system need not concern itself with packet transformations resulting from error or malicious behavior. Packets resulting from such transformations only need be traced to the point of transformation, as the transforming node either needs to be fixed or can be considered a co-conspirator. An optimum tions, however, back to the source of the original packet Valid packet transformations are defined as a change of packet state that allows for or enhances network data delivery. Transformations occur due to such reasons as hardware needs, network management protocol requirements, and source request. Based on the transform Taining attack paths 1. Packet Encapsulation: A new packet is generated in which the tical packets injected by Al and A2 and received by the vic- original packet is encapsulated as the payload(e.g, IPsec) tim, V. The arrows indicate links traversed by the packet; The new packet is forwarded to an intermediate destination nodes on an attack path are shaded: A1, Rl, Ra, Rr, R9, v and {A2,R2,Rs,R7,R9,V} 2. Packer Generation: One or more packets are genera direct result of an action by the router on the origin must attempt to discern not only the packet's source, but its entire ath through the network. If a path can be traced through any num- (e.g. an ICMP Echo Reply sent in response to an ICl ber of non -subverted routers. then it must terminate at either the Request). The new packets are forwarded and processed in- dependent of the original packet. source of the flow or pass through a subverted router which can be considered to be a co-conspirator and treated appropriately. Hence, we are interested in constructing an attack path, where the path RFC 1812-compliant routers [u such as packet fragmentation, IP consists of each router traversed by the packet on Its ourney irom option processing, ICMP processing, and packet duplication. Net source to the victim. Because conspiring routers can fabricate trace work address translation(NAT) and both IP-in-IP and IPsec tunnel information, the path can only be guaranteed to be accurate on the portion from the victim to the first source--multiple sources may be ing are also notable forms of packet transformation. Many of these transformations result in an irrecoverable loss of the original packet identified if routers are subverted. Further, since multiple, indistin- uishable packets may be injected into the network from differen state due to the stateless nature of ip networks ources in the general case, a traceback system should construct an A study of wide-area traffic patterns conducted by the Cooperative attack graph composed of the attack paths for every instance of the Association for Internet Data Analysis(CAIDA)found less than attack packet that arrived at the victim. Figure I depicts the net- 3% of lP traffic undergoes common transformation and IP tunnel- work as viewed by the victim and a particular attack graph for that ing [12]. While this study did not encompass all forms of transfor- nat An attack graph may contain false positives in the presence of sub- fraction of the overall IP traffic traversing the Internet today. How. verted routers; that is, the attack graph may identify sources that ever, attackers may transmit packets engineered to experience trans- did not actually emit the packet. We argue this is an unavoidable formation. The ability to trace packets that undergo transformation consequence of admitting the possibility of subverted routers. An is, therefore, an essential feature of any viable traceback system attempting to minimize false positives; it must never exonerate an 3 RELATED WORK attacker by not including the attacker in the attack graph. Further, when a traceback system is deployed, it must not reduce the There are two approaches to the problem of determining the route privacy ofIP communications. In particular, entities not involved in of a packet flow one can audit the flow as it traverses the network, he generation, forwarding, or receipt of the original packet should or one can attempt to infer the route based upon its impact on the not be able to gain access to packet contents by either utilizing or state of the network. Both approaches become increasingly difficult as part of participating in the IP traceback system. An ideal IP as the size of the flow decreases but the latter becomes infeasible traceback system must not expand the eavesdropping capabilities when fow sizes approach a single packet because small flows gen of a malicious party Route inference was pioneered by Burch and Cheswick [5] who 2.3 Transformations considered the restricted problem of large flows and pro- posed a novel technique that systematically It is important to note that packets may be modified during the for- work links. By watching for variations in the received packet flow warding process. In addition to the standard decrementing of the due to the restricted link bandwidth, they are able to infer the flows
V R6 R8 R9 R7 S1 R1 A1 S3 R4 A2 S4 R2 R3 S5 R5 Figure 1: An attack graph containing attack paths for two identical packets injected by A1 and A2 and received by the victim, V . The arrows indicate links traversed by the packet; nodes on an attack path are shaded: {A1, R1, R4, R7, R9, V } and {A2, R2, R5, R7, R9, V }. must attempt to discern not only the packet’s source, but its entire path through the network. If a path can be traced through any number of non-subverted routers, then it must terminate at either the source of the flow or pass through a subverted router which can be considered to be a co-conspirator and treated appropriately. Hence, we are interested in constructing an attack path, where the path consists of each router traversed by the packet on its journey from source to the victim. Because conspiring routers can fabricate trace information, the path can only be guaranteed to be accurate on the portion from the victim to the first source—multiple sources may be identified if routers are subverted. Further, since multiple, indistinguishable packets may be injected into the network from different sources in the general case, a traceback system should construct an attack graph composed of the attack paths for every instance of the attack packet that arrived at the victim. Figure 1 depicts the network as viewed by the victim and a particular attack graph for that victim. An attack graph may contain false positives in the presence of subverted routers; that is, the attack graph may identify sources that did not actually emit the packet. We argue this is an unavoidable consequence of admitting the possibility of subverted routers. An ideal traceback system, however, produces no false negatives while attempting to minimize false positives; it must never exonerate an attacker by not including the attacker in the attack graph. Further, when a traceback system is deployed, it must not reduce the privacy of IP communications. In particular, entities not involved in the generation, forwarding, or receipt of the original packet should not be able to gain access to packet contents by either utilizing or as part of participating in the IP traceback system. An ideal IP traceback system must not expand the eavesdropping capabilities of a malicious party. 2.3 Transformations It is important to note that packets may be modified during the forwarding process. In addition to the standard decrementing of the time to live (TTL) field and checksum recomputation, IP packets may be further transformed by intermediate routers. Packet transformation may be the result of valid processing, router error, or malicious intent. A traceback system need not concern itself with packet transformations resulting from error or malicious behavior. Packets resulting from such transformations only need be traced to the point of transformation, as the transforming node either needs to be fixed or can be considered a co-conspirator. An optimum traceback system should trace packets through valid transformations, however, back to the source of the original packet. Valid packet transformations are defined as a change of packet state that allows for or enhances network data delivery. Transformations occur due to such reasons as hardware needs, network management, protocol requirements, and source request. Based on the transform produced, packet transformations are categorized as follows: 1. Packet Encapsulation: A new packet is generated in which the original packet is encapsulated as the payload (e.g., IPsec). The new packet is forwarded to an intermediate destination for de-encapsulation. 2. Packet Generation: One or more packets are generated as a direct result of an action by the router on the original packet (e.g. an ICMP Echo Reply sent in response to an ICMP Echo Request). The new packets are forwarded and processed independent of the original packet. Common packet transformations include those performed by RFC 1812-compliant routers [1] such as packet fragmentation, IP option processing, ICMP processing, and packet duplication. Network address translation (NAT) and both IP-in-IP and IPsec tunneling are also notable forms of packet transformation. Many of these transformations result in an irrecoverable loss of the original packet state due to the stateless nature of IP networks. A study of wide-area traffic patterns conducted by the Cooperative Association for Internet Data Analysis (CAIDA) found less than 3% of IP traffic undergoes common transformation and IP tunneling [12]. While this study did not encompass all forms of transformation (NAT processing being a notable omission), it seems safe to assume that packet transformations account for a relatively small fraction of the overall IP traffic traversing the Internet today. However, attackers may transmit packets engineered to experience transformation. The ability to trace packets that undergo transformation is, therefore, an essential feature of any viable traceback system. 3 RELATED WORK There are two approaches to the problem of determining the route of a packet flow: one can audit the flow as it traverses the network, or one can attempt to infer the route based upon its impact on the state of the network. Both approaches become increasingly difficult as the size of the flow decreases, but the latter becomes infeasible when flow sizes approach a single packet because small flows generally have no measurable impact on the network state. Route inference was pioneered by Burch and Cheswick [5] who considered the restricted problem of large packet flows and proposed a novel technique that systematically floods candidate network links. By watching for variations in the received packet flow due to the restricted link bandwidth, they are able to infer the flow’s 5
route. This requires considerable knowledge of network topology probability of detecting small flows and does not alleviate the se- and the ability to generate large packet floods on arbitrary network curity issues raised by storing complete packets in the router. The One can categorize auditing techniq ues into two c to the way in which they balance resource requirements across the Alternatively, routers can be tasked to perform more sophisticat network components. Some techniques require resources at both auditing in real time, extracting a smaller amount of information the end host and the routing infrastructure, others require resources as packets are forwarded. Many currently available routers support only within the network itself. Of those that require only infrastruc- input debugging a feature that identifies on which incoming port ure support, some add packet processing to the forwarding engine a particular outgoing packet(or set of packets) of interest arrived of the routers while others offload the computation to the control Since no history is stored, however, this process must be activated path of the routers before an attack packet passes by. Furthermore, due to the high overhead of this operation on many popular router architectures 3.1 End-host schemes activating it may have adverse effects on the trafic currently being serviced by the router. Some auditing approaches attempt to distribute the burden by stor ing state at the end hosts rather than in the network. Routers notify the packet destination of their presence on the route. Because IP 3.3 Specialized routing packets cannot grow arbitrarily large, schemes have been developed to reduce the amount of space required to send such information. One of the main problems with the link testing or logging meth Recently proposed techniques by Savage et al. [21]and Bellovin[2] ods above is the large amount of repetition required. A trace is explore in-band and out-of-band signaling, respectively Because of the high overhead involved, neither Savage nor Bellovin along the way. Once the incoming link or links have been identified attempt to provide audit information for every packet. Instead, each employs probabilistic methods that allow sufficiently large packet Several techniques have been developed to streamline and automate lows to be traced. By providing partial information on a subset this process. Some IsPs have developed their own ad hoc mecha packets in a flow, auditing routers enable an end host to recon- nisms for automatically conducting input debugging across their struct the entire path traversed by the packet flow after red eceiving a networks. Schnackenberg et al. [22I propose a special Intruder sufficient number of packets belonging to the flow Detection and Isolation Protocol (DiP)to facilitate interaction be The two schemes diverge in the methods used to communicate the tween routers involved in a traceback effort. IDiP does not specify information to the end host Savage et al. employ a packet marking how participating entities should track packet traffic; it simply re- heme that encodes the information in rarely-used fields within quires that they be able to determine whether or not they have seen the IP header itself. This approach has been improved upon by a component of an attack matching a certain description. Even with of paths automated tools, however, each router in the IsP must support input ticate the encodings (23). In order to avoid the backwards compat. debugging or logging which are not common in today s high-speed bility issues and increased computatio cated encoding schemes employed in the packet marking schemes In order to avoid this requirement, Stone [24] suggests constructing Bellovin's scheme(and later extensions by Wu et al. [25])simply an overlay network connecting all the edge routers of an ISP. By sends the audit information in an ICMP message using a deliberately simple topology of specialized routers, suspi- cious flows can be dynamically rerouted across the special tracking 3.2 Infrastructure approaches network for analysis. This approach has two major shortcomings First, the attack must be sufficiently long-lived to allow the Isto End-host schemes require the end hosts to log meta data in case an effect the rerouting before the relevant flow terminates.Second,the incoming packet proves to be offensive. Alternatively, the network itself can be charged with maintaining the audit trails routing change is perceptible by the attacker, and an especially mo- tivated attacker may be able to escape detection by taking appropri- The obvious approach to auditing packet flow is simply to log pack- ate action. While techniques exist to hide precisely what changed ets at various points throughout the network and then use appropri- about the route, changes in layer-three topology are hard to mask the pack network. Logging requires no computation on the router's fast path and, thus, can be implemented efficiently in today's router architec- 4 PACKET DIGESTING ture. Sager suggests such a monitoring approach [19]. However the effectiveness of the logs is limited by the amount of space avail- SPIe, the Source Path Isolation Engine, uses auditing techniques to able to store them. Given todays link speeds, packet logs quickly support the traceback of individual packets while reducing the stor- grow to intractable sizes, even over relatively short time frames. An age requirements by several orders of magnitude over current log oC-192 link is capable of transterring 1. 25GB per second. If one ing and storing 32-bit packet digests rather than storing the packets ds to conduct a query, a router with 16 links 1. 2TB of high-speed storage themselves. In addition to reducing storage requirements, storing packet digests instead of the actual packet contents preserves traf- These requirements can be reduced by sampling techniques similar fic confidentiality by preventing SPle from being used as a tool for to those of the end-host schemes, but down-sampling reduces the eavesdropping
route. This requires considerable knowledge of network topology and the ability to generate large packet floods on arbitrary network links. One can categorize auditing techniques into two classes according to the way in which they balance resource requirements across the network components. Some techniques require resources at both the end host and the routing infrastructure, others require resources only within the network itself. Of those that require only infrastructure support, some add packet processing to the forwarding engine of the routers while others offload the computation to the control path of the routers. 3.1 End-host schemes Some auditing approaches attempt to distribute the burden by storing state at the end hosts rather than in the network. Routers notify the packet destination of their presence on the route. Because IP packets cannot grow arbitrarily large, schemes have been developed to reduce the amount of space required to send such information. Recently proposed techniques by Savage et al. [21] and Bellovin [2] explore in-band and out-of-band signaling, respectively. Because of the high overhead involved, neither Savage nor Bellovin attempt to provide audit information for every packet. Instead, each employs probabilistic methods that allow sufficiently large packet flows to be traced. By providing partial information on a subset of packets in a flow, auditing routers enable an end host to reconstruct the entire path traversed by the packet flow after receiving a sufficient number of packets belonging to the flow. The two schemes diverge in the methods used to communicate the information to the end host. Savage et al. employ a packet marking scheme that encodes the information in rarely-used fields within the IP header itself. This approach has been improved upon by Song and Perrig to improve the reconstruction of paths and authenticate the encodings [23]. In order to avoid the backwards compatibility issues and increased computation required by the sophisticated encoding schemes employed in the packet marking schemes, Bellovin’s scheme (and later extensions by Wu et al. [25]) simply sends the audit information in an ICMP message. 3.2 Infrastructure approaches End-host schemes require the end hosts to log meta data in case an incoming packet proves to be offensive. Alternatively, the network itself can be charged with maintaining the audit trails. The obvious approach to auditing packet flow is simply to log packets at various points throughout the network and then use appropriate extraction techniques to discover the packet’s path through the network. Logging requires no computation on the router’s fast path and, thus, can be implemented efficiently in today’s router architecture. Sager suggests such a monitoring approach [19]. However, the effectiveness of the logs is limited by the amount of space available to store them. Given today’s link speeds, packet logs quickly grow to intractable sizes, even over relatively short time frames. An OC-192 link is capable of transferring 1.25GB per second. If one allows 60 seconds to conduct a query, a router with 16 links would require 1.2TB of high-speed storage. These requirements can be reduced by sampling techniques similar to those of the end-host schemes, but down-sampling reduces the probability of detecting small flows and does not alleviate the security issues raised by storing complete packets in the router. The ability of an attacker to break into a router and capture terrabytes of actual traffic has severe privacy implications. Alternatively, routers can be tasked to perform more sophisticated auditing in real time, extracting a smaller amount of information as packets are forwarded. Many currently available routers support input debugging, a feature that identifies on which incoming port a particular outgoing packet (or set of packets) of interest arrived. Since no history is stored, however, this process must be activated before an attack packet passes by. Furthermore, due to the high overhead of this operation on many popular router architectures, activating it may have adverse effects on the traffic currently being serviced by the router. 3.3 Specialized routing One of the main problems with the link testing or logging methods above is the large amount of repetition required. A trace is conducted in a hop-by-hop fashion requiring a query at each router along the way. Once the incoming link or links have been identified, the process must be repeated at the upstream router. Several techniques have been developed to streamline and automate this process. Some ISPs have developed their own ad hoc mechanisms for automatically conducting input debugging across their networks. Schnackenberg et al. [22] propose a special Intruder Detection and Isolation Protocol (IDIP) to facilitate interaction between routers involved in a traceback effort. IDIP does not specify how participating entities should track packet traffic; it simply requires that they be able to determine whether or not they have seen a component of an attack matching a certain description. Even with automated tools, however, each router in the ISP must support input debugging or logging which are not common in today’s high-speed routers for reasons discussed above. In order to avoid this requirement, Stone [24] suggests constructing an overlay network connecting all the edge routers of an ISP. By using a deliberately simple topology of specialized routers, suspicious flows can be dynamically rerouted across the special tracking network for analysis. This approach has two major shortcomings. First, the attack must be sufficiently long-lived to allow the ISP to effect the rerouting before the relevant flow terminates. Second, the routing change is perceptible by the attacker, and an especially motivated attacker may be able to escape detection by taking appropriate action. While techniques exist to hide precisely what changed about the route, changes in layer-three topology are hard to mask. 4 PACKET DIGESTING SPIE, the Source Path Isolation Engine, uses auditing techniques to support the traceback of individual packets while reducing the storage requirements by several orders of magnitude over current logbased techniques [19]. Traffic auditing is accomplished by computing and storing 32-bit packet digests rather than storing the packets themselves. In addition to reducing storage requirements, storing packet digests instead of the actual packet contents preserves traf- fic confidentiality by preventing SPIE from being used as a tool for eavesdropping. 6
Version Header Length ype of Service Identification Fragment Offset Checksum Source Address 3000 Destination Address 1e05 Payload 2022242628303234363840 Figure 2: The fields of an IP packet. Fields in gray are masked Figure 3: The fraction of packets that collide as a function of pre- out before digesting, including the Type of Service, Time to Live fix length. The WAN trace represents 985, 150 packets( with 5, 801 (TTL), IP checksum, and IP options fields duplicates removed) collected on July 20, 2000 at the University of Florida oC-3 gateway [14]. The lan trace consists of one million packets(317 duplicates removed)observed on an Ethernet segment 4.1 Hash input at the MIT Lab for Computer Science represent an IP packet and enable the identification of the packet wide area with identical prefixes indicates that packets with match- able to limit the size of the hash input both for performance and ng prefix lengths of 22 and 23 bytes are ICMP Time Exceeded for reasons discussed below (c f. section 5.3). Duffield and Gross- packets with matching prefixes between 24 and 31 bytes in leng glauser encountered similar requirements while sampling a subset are TCP packets with IP identifications also set to zero which are forwarded packets in an attempt to measure tratfic fiows [7]. We first differentiated by the TCP sequence number or acknowledg use a similar approach, masking variant packet content and select- ng an appropriate-length prefix of the packet to use as input to the digesting function. Our choice of invariant fields and prefix length The markedly higher collision rate in the local area is due to the lack is slightly different, however of address and traffic diversity. This expected result does not sig Figure 2 shows an IP packet and the fields included by the spie di- nificantly impact SPIEs performance, however. LANS are likely to exist at only two points in an attack gesting function. SPIE computes digests over the invariant portion ing the victim and the attacker(s). False positives on the victims of the IP header and the first 8 bytes of the payload. Frequently local network can be easily eliminated from the attack graph--they modified header fields are masked prior to digesting. Note that be- likely share the same gateway router in any event. False positives yond the obvious fields(TTL, TOS, and checksum), certain IP op- at the source are unlikely if the attacker is using spoofed source ad- tions cause routers to rewrite the option field at various intervals. To dresses dresses, as this provides the missing diversity in attack traffic, and ensure a packet appears identical at all steps along its route, SPIE remain in the immediate vicinity of the true attacker by definition masks or compensates for these fields when computing the packet Hence, for the purposes of SPIE, IP packets are effectively distin- digests.It is important to note that the invariant IP fields used for guished by the first 28 invariant bytes of the packet SPie digesting may occasionally be modified by a packet transform 4.2 Bloom filters Our research indicates that the first 28 invariant bytes of a packet masked IP header plus the first 8 bytes of payload) are sufficient Storing the set of digests for the traffic forwarded by the router to differentiate almost all non-identical packets. Figure 3 presents would require massive amounts of storage. Instead, SPie uses a the rate of packet collisions for an increasing prefix length for two space-efficient data structure known as a bloom filter to record representative traces: a Wan trace from an OC-3 gateway router, packet digests [4]. A Bloom filter computes k distinct packet di- and a Lan trace from an active 100Mb Ethernet segment (Results gests for each packet using independent uniform hash functions were similar for traces across a number of sites. )Two unique pack- nd uses the n-bit results to index into a 2-sized bit array. The ets which are identical up to the specified prefix length are termed array is initialized to all zeros, and bits are set to one as packets are a collision. A 28-byte prefix results in a collision rate of approxi- received. Figure 4 depicts a Bloom filter with k hash functions mately 0.00092% in the wide area and 0. 139% on the Lan Membership tests can be conducted simply by computing the k di Unlike similar results reported by duffield and Grossglauser [7, fig. gests on the packet in question and checking the indicated bit posi 41,our results include only unique packets, exact duplicates were ber of current operating systems, removed from the packet trace. Close inspection of packets in the including recent versions of Linux, frequently set the IP
Payload Options Destination Address Source Address TTL Protocol Checksum Identification D F M F Fragment Offset Version Header Length Type of Service Total Length Figure 2: The fields of an IP packet. Fields in gray are masked out before digesting, including the Type of Service, Time to Live (TTL), IP checksum, and IP options fields. 4.1 Hash input The packet content used as input to the hash function must uniquely represent an IP packet and enable the identification of the packet across hops in the forwarding path. At the same time, it is desirable to limit the size of the hash input both for performance and for reasons discussed below (c.f. section 5.3). Duffield and Grossglauser encountered similar requirements while sampling a subset of forwarded packets in an attempt to measure traffic flows [7]. We use a similar approach, masking variant packet content and selecting an appropriate-length prefix of the packet to use as input to the digesting function. Our choice of invariant fields and prefix length is slightly different, however. Figure 2 shows an IP packet and the fields included by the SPIE digesting function. SPIE computes digests over the invariant portion of the IP header and the first 8 bytes of the payload. Frequently modified header fields are masked prior to digesting. Note that beyond the obvious fields (TTL, TOS, and checksum), certain IP options cause routers to rewrite the option field at various intervals. To ensure a packet appears identical at all steps along its route, SPIE masks or compensates for these fields when computing the packet digests. It is important to note that the invariant IP fields used for SPIE digesting may occasionally be modified by a packet transform (c.f. section 5.3). Our research indicates that the first 28 invariant bytes of a packet (masked IP header plus the first 8 bytes of payload) are sufficient to differentiate almost all non-identical packets. Figure 3 presents the rate of packet collisions for an increasing prefix length for two representative traces: a WAN trace from an OC-3 gateway router, and a LAN trace from an active 100Mb Ethernet segment. (Results were similar for traces across a number of sites.) Two unique packets which are identical up to the specified prefix length are termed a collision. A 28-byte prefix results in a collision rate of approximately 0.00092% in the wide area and 0.139% on the LAN. Unlike similar results reported by Duffield and Grossglauser [7, fig. 4], our results include only unique packets; exact duplicates were removed from the packet trace. Close inspection of packets in the 1e-06 1e-05 0.0001 0.001 0.01 0.1 1 20 22 24 26 28 30 32 34 36 38 40 Fraction of Collided Packets Prefix Length (in bytes) WAN LAN Figure 3: The fraction of packets that collide as a function of pre- fix length. The WAN trace represents 985,150 packets (with 5,801 duplicates removed) collected on July 20, 2000 at the University of Florida OC-3 gateway [14]. The LAN trace consists of one million packets (317 duplicates removed) observed on an Ethernet segment at the MIT Lab for Computer Science. wide area with identical prefixes indicates that packets with matching prefix lengths of 22 and 23 bytes are ICMP Time Exceeded error packets with the IP identification field set to zero. Similarly, packets with matching prefixes between 24 and 31 bytes in length are TCP packets with IP identifications also set to zero which are first differentiated by the TCP sequence number or acknowledgment fields.2 The markedly higher collision rate in the local area is due to the lack of address and traffic diversity. This expected result does not significantly impact SPIE’s performance, however. LANs are likely to exist at only two points in an attack graph: immediately surrounding the victim and the attacker(s). False positives on the victim’s local network can be easily eliminated from the attack graph—they likely share the same gateway router in any event. False positives at the source are unlikely if the attacker is using spoofed source addresses, as this provides the missing diversity in attack traffic, and remain in the immediate vicinity of the true attacker by definition. Hence, for the purposes of SPIE, IP packets are effectively distinguished by the first 28 invariant bytes of the packet. 4.2 Bloom filters Storing the set of digests for the traffic forwarded by the router would require massive amounts of storage. Instead, SPIE uses a space-efficient data structure known as a Bloom filter to record packet digests [4]. A Bloom filter computes k distinct packet digests for each packet using independent uniform hash functions, and uses the n-bit results to index into a 2n-sized bit array. The array is initialized to all zeros, and bits are set to one as packets are received. Figure 4 depicts a Bloom filter with k hash functions. Membership tests can be conducted simply by computing the k digests on the packet in question and checking the indicated bit posi- 2Further investigation indicates a number of current operating systems, including recent versions of Linux, frequently set the IP ID to zero. 7
H1(P) H2(P) =且〓二且 H3(P) ISP's Network Hk(P n bits Figure 4: For each packet received, SPIE computes k independent The SPIe network infrastructure, consisting of Data Gen- n-bit digests, and sets the corresponding bits in the 2-bit digest Agents(DGAs), SPIE Collection and Reduction Agents s), and a SPIE Traceback Manager (STM). tions. If any one of them is zero, the packet was not stored in the prints, but that is the common case already. The third, and most table. If, however, all the bits are one, it is highly likely the packet difficult attack, is to create an attack packet with the same finger- as stored. It is possible that some set of other insertions caused all print as another, non-attack packet. In general, this attack simply the bits to be set, creating a false positive, but the rate of such false yields one more false-positive path, usually only for one hop(as positives can be controlled [8] 4.3 Hash functions 5 SOURCE PATH ISOLATION ENGINE SPie places three major restrictions on the family of hash functions, SPIE-enhanced routers maintain a cache of packet digests for re- F,used in its Bloom filters. First, each member function must cently forwarded traffic. If a packet is determined to be offensive distribute a highly correlated set of input values(IP packet prefixes), by some intrusion detection system(or judged interesting by some is, for a hash function H: P-2 in F, and distinct packets pa rfy E P, Pr[H(a)=H(y)= 1/(2m). This is a standard property of good hash functions. mad减ptp hash function(H(r)=H(y) for some H)be independent of co/- 5.1 Architecture SPIE further requires that the event that two packets collide in one lision events in any other functions(h(a)= H(y), H* H). The tasks of packet auditing, query processing, and attack graph ntuitively, this implies false positives at one router are independ generation are dispersed among separate components in the SPIE of false positives at neighboring routers. Formally, for any func- system. Figure 5 shows the three major architectural components tion H E F chosen at random independently of the input packets of the SPIE system. Each SPIE-enhanced router has a Data gener- r and y, Pr[H(ar)=H(y]=2 with high probability. Such ation Agent(DGA)associated with it. Depending upon the type of hash families. called universal hash families. were first defined by outer in question, the dga can be implemented and deployed as a Carter and Wegman [6] and can be implemented in a variety of software agent, an interface card plug to the switching background fashions 3, 10, 11] bus, or a separate auxiliary box connected to the router through Finally, member functions must be straightforward to compute at high link This requirement is not impractical because SPie The dGa produces packet digests of each packet as it departs the do not require any cryptographic "hardness"prop- router, and stores the digests in bit-mapped digest tables. The tables it does not have to be difficult to generate a valid are paged every so often, and represent the set of traffic forwarded input packet given a particular hash value. Being able to create a by the router for a particular interval of time. Each table is packet with a particular hash value enables three classes of attacks, tated with the time interval and the set of hash functions used to all of which are fairly benign. One attack would ensure that all at- compute the packet digests over that interval. The digest tables are ack packets have the same fingerprint in the bloom filter at some stored locally at the dga for some period of time, depending on router(which is very difficult since there are multiple, independent the resource constraints of the router. If interest is expressed in the hashes at each router ), but this merely elicits a packet trace that traffic data for a particular time interval, the tables are transferred reveals a larger set of systems from which the attacker can attack. to a SPIE Collection and Reduction( SCAR)agent for longer-term Another attack is to ensure all attack packets have different finge storage and analysis
H1(P) H2(P) H3(P) . . . Hk(P) n bits 1 1 1 1 2n bits Figure 4: For each packet received, SPIE computes k independent n-bit digests, and sets the corresponding bits in the 2n-bit digest table. tions. If any one of them is zero, the packet was not stored in the table. If, however, all the bits are one, it is highly likely the packet was stored. It is possible that some set of other insertions caused all the bits to be set, creating a false positive, but the rate of such false positives can be controlled [8]. 4.3 Hash functions SPIE places three major restrictions on the family of hash functions, F, used in its Bloom filters. First, each member function must distribute a highly correlated set of input values (IP packet prefixes), P, as uniformly as possible over the hash’s result value space. That is, for a hash function H : P → 2m in F, and distinct packets x = y ∈ P, Pr[H(x) = H(y)] = 1/(2m). This is a standard property of good hash functions. SPIE further requires that the event that two packets collide in one hash function (H(x) = H(y) for some H) be independent of collision events in any other functions (H (x) = H (y), H = H). Intuitively, this implies false positives at one router are independent of false positives at neighboring routers. Formally, for any function H ∈ F chosen at random independently of the input packets x and y, Pr[H(x) = H(y)] = 2−m with high probability. Such hash families, called universal hash families, were first defined by Carter and Wegman [6] and can be implemented in a variety of fashions [3, 10, 11]. Finally, member functions must be straightforward to compute at high link speeds. This requirement is not impractical because SPIE hash functions do not require any cryptographic “hardness” properties. That is, it does not have to be difficult to generate a valid input packet given a particular hash value. Being able to create a packet with a particular hash value enables three classes of attacks, all of which are fairly benign. One attack would ensure that all attack packets have the same fingerprint in the Bloom filter at some router (which is very difficult since there are multiple, independent hashes at each router), but this merely elicits a packet trace that reveals a larger set of systems from which the attacker can attack. Another attack is to ensure all attack packets have different fingerRouter Router DGA Router Router Router DGA SCAR Router Router Router DGA STM ISP's Network Figure 5: The SPIE network infrastructure, consisting of Data Generation Agents (DGAs), SPIE Collection and Reduction Agents (SCARs), and a SPIE Traceback Manager (STM). prints, but that is the common case already. The third, and most difficult attack, is to create an attack packet with the same fingerprint as another, non-attack packet. In general, this attack simply yields one more false-positive path, usually only for one hop (as the hash functions change at each hop). 5 SOURCE PATH ISOLATION ENGINE SPIE-enhanced routers maintain a cache of packet digests for recently forwarded traffic. If a packet is determined to be offensive by some intrusion detection system (or judged interesting by some other metric), a query is dispatched to SPIE which in turn queries routers for packet digests of the relevant time periods. The results of this query are used in a simulated reverse-path flooding (RPF) algorithm to build an attack graph that indicates the packet’s source(s). 5.1 Architecture The tasks of packet auditing, query processing, and attack graph generation are dispersed among separate components in the SPIE system. Figure 5 shows the three major architectural components of the SPIE system. Each SPIE-enhanced router has a Data Generation Agent (DGA) associated with it. Depending upon the type of router in question, the DGA can be implemented and deployed as a software agent, an interface card plug to the switching background bus, or a separate auxiliary box connected to the router through some auxiliary interface. The DGA produces packet digests of each packet as it departs the router, and stores the digests in bit-mapped digest tables. The tables are paged every so often, and represent the set of traffic forwarded by the router for a particular interval of time. Each table is annotated with the time interval and the set of hash functions used to compute the packet digests over that interval. The digest tables are stored locally at the DGA for some period of time, depending on the resource constraints of the router. If interest is expressed in the traffic data for a particular time interval, the tables are transferred to a SPIE Collection and Reduction (SCAR) agent for longer-term storage and analysis. 8
SCARs are responsible for a particular region of the network, serv- Packet Data ng as data concentration points for several routers. SCARs monitor Digest and record the topology of their area and facilitate traceback of an packets that traverse the region. Due to the complex topologies of 32 bits today's ISPs, there will typically be several SCARs distributed over Figure 6: A Transform Lookup Table(TLT) stores sufficient infor- an entire network. Upon request, each SCAR produces an attack graph for its particular region. The attack graphs from each SCAR is indexed by packet digest, specifies the type of transformation, Traceback Manager(STM) and stores any irrecoverable packet data The STM controls the whole SPIE system. The STM is the inter face to the intrusion detection system or other entity requesting a 5.3 Transformation processing packet trace. When a request is presented to the stm, it verifies the authenticity of the request, dispatches the request to the appre IP packets may undergo valid transformation while traversing the priate SCARs, gathers the resulting attack graphs, and assembles network, and SPIe must be capable of tracing through such trans- them into a complete attack graph. Upon completion of the trace- formations. In particular, SPIE must be able to reconstruct the origi- back process, STMs reply to intrusion detection systems with the nal packet from the transformed packet. Unfortunately, many trans- final attack graph formations are not invertible without additional information due to the stateless nature of IP networks. Consequently, sufficient packet data must be recorded by SPIe at the time of transformation such that the original packet is able to be reconstructed 5.2 Traceback processing The packet data chosen as input to the digesting function deter- the traceback process can begin, an attack packet dentified. Most likely, an intrusion detection system(IDS) the sth be need only consider transformations that modify fields used as input termine that an exceptional event has occurred and provide the stm to the digest function. SPIE computes digests over the IP header masks out with a packet, P, victim, V, and time of attack, T. SPIE places two omits in the case of IP options)several frequently updated fields onstraints on the IDS: the victim must be expressed in terms of the before digesting, as shown in figure 2 of section 4. This hides last-hop router, not the end host itself, and the attack packet must most hop-by-hop transformations from the digesting function,but be identified in a timely fashion. The first requirement provides the forces SPIE to explicitly handle each of the following transfor- thar y process with a starting point, the latter stems from the fact traceback must be initiated before the appropriate digest tables mations: fragmentation, network address translation(NAT), ICMP are overwritten by the DGAs. This time constraint is directly re- messages, IP-in-IP tunneling, and IP security(IPsec) lated to the amount of resources dedicated to the storage of traffic Recording the information necessary to reconstruct the original digests. (We discuss timing and resource tradeoffs in section 7) packet from a transformed packet requires additional resources Upon receipt of traceback request, the STM cryptographically ver- Fortunately for SPIE, the circumstances that cause a packet to un dergo a transformation will generally take that packet off of the ifies its authenticity and integrity. Any entity wishing to employ fast path of the router and put it onto the control path, relaxing unchanged, however; hence, transformation information must b ification, the STM immediately asks all SCARs in its domain to stored in a scalable and space-efficient manner poll their respective DGAs for the relevant traffic digests. Time is critical because this poll must happen while the appropriate di- gest tables are still resident at the dGAs. Once the digest tables 5.3.1 Transform lookup table are safely transferred to SCARs, the traceback process is no longer under real-time constraints Along with each packet digest table collected at a DGA, SPIe main- tains a corresponding transform table for the same interval of time Beginning at the SCAR responsible for the victims region of the called a transform lookup table, or TLT. Each entry in the tlt con- network, the STM sends a query message consisting of the packet, tains three fields. The first field stores a digest of the transformed ress point, and time of receipt. The SCAR responds with a partial packet. The second field specifies the type of transformation attack graph and the packet as it entered the region( it may have three bits are sufficient to uniquely identify the transformation type been transformed, possibly multiple times, within the region). The among those supported by SPle. The last field contains a variable attack graph either terminates within the region managed by the mount of packet data the length of which depends upon the type SCAR, in which case a source has been identified, or it contains of transformation being recorded nodes at the edge of the SCAR S network region, in which case the For space efficiency, the data field is limited to 32 bits. Some trans- STM sends a query(with the possibly-transformed packet)to the formations, such as network address translation, may require more SCAR abutting that edge node pace. These transformations utilize a level of indirection--one bit attack gra of the transformation type field is reserved as an indirect flag. If or at the edge of the the indirect, or l, flag is set, the third field of the tlt is treated as a SPIE system. The STM then constructs a composite attack graph pointer to an external data structure which contains the information which it returns to the intrusion detection system necessary to reconstruct the packet
SCARs are responsible for a particular region of the network, serving as data concentration points for several routers. SCARs monitor and record the topology of their area and facilitate traceback of any packets that traverse the region. Due to the complex topologies of today’s ISPs, there will typically be several SCARs distributed over an entire network. Upon request, each SCAR produces an attack graph for its particular region. The attack graphs from each SCAR are grafted together to form a complete attack graph by the SPIE Traceback Manager (STM). The STM controls the whole SPIE system. The STM is the interface to the intrusion detection system or other entity requesting a packet trace. When a request is presented to the STM, it verifies the authenticity of the request, dispatches the request to the appropriate SCARs, gathers the resulting attack graphs, and assembles them into a complete attack graph. Upon completion of the traceback process, STMs reply to intrusion detection systems with the final attack graph. 5.2 Traceback processing Before the traceback process can begin, an attack packet must be identified. Most likely, an intrusion detection system (IDS) will determine that an exceptional event has occurred and provide the STM with a packet, P, victim, V , and time of attack, T. SPIE places two constraints on the IDS: the victim must be expressed in terms of the last-hop router, not the end host itself, and the attack packet must be identified in a timely fashion. The first requirement provides the query process with a starting point; the latter stems from the fact that traceback must be initiated before the appropriate digest tables are overwritten by the DGAs. This time constraint is directly related to the amount of resources dedicated to the storage of traffic digests. (We discuss timing and resource tradeoffs in section 7). Upon receipt of traceback request, the STM cryptographically verifies its authenticity and integrity. Any entity wishing to employ SPIE to perform a traceback operation must be properly authorized in order to prevent denial of service attacks. Upon successful verification, the STM immediately asks all SCARs in its domain to poll their respective DGAs for the relevant traffic digests. Time is critical because this poll must happen while the appropriate digest tables are still resident at the DGAs. Once the digest tables are safely transferred to SCARs, the traceback process is no longer under real-time constraints. Beginning at the SCAR responsible for the victim’s region of the network, the STM sends a query message consisting of the packet, egress point, and time of receipt. The SCAR responds with a partial attack graph and the packet as it entered the region (it may have been transformed, possibly multiple times, within the region). The attack graph either terminates within the region managed by the SCAR, in which case a source has been identified, or it contains nodes at the edge of the SCAR’s network region, in which case the STM sends a query (with the possibly-transformed packet) to the SCAR abutting that edge node. This process continues until all branches of the attack graph terminate, either at a source within the network, or at the edge of the SPIE system. The STM then constructs a composite attack graph which it returns to the intrusion detection system. Digest Type I Packet Data 29 bits 3 bits 32 bits Figure 6: A Transform Lookup Table (TLT) stores sufficient information to invert packet transformations at SPIE routers. The table is indexed by packet digest, specifies the type of transformation, and stores any irrecoverable packet data. 5.3 Transformation processing IP packets may undergo valid transformation while traversing the network, and SPIE must be capable of tracing through such transformations. In particular, SPIE must be able to reconstruct the original packet from the transformed packet. Unfortunately, many transformations are not invertible without additional information due to the stateless nature of IP networks. Consequently, sufficient packet data must be recorded by SPIE at the time of transformation such that the original packet is able to be reconstructed. The packet data chosen as input to the digesting function determines the set of packet transformations SPIE must handle—SPIE need only consider transformations that modify fields used as input to the digest function. SPIE computes digests over the IP header and the first eight bytes of the packet payload but masks out (or omits in the case of IP options) several frequently updated fields before digesting, as shown in figure 2 of section 4. This hides most hop-by-hop transformations from the digesting function, but forces SPIE to explicitly handle each of the following transformations: fragmentation, network address translation (NAT), ICMP messages, IP-in-IP tunneling, and IP security (IPsec). Recording the information necessary to reconstruct the original packet from a transformed packet requires additional resources. Fortunately for SPIE, the circumstances that cause a packet to undergo a transformation will generally take that packet off of the fast path of the router and put it onto the control path, relaxing the timing requirements. The router’s memory constraints remain unchanged, however; hence, transformation information must be stored in a scalable and space-efficient manner. 5.3.1 Transform lookup table Along with each packet digest table collected at a DGA, SPIE maintains a corresponding transform table for the same interval of time called a transform lookup table, or TLT. Each entry in the TLT contains three fields. The first field stores a digest of the transformed packet. The second field specifies the type of transformation— three bits are sufficient to uniquely identify the transformation type among those supported by SPIE. The last field contains a variable amount of packet data the length of which depends upon the type of transformation being recorded. For space efficiency, the data field is limited to 32 bits. Some transformations, such as network address translation, may require more space. These transformations utilize a level of indirection—one bit of the transformation type field is reserved as an indirect flag. If the indirect, or I, flag is set, the third field of the TLT is treated as a pointer to an external data structure which contains the information necessary to reconstruct the packet. 9
The indirect flag can also be used for flow caching. In many cases, packets undergoing a particular transformation are related. In such ases, it is possible to reduce the storage requirements by suppress ing duplicate packet data, instead referencing a single copy of the required data that can be used to reconstruct any packet in the flow. Such a scheme requires, however, that the SPle-enabled router it- self be capable of flow caching, or at least identification, so that the packets within the fow can be correlated and stored appropriately In order to preserve alignment, it is likely efficient implementations would store only 29 bits of the packet digest resulting in 64-bit wide TLT entries. This width implies eight distinct packet digests will ap to the same TLt entry. The relative rarity of packet transfor- mations [12], the sparsity of the digest table, and the uniformity of the digesting function combine to make collisions extremely rare in practice. Assuming a digest table capacity of roughly 3. 2Mpkts (16Mb SRAM, see section 7. 2)and a transformation rate of 3%, the expected collision rate is approximately 1: 5333 packets. Even if Figure 7: Reverse path flooding, starting at the victims router, v collision occurs, it simply results in an additional possible trans- and proceeding backwards toward the attacker, A. Solid arrows formation of the queried packet. Each transformation is computed represent the attack path; dashed arrows are SPie queries. Queries including the null transformation)and traceback continues. In- are dropped by routers that did not forward the packet in question correctly transformed packets likely will not exist at neighboring routers and, thus, will not contribute any false nodes to the attack messages always include at least the first 64 bits of the offending packet [16]. Careful readers may be concerned that encapsulation cannot be inverted if the encapsulated packet is subsequently frag- 5.3.2 Special-purpose gateways mented and the fragments containing the end and first 64 bits of payload are not available. While this is strictly Some classes of packet transformations, notably NAT and tunnel true, such transformations need to be inverted only in extreme cases ing, are often performed on a large fraction of packets passing as it takes a very sophisticated attacker to cause a packet to be first through a particular gateway. The transform lookup table would encapsulated, then fragmented, and then ensure fragment loss. If quickly grow to an unmanageable size in such instances, hence, all the fragments are received, the original header can be extracted SPIE considers the security gateway or NAT functionality of routers from the reassembled payload. It seems extremely difficult for an s a separate entity. Standard routing transformations are handled as attacker to insure that packet fragments are lost. It can cause packet above, but special purpose gateway transformations require a differ- loss by flooding the link, but to do so requires sending such a large ent approach to transformation handling. Transformations in these number of packets that it is extremely likely that all the fragments types of gateways are generally computed in a stateful way (usually for at least one packet will be successfully received by the decapsu- based on a static rule set); hence, they can be inverted in a similar lator for use in traceback such transformations is straightforward; we do not consider it here. 5.4 Graph construction 5.3.3 Sample transformations Each SCAR constructs a subgraph using topology information about its particular region of the network. After collecting the A good example of transformation is packet fragmentation. To digest tables from all of the routers in its region, a SCAR sim- avoid needing to store any of the packet payload, SPIE supports ulates reverse-path flooding(RPF) by examining the digest ta- traceback of only the first packet fragment. Non-first fragments bles in the order they would be queried if an actual reverse pa may be traced to the flood was conducted on the topology that existed at the time the based attacks [13], is the attacker. (If only a subset of the fragments packet was forwarded. Figure 7 shows how reverse-path flood- is received by the victim the packet cannot be reassembled; hence, ing would discover the attack path from V to A, querying routers the only viable attack is a denial of service attack on the reassembly Rs, R9, R7, R4, Ss, Rs, and R2 along the way. It is important to engine. But, if the fragmentation occurs within the network itsel note that the routers are not actually queried-the SCar has al an attacker cannot control which fragments are received by the vic- ready cached all the relevant hash digests locally tim so the victim will eventually receive a first fragment to use in In order to query each router, a SCAR computes the appropriate set traceback. Packet data to be recorded includes the total length, fragment offset, and more fragments(MF)field. Since properly- membership. If an entry exists for the packet in question, the router behaving IP routers cannot create fragments with less than 8 bytes is considered to have forwarded the packet. The SCAR adds the of payload information [17], SPie is always able to invert frag tation and construct the header and at least 64 bits of payload of the current node to the attack graph and moves on to each of its neigh- fragmented packet which is sufficient to continue traceback bors(except, of course, the neighbor already queried). If, however, the digest is not found in the table, it may be necessary to search Observe that SPIe never needs to record any packet payload infor- the digest table for the previous time period. Depending on the link mation ICMP transformations can be inverted because ICMP error latency between routers, SCARs may need to request multiple di-
The indirect flag can also be used for flow caching. In many cases, packets undergoing a particular transformation are related. In such cases, it is possible to reduce the storage requirements by suppressing duplicate packet data, instead referencing a single copy of the required data that can be used to reconstruct any packet in the flow. Such a scheme requires, however, that the SPIE-enabled router itself be capable of flow caching, or at least identification, so that the packets within the flow can be correlated and stored appropriately. In order to preserve alignment, it is likely efficient implementations would store only 29 bits of the packet digest resulting in 64-bit wide TLT entries. This width implies eight distinct packet digests will map to the same TLT entry. The relative rarity of packet transformations [12], the sparsity of the digest table, and the uniformity of the digesting function combine to make collisions extremely rare in practice. Assuming a digest table capacity of roughly 3.2Mpkts (16Mb SRAM, see section 7.2) and a transformation rate of 3%, the expected collision rate is approximately 1:5333 packets. Even if a collision occurs, it simply results in an additional possible transformation of the queried packet. Each transformation is computed (including the null transformation) and traceback continues. Incorrectly transformed packets likely will not exist at neighboring routers and, thus, will not contribute any false nodes to the attack graph. 5.3.2 Special-purpose gateways Some classes of packet transformations, notably NAT and tunneling, are often performed on a large fraction of packets passing through a particular gateway. The transform lookup table would quickly grow to an unmanageable size in such instances; hence, SPIE considers the security gateway or NAT functionality of routers as a separate entity. Standard routing transformations are handled as above, but special purpose gateway transformations require a different approach to transformation handling. Transformations in these types of gateways are generally computed in a stateful way (usually based on a static rule set); hence, they can be inverted in a similar fashion. While the details are implementation-specific, inverting such transformations is straightforward; we do not consider it here. 5.3.3 Sample transformations A good example of transformation is packet fragmentation. To avoid needing to store any of the packet payload, SPIE supports traceback of only the first packet fragment. Non-first fragments may be traced to the point of fragmentation which, for fragmentbased attacks [13], is the attacker. (If only a subset of the fragments is received by the victim the packet cannot be reassembled; hence, the only viable attack is a denial of service attack on the reassembly engine. But, if the fragmentation occurs within the network itself, an attacker cannot control which fragments are received by the victim so the victim will eventually receive a first fragment to use in traceback.) Packet data to be recorded includes the total length, fragment offset, and more fragments (MF) field. Since properlybehaving IP routers cannot create fragments with less than 8 bytes of payload information [17], SPIE is always able to invert fragmentation and construct the header and at least 64 bits of payload of the pre-fragmented packet which is sufficient to continue traceback. Observe that SPIE never needs to record any packet payload information. ICMP transformations can be inverted because ICMP error V R6 R8 R9 R7 S1 R1 S2 S3 R4 A S4 R2 R3 S5 R5 Figure 7: Reverse path flooding, starting at the victim’s router, V , and proceeding backwards toward the attacker, A. Solid arrows represent the attack path; dashed arrows are SPIE queries. Queries are dropped by routers that did not forward the packet in question. messages always include at least the first 64 bits of the offending packet [16]. Careful readers may be concerned that encapsulation cannot be inverted if the encapsulated packet is subsequently fragmented and the fragments containing the encapsulated IP header and first 64 bits of payload are not available. While this is strictly true, such transformations need to be inverted only in extreme cases as it takes a very sophisticated attacker to cause a packet to be first encapsulated, then fragmented, and then ensure fragment loss. If all the fragments are received, the original header can be extracted from the reassembled payload. It seems extremely difficult for an attacker to insure that packet fragments are lost. It can cause packet loss by flooding the link, but to do so requires sending such a large number of packets that it is extremely likely that all the fragments for at least one packet will be successfully received by the decapsulator for use in traceback. 5.4 Graph construction Each SCAR constructs a subgraph using topology information about its particular region of the network. After collecting the digest tables from all of the routers in its region, a SCAR simulates reverse-path flooding (RPF) by examining the digest tables in the order they would be queried if an actual reverse path flood was conducted on the topology that existed at the time the packet was forwarded. Figure 7 shows how reverse-path flooding would discover the attack path from V to A, querying routers R8, R9, R7, R4, S5, R5, and R2 along the way. It is important to note that the routers are not actually queried—the SCAR has already cached all the relevant hash digests locally. In order to query each router, a SCAR computes the appropriate set of digests as indicated by the table, and then consults the table for membership. If an entry exists for the packet in question, the router is considered to have forwarded the packet. The SCAR adds the current node to the attack graph and moves on to each of its neighbors (except, of course, the neighbor already queried). If, however, the digest is not found in the table, it may be necessary to search the digest table for the previous time period. Depending on the link latency between routers, SCARs may need to request multiple di- 10
est tables from each router in order to assure they have the digest SPIE Card (or Box) for the appropriate time frame. Once a digest is located, the packet DRAM arrival time is always considered to be the latest possible time in the interval. This insures the packet must have been seen at an earlier time at adjacent routers If the packet is not found in any of the digest tables for the relevant time period, that particular branch of the search tree is terminated and searching continues at the remaining routers. A list of previ- ously visited nodes is kept at all times, and cycles are pruned to The result of this procedure is a connected graph containing the set nodes believed to have forwarded the packet toward the victim Assuming correct operation of the routers, this graph is guaranteed be a superset of the actual attack graph. But due to digest col- lesions, there may be nodes in the attack graph that are not in the Signature Taps Signature Aggregation History Memory actual attack graph. We call these nodes false positives and base the success of spiE on its ability to limit the number of false positive Figure 8: A sample SPIe dGa hardware implementation for high contained in a returned attack graph speed routers. 6 PRACTICAL IMPLEMENTATION As time passes, the forwarded traffic will begin to fill the digest reeBSD SPIE prototype, we simulate a universal hash bles and they must be paged out before they become over-saturated family using MD5[18. A random member is defined by selecting resulting in unacceptable false-positive rates. The tables are stored random input vector to prepend to each packet. The properties in a history buffer implemented as a large ring buffer. Digest tables of MDS ensure that the digests of identical packets with different can then be transferred by a separate control processor to SCARs put vectors are independent. The 128-bit output of MD5 is then while they are stored in the ring buffer considered as four independent 32-bit digests which can suppor Bloom filters of dimension up to four. Router implementations re- 7 ANALYSIS airing higher performance are likely to prefer other universal hash families specifically tailored to hardware implementation [11]. A There are several tradeoffs involved when determining the optimum simple family amenable to fast hardware implementation could be amount of resources to dedicate to SPIE on an individual router or constructed by computing a CRC modulo a random member of the the network as a whole. SPIE's resource requirements can be ex set of indivisible polynomials over Z2k pressed in terms of two quantities: the number of packet digest functions used by the Bloom filter, and the amount of memory used n order to ensure hash independence, each router periodically gen to store packet digests. Similarly, SPIEs performance can be char- erates a set of k independent input vectors and uses them to select k acterized in two orthogonal dimensions. The first is the length of digest functions needed for the Bloom filter from the family of uni- time for which packet digests are kept. Queries can only be issued versal hashes. These input vectors are computed using a pseudo- while the digests are cached; unless requested by a SLan unio random number generator which is independently seeded at each reasonable amount of time the dgas will discard the digest tabl router. For increased robustness against adversarial traffic, the input in order to make room for more recent ones. The second is the ac- vectors are changed each time the digest table is paged, resulting in independence not only across routers but also across time periods. number of false positives in the graph returned by SPie The size of the digest bit vector, or digest table, varies with the Both of these metrics can be controlled by adjusting operational total traffic capacity of the router, faster routers need larger vectors parameters. In particular, the more memory available for storing for the same time period. Similarly, the optimum number of hash packet digests, the longer the time queries can be issued. Similarly, functions varies with the size of the bit vector. Routers with tigh digest tables with lower false-positive rates yield more accurate at- nemory constraints can compute additional digest functions and tack graphs. Hence, we wish to characterize the performance of provide the same false-positive rates as those who compute fewer spiE in terms of the amount of available memory and digest tabl digests but provide a larger bit vector performance Figure 8 depicts a possible implementation of a SPIE Data Genera- tion Agent in hardware for use on high-speed routers. A full discus- 7.1 False positives sion of the details of the architecture and an analysis of its perfor- ance were presented previously [20]. Briefly, each interface card We first relate the rate of false positives in an attack graph to the in the router is outfitted with an Interface Tap which computes mul- rate of false positives in an individual digest table. This relationship tiple independent digests of each packet as it is forwarded. These depends on the actual network topology and traffic being forwarded digests are passed to a separate SPie processor(implemented either at the time. We can, however, make some simplifying assumptions in a line card form factor or as an external unit)which stores them in order to derive an upper bound on the number of false positives as described above in digest tables for specific time period as a function of digest table performance
gest tables from each router in order to assure they have the digest for the appropriate time frame. Once a digest is located, the packet arrival time is always considered to be the latest possible time in the interval. This insures the packet must have been seen at an earlier time at adjacent routers. If the packet is not found in any of the digest tables for the relevant time period, that particular branch of the search tree is terminated and searching continues at the remaining routers. A list of previously visited nodes is kept at all times, and cycles are pruned to assure termination. The result of this procedure is a connected graph containing the set of nodes believed to have forwarded the packet toward the victim. Assuming correct operation of the routers, this graph is guaranteed to be a superset of the actual attack graph. But due to digest collisions, there may be nodes in the attack graph that are not in the actual attack graph. We call these nodes false positives and base the success of SPIE on its ability to limit the number of false positives contained in a returned attack graph. 6 PRACTICAL IMPLEMENTATION For our FreeBSD SPIE prototype, we simulate a universal hash family using MD5 [18]. A random member is defined by selecting a random input vector to prepend to each packet. The properties of MD5 ensure that the digests of identical packets with different input vectors are independent. The 128-bit output of MD5 is then considered as four independent 32-bit digests which can support Bloom filters of dimension up to four. Router implementations requiring higher performance are likely to prefer other universal hash families specifically tailored to hardware implementation [11]. A simple family amenable to fast hardware implementation could be constructed by computing a CRC modulo a random member of the set of indivisible polynomials over Z2k . In order to ensure hash independence, each router periodically generates a set of k independent input vectors and uses them to select k digest functions needed for the Bloom filter from the family of universal hashes. These input vectors are computed using a pseudorandom number generator which is independently seeded at each router. For increased robustness against adversarial traffic, the input vectors are changed each time the digest table is paged, resulting in independence not only across routers but also across time periods. The size of the digest bit vector, or digest table, varies with the total traffic capacity of the router; faster routers need larger vectors for the same time period. Similarly, the optimum number of hash functions varies with the size of the bit vector. Routers with tight memory constraints can compute additional digest functions and provide the same false-positive rates as those who compute fewer digests but provide a larger bit vector. Figure 8 depicts a possible implementation of a SPIE Data Generation Agent in hardware for use on high-speed routers. A full discussion of the details of the architecture and an analysis of its performance were presented previously [20]. Briefly, each interface card in the router is outfitted with an Interface Tap which computes multiple independent digests of each packet as it is forwarded. These digests are passed to a separate SPIE processor (implemented either in a line card form factor or as an external unit) which stores them as described above in digest tables for specific time periods. . . . S32 S32 S32 S32 S32 Sk 2k-bit RAM t t-P s + FIFO RAM MUX Readout by Control Processor . . . . . . Ring Buffer DRAM Time =t readout every R ms Signature Taps Signature Aggregation History Memory Line Cards SPIE Card (or Box) Figure 8: A sample SPIE DGA hardware implementation for highspeed routers. As time passes, the forwarded traffic will begin to fill the digest tables and they must be paged out before they become over-saturated, resulting in unacceptable false-positive rates. The tables are stored in a history buffer implemented as a large ring buffer. Digest tables can then be transferred by a separate control processor to SCARs while they are stored in the ring buffer. 7 ANALYSIS There are several tradeoffs involved when determining the optimum amount of resources to dedicate to SPIE on an individual router or the network as a whole. SPIE’s resource requirements can be expressed in terms of two quantities: the number of packet digest functions used by the Bloom filter, and the amount of memory used to store packet digests. Similarly, SPIE’s performance can be characterized in two orthogonal dimensions. The first is the length of time for which packet digests are kept. Queries can only be issued while the digests are cached; unless requested by a SCAR within a reasonable amount of time, the DGAs will discard the digest tables in order to make room for more recent ones. The second is the accuracy of the candidate attack graphs which can be measured in the number of false positives in the graph returned by SPIE. Both of these metrics can be controlled by adjusting operational parameters. In particular, the more memory available for storing packet digests, the longer the time queries can be issued. Similarly, digest tables with lower false-positive rates yield more accurate attack graphs. Hence, we wish to characterize the performance of SPIE in terms of the amount of available memory and digest table performance. 7.1 False positives We first relate the rate of false positives in an attack graph to the rate of false positives in an individual digest table. This relationship depends on the actual network topology and traffic being forwarded at the time. We can, however, make some simplifying assumptions in order to derive an upper bound on the number of false positives as a function of digest table performance. 11
7.1.I Theoretical bounds Suppose, for example, each router neighbors have degree at most d ensures its digest tables have a false-positive rate of at 08 most P= p/d, where 0< p/d s 1(p is just an arbitrary tuning factor ). A simplistic analysis shows that for any true attack graph G with n nodes, the attack graph returned by SPIe will have at most np/(1-p)extra nodes in expectation. The false-positive rate of a digest table varies over time, depending on the traffic load at the router and the amount of time since it was 蜀西号E2 paged. Similarly, if the tables are paged on a strict schedule based on maximum link capacity, and the actual traffic load is less, digest tables will never reach their rated capacity. Hence, the analytic re- ult is a worst case bound since the digest table performs strictly 。 20 better while it is only partially full. Furthermore, our analysis as- ength of Attack Path(in hops) es the set of neighbors at each node is disjoint which is not true real networks. It seems reasonable to expect, therefore, that the Figure 9: The number of false positives in a SPIE-generated attack false-positive rate over real topologies with actual utilization rates graph as a function of the length of the attack path, for p= 1/8 would be substantially lower The theoretical bound is plotted against three simulation results, For the purposes of this discussion, we arbitrarily select a false- two with false-positive rates conditioned on router degree,one ive rate of n/7, resulting in no more than 5 additional nodes without. For the two degree-dependent runs, one considered actual n expectation for a path length of over 32 nodes(approaching the link utilization, while the other assumed full utilization. Each sim ter of the Internet) according to our theoretica ulation represents the average of 5000 runs using actual topology the bound above, p= 1/8 seems a reasonable starting point and we and utilization data from a national tier-one ISP. 7.2 Simulation results In order to validate our theoretical bound, we have plotted the ex- pected number of false positives as a function of attack path length In order to relate false-positive rate to digest table performance in and digest table performance, np/(1-p) as computed above, and real topologies, we have run extensive simulations using the actual now that in comparison to the results of three simulations on our network topology of a national tier-one IsP made up of roughly 70 ISP backbone topology. In the first, we set the maximum digest backbone routers with links ranging from T-1 to OC-3. We obtained table false-positive probability to P= p/d, as prescribed above a topology snapshot and average link utilization data for the IsPs Figure 9 shows a false-positive rate significantly lower than the an network backbone for a week-long period toward the end of 2000 tic bound. A significant portion of the disparity results from th sampled using periodic SNMP queries, and averaged over the week. relatively low link utilizations maintained by operational backbones tim, and sending 1000 attack packets at a constant rate between 25%6), as can he s in our data set had utilization rates of less than We simulated an attack by randomly selecting a source and vic- (77% of t en by comparing the results of a second sim lation assuming full link utilization. There remains, however. a the path from source to destination. A traceback is then simulated considerable gap between the analytic bound and simulated perfor- starting at the victim router and(hopefully) proceeding toward the mance in network backbones urce. Uniformly distributed background traffic is simulated by The non-linearity of the simulation results indicates there is a strong selecting a fixed maximum false-positive rate, P, for the digest ta- damping factor due to the topological structure of the network. In- ble at each off-path router. ( Real background traffic is not uniform, tuitively, routers with many neighbors are found at the core of the which would result in slight dependencies in the false-positive rates network(or at peering points), and routers with fewer neighbors are between routers, but we believe that this represents a reasonable found toward the edge of the network. This suggests false positives starting point. In order to accurately model performance with real induced by core routers may quickly die out as the attack graph traffic loads, the effective false-positive rate is scaled by the ob- proceeds toward less well-connected routers at the edge To examine the dependence upon vertex degree, we conducted e consider a non-transformed packet with only one other simulation. This time, we removed the false-positive source and one destination. Preliminary experiments with multi- dependence upon the degree of the router's neighbors, setting the ple sources(as might be expected in a distributed denial of servi digest table performance to simply P= p(and returning to (DDoS)attack) indicate false positives scale linearly with respect to tual utilization data). While there is a marked increase in the num the size of the attack graph, which is the union of the attack pat ber of false positives, it remains well below the analytic bound. for each copy of the packet. We do not, however, consider this case This somewhat surprising result indicates that despite the analy in the experiments presented here. (A DDos attack sending iden- bound's dependence on router degree, the hierarchical structure of tical packets from multiple sources only aids SPIE in its task. a IsP backbones may permit a relaxation of the coupling, allowing wise attacker would instead send distinct packets from each source, the false positive rate of the digest tables, P, to be set independently of the degree, d, resulting in significant space savings
7.1.1 Theoretical bounds Suppose, for example, each router whose neighbors have degree at most d ensures its digest tables have a false-positive rate of at most P = p/d, where 0 ≤ p/d ≤ 1 (p is just an arbitrary tuning factor). A simplistic analysis shows that for any true attack graph G with n nodes, the attack graph returned by SPIE will have at most np/(1 − p) extra nodes in expectation. The false-positive rate of a digest table varies over time, depending on the traffic load at the router and the amount of time since it was paged. Similarly, if the tables are paged on a strict schedule based on maximum link capacity, and the actual traffic load is less, digest tables will never reach their rated capacity. Hence, the analytic result is a worst case bound since the digest table performs strictly better while it is only partially full. Furthermore, our analysis assumes the set of neighbors at each node is disjoint which is not true in real networks. It seems reasonable to expect, therefore, that the false-positive rate over real topologies with actual utilization rates would be substantially lower. For the purposes of this discussion, we arbitrarily select a falsepositive rate of n/7, resulting in no more than 5 additional nodes in expectation for a path length of over 32 nodes (approaching the diameter of the Internet) according to our theoretical model. Using the bound above, p = 1/8 seems a reasonable starting point and we turn to considering its effectiveness in practice. 7.1.2 Simulation results In order to relate false-positive rate to digest table performance in real topologies, we have run extensive simulations using the actual network topology of a national tier-one ISP made up of roughly 70 backbone routers with links ranging from T-1 to OC-3. We obtained a topology snapshot and average link utilization data for the ISP’s network backbone for a week-long period toward the end of 2000, sampled using periodic SNMP queries, and averaged over the week. We simulated an attack by randomly selecting a source and victim, and sending 1000 attack packets at a constant rate between them. Each packet is recorded by every intermediate router along the path from source to destination. A traceback is then simulated starting at the victim router and (hopefully) proceeding toward the source. Uniformly distributed background traffic is simulated by selecting a fixed maximum false-positive rate, P, for the digest table at each off-path router. (Real background traffic is not uniform, which would result in slight dependencies in the false-positive rates between routers, but we believe that this represents a reasonable starting point.) In order to accurately model performance with real traffic loads, the effective false-positive rate is scaled by the observed traffic load at each router. For clarity, we consider a non-transformed packet with only one source and one destination. Preliminary experiments with multiple sources (as might be expected in a distributed denial of service (DDoS) attack) indicate false positives scale linearly with respect to the size of the attack graph, which is the union of the attack paths for each copy of the packet. We do not, however, consider this case in the experiments presented here. (A DDoS attack sending identical packets from multiple sources only aids SPIE in its task. A wise attacker would instead send distinct packets from each source, forcing the victim to trace each packet individually.) 0 0.2 0.4 0.6 0.8 1 0 5 10 15 20 25 30 Avg. Number of False Positives Length of Attack Path (in hops) Theoretical bound 0 0.2 0.4 0.6 0.8 1 0 5 10 15 20 25 30 Avg. Number of False Positives Length of Attack Path (in hops) Theoretical bound [100% util.] P=1/(8*degree) 0 0.2 0.4 0.6 0.8 1 0 5 10 15 20 25 30 Avg. Number of False Positives Length of Attack Path (in hops) Theoretical bound [100% util.] P=1/(8*degree) P=1/8 0 0.2 0.4 0.6 0.8 1 0 5 10 15 20 25 30 Avg. Number of False Positives Length of Attack Path (in hops) Theoretical bound [100% util.] P=1/(8*degree) P=1/8 P=1/(8*degree) Figure 9: The number of false positives in a SPIE-generated attack graph as a function of the length of the attack path, for p = 1/8. The theoretical bound is plotted against three simulation results, two with false-positive rates conditioned on router degree, one without. For the two degree-dependent runs, one considered actual link utilization, while the other assumed full utilization. Each simulation represents the average of 5000 runs using actual topology and utilization data from a national tier-one ISP. In order to validate our theoretical bound, we have plotted the expected number of false positives as a function of attack path length and digest table performance, np/(1 − p) as computed above, and show that in comparison to the results of three simulations on our ISP backbone topology. In the first, we set the maximum digest table false-positive probability to P = p/d, as prescribed above. Figure 9 shows a false-positive rate significantly lower than the analytic bound. A significant portion of the disparity results from the relatively low link utilizations maintained by operational backbones (77% of the links in our data set had utilization rates of less than 25%), as can be seen by comparing the results of a second simulation assuming full link utilization. There remains, however, a considerable gap between the analytic bound and simulated performance in network backbones. The non-linearity of the simulation results indicates there is a strong damping factor due to the topological structure of the network. Intuitively, routers with many neighbors are found at the core of the network (or at peering points), and routers with fewer neighbors are found toward the edge of the network. This suggests false positives induced by core routers may quickly die out as the attack graph proceeds toward less well-connected routers at the edge. To examine the dependence upon vertex degree, we conducted another simulation. This time, we removed the false-positive rate’s dependence upon the degree of the router’s neighbors, setting the digest table performance to simply P = p (and returning to actual utilization data). While there is a marked increase in the number of false positives, it remains well below the analytic bound. This somewhat surprising result indicates that despite the analytic bound’s dependence on router degree, the hierarchical structure of ISP backbones may permit a relaxation of the coupling, allowing the false positive rate of the digest tables, P, to be set independently of the degree, d, resulting in significant space savings. 12