Middleboxes No Longer Considered harmful Michael Walfish, Jeremy Stribling, Maxwell Krohn, Hari balakrishnan robert morris. and scott shenker MIT Computer Science and Artificial Intelligence Laboratory http://nms.csail.mit.edu/doa Abstract This decades-old guideline has become an empty Intermediate network elements, such as network address commandment, as firewalls, network address translators translators(NATs), firewalls, and transparent caches are (NATs), transparent caches, and other widely deployed now commonplace. The usual reaction in the network ar- network elements use higher-layer fields to perform their chitecture community to these so-called middleboxes is functions a combination of scorn(because they violate important architectural principles)and dismay(because these vi That these tenets are routinely violated is not merely olations make the Internet less flexible). While we ac- an Internet legalism. The inability of hosts in private knowledge these concerns, we also recognize that mid- address realms to pass handles allowing other hosts dleboxes have become an Internet fact of life for impor to communicate with them has hindered or halted the tant reasons. To retain their functions while eliminating spread of newer protocols, such as SIP [24]and various their dangerous side-effects, we propose an extension to peer-to-peer systems [18J Layer violations lead to rigid- the Internet architecture, called the Delegation-Oriented ity in the network infrastructure, as the transgressing Architecture(DOA), that not only allows, but also facili- network elements may not accommodate new traffic tates, the deployment of middleboxes DOA involves two classes. The hundreds of IETF proposals for working relatively modest changes to the current architecture: (a) around problems introduced by NATs [54],firewalls a set of references that are carried in packets and serve as and other layer-violating boxes are compelling evidence persistent host identifiers and(b)a way to resolve these that middleboxres(as such hosts are collectively known) references to delegates chosen by the referenced host. and the Internet architecture are not in harmony Indeed. because middleboxes violate one or both tenets 1 Introduction above, Internet architects have traditionally reacted to The Internet' s architecture is defined not iust by a set of them with disdain and despair We take a different view. Rather than seeing middle protocol specifications but also by a collection of genera. boxes as a blight on the Internet architecture, we see the design guidelines. Among the architecture's origina principles[12] are two tenets at the network layer(i.e, current Internet architecture as an impediment to middle IP layer)that are still widely valued, but are nonetheless boxes. We believe such intermediaries, as we will call often disobeyed in the current Internet them, exist for important and permanent reasons, and we think the future will have more not fewer. of them. #1: Every Internet entity has a unique network The market will continue to demand intermediaries layer identifier that allows others to reach it. During for various reasons. NATs maintain and bridge between the Internets youth, every network entity had a globally fferent IP spaces. Firewalls and other boxes that in- unique, fixed IP address. However, the emergence tercept unwanted packets will be increasingly needed of private networks and host mobility, among othe as attacks on end-hosts grow in rate and severity things, ended the halcyon days of unique identity and even sophisticated users have difficulty configuring PCs transparent reachability. Now, many Internet hosts have to be impervious to attack, we believe that users would no globally unique handle that serves to direct packets want to outsource this protection to a professionally managed host--one not physically interposed in front of the user--that would vet incoming packets. Under #2: Network elements should not process pack- the current architecture, such outsourcing to"off-path ets that are not addressed to them. We call this tenet hosts requires special-purpose machinery and extensive manual configuration. Intermediaries can also increase network element identified by an IP packet,'s destination I Even if the move to ipv6 as networks will field should inspect the packets higher-layer fields. remain. Moreover, private address r rotection against certain types of network attacks. Affiliation: UC Berkeley and ICSl. spaces are a temporary inconvenience that will s
Middleboxes No Longer Considered Harmful Michael Walfish, Jeremy Stribling, Maxwell Krohn, Hari Balakrishnan, Robert Morris, and Scott Shenker∗ MIT Computer Science and Artificial Intelligence Laboratory http://nms.csail.mit.edu/doa Abstract Intermediate network elements, such as network address translators (NATs), firewalls, and transparent caches are now commonplace. The usual reaction in the network architecture community to these so-called middleboxes is a combination of scorn (because they violate important architectural principles) and dismay (because these violations make the Internet less flexible). While we acknowledge these concerns, we also recognize that middleboxes have become an Internet fact of life for important reasons. To retain their functions while eliminating their dangerous side-effects, we propose an extension to the Internet architecture, called the Delegation-Oriented Architecture (DOA), that not only allows, but also facilitates, the deployment of middleboxes. DOA involvestwo relatively modest changes to the current architecture: (a) a set of references that are carried in packets and serve as persistent host identifiers and (b) a way to resolve these references to delegates chosen by the referenced host. 1 Introduction The Internet’s architecture is defined not just by a set of protocol specifications but also by a collection of general design guidelines. Among the architecture’s original principles [12] are two tenets at the network layer (i.e., IP layer) that are still widely valued, but are nonetheless often disobeyed in the current Internet: #1: Every Internet entity has a unique networklayer identifier that allows others to reach it. During the Internet’s youth, every network entity had a globally unique, fixed IP address. However, the emergence of private networks and host mobility, among other things, ended the halcyon days of unique identity and transparent reachability. Now, many Internet hosts have no globally unique handle that serves to direct packets to them. #2: Network elements should not process packets that are not addressed to them. We call this tenet “network-level layering”, and it implies that only a network element identified by an IP packet’s destination field should inspect the packet’s higher-layer fields. ∗Affiliation: UC Berkeley and ICSI. This decades-old guideline has become an empty commandment, as firewalls, network address translators (NATs), transparent caches, and other widely deployed network elements use higher-layer fields to perform their functions. That these tenets are routinely violated is not merely an Internet legalism. The inability of hosts in private address realms to pass handles allowing other hosts to communicate with them has hindered or halted the spread of newer protocols, such as SIP [24] and various peer-to-peer systems [18]. Layer violations lead to rigidity in the network infrastructure, as the transgressing network elements may not accommodate new traffic classes. The hundreds of IETF proposals for working around problems introduced by NATs [54], firewalls, and other layer-violating boxes are compelling evidence that middleboxes (as such hosts are collectively known) and the Internet architecture are not in harmony [8]. Indeed, because middleboxes violate one or both tenets above, Internet architects have traditionally reacted to them with disdain and despair. We take a different view. Rather than seeing middleboxes as a blight on the Internet architecture, we see the current Internet architecture as an impediment to middleboxes. We believe such intermediaries, as we will call them, exist for important and permanent reasons, and we think the future will have more, not fewer, of them. The market will continue to demand intermediaries for various reasons. NATs maintain and bridge between different IP spaces.1 Firewalls and other boxes that intercept unwanted packets will be increasingly needed as attacks on end-hosts grow in rate and severity. Since even sophisticated users have difficulty configuring PCs to be impervious to attack, we believe that users would want to outsource this protection to a professionally managed host—one not physically interposed in front of the user—that would vet incoming packets. Under the current architecture, such outsourcing to “off-path” hosts requires special-purpose machinery and extensive manual configuration. Intermediaries can also increase 1Even if the move to IPv6 accelerates, some IPv4 networks will remain. Moreover, private address realms give some protection against certain types of network attacks. Hence, we do not think private IP spaces are a temporary inconvenience that will soon end
balancing. Commercial service providers will continue cuss mobility and multi rts 2 This paper does not dis- performance through, for example, caching and load- application-level counter ming scenarios either(though to take advantage of such performance-enhancing inter- DOA, because it separates location and identity, could mediaries, disregarding architectural purit with modest extensions--handle those scenarios ). Given Thus, we have a fundamental conflict: although in- our limited focus, dOa should be viewed not as a com- termediaries offer clear advantages, they run afoul of prehensive architecture but rather as an architectural the two tenets above, which causes harm and makes de- component designed to address network-layer middle ploying and using intermediaries unnecessarily hard. Our boxes. Its design presumes IPv4 at the network layer but goal, therefore, is an architecture hospitable to intermedi- DOA is also compatible with, and useful for, IPv6. aries, specifically one that allows intermediaries to abide The final limitation we mention is that some peo by the two tenets, to avoid other architectural infractions, ple want to deploy tenet-violating middleboxes(e.g,a and to retain the same functions as today. Such an archi- censorious government that silently filters packets en- tecture would let a variety of middleboxes be deployed tering and exiting national borders)and that DOA can while also end-system protocols evolve indepen- neither prevent such architecturally suspect middleboxes dently and quickly nor mitigate their damage Oriented Architecture (DOA-is based on two main 2 Background ideas. First, all entities have a globally unique identi- This review of common network-layer middleboxes is fier in a flat namespace, and packets carry these identi- limited to the two we build under DoA--NATs and fiers Second. Doa allows senders and receivers to ex- firewalls--and to a subset of their draw backs for a com- press that one or more intermediaries should process plete review, see [8, 18, 23, 38, 55]. Although NAT and packets en route to a destination. This delegation lets firewalling often combined in one box. these two the resulting architecture embrace intermediaries as first- functions are logically separate class citizens that are explicitly invoked and need no 2.1 NAT and NaPt be physically interposed in front of the hosts they ser- e. Globally unique identifiers and delegation have Network Address Translation(NAT) and Network Ad- isted in previous work describing different architectures dress Port Translation(NAPT) [54 have several uses (e.8, 13 [57)); see 89 for details. This papers contribu- For the purposes of this paper, we assume the follow- tion is defining a relatively incremental extension to the ing common scenario: a NAT or NAPT box bridges two Internet architecture, DOA, that coherently accommo address realms, at least one of which is private. Private dates network-level intermediaries like Nats and fire. addresses are unique within an address realm but am- changes to IP or IP ro biguous between address realms [46]: public addresses does require changes to host and intermediary software. are globally unique and reachable from all Internet hosts However, these changes are modular, and current appli cations can be easily ported. box. Packets destined for hosts behind the box are said to We illustrate DOA with two examples: first be inbound. The difference between nat and NAPt network -extension boxes which are analogous to that NATs do not look at fields beyond the IP he realms but do not obscure hosts,' identities, and second. as"NAT", though our description focuses on NAPT(the network filtering boxes"which are analogous to today's more common of the two today ) for simplicity, we as- firewalls but do not violate network-level layering and sume that NAPTs have only one external IP address need not be topologically in front of the hosts they ser- People deploy NATs for two reasons vice. Our goal is to show how our architecture easily ac-. Convenience and Flexibility: Private addressing commodates these boxes realms allow people to administer a set of hosts with Scope and limitations Security: Since hosts behind the nat do not have DOA is based on a subset of the architecture in a pre global identities, a host outside the private realm can- vious paper [3]. That position paper touches not address the hosts in the private realm or express ous issues-including mobility, multi-homing, networ hat traffic should go to them, which protects them level middleboxes, and application-level middleboxes- from unwanted traffic. Also, by default (i.e, without with scant ion to design details or implementations lual configuration), a NAT allows only inbound In an attempt to bring some of those nebulous architec tural mutterings into focus, this paper concentrates ex sively on network-level intermediaries and ignores m时计
performance through, for example, caching and loadbalancing. Commercial service providers will continue to take advantage of such performance-enhancing intermediaries, disregarding architectural purity. Thus, we have a fundamental conflict: although intermediaries offer clear advantages, they run afoul of the two tenets above, which causes harm and makes deploying and using intermediaries unnecessarily hard. Our goal, therefore, is an architecture hospitable to intermediaries, specifically one that allows intermediaries to abide by the two tenets, to avoid other architectural infractions, and to retain the same functions as today. Such an architecture would let a variety of middleboxes be deployed while also letting end-system protocols evolve independently and quickly. Our architecture—which we call the DelegationOriented Architecture (DOA)—is based on two main ideas. First, all entities have a globally unique identi- fier in a flat namespace, and packets carry these identi- fiers. Second, DOA allows senders and receivers to express that one or more intermediaries should process packets en route to a destination. This delegation lets the resulting architecture embrace intermediaries as firstclass citizens that are explicitly invoked and need not be physically interposed in front of the hosts they service. Globally unique identifiers and delegation have existed in previous work describing different architectures (e.g., i3 [57]); see §9 for details. This paper’s contribution is defining a relatively incremental extension to the Internet architecture, DOA, that coherently accommodates network-level intermediaries like NATs and firewalls. DOA requires no changes to IP or IP routers but does require changes to host and intermediary software. However, these changes are modular, and current applications can be easily ported. We illustrate DOA with two examples: first, “network-extension boxes” which are analogous to today’s NATs in their establishment of private addressing realms but do not obscure hosts’ identities, and second, “network filtering boxes” which are analogous to today’s firewalls but do not violate network-level layering and need not be topologically in front of the hosts they service. Our goal is to show how our architecture easily accommodates these boxes. Scope and Limitations DOA is based on a subset of the architecture in a previous paper [3]. That position paper touches on various issues—including mobility, multi-homing, networklevel middleboxes, and application-level middleboxes— with scant attention to design details or implementations. In an attempt to bring some of those nebulous architectural mutteringsinto focus, this paper concentrates exclusively on network-level intermediaries and ignores their application-level counterparts.2 This paper does not discuss mobility and multi-homing scenarios either (though DOA, because it separates location and identity, could— with modest extensions—handle those scenarios). Given our limited focus, DOA should be viewed not as a comprehensive architecture but rather as an architectural component designed to address network-layer middleboxes. Its design presumes IPv4 at the network layer but DOA is also compatible with, and useful for, IPv6. The final limitation we mention is that some people want to deploy tenet-violating middleboxes (e.g., a censorious government that silently filters packets entering and exiting national borders) and that DOA can neither prevent such architecturally suspect middleboxes nor mitigate their damage. 2 Background This review of common network-layer middleboxes is limited to the two we build under DOA—NATs and firewalls—and to a subset of their drawbacks; for a complete review, see [8, 18, 23, 38, 55]. Although NAT and firewalling are often combined in one box, these two functions are logically separate. 2.1 NAT and NAPT Network Address Translation (NAT) and Network Address Port Translation (NAPT) [54] have several uses. For the purposes of this paper, we assume the following common scenario: a NAT or NAPT box bridges two address realms, at least one of which is private. Private addresses are unique within an address realm but ambiguous between address realms [46]; public addresses are globally unique and reachable from all Internet hosts. The hosts in the private realm are said to be behind the box. Packets destined for hosts behind the box are said to be inbound. The difference between NAT and NAPT is that NATs do not look at fields beyond the IP header. We adopt the convention of referring to both NAT and NAPT as “NAT”, though our description focuses on NAPT (the more common of the two today); for simplicity, we assume that NAPTs have only one external IP address. People deploy NATs for two reasons: • Convenience and Flexibility: Private addressing realms allow people to administer a set of hosts without having to obtain public IP addresses for each. • Security: Since hosts behind the NAT do not have global identities, a host outside the private realm cannot address the hosts in the private realm or express that traffic should go to them, which protects them from unwanted traffic. Also, by default (i.e., without manual configuration), a NAT allows only inbound 2The basic architectural ideas can be illustrated with network-level intermediaries. At the application level, one must consider how applications are structured and named, a topic outside this paper’s scope [3]
traffic that is part of a connection initiated by a host 3 Architectural Overview of doa ehind the nat This section gives an overview of DOA; we defer design The main operations performed by a NAT are: 1)dy- details to 84. We first list desired architectural proper- namically allocating a source port at its public IP address ties that aid in gracefully accommodating intermediaries when a host behind it initiates a TCP connection or sends and then describe mechanisms to achieve those proper- a UDP packet; and (2)rewriting IP address and transport- ties. The remainder of the section discusses how DOA layer port fields to demultiplex inbound packets to the extends the Internet architectur hosts behind the naT and to multiplex outbound pack- 3.1 Desired Architectural Properties ets over the same source IP address. NATs violate both tenets in S1. First, a NATed host's conception of its iden- ()Global identifiers in packets: Each packet should tity (namely its IP address) is a private address and thus contain an identifier that unambiguously specifies is not a handle that it can pass around to allow other net the ultimate destination. The Internet architecture. as work entities to reach it. Second NaTs,modification of originally conceived, did provide global identifiers in port fields violates tenet #2. packets, but IPv4 addresses no longer meet the"global NATs cause the following additional problems identifier"requirement. (IPv6 addresses, because they reflect network topology, are also unsuitable for us, as In order for a server behind a nat to receive un- we elaborate below. )The purpose of a global identifier is solicited inbound packets sent to a given destination to uniquely identify the packet's ultimate destination to port, one must statically configure the NAT with in intermediaries in a way that is apI plication-independent. structions about packets with that destination port. his manual step is called hole punching and requires (2) Delegation as a primitive: Hosts should have administrative control over the NAT. The amount of an application-independent way to express to others that manual configuration increases when a series of nAts to reach the host, packets should be sent to an interme separate a server from the public Internet, creating diary or series of intermediaries. This primitive-called a tree of private address spacesin this case, one delegation-allows end-hosts or their administrators must not only configure each of the NATs in the tree to explicitly invoke(and revoke)intermediaries. These but also coordinate among them; e.g., each globally intermediaries need not be"on the topological path reachable Web server in a given tree of NATs must get 3.2 Mechanisms traffic on a different port on the outermost NAT's pub- EIDs: To achieve property (1), each host has an unam- lic IP address.(By outermost, we mean"connected to biguous endpoint identifier picked from a large names- the globally reachable Internet".) pace. Our design imposes the following additional re- Hosts behind the same NAT cannot simultaneously quirements receive traffic sent to the same TCP port number on the NAt's public IP address. However, some applica (a)The identifier is independent of network topology tions require traffic on a specific port; e. g, IPSECex (ruling out IPv6 addresses and other identifiers with pects traffic on port 500 [44], so only one host within topology-dependent components, as in [42, 43)). a tree of nats can receive virtual private Network With this requirement, hosts can change locations (VPN21] service. while keeping the same identifiers 2.2 Firewalls (b)The identifier can carry cryptographic meaning(rul- ing out human-friendly DNS names). We explain A firewall blocks certain traffic classes on behalf of a host the purpose of this requirement later in this section by examining IP-,transport-,and sometimes application- To satisfy these requirements, we choose flat 160-bit level fields and then applying a set of"firewall rules". It endpoint identifiers(EIDs). A DOA header between must be on the topological path between the host and the the IP and TCP headers carries source and destination host's Internet provider, which we argue is unnecessarily unnecessany EIDs. Transport connections are bound to source and restrictive. Today's firewalls disobey tenet #2 because, destination EIDs (instead of to source and destination by design, they must inspect many non-lP fields in each IP addresses as in the status quo ). DOA borrow even if the intended recipient wants to allow the traffic. including Nimrod [34], HIP [39 UIP(7 eviour s the packet. Since firewalls by default distrust that which they idea of topology-independent EIDs from pre do not recognize, they may block novel but benign traffic, EIDs are resolved tion as a primitive by resolving EIDs. We presume a 3Such series of NAts are not artificial; see 85.4 and Figure 4 mapping service, accessible to Internet hosts, that maps
traffic that is part of a connection initiated by a host behind the NAT. The main operations performed by a NAT are: (1) dynamically allocating a source port at its public IP address when a host behind it initiates a TCP connection or sends a UDP packet; and (2) rewriting IP address and transportlayer port fields to demultiplex inbound packets to the hosts behind the NAT and to multiplex outbound packets over the same source IP address. NATs violate both tenets in §1. First, a NATed host’s conception of its identity (namely its IP address) is a private address and thus is not a handle that it can pass around to allow other network entities to reach it. Second, NATs’ modification of port fields violates tenet #2. NATs cause the following additional problems: • In order for a server behind a NAT to receive unsolicited inbound packets sent to a given destination port, one must statically configure the NAT with instructions about packets with that destination port. This manual step is called hole punching and requires administrative control over the NAT. The amount of manual configuration increases when a series of NATs separate a server from the public Internet, creating a tree of private address spaces3—in this case, one must not only configure each of the NATs in the tree but also coordinate among them; e.g., each globally reachable Web server in a given tree of NATs must get traffic on a different port on the outermost NAT’s public IP address. (By outermost, we mean “connected to the globally reachable Internet”.) • Hosts behind the same NAT cannot simultaneously receive traffic sent to the same TCP port number on the NAT’s public IP address. However, some applications require traffic on a specific port; e.g., IPSEC expects traffic on port 500 [44], so only one host within a tree of NATs can receive Virtual Private Network (VPN) [21] service. 2.2 Firewalls A firewall blocks certain traffic classes on behalf of a host by examining IP-, transport-, and sometimes applicationlevel fields and then applying a set of “firewall rules”. It must be on the topological path between the host and the host’s Internet provider, which we argue is unnecessarily restrictive. Today’s firewalls disobey tenet #2 because, by design, they must inspect many non-IP fields in each packet. Since firewalls by default distrust that which they do not recognize, they may block novel but benign traffic, even if the intended recipient wants to allow the traffic. 3Such series of NATs are not artificial; see §5.4 and Figure 4. 3 Architectural Overview of DOA This section gives an overview of DOA; we defer design details to §4. We first list desired architectural properties that aid in gracefully accommodating intermediaries and then describe mechanisms to achieve those properties. The remainder of the section discusses how DOA extends the Internet architecture. 3.1 Desired Architectural Properties (1) Global identifiers in packets: Each packet should contain an identifier that unambiguously specifies the ultimate destination. The Internet architecture, as originally conceived, did provide global identifiers in packets, but IPv4 addresses no longer meet the “global identifier” requirement. (IPv6 addresses, because they reflect network topology, are also unsuitable for us, as we elaborate below.) The purpose of a global identifier is to uniquely identify the packet’s ultimate destination to intermediaries in a way that is application-independent. (2) Delegation as a primitive: Hosts should have an application-independent way to express to others that, to reach the host, packets should be sent to an intermediary or series of intermediaries. This primitive—called delegation—allows end-hosts or their administrators to explicitly invoke (and revoke) intermediaries. These intermediaries need not be “on the topological path”. 3.2 Mechanisms EIDs: To achieve property (1), each host has an unambiguous endpoint identifier picked from a large namespace. Our design imposes the following additional requirements: (a) The identifier is independent of network topology (ruling out IPv6 addresses and other identifiers with topology-dependent components, as in [42, 43]). With this requirement, hosts can change locations while keeping the same identifiers. (b) The identifier can carry cryptographic meaning (ruling out human-friendly DNS names). We explain the purpose of this requirement later in this section. To satisfy these requirements, we choose flat 160-bit endpoint identifiers (EIDs). A DOA header between the IP and TCP headers carries source and destination EIDs. Transport connections are bound to source and destination EIDs (instead of to source and destination IP addresses as in the status quo). DOA borrows the idea of topology-independent EIDs from previous work, including Nimrod [34], HIP [39], UIP [17]. EIDs are resolved . . .: DOA provides for delegation as a primitive by resolving EIDs. We presume a mapping service, accessible to Internet hosts, that maps
EIDs to some target specified by the EID owner. This and the packet's destination IP address is not resolution has two flavors: work element's, then the element may change to Ip addresses: In order to communicate with in the packet besides per-hop fields. (However, ge notiNg an end-host identified by an EID, a prospective peer may drop packets based on information in the IP header ses the mapping service to resolve the Eid to an IP which permits functions such as ingress and egress fil address. This indirection creates a way for a host to tering. )If, on the other hand, the packet's destination pecify that prospective peers should direct their pack- IP address matches the network element's, there are ets to a given delegate: the host has its EID resolve to two cases: (1) The destination EID in the DOA header the IP address of the delegate. matches the network element's EID (i.e, the packet has reached its destination); or(2)These ElDs do not match, to other EIDs: More generally, an EID can resolve which means the element is a delegate In the latter case, to another EID, allowing an end-host to map its ElD to network-level layering implies that the allowed packet a delegate's identity; if an end-host's EID had to map operations are up to the entities in the delegation rela- to the delegate's IP address(or any other topology dependent identifier), the end-host would have to up- N date the mapping whenever the delegate's location ering but allows violations of higher-level equivalent EIDs, each of which identifies an intermediary sp e.g., an explicitly addressed firewall that looks at appli- ified by the host. This sequence is carried in packets, cation payloads upholds the rules just given but flouts yielding a loose source route in identifier space. This application-level layering. In general, this paper claims option is reminiscent of 13s stacked identifiers. hat dOA improves on the status quo by restoring network-level layering but does not insist that intermedi- Thus, our design requires an EID resolution infrastruc- aries adhere to higher-level layering. Why not? Higher- ture. We wish the management of this infrastructure to level layers define how to organize host software, and one be as automated as possible, which is why we had re- can imagine splitting host software among boxes using quirement(b), above: automated management is easier exotic decompositions. Defining both higher-level layer- if the EIDs are vested with cryptographic meaning [36]. ing and an architecture that respects these higher layers The resolution infrastructure must scalably support a is a problem that requires care and one we have left to fu putO/getO interface over a large, sparse, and flat names- ture work. In the meantime we believe that hosts invok pace. Distributed hash tables(DHTs)[2, 14, 49, 62]give in e& intermediaries should decide how best to split func- exactly this capability, but any other technology that of- tions between them and their intermediaries fers this capability would also suffice. DNS'sresolve- We now discuss how the IP layering rules given your-own-namespace"economic model cannot be used above apply to specific intermediaries. Under DOA, here, but there are plausible scenarios for the economic NATs, which exist to bridge address realms, need not bscure host identity: as we describe in more detail in We have not yet mentioned sender-invoked interme- 85, DOA-based NATs may rewrite IP fields but will nei- diaries.Under DOA, senders invoke intermediaries by ther touch DOA fields that carry host identities nor over- putting into packets additional identifiers beyond the load transport-layer fields. Also, firewalls could be ex- identifiers that resulted from resolving the receiver's plicitly invoked, meaning that packets ending up at the EID. Sender-invoked intermediaries receive little atten- firewalls would be addressed to the firewall. While these tion in this paper but are part of DOAs design. new firewalls(which we cover in 6)could certainly have 3.3 DOA and the Two Tenets them to d classes just as today's firewalls do, they are not violat We elaborate on our earlier claim that doa allows in- ing network-level layering because packets are addressed termediaries to abide by the two tenets in sl. Because to them. One result of this explicit addressing is that the they are location-independent and drawn from a massive firewalls invocation is under users'(or their administra- namespace, EIDs can globally and unambiguously iden- tors,)control, so the user (or administrator)cor tify hosts, satisfying tenet #l. As a result, a network el- to have packets destined for it sent to another ement can reply to the source of a packet by sending one with better suited policies the location given by the resolution of the source EID y network-level layering(tenet #2), network 3. 4 DOA and Internet Evolvability elements need only follow normal IP layering rules, as follows. If an IP packet arrives at a network element The preceding point is more general than firewalls and is important for the Internets flexibility and evolvability In this case, transport connections are bound to the ultimate end. Today, there is only one easy way to deploy a middlebo point, which is identified by the last EID in the sequence. course under DOA. some
EIDs to some target specified by the EID owner. This resolution has two flavors: • . . . to IP addresses: In order to communicate with an end-host identified by an EID, a prospective peer uses the mapping service to resolve the EID to an IP address. This indirection creates a way for a host to specify that prospective peers should direct their packets to a given delegate: the host has its EID resolve to the IP address of the delegate. • . . . to other EIDs: More generally, an EID can resolve to another EID, allowing an end-host to map its EID to a delegate’s identity; if an end-host’s EID had to map to the delegate’s IP address (or any other topologydependent identifier), the end-host would have to update the mapping whenever the delegate’s location changed. An EID can also resolve to a sequence of EIDs, each of which identifies an intermediary specified by the host. This sequence is carried in packets, yielding a loose source route in identifier space.4 This option is reminiscent of i3’s stacked identifiers. Thus, our design requires an EID resolution infrastructure. We wish the management of this infrastructure to be as automated as possible, which is why we had requirement (b), above: automated management is easier if the EIDs are vested with cryptographic meaning [36]. The resolution infrastructure must scalably support a put()/get() interface over a large, sparse, and flat namespace. Distributed hash tables (DHTs) [2, 14, 49, 62] give exactly this capability, but any other technology that offers this capability would also suffice. DNS’s “resolveyour-own-namespace” economic model cannot be used here, but there are plausible scenarios for the economic viability of a DHT-based resolution infrastructure [61]. We have not yet mentioned sender-invoked intermediaries. Under DOA, senders invoke intermediaries by putting into packets additional identifiers beyond the identifiers that resulted from resolving the receiver’s EID. Sender-invoked intermediaries receive little attention in this paper but are part of DOA’s design. 3.3 DOA and the Two Tenets We elaborate on our earlier claim that DOA allows intermediaries to abide by the two tenets in §1. Because they are location-independent and drawn from a massive namespace, EIDs can globally and unambiguously identify hosts, satisfying tenet #1. As a result, a network element can reply to the source of a packet by sending to the location given by the resolution of the source EID. To obey network-level layering (tenet #2), network elements need only follow normal IP layering rules, as follows. If an IP packet arrives at a network element 4 In this case, transport connections are bound to the ultimate endpoint, which is identified by the last EID in the sequence. and the packet’s destination IP address is not the network element’s, then the element may change nothing in the packet besides per-hop fields. (However, elements may drop packets based on information in the IP header, which permits functions such as ingress and egress filtering.) If, on the other hand, the packet’s destination IP address matches the network element’s, there are two cases: (1) The destination EID in the DOA header matches the network element’s EID (i.e., the packet has reached its destination); or (2) These EIDs do not match, which means the element is a delegate. In the latter case, network-level layering implies that the allowed packet operations are up to the entities in the delegation relationship. Note that this last claim satisfies network-level layering but allows violations of higher-level equivalents, e.g., an explicitly addressed firewall that looks at application payloads upholds the rules just given but flouts application-level layering. In general, this paper claims that DOA improves on the status quo by restoring network-level layering but does not insist that intermediaries adhere to higher-level layering. Why not? Higherlevel layers define how to organize hostsoftware, and one can imagine splitting host software among boxes using exotic decompositions. Defining both higher-level layering and an architecture that respects these higher layers is a problem that requires care and one we have left to future work. In the meantime, we believe that hosts invoking intermediaries should decide how best to split functions between them and their intermediaries. We now discuss how the IP layering rules given above apply to specific intermediaries. Under DOA, NATs, which exist to bridge address realms, need not obscure host identity: as we describe in more detail in §5, DOA-based NATs may rewrite IP fields but will neither touch DOA fields that carry host identities nor overload transport-layer fields. Also, firewalls could be explicitly invoked, meaning that packets ending up at the firewalls would be addressed to the firewall. While these new firewalls (which we cover in §6) could certainly have outmoded policies, causing them to drop novel traffic classes just as today’s firewalls do, they are not violating network-level layering because packets are addressed to them. One result of this explicit addressing is that the firewall’s invocation is under users’ (or their administrators’) control, so the user (or administrator) could decide to have packets destined for it sent to another firewall, one with better suited policies. 3.4 DOA and Internet Evolvability The preceding point is more general than firewalls and is important for the Internet’s flexibility and evolvability. Today, there is only one easy way to deploy a middlebox: putting it “on the path”. Of course, under DOA, some
1516 User 4-bt4 16-bit total length DHT(internet) 160-bit source ElD Sockets Apl 160-bit destination ElD put(EID, erec) Figure 2: Example DOa header with no stacked identifiers. erec get (ElD) Kernel is best implemented on end-hosts and not"in the net work because intelligence in the network leads to inflex ibility and because end-hosts know best what functions hey need. At a high level, DOA upholds this vision: the P header doa header tcp header body explicit invocation of intermediaries means that intelli is not stuck in the network and that end-hosts c Figure 1: High-level view of doa design. invoke the intermediaries that best serve them boxes would have to be on the topological path to enforce 4 Detailed DOA Design physical security (e.g, for denial-of-service protection) 86.4 describes how DOA accommodates these on-path Given the preceding general description of DOA,we now boxes.However,DOA-with its flexible and application- present details of the design Figure I shows the DOA independent invocation primitive--also gives users or components and the interfaces between them their administrators the option to outsource functional- ity. Thus, under DOA, fewer intermediaries would need 4.1 Header format to be physically interposed, and users, no longer limited DOA packets are delivered over IP, with the IP protocol to the capabilities of the boxes in front of them, could field set to a well-known value. An example DOA header avail themselves of a menu of services igure 2; the header As a result, we believe that DOa could permit the rise length is measured in four-byte words, the protocol field of a competitive market in professionally managed inter- is the transport-level protocol(e.g, TCP, UDP)used by mediary services such as firewalls. Delegation and reso- the packet, and the length field gives the dOa packets lution are precisely what is necessary for such a market- total length(including the dOa header but not IPheader) the ability for users to select a provider and to switch in bytes. TCP and UDP pseudo-checksums include the providers Because users would have a choice, they could EIDs where IP addresses are used today(since transport seek the intermediary service that best suited their needs, logically occurs between two entities, each identified by and because these services would be professionally man- an EID). The DOa header is extensible(e.g, the re- aged, they could keep up with the rapid pace of applica- mote packet filter presented in g6 extends the basic DOA tion innovation. Thus we see DOa as contributing to the header) Internets ability to evolve. While we believe in its benefits. it is not clear that 4.2 Resolution and Invoking Intermediaries DOA is necessary for these new functions. In fact, we A DOA host wishing to send a packet to a recipient ob- conjecture that even for those applications and interme- tains the recipients EID e out-of-band(e.g, by resolving diaries that one can seemingly build only under DOA, the recipient,s DNS name to e). The sender then uses someone with enough imagination and fortitude could the EId resolution infrastructure-which is discussed achieve equivalent functionality under the status quo- in 83.2 and which we base on DHTs-to retrieve an but not without running afoul of a basic tenet of the In- erecord, depicted in Figure 3. An erecord's fields ternet architecture. We do suspect that the mechanisms of as follows: the EID field is the ElD being resol DOa will help new Internet functionality to evolve, but Target field is described in the next paragraph; the Hint ultimately we believe our contribution is not novel func- field is optional information whose use we illustrate in tion but rather novel architecture--making a class of net- 85; and the ttl field like DNS'S TtL, is a hint indicat work intermediary functions easier to build and reason ing how long entities should cache the erecord DOA presumes that EID owners (or administrators acting on A natural question is how DOA relates to the canoni- their behalf) maintain and possibly periodically refresh cal end-to-end argument [51], which is often interpreted the DHT's copy of their erecord. as a warning against intermediaries The central claim of he end-to-end argi is that application intelligence Some DHTs, like OpenDHT [29], store only soft state, requiring EID owners to do refreshes
Figure 1: High-level view of DOA design. boxes would have to be on the topological path to enforce physical security (e.g., for denial-of-service protection); §6.4 describes how DOA accommodates these on-path boxes. However, DOA—with its flexible and applicationindependent invocation primitive—also gives users or their administrators the option to outsource functionality. Thus, under DOA, fewer intermediaries would need to be physically interposed, and users, no longer limited to the capabilities of the boxes in front of them, could avail themselves of a menu of services. As a result, we believe that DOA could permit the rise of a competitive market in professionally managed intermediary services such as firewalls. Delegation and resolution are precisely what is necessary forsuch a market— the ability for users to select a provider and to switch providers. Because users would have a choice, they could seek the intermediary service that best suited their needs, and because these services would be professionally managed, they could keep up with the rapid pace of application innovation. Thus, we see DOA as contributing to the Internet’s ability to evolve. While we believe in its benefits, it is not clear that DOA is necessary for these new functions. In fact, we conjecture that even for those applications and intermediaries that one can seemingly build only under DOA, someone with enough imagination and fortitude could achieve equivalent functionality under the status quo— but not without running afoul of a basic tenet of the Internet architecture. We do suspect that the mechanisms of DOA will help new Internet functionality to evolve, but ultimately we believe our contribution is not novel function but rather novel architecture—making a class of network intermediary functions easier to build and reason about, and less likely to cause harm. A natural question is how DOA relates to the canonical end-to-end argument [51], which is often interpreted as a warning against intermediaries. The central claim of the end-to-end argument is that application intelligence 4−bit header length version 4−bit 8−bit protocol 16−bit total length 0 15 16 31 bytes 44 160−bit destination EID 160−bit source EID Figure 2: Example DOA header with no stacked identifiers. is best implemented on end-hosts and not “in the network” because intelligence in the network leads to inflexibility and because end-hosts know best what functions they need. At a high level, DOA upholds this vision: the explicit invocation of intermediaries means that intelligence is not stuck in the network and that end-hosts can invoke the intermediaries that best serve them. 4 Detailed DOA Design Given the preceding general description of DOA, we now present details of the design. Figure 1 shows the DOA components and the interfaces between them. 4.1 Header Format DOA packets are delivered over IP, with the IP protocol field set to a well-known value. An example DOA header, with no extensions, is shown in Figure 2; the header length is measured in four-byte words, the protocol field is the transport-level protocol (e.g., TCP, UDP) used by the packet, and the length field gives the DOA packet’s total length (including the DOA header but not IP header) in bytes. TCP and UDP pseudo-checksums include the EIDs where IP addresses are used today (since transport logically occurs between two entities, each identified by an EID). The DOA header is extensible (e.g., the remote packet filter presented in §6 extends the basic DOA header). 4.2 Resolution and Invoking Intermediaries A DOA host wishing to send a packet to a recipient obtains the recipient’s EID e out-of-band (e.g., by resolving the recipient’s DNS name to e). The sender then uses the EID resolution infrastructure—which is discussed in §3.2 and which we base on DHTs—to retrieve an erecord, depicted in Figure 3. An erecord’s fields are as follows: the EID field is the EID being resolved; the Target field is described in the next paragraph; the Hint field is optional information whose use we illustrate in §5; and the TTL field, like DNS’s TTL, is a hint indicating how long entities should cache the erecord. DOA presumes that EID owners (or administrators acting on their behalf) maintain and possibly periodically refresh5 the DHT’s copy of their erecord. 5Some DHTs, like OpenDHT [29], store only soft state, requiring EID owners to do refreshes
EID:0x345ba4d 4.3 Security and Integrity Target: EID+ Or IP address Because identities(namely, EIDs)are separate from lo- TTL: tin cations(namely, IP addresses), the following require ment arises under DOA: The mapping from a given ElD Figure 3: The erecord to its target must be correct, i.e either resolving an EID, Recall from 83.2 that EIDs can either resolve to IP ad- or using an erecord directly sent by a host, must yield dresses(inducing what we call EID-to-IP mappings)or the lP address intended by the ElD owner or by the ElD to one or more EIDs(inducing EID-to-EID+ mappings). owner's delegates. Specifically, DOA must satisfy the If the Target field of the erecord contains only an IP following properties: address i, then, as described in 83. 2, the sender simply 1. Anyone fetching an erecord must be able to verify transmits a packet whose destination IP address is i and that the eld owner created it whose destination EID is e. In this case, the EID owner 2. Only the owner of an EID may update the correspond may or may not be directing potential senders to a del- ing erecord in the dht. egate, but the semantics are the same: the EID owner is saying"to reach me, send your packet there 3. When a host sends its erecord to another host with- If, on the other hand, the Target field of the erecord out using the dht, the sending host must not be able contains one or more ElDs, then the recipient is express to forge the erecord. ing its wish that the packet transit one or more interme- To uphold these properties, DOA uses se」 diaries before reaching the recipient. In this case, the se- certification [36]: EIDs must be the hash of a public key, mantics are"to reach me, send your packet to these in- and the erecord is signed with the corresponding pri termediaries in sequence". The sender would resolve the vate key. When a host either performs a getO operation first EID in the series to an IP address j(perhaps via in- on the dhT, resulting in an erecord, or else receives termediate resolutions to other series of ElDs, each of an erecord directly from a purported ElD owner, the which would be injected into the original series in the host must check that the erecord is signed with the logical order) and send the packet to j. This stack of EIDs private key whose corresponding public key was hashed is carried in the DOA header; transport connections are to create the EID in question. DHT nodes also perform bound to the last EID, which identifies the ultimate des- this check before accepting erecords. For more details, tination. (The design, but not our implementation, lets an including how ElD owners may update their public keys EID resolve to multiple IP addresses; the multiplicity re- without changing their ElDs, see [61]; we adopt the flects a multi-homed host or an anycast situation in which mechanisms described there a set of hosts are equivalent for the erecord owner's pur- With the above properties satisfied, erecords cannot poses. Similarly, each EID in the Target field could really be forged, but senders can still spoof source EIDs(i.e be a set of EIDs, again representing equivalent hosts. put the wrong source EID field in the packet). This at To send a packet back to the source, the receiver exe- tack is like spoofing a source IP address today(except cutes the steps just described to resolve the sender's EID, that ingress and egress filtering, which help guard against f. The receiver cannot simply use the source IP address IP address spoofing, are not applicable to ElDs): success in the original packet as the destination IP address in ful attacks do the same damage, and both attacks are de- the reply packet because f may resolve to a different IP tectable under two-way communication. For example, if address(e. g, f's host sends packets directly but wants a TCP client tries to spoof a source EID to a TCP server, packets to it sent through an intermediary) when the server looks up the source EID(or uses the c: To spare the server the burden of a DHT lookup, the signed erecord supplied by the client), the server gets ent can send its erecord as an optimization. ( The the correct(not fake)IP address for that EID, so when client may have to send more than one erecord since the server replies to the IP address, the host at that ad the clients EID may resolve to a chain of EIDs before dress will not complete the 3-way handshake being resolved to the IP address needed by the server. Security of the DHT itself is a topic outside the scope Also, DOA hosts use the erecords ttl to maintain a of this paper. We briefly observe that DHT nodes cannot TTL-based cache of eId-to-IP and EID-to-EID+ values. forge erecords but can return old versions of erecords. thus avoiding a DHT lookup for every packet A way to guard against this attack by consulting multiple he erech ord and accompanying machinery exist to DHT nodes, instead of one, is mentioned in [14] support receiver-invoked intermediaries Senders invoke Also, we note that while ip source routing cr additional intermediaries by pushing the EIDs of the in- security problems, DOA's loose source routes of ElDs termediaries onto an identifier stack in the doa header. do not inherit these difficulties. With IP source routing. receivers reverse the source route to reply to a sender, which allows an adversary to carry out a man-in-the-
EID: 0x345ba4d... Target: EID+ or IP address Hint: e.g., IP address TTL: time-to-live, a caching hint Figure 3: The erecord. Recall from §3.2 that EIDs can either resolve to IP addresses (inducing what we call EID-to-IP mappings) or to one or more EIDs (inducing EID-to-EID+ mappings). If the Target field of the erecord contains only an IP address i, then, as described in §3.2, the sender simply transmits a packet whose destination IP address is i and whose destination EID is e. In this case, the EID owner may or may not be directing potential senders to a delegate, but the semantics are the same: the EID owner is saying “to reach me, send your packet there”. If, on the other hand, the Target field of the erecord contains one or more EIDs, then the recipient is expressing its wish that the packet transit one or more intermediaries before reaching the recipient. In this case, the semantics are “to reach me, send your packet to these intermediaries in sequence”. The sender would resolve the first EID in the series to an IP address j (perhaps via intermediate resolutions to other series of EIDs, each of which would be injected into the original series in the logical order) and send the packet to j. This stack of EIDs is carried in the DOA header; transport connections are bound to the last EID, which identifies the ultimate destination. (The design, but not our implementation, lets an EID resolve to multiple IP addresses; the multiplicity re- flects a multi-homed host or an anycastsituation in which a set of hosts are equivalent for the erecord owner’s purposes. Similarly, each EID in the Target field could really be a set of EIDs, again representing equivalent hosts.) To send a packet back to the source, the receiver executes the steps just described to resolve the sender’s EID, f . The receiver cannot simply use the source IP address in the original packet as the destination IP address in the reply packet because f may resolve to a different IP address (e.g., f’s host sends packets directly but wants packets to it sent through an intermediary). To spare the server the burden of a DHT lookup, the client can send its erecord as an optimization. (The client may have to send more than one erecord since the client’s EID may resolve to a chain of EIDs before being resolved to the IP address needed by the server.) Also, DOA hosts use the erecord’s TTL to maintain a TTL-based cache of EID-to-IP and EID-to-EID+ values, thus avoiding a DHT lookup for every packet. The erecord and accompanying machinery exist to support receiver-invoked intermediaries. Senders invoke additional intermediaries by pushing the EIDs of the intermediaries onto an identifier stack in the DOA header. 4.3 Security and Integrity Because identities (namely, EIDs) are separate from locations (namely, IP addresses), the following requirement arises under DOA: The mapping from a given EID to its target must be correct, i.e., either resolving an EID, or using an erecord directly sent by a host, must yield the IP address intended by the EID owner or by the EID owner’s delegates. Specifically, DOA must satisfy the following properties: 1. Anyone fetching an erecord must be able to verify that the EID owner created it. 2. Only the owner of an EID may update the corresponding erecord in the DHT. 3. When a host sends its erecord to another host without using the DHT, the sending host must not be able to forge the erecord. To uphold these properties, DOA uses selfcertification [36]: EIDs must be the hash of a public key, and the erecord is signed with the corresponding private key. When a host either performs a get() operation on the DHT, resulting in an erecord, or else receives an erecord directly from a purported EID owner, the host must check that the erecord is signed with the private key whose corresponding public key was hashed to create the EID in question. DHT nodes also perform this check before accepting erecords. For more details, including how EID owners may update their public keys without changing their EIDs, see [61]; we adopt the mechanisms described there. With the above properties satisfied, erecords cannot be forged, but senders can still spoof source EIDs (i.e., put the wrong source EID field in the packet). This attack is like spoofing a source IP address today (except that ingress and egress filtering, which help guard against IP addressspoofing, are not applicable to EIDs): successful attacks do the same damage, and both attacks are detectable under two-way communication. For example, if a TCP client tries to spoof a source EID to a TCP server, when the server looks up the source EID (or uses the signed erecord supplied by the client), the server gets the correct (not fake) IP address for that EID, so when the server replies to the IP address, the host at that address will not complete the 3-way handshake. Security of the DHT itself is a topic outside the scope of this paper. We briefly observe that DHT nodes cannot forge erecords but can return old versions of erecords. A way to guard against this attack by consulting multiple DHT nodes, instead of one, is mentioned in [14]. Also, we note that while IP source routing creates security problems, DOA’s loose source routes of EIDs do not inherit these difficulties. With IP source routing, receivers reverse the source route to reply to a sender, which allows an adversary to carry out a man-in-the-
middle attack by placing its IP address in a forged source 5 Network Extension Boxes Under doa route. Under DOA, however, hosts do not reverse the This section and the next describe example intermedi- loose source route to reply to a sender aries under DOA. In the next section($6), our focus is on filtering packets and how to move this function"off- 4.4 Host software path". In this section, we show how the DOA framework We now describe the software interface that a production accommodates boxes that bridge between different IP ad- DOa deployment would expose Our prototype imple- dress spaces and also simplifies the use of these boxes mentation differs from this description; see 87.1 Under the status quo, these boxes are known as NATs DOA software would run in the kernel and be ex- but would be reincarnated under DOA as tenet-upholding posed to applications with the Berkeley sockets API[37, Network Extension Boxes(NEBs) hich can extend to EID-based identification Applica We first consider three usage scenarios for nebs tions would open a new socket type, PF- DOA(in anal-(35.1), then give our general approach, including a short ogy with PF-INET, used by today's IPv4-based appli- discussion of architectural coherence($5. 2), and then cations), and pass to the API a new data structure, the discuss the benefits of this approach($5.3). One of the sockaddr_eid, which holds an EID and port gust as benefits, automatically exposing hosts behind NEBs, is the sockaddr in-which todays IP-based applications particularly useful when NEBs are cascaded ($5.4). We use--holds an IP address and port). Some of the socket present several mechanisms for achieving automatic con- calls, such as connectO and sendtoo, might cause the figuration($5.5)and require that they work when there DOA software, depending on the state of its caches, are multiple levels of NEB. We conclude the section with to issue one or more DHT lookups to resolve the eId a discussion($5.6) into potentially intermediate EIDs and also an IP ad dress. One could port todays applications by substitut 5.1 Scope ing sockaddr_eid for sockaddr_ in in the code though The following NEB scenarios reflect reasons for deploy client applications would need additional logic to get a ing NATs today(g2. 1)and are ordered by the degree of server's EID, perhaps via a DNS lookup access control For example, client TCP applications would call (a)A host behind the NEB is accessible on all ports.The connectO, supplying a sockaddr eid that contained an EID and port, both of which the application had NEB creates a separate addressing realm but does not control access. Under this scenario. which cor- tained out of band. Similarly, TCP server applications responds to the"Convenience and Flexibility"reason would call accepto, getting back the EID and port of for deploying a NAT today($2.1), many hosts within the initiating client. To reply to the client, the server's an organization can be reachable as first-class mem- DOA Software would resolve the clients eld to an IP bers of the Internet, even if the organization has only address i and address reply packets to i at the IP layer one ip address For bootstrapping, DOA hosts would be configured with the EIDs and IP addresses of one or more of the (b)a host behind the NEB is accessible on configured DHT nodes, in analogy with how today's hosts learn the ports, and the NEB blocks unsolicited traffic to the IP address of a DNS resolver(via hardcoding or DHCP) host on the other ports. This scenario, which reflects On boot up the doa software would insert into the both reasons for deploying a NAT ($2. 1), is analo- DHT the hosts erecord(which could contain an ElD- gous to, e.g., today setting up a Web server behind to-EID or EID-to-IP mapping, depending on the hosts a NAT and configuring the Nat to send all packets configuration) and would refresh the mapping periodi with destination port 80 to the Web server cally or in response to host configuration changes (c) A host behind the NEB is accessible on no ports, i.e the host can only receive packets associated with con- 4.5 Limitations nections it has initiated. This scenario, which is prin- DOA hosts cache erecords, so hosts may have stale in- pally driven by the"Security"reason for deploying formation about prospective peers. Also, two DOA peers a NAT(S2.1), is the default under NAt today. in a TCP session resolve each other's EIDs only once- We expect that under DOA, scenario(b)a mix of ac at the start of the session-so hosts cannot change loca- cess control and exposure d be most common tions without breaking existing connections. DOA could However, for clarity, we focus on scenario(a)and return overcome this limitation if it were extended with a sig- to scenarios(b)and(c)at the end of the section($5.6) naling mechanism, as in [39, 53], that allows hosts to no- tify peers of IP address changes. Finally, an EID owI 5.2 Approach cannot change which intermediaries are invoked based NEBs preserve packets DOA headers and use the desti on who is trying to communicate with it. nation EID field as a demultiplexing token. For example
middle attack by placing its IP address in a forged source route. Under DOA, however, hosts do not reverse the loose source route to reply to a sender. 4.4 Host Software We now describe the software interface that a production DOA deployment would expose. Our prototype implementation differs from this description; see §7.1. DOA software would run in the kernel and be exposed to applications with the Berkeley sockets API [37], which can extend to EID-based identification. Applications would open a new socket type, PF DOA (in analogy with PF INET, used by today’s IPv4-based applications), and pass to the API a new data structure, the sockaddr eid, which holds an EID and port (just as the sockaddr in—which today’s IP-based applications use—holds an IP address and port). Some of the socket calls, such as connect() and sendto(), might cause the DOA software, depending on the state of its caches, to issue one or more DHT lookups to resolve the EID into potentially intermediate EIDs and also an IP address. One could port today’s applications by substituting sockaddr eid for sockaddr in in the code, though client applications would need additional logic to get a server’s EID, perhaps via a DNS lookup. For example, client TCP applications would call connect(), supplying a sockaddr eid that contained an EID and port, both of which the application had obtained out of band. Similarly, TCP server applications would call accept(), getting back the EID and port of the initiating client. To reply to the client, the server’s DOA software would resolve the client’s EID to an IP address i and address reply packets to i at the IP layer. For bootstrapping, DOA hosts would be configured with the EIDs and IP addresses of one or more of the DHT nodes, in analogy with how today’s hosts learn the IP address of a DNS resolver (via hardcoding or DHCP). On boot up, the DOA software would insert into the DHT the host’s erecord (which could contain an EIDto-EID+ or EID-to-IP mapping, depending on the host’s configuration) and would refresh the mapping periodically or in response to host configuration changes. 4.5 Limitations DOA hosts cache erecords, so hosts may have stale information about prospective peers. Also, two DOA peers in a TCP session resolve each other’s EIDs only once— at the start of the session—so hosts cannot change locations without breaking existing connections. DOA could overcome this limitation if it were extended with a signaling mechanism, as in [39,53], that allows hosts to notify peers of IP address changes. Finally, an EID owner cannot change which intermediaries are invoked based on who is trying to communicate with it. 5 Network Extension Boxes Under DOA This section and the next describe example intermediaries under DOA. In the next section (§6), our focus is on filtering packets and how to move this function “offpath”. In this section, we show how the DOA framework accommodates boxesthat bridge between different IP address spaces and also simplifies the use of these boxes. Under the status quo, these boxes are known as NATs but would be reincarnated under DOA as tenet-upholding Network Extension Boxes (NEBs). We first consider three usage scenarios for NEBs (§5.1), then give our general approach, including a short discussion of architectural coherence (§5.2), and then discuss the benefits of this approach (§5.3). One of the benefits, automatically exposing hosts behind NEBs, is particularly useful when NEBs are cascaded (§5.4). We present several mechanismsfor achieving automatic con- figuration (§5.5) and require that they work when there are multiple levels of NEB. We conclude the section with a discussion (§5.6). 5.1 Scope The following NEB scenarios reflect reasons for deploying NATs today (§2.1) and are ordered by the degree of access control: (a) A host behind the NEB is accessible on all ports. The NEB creates a separate addressing realm but does not control access. Under this scenario, which corresponds to the “Convenience and Flexibility” reason for deploying a NAT today (§2.1), many hosts within an organization can be reachable as first-class members of the Internet, even if the organization has only one IP address. (b) A host behind the NEB is accessible on configured ports, and the NEB blocks unsolicited traffic to the host on the other ports. This scenario, which reflects both reasons for deploying a NAT (§2.1), is analogous to, e.g., today setting up a Web server behind a NAT and configuring the NAT to send all packets with destination port 80 to the Web server. (c) A host behind the NEB is accessible on no ports, i.e., the host can only receive packets associated with connections it has initiated. This scenario, which is principally driven by the “Security” reason for deploying a NAT (§2.1), is the default under NAT today. We expect that under DOA, scenario (b)—a mix of access control and exposure—would be most common. However, for clarity, we focus on scenario (a) and return to scenarios (b) and (c) at the end of the section (§5.6). 5.2 Approach NEBs preserve packets’ DOA headers and use the destination EID field as a demultiplexing token. For example
the nEB could maintain an EID-to-IP table look up the Physical Host 1 destination EIDs of incoming packets, and then use the nternet results of these lookups to rewrite the destination IP ad- dresses. There are other ways to demultiplex; we cover hem in§55 This approach upholds the two tenets stated earlier. Physical Host 2 Tenet #1 holds because an end-host behind a neb can pass its EID to others, who can then use this handle to Customer I Home Network direct packets to the given host. As mentioned in $3.3, to obey network-level layering(tenet #2)NEBs may only Customer 2 Home Network rewrite fields in a packet if the packet is addressed to the NEB. Since NEBs, like todays NATs, have to rewrite Figure 4: A tree of nats both the destination IP addresses of inbound packets(to demultiplex them) and the source IP addresses of out- 5.4 Cascaded nEBs bound packets(to make them appear as if they originated The scenario of multiple address realms between a given at the NEB), the discussion in $3.3 implies that both in- host and the rest of the Internet is becoming more fre- bound and outbound packets be addressed to the neb at quent. Consider the following example, depicted in Fig the IP layer However, this approach, in pure form, makes the NeB ure 4: an individual runs a virtual host(using, e.8- VMWare [60]) that runs behind a NAT on the physi- resolve the destination EIDS of outbound packets. As a cal host(such NATing of virtual hosts is common). The practical matter, sources of outbound packets could do physical host is in turn a member of a home network that the resolution and put the resulting IP address some- is all behind a single NAT, which is connected to a broad- where in the packet, thereby sparing the NEB this reso- band provider. The link from the broadband provider, lution burden. The source could even put the resulting P owing to the provider's operations, is itselfNATed,mak address in the destination IP address field, at the lP layer, ing, altogether, three levels of nAt between the virtual NAT. This modified approach--which technically vio- host and the global Internet. We now cover protocols for automatically configuring lates the rules in 83.3 but is consistent with the spirit of NEBs to expose servers; we require the protocols the tenets because the violation is under the control of when servers are behind multiple levels ofNEG 'o work he end-host--is what we adopt 5.5 Secure Automatic Configuration A protocol for configuring NEBs to expose servers must Upholding the two tenets results in the following bene- satisfy three requirements. First, the protocol must tell fits, some of which solve the problems stated in 82.1 the end-host what to put in its erecord since an end End-to-end communication. Communication is log- host separated from the global Internet by levels of neB ically between two EIDs. Thus, protocols can uniquely has no a priori knowledge about the IP addresses of identify hosts NeBs between that end-host and the internet second Ports are not overloaded. Not using the destination the protocol must establish state, either in NEBs or in the port as a demultiplexing token lets multiple hosts behind EID resolution infrastructure, that allows NEBs to use a nEB receive packets sent to the same destination port. the destination EID field in packets as a demultiplexing VPNs. Getting VPNs to work through NATs is cum- token for rewriting the destination IP address field bersome and complicated [44]. The difficulties under the Third, this state must correspond to the wishes of status quo result from NATs rewriting both ports and the actual ElD owner, rather than of an impostor try- IP addresses. Under DOA, NEBs do not rewrite ports, ing to divert the EID owner's traffic. This focus on au- and the state associated with encrypted tunnels could be thenticity is warranted because passing unprotected pro- bound to EIDs. not ip addresses 6 tocol messages through levels of NEB could be prob Automatic configuration Under DOA, the process lematic. For example, an upstream provider cannot trust of exposing a host behind a NEB can be automated. NEBs administered by its customers, and end-users can- When NEBs are cascaded, a scenario covered in the next not trust each other's NEBs to correctly propagate con- section, this automation is particularly useful-and par- trol or data messages. Also, NEB networks, like today's ticularly problematic under the status quo(82.1) NATs. would often be constructed over wireless links which are susceptible to eavesdropping and tamperin 6Much of the HIP work (40 focuses on such binding of IPSEC state In what follows, we that a NEB trusts only the to cryptographically imbued EIDs neB directly upstream of it(called its parent); that NEBs
the NEB could maintain an EID-to-IP table, look up the destination EIDs of incoming packets, and then use the results of these lookups to rewrite the destination IP addresses. There are other ways to demultiplex; we cover them in §5.5. This approach upholds the two tenets stated earlier. Tenet #1 holds because an end-host behind a NEB can pass its EID to others, who can then use this handle to direct packets to the given host. As mentioned in §3.3, to obey network-level layering (tenet #2) NEBs may only rewrite fields in a packet if the packet is addressed to the NEB. Since NEBs, like today’s NATs, have to rewrite both the destination IP addresses of inbound packets (to demultiplex them) and the source IP addresses of outbound packets (to make them appear as if they originated at the NEB), the discussion in §3.3 implies that both inbound and outbound packets be addressed to the NEB at the IP layer. However, this approach, in pure form, makes the NEB resolve the destination EIDs of outbound packets. As a practical matter, sources of outbound packets could do the resolution and put the resulting IP address somewhere in the packet, thereby sparing the NEB this resolution burden. The source could even put the resulting IP address in the destination IP address field; at the IP layer, then, outbound packets would look alike under NEB and NAT. This modified approach—which technically violates the rules in §3.3 but is consistent with the spirit of the tenets because the violation is under the control of the end-host—is what we adopt. 5.3 Benefits Upholding the two tenets results in the following bene- fits, some of which solve the problems stated in §2.1. End-to-end communication. Communication is logically between two EIDs. Thus, protocols can uniquely identify hosts. Ports are not overloaded. Not using the destination port as a demultiplexing token lets multiple hosts behind a NEB receive packets sent to the same destination port. VPNs. Getting VPNs to work through NATs is cumbersome and complicated [44]. The difficulties under the status quo result from NATs rewriting both ports and IP addresses. Under DOA, NEBs do not rewrite ports, and the state associated with encrypted tunnels could be bound to EIDs, not IP addresses.6 Automatic configuration. Under DOA, the process of exposing a host behind a NEB can be automated. When NEBs are cascaded, a scenario covered in the next section, this automation is particularly useful—and particularly problematic under the status quo (§2.1). 6Much of the HIP work [40] focuses on such binding of IPSEC state to cryptographically imbued EIDs. Figure 4: A tree of NATs. 5.4 Cascaded NEBs The scenario of multiple address realms between a given host and the rest of the Internet is becoming more frequent. Consider the following example, depicted in Figure 4: an individual runs a virtual host (using, e.g., VMWare [60]) that runs behind a NAT on the physical host (such NATing of virtual hosts is common). The physical host is in turn a member of a home network that is all behind a single NAT, which is connected to a broadband provider. The link from the broadband provider, owing to the provider’s operations, is itself NATed, making, altogether, three levels of NAT between the virtual host and the global Internet. We now cover protocolsfor automatically configuring NEBs to expose servers; we require the protocolsto work when servers are behind multiple levels of NEB. 5.5 Secure Automatic Configuration A protocol for configuring NEBs to expose servers must satisfy three requirements. First, the protocol must tell the end-host what to put in its erecord since an endhost separated from the global Internet by levels of NEB has no a priori knowledge about the IP addresses of NEBs between that end-host and the Internet. Second, the protocol must establish state, either in NEBs or in the EID resolution infrastructure, that allows NEBs to use the destination EID field in packets as a demultiplexing token for rewriting the destination IP address field. Third, this state must correspond to the wishes of the actual EID owner, rather than of an impostor trying to divert the EID owner’s traffic. This focus on authenticity is warranted because passing unprotected protocol messages through levels of NEB could be problematic. For example, an upstream provider cannot trust NEBs administered by its customers, and end-users cannot trust each other’s NEBs to correctly propagate control or data messages. Also, NEB networks, like today’s NATs, would often be constructed over wireless links, which are susceptible to eavesdropping and tampering. In what follows, we assume that a NEB trusts only the NEB directly upstream of it (called its parent); that NEBs
and end-hosts know the Eld of their parent; and that all Round 1 Round 2 links in the NEB network are vulnerable to eavesdrop- ping, tampering, and arbitrary data injection. We now give three mechanisms, each using a different kind of EID resolution, that meet the requirements above We implemented the third one; see 87.2 5.5.1 EID maps to ElD NEB 2 Each NEB and end-host creates a mapping in the global (ElD: e2) EID resolution infrastructure from its EID to its parent's EID: in other words. NEBs and end-hosts use the dele- gation primitive to say, to reach me, send your packet to NEB 1 my parent's EID Also, each NEB holds a mapping from (EID: ep) o→ Control plane. Assume an end-host with EID eo must raverse nebs EIDs el through en before reaching the Internet. The end-host inserts a mapping from its EID End-host IP: io (eo)to its parent's EID(e1)into the global EID resolu- (EID: eo) (e2,i2,r2) tion service. The end-host also sends a message to e informing it of a mapping between its ElD(eo) and its IP address(io). All other internal NEBs in the chain(el through en-1)use the same protocol. The outermost nEB Figure 5: NEB and dht state after each DOA-RIP round uses the global EID resolution infrastructure to map its ED(en) to its IP address (in), which is globally reach able. A neb with EID should only accept an EID- mapping has in its Hint field the end-host s internal IP to-IP mapping of the form (ej, ij) if the mapping is au thentic,i.e,if it is signed by the private key correspond- the EID resolution infrastructure from its own EID to the EID, e2, of its parent and puts its "outer"IP address, in ing to ej; performing this check might require ej to se into the Hint field of the erecord. This process contin- e+l its public key(which should hash to ej) This approach, as just described, is vulnerable to re- ues until the outermost NEB inserts a mapping from its plays of(ej, ij) mappings Such replays would allow the EID, en, to its"outer""IP address, i wrong end-host--one that is later assigned IP address Data plane. A remote host wishing to communicate ij-to redirect ejs traffic to it. We show how one migl with eo resolves eo to el, el to e2, .. en-I to en, while rotect against these attacks in $5.5.3 remembering the Hints io, i,..., in. As with the previ- Data plane. Assuming the end-host and intermediate ous mechanism, the remote host stacks the identifiers eo NEBs all initialize successfully, a remote client can send through en in its packets--and in this case also includes data packets to the end-host(with EID eo) by using the in the doa header the IP addresses io through inthen EID resolution infrastructure to map eo to el, el to sends the packet to IP address in. Once the packet reaches and so on, up the NEB chain. The last EID lookup maps pad ermost NEB (en), the NEB pops the top EID and tifiers eo through en in its packets and sends the packet in-1 is the packets next hop, and the process continues to IP address in. Once the packet reaches the outermost 5.5.3 EID maps to IP address NEB(en), the NEB pops the top EID off the stack to find that er-I is the packet's next hop. The NEB then The previous two mechanisms require a prospective consults its routing table to map EiD en-I to IP address sender to do as many EID resolution infrastructure in-I, rewrites the packets destination IP address to i, lookups as there are levels of NEB. An alternative, that and forwards the packet. This process continues until the we call DOA-RIP, allows senders to do a single resolu tion: from the eld. eo, of the end-host to the IP address in, of the outermost NEB 5.5.2 EID maps to ElD and a Hint Control plane. End-hosts and NEBs follow a two- Another approach uses the erecords Hint field, men- round protocol, depicted in Figure 5. In the first round tioned in 84.2, to relieve NEBs of state the end-host(with EID eo) sends an initialization mes- Control Plane. The end-host inserts into the EID res- sage to its parent in the NEB tree; intermediate NEBs olution infrastructure a mapping from its EID, eo, to the forward the message until it reaches the outermost NEB ElD, el, of its parent NEB; the erecord holding this (with EID en). The outermost NEB creates a message
and end-hosts know the EID of their parent; and that all links in the NEB network are vulnerable to eavesdropping, tampering, and arbitrary data injection. We now give three mechanisms, each using a different kind of EID resolution, that meet the requirements above. We implemented the third one; see §7.2. 5.5.1 EID maps to EID Each NEB and end-host creates a mapping in the global EID resolution infrastructure from its EID to its parent’s EID; in other words, NEBs and end-hosts use the delegation primitive to say, “to reach me, send your packet to my parent’s EID”. Also, each NEB holds a mapping from its children’s EIDs to its children’s internal IP addresses. Control plane. Assume an end-host with EID e0 must traverse NEBs with EIDs e1 through en before reaching the Internet. The end-host inserts a mapping from its EID (e0) to its parent’s EID (e1) into the global EID resolution service. The end-host also sends a message to e1 informing it of a mapping between its EID (e0) and its IP address (i0). All other internal NEBs in the chain (e1 through en−1) use the same protocol. The outermost NEB uses the global EID resolution infrastructure to map its EID (en) to its IP address (in), which is globally reachable. A NEB with EID e j+1 should only accept an EIDto-IP mapping of the form he j , iji if the mapping is authentic, i.e., if it is signed by the private key corresponding to e j ; performing this check might require e j to send ej+1 its public key (which should hash to e j). This approach, as just described, is vulnerable to replays of he j , iji mappings. Such replays would allow the wrong end-host—one that is later assigned IP address ij—to redirect e j’s traffic to it. We show how one might protect against these attacks in §5.5.3. Data plane. Assuming the end-host and intermediate NEBs all initialize successfully, a remote client can send data packets to the end-host (with EID e0) by using the EID resolution infrastructure to map e0 to e1, e1 to e2, and so on, up the NEB chain. The last EID lookup maps en to the IP address in. The client then stacks the identifiers e0 through en in its packets and sends the packet to IP address in. Once the packet reaches the outermost NEB (en), the NEB pops the top EID off the stack to find that en−1 is the packet’s next hop. The NEB then consults its routing table to map EID en−1 to IP address in−1, rewrites the packet’s destination IP address to in−1, and forwards the packet. This process continues until the packet reaches its eventual destination, e0. 5.5.2 EID maps to EID and a Hint Another approach uses the erecord’s Hint field, mentioned in §4.2, to relieve NEBs of state. Control Plane. The end-host inserts into the EID resolution infrastructure a mapping from its EID, e0, to the EID, e1, of its parent NEB; the erecord holding this NEB 2 NEB 1 End-host (EID: e2) (EID: e1) (EID: e0) Round 1 Round 2 Internet DHT IP: i2 IP: i1 IP: i0 IP: i0 he2, i2,r2i he1, i1,r1i he0 → i0i he0 → i1i IP: i1 IP: i2 he0 → i2i Figure 5: NEB and DHT state after each DOA-RIP round. mapping has in its Hint field the end-host’s internal IP address, i0. The NEB e1 likewise creates a mapping in the EID resolution infrastructure from its own EID to the EID, e2, of its parent and puts its “outer” IP address, i1, into the Hint field of the erecord. This process continues until the outermost NEB inserts a mapping from its EID, en, to its “outer” IP address, in. Data plane. A remote host wishing to communicate with e0 resolves e0 to e1, e1 to e2, . . . , en−1 to en, while remembering the Hints i0, ii , . . . , in. As with the previous mechanism, the remote host stacks the identifiers e0 through en in its packets—and in this case also includes in the DOA header the IP addresses i0 through in—then sends the packet to IP address in. Once the packet reaches the outermost NEB (en), the NEB pops the top EID and IP address off the stack to find that en−1 with IP address in−1 is the packet’s next hop, and the process continues. 5.5.3 EID maps to IP address The previous two mechanisms require a prospective sender to do as many EID resolution infrastructure lookups as there are levels of NEB. An alternative, that we call DOA-RIP, allows senders to do a single resolution: from the EID, e0, of the end-host to the IP address, in, of the outermost NEB. Control plane. End-hosts and NEBs follow a tworound protocol, depicted in Figure 5. In the first round, the end-host (with EID e0) sends an initialization message to its parent in the NEB tree; intermediate NEBs forward the message until it reaches the outermost NEB (with EID en). The outermost NEB creates a message
xn=(en, in, rn)(n is a random nonce to prevent re- part because they hide internal network topology play attacks), signs xn, and sends it to the NEB with EID EIDs are independent of network internals, organi en-1.Each NEB ek(k< n), follows suit, appending the might be looser about exposing EIDs than IP addr message xk =(ek, ik, rk)to xk+l. When the end-host Comparison of the mechanisms. Observe that the ceives xI, it verifies the message using eIs public key. three mechanisms above are different ways to perform This message is a route to the global Internet. routing that offer different trade-offs between state held In the second round, the end-host creates a series of in the NEB and the degree of fate-sharing. With one requests k =(eo, ik-1, rk)for 1 s k s n; signs each of the mechanisms($5.5.2), all information about EID- Dk individually; concatenates all the yks and appends its to-IP mappings is in the EID resolution infrastructure public key; and sends this message up the NEB chain. which simultaneously frees the neB of state but makes Each NEB ek verifies y using eos public key and sig- correct routing depend on the availability of the resolu- nature. Each NEB further checks that rk is in its cache tion infrastructure. In contrast, DOA-RIP pushes nearly and that nk is the nonce it issued in the first round for all state into the NEBs along the path between two com- Eld eo(the NEB flushes rk from a cache within a fixed municating entities number of seconds-10, in our implementation--of is- 6 Network Filtering Boxes Under dOa suing rk). If these checks succeed, the NEB flushes rk establishes a mapping (eo, ik-l), and propagates the re- In this section, we demonstrate DOA's delegation prim quest up the NEB tree. If all NEBs successfully establish itive with a simple remote packet filter(RPF)box that e mapping, the end-host inserts into the EID resolution yields functionality similar to today's firewalls but need infrastructure a map from eo to in not be interposed between a host receiving firewall ser- Data plane. To communicate with the end-host, re- vice and that hosts link to the Internet. One can certainly mote clients first resolve eo to in and then send packets get similar functionality today with special-purpose ma- with destination IP address in and destination EID hinery (e.g, VPN software, though their interfaces dif- at which point the outermost NEB, and all succeeding fer across solution providers). However, we believe that NEBs in the chain, use their internal state to forward the decoupling services from topology is best done with ar- packet to the end-host. chitectural, rather than application, support because: (1) users should be able to compose intermediaries and (2) 5.6 Discussion users should be able to change their delegates easily(see Other scenarios. Though we focused on scenario(a) 83.4), both of which imply that the architecture support (from 5.1), the benefits noted above (in 85.3)apply a standard, application-independent invocation method. equally to scenario(b). Two of the three mechanisms 6.1 Approach and design for automatic configuration also apply(the stateless NEB The RPF is a basic application of DOA's mechanisms from 85.5.2 does not) with the one change that end- it is depicted in Figure 6. A user(or representative of hosts-when making signed requests of parent or ances- the user, e. g, corporate IT staff)wanting remote firewall tor NEBs to add EID-to-IP mappings--need to add re- service creates a mapping in the ElD resolution infras- quests to open(or block) specific ports. This type of au- tructure from the end-hosts EID, e, to the RPF's ElD tomatic hole punching works under DOA, in contrast to f(or to the RPF's IP address, but then if that IP ad the status quo, for three reasons: (1) DOA has a persistent dress changes, the resolution of e will be incorrect). This notion of host identity, which allows NEBs to associate end-host expresses its actual network location either by policies with hosts and remote network entities to iden- putting its IP address, i, in the Hint field of the erecord tify hosts behind the NEB; (2) port fields are not over- to which e resolves, or by communicating directly with loaded under dOA, so internal nodes in the NEB tree do the RPF and telling it about the association between e not have to coordinate among themselves, in contrast to and i(Our implementation, described in $7.3, uses the the status quo wherein only one server in a tree of NATs second option. can receive, e.g., traffic destined to port 80 on the outer- When a sender attempts to contact e, it first looks up most NATs public IP address; and (3)hosts can leverage e in the EID resolution infrastructure, sees that e maps the cryptographic properties of their identities to create to f, and then further resolves f to an IP address(which signed messages saying"handle my packets like this might involve intermediate resolution steps, depending he benefits above, except automatic configuration, on whether the rPf itself has delegates). In the simple also apply to scenario(c). Although this scenario is the case in which f resolves directly to an IP address j, the strictest access control NEBs offer, network administra- sender forms IP packets with destination address j and tors may still prefer NATs, since NATs, unlike NEBs, destination EID e. Note that f must be in the stack of bscure the identities of the organizations'end-hosts. identifiers since the host given by j may actually be the Our response is that organizations today use NATs in RPF's delegate rather than the RPF itself (e g-, if the RE
xn = hen, in,rni (rn is a random nonce to prevent replay attacks), signs xn, and sends it to the NEB with EID en−1. Each NEB ek (k < n), follows suit, appending the message xk = hek, ik,rki to xk+1. When the end-host receives x1, it verifies the message using e1’s public key. This message is a route to the global Internet. In the second round, the end-host creates a series of requests yk = he0, ik−1,rki for 1 ≤ k ≤ n; signs each yk individually; concatenates all the yk’s and appends its public key; and sends this message up the NEB chain. Each NEB ek verifies yk using e0’s public key and signature. Each NEB further checks that rk is in its cache and that rk is the nonce it issued in the first round for EID e0 (the NEB flushes rk from a cache within a fixed number of seconds—10, in our implementation—of issuing rk). If these checks succeed, the NEB flushes rk, establishes a mapping he0, ik−1i, and propagates the request up the NEB tree. If all NEBs successfully establish the mapping, the end-host inserts into the EID resolution infrastructure a map from e0 to in. Data plane. To communicate with the end-host, remote clients first resolve e0 to in and then send packets with destination IP address in and destination EID e0, at which point the outermost NEB, and all succeeding NEBs in the chain, use their internal state to forward the packet to the end-host. 5.6 Discussion Other scenarios. Though we focused on scenario (a) (from §5.1), the benefits noted above (in §5.3) apply equally to scenario (b). Two of the three mechanisms for automatic configuration also apply (the stateless NEB from §5.5.2 does not) with the one change that endhosts—when making signed requests of parent or ancestor NEBs to add EID-to-IP mappings—need to add requests to open (or block) specific ports. This type of automatic hole punching works under DOA, in contrast to the status quo, for three reasons: (1) DOA has a persistent notion of host identity, which allows NEBs to associate policies with hosts and remote network entities to identify hosts behind the NEB; (2) port fields are not overloaded under DOA, so internal nodes in the NEB tree do not have to coordinate among themselves, in contrast to the status quo wherein only one server in a tree of NATs can receive, e.g., traffic destined to port 80 on the outermost NAT’s public IP address; and (3) hosts can leverage the cryptographic properties of their identities to create signed messages saying “handle my packets like this”. The benefits above, except automatic configuration, also apply to scenario (c). Although this scenario is the strictest access control NEBs offer, network administrators may still prefer NATs, since NATs, unlike NEBs, obscure the identities of the organizations’ end-hosts. Our response is that organizations today use NATs in part because they hide internal network topology. Since EIDs are independent of network internals, organizations might be looser about exposing EIDs than IP addresses. Comparison of the mechanisms. Observe that the three mechanisms above are different ways to perform routing that offer different trade-offs between state held in the NEB and the degree of fate-sharing. With one of the mechanisms (§5.5.2), all information about EIDto-IP mappings is in the EID resolution infrastructure, which simultaneously frees the NEB of state but makes correct routing depend on the availability of the resolution infrastructure. In contrast, DOA-RIP pushes nearly all state into the NEBs along the path between two communicating entities. 6 Network Filtering Boxes Under DOA In this section, we demonstrate DOA’s delegation primitive with a simple remote packet filter (RPF) box that yields functionality similar to today’s firewalls but need not be interposed between a host receiving firewall service and that host’s link to the Internet. One can certainly get similar functionality today with special-purpose machinery (e.g., VPN software, though their interfaces differ across solution providers). However, we believe that decoupling services from topology is best done with architectural, rather than application, support because: (1) users should be able to compose intermediaries and (2) users should be able to change their delegates easily (see §3.4), both of which imply that the architecture support a standard, application-independent invocation method. 6.1 Approach and Design The RPF is a basic application of DOA’s mechanisms; it is depicted in Figure 6. A user (or representative of the user, e.g., corporate IT staff) wanting remote firewall service creates a mapping in the EID resolution infrastructure from the end-host’s EID, e, to the RPF’s EID, f (or to the RPF’s IP address, but then if that IP address changes, the resolution of e will be incorrect). This end-host expresses its actual network location either by putting its IP address, i, in the Hint field of the erecord to which e resolves, or by communicating directly with the RPF and telling it about the association between e and i. (Our implementation, described in §7.3, uses the second option.) When a sender attempts to contact e, it first looks up e in the EID resolution infrastructure, sees that e maps to f , and then further resolves f to an IP address (which might involve intermediate resolution steps, depending on whether the RPF itself has delegates). In the simple case in which f resolves directly to an IP address j, the sender forms IP packets with destination address j and destination EID e. Note that f must be in the stack of identifiers since the host given by j may actually be the RPF’s delegate rather than the RPF itself (e.g., if the RPF