The upcoming World IPv6 launch is stimulating a lot of conversation around IPv6 deployment and common deployment scenarios. People regularly ask “where’s my NAT,” which is something we have tried to address in architectural discussions in RFC 2993, RFC 4864, and RFC 6269. Margaret Wasserman and I have worried specifically about the implications of the multiplication of provider-independent addresses at the edge and the issues of multihoming, and described a model for IPv6 network prefix translation that we think addresses most of the issues and yet facilitates scalable multihoming without provider-independent addressing and the bloating of the route table it implies. Per-residential-customer multihoming is currently in use for NTT BFLETS in Japan.
My colleague Andrew Yourtchenko, whom many of you may know from IPv6 events, has a very different opinion about network address translation. If anything, he would like to get rid of it. Andrew has contributed to some 14 RFCs on the topic of transition and has much of value to say.
While I agree with Andrew on a number of issues, I don’t agree about the model in which one deploys a prefix allocated by each of one’s upstreams providers on each of the LANs in a network. I think that while we have reduced costs for ISPs in the smaller route table, we have significantly expanded the complexity faced by the edge network without giving them a benefit that they readily recognize. I agree with the end-to-end model and the ability to deploy new applications anywhere in the network, but I think that stateless prefix translation can meet those issues and help in managing the size of the route table. Andrew and I recently weighed the pros and cons of our different opinions and included our thoughts in this blog. What is your opinion on this topic?
Andrew: Most people already know what we mean by NPTv6, but just in case there are different interpretations, will you quickly explain what NPTv6 is and why it’s important or controversial?
Fred: “NPT” refers to Network Prefix Translation. If you’d like more details, refer to RFC 6296, which is the definition of the service. Basically, we consider the IPv6 prefix to be a locator and the endpoint identifier to be an identifier, and we translate statelessly between local and global locators in a manner that facilitates communication between any two systems in the network using existing transport and application protocols.
Andrew: Why do you think one would need Network Prefix Translation (NPTv6)?
Fred: Well, in writing it, there were a few things we wanted to address.
Renumbering (RFC 4192): The real issues in renumbering aren’t so much the process of distributing prefixes (IPv6 networks, route maps, ACLs, and so on) as much as dealing with broken software that makes silly assumptions about addresses. Some examples of assumptions include: an address that is meaningful to me is meaningful to you, addresses once assigned never change, a host or service has exactly one address, one doesn’t need to worry about names, and so on.
Margaret Wasserman’s motivating issue was edge network independence. The model that the IPv6 community has espoused has been that the transit network – ISPs of all stripes – get independent prefixes for reasons related to graph theory, and edge networks get their prefix from their upstream provider. From a technology perspective, that makes a lot of sense. From an operational or business perspective, that means the edge network has a strong disincentive to change upstream networks – it’s a market lock-in.
Edge networks have responded by becoming members of their various Regional Internet Registries and obtaining provider-independent addresses rather than permitting themselves to be captives of their upstream networks. The AS65000 BGP Routing Table Analysis Report on 4 April 2012, tells me that of 40694 Autonomous Systems in the routing system, 17049 originate exactly one prefix, 27182 are visible in only one AS path, and 34906 are Origin-only AS’s; a mere 5788 (14%) are there for a reason other than multihoming. Provide translation, and voila, they are independent.
My motivating issue was the resulting implication for the route table. If we enumerate edge networks, we presumably have every small business, and in some markets every home, as a multihomed network. That pushes the count of provider-independent prefixes much harder than IPv4 issues push the route table today. But, if we provide translation, the edge networks are independent of their upstreams AND the upstreams can aggressively aggregate routes into what they view as provider-allocated addresses. So the route table is much more manageable.
Andrew: What if the registries were to make the process of getting the PI addresses a one-click operation, would you think there would be a need for NPTv6?
Fred: Absolutely. The easier it is for an edge network to get a prefix its upstream has to individually route, the more pressure there is on the route table. There’s a little more to it, of course. Having the upstream route my prefix rather than allocating its own is a different contract. But ask yourself; do you really want every mom and pop grocery operating a BGP exchange? No, for a number of good reasons. Translation resulting in a PI-like independence is better for the transit core in a variety of ways.
Andrew: But I heard LISP aimed to solve the routing table scalability and L-I separation?
Fred: Well, it does, and it doesn’t. Mostly, it moves the complexity of managing routing from the transit networks to the edge networks, which I will argue is moving it from the experts to the newbies. I’m not sure you want to go there. The edge still has the same route table with whatever scalability issues it has, and the transit domain has a different addressing plan.
As to locators and identifiers, I’ll argue that the term and the debate started with a definition from RFC 1992: a “locator” is something mutable that indicates where I am, and an “identifier” is something that identifies an instance of an application serving a given use. John Day would have some comments here; I refer to RFC 1992’s identifier as a Transport Connection Endpoint, which might be instantiated as a socket in use by an application. We have a number of degenerate forms of that – identifiers for interfaces, identifiers for physical or virtual machines, and so on. But LISP starts out by capitalizing on the fact that folks want to talk about locators and identifiers and dramatically redefines the terms. A “locator” is not a tag that tells me where a target is, rather it is the IP address of a “tunnel” endpoint, and an “identifier” doesn’t identify an application serving a purpose such as a socket or a virtual machine dedicated to an application, it is simply an IP address.
Seems like marketing, to me.
Andrew: Network Address Translation (NAT) as a technology is a double-edged sword. During more than 10 years of my work in the TAC, I’ve seen a lot of networks burned by excessive use of it. I tend to call NAT technology a prescription drug. Do you think it’s going to be different in NPTv6 ?
Fred: I’d like to very carefully define terms here.
I agree that Network Address Translation, by which I mean stateful translation of a relatively large address space through overloading into a smaller address space, has been a real problem. RFCs 2993 and 6269 detail the issues quite well. The biggest reason is an issue of coupling; applications like FTP, HTTP, and SIP make the assumption that the address they are using at the network layer identifies them (what if they are mobile? what if they have multiple addresses?), and that said address is meaningful to another system. NATs break both of those assumptions dramatically. The address I use may not be meaningful to my peer, and may not be the only or even the best way to get to me.
Network Prefix Translation (NPTv6) uses a stateless algorithmic translation between two address spaces of the same size. That means that every host “inside” has a predictable address “outside,” and apart from firewall rules is therefore reachable from “outside.” It is far more scalable, and eliminates a lot of the issues. I’ll still argue that one wants to use application layer identifiers (names) in redirects and the like, and when you really want to talk about addresses you’re going to have to include both inside and outside addresses. The coupling issue between address spaces (“is an address that is meaningful to me meaningful to you?”) still exists. But since we have eliminated overloading and state, we can load-share or fail over between translators between the same two address spaces, to pick one example.
On inside/outside addresses, you might want to read, “Using PCP to Find an External Address in an NPTv6 Network.”
Andrew: NAT and Security is my favourite soapbox. While it is true that having a stable interface ID provides a trivial way to correlate the different sessions by the same user, the upper level protocols are already providing way too much information. Take HTTP, for example. Even by itself, the browser leaks a huge amount of information about itself. Go to panopticlick.eff.org and check. Is your browser unique ?
Fred: We could also discuss the email envelope. I had a lengthy discussion on this topic with a friend that worked at another company, who was very certain that the several layers of NAT he was behind hid his topology and provided security—until I told him what his 169.254.0.0/16 address was in his local environment. Yes, the security benefits of NAT are mostly illusory.
The benefit of NAT that is not illusory is that it cannot forward a packet for which it doesn’t have saved state. But even that is readily over-estimated. When I see a packet go by, I know that a session is still in progress and its state needs to be created or maintained a little longer. I don’t know when the session will end, however, unless I happen to intercept something fairly draconian like a TCP RST. As a result, a pinhole that is open usually stays open past the time I really need it, and provides a means of attacking someone.
Andrew: And we haven’t started talking about using the ETags or supercookies for persistently tracking the users from third-party sites. Anyway, as I suspected, I am going on a tangent with this topic, let’s get back to NPTv6. So, we have the PA addresses on the outside and do not need to inject PI addresses, but what about the connection survivability?
Fred: Well, the downside of anything that changes the address is that it changes the address; if the transport is address agile (RFC 5061), that’s OK, but if not (TCP), the session breaks. So if I have translators between my edge network and two different upstreams, neither NAT nor NPTv6 gives you session survivability up front. NPTv6 does provide it, though, if you have two translators to the same upstream, since the address translation is algorithmic, you can load share and fail over between them.
Ron Bonica has a suggestion in “Multihoming with IPv6-to-IPv6 Network Prefix Translation (NPTv6)” in which he provides a way that one could provide fail-over to a different ISP by enabling the ISPs to hold hands. It may turn out to be useful.
Andrew: But isn’t this the same approach as we had with IPv4 NAT back some 9 years ago? In practice I saw it just promote the use of stateful translation – and therefore, it did not help with respect to end-to-end addressability. What do you think?
Fred: I assume you are talking about the failover question. Since the translation is stateless, it’s much more scalable and predictable. I’ll argue that since every interface has a predictable outside address, NPTv6 promotes end-to-end addressability. In the stateful case the translation didn’t always exist, and in the NPT case it does.
Andrew: By the way, speaking of end-to-end addressability, how is this achieved with NPTv6?
Fred: It is achieved by providing a predictable outside address corresponding to each inside address through each upstream network.
Let me give you a worked example. Suppose I am using my link-local address fe80::/64 as the inside address and 2001:db8:1::/48 outside, and you are using a ULA fd00:1234::/48 inside and 2001:db8:2::/56 outside. When I want to send a packet to you, I need your outside address, which I presumably get from DNS. So my packet might be fe80::<me> to 2001:db8:2::<you> as I send it. My NPTv6 translator changes my source address to 2001:db8:1::<me> with some twiddling to handle the TCP checksum, and your NPTv6 translator changes the destination address to fd00:1234::<you>. When you reply, you reply from your address (the ULA address) to my outside address, and both get translated. So while neither your application nor my application is using the global address directly, packets can be unambiguously directed between them without having to go through a proxy.
Andrew: Technologies like multipath TCP are already solving the problem of multihoming at transport level. Why do you think L3 is the best place to do it?
Fred: Well, please don’t get me wrong on the value of transport layer solutions that enable the use of multiple paths or address agility; I think they are important. I especially think that Happy Eyeballs are important. But they are only addressing multihoming in the sense of enabling applications to communicate and make sense of multiple paths in the network. They are not addressing the complexities of network operations (Do you really want a /64 from each of your upstreams on each of your LANs?), nor are they thinking about the impact of PI on the transit core’s route table.
Andrew: But I heard there was this approach called ILNP?
Fred: ILNP (and related drafts) is a great solution, and in some ways I would prefer it to NPTv6. Like NPTv6, it considers the IPv6 address to have two parts, which it defines as:
- Identifier: a non-topological name for uniquely identifying a node
- Locator: a topologically-bound name for an IP subnetwork
Note that it’s a node address, not an interface address; the locator in effect says how to instantiate the node on an interface in a subnet.
The locator model allows for translation in flight similar to NPTv6, although it’s not required. Thus, the kinds of arguments that can be made for no longer needing provider-independent addresses to be independent of the upstream network in a business and operational sense apply equally well. This much is Really Appealing to me.
Where I have a problem with it is that, where NPTv6 does a TCP/UDP checksum update in the IPv6 address, ILNP excludes the locator (what you and I call a prefix in the source and destination addresses) from the TCP/UDP checksum. It requires me to change TCP and UDP, and change it on all equipment that I expect to apply a prefix translation against simultaneously. This is kind of like speeding a particle up from a sub-light speed to flying faster than light; there is no problem flying slower than light, and apart from complex arithmetic, no issue flying faster than the speed of light. The singularity at the speed of light (calculating my mass involves dividing by zero) is a real kink in the process though. If I thought I could deploy ILNP, I might strongly push it. I don’t think I can deploy it.
Andrew: So, for myself I conclude that NPTv6 is a useful technology, and is less “evil” than NATs in IPv4. Given its effects, would one still need to think long-term before embarking on the journey of running it?
Fred: Personally, no. Your mileage, of course, may vary. I think it is a pragmatic solution to a real problem.