Getting your website up and running on IPv6 with ZXTM


IPv6 has been hailed as the remedy for many of the internet's current limitations ever since the mid-1990s. Nevertheless, even today in 2008 only a small fraction of the traffic on the internet is actually flowing over IPv6. Some of the reasons many websites are only available over IPv4 are interoperability concerns. In this article we'll describe how ZXTM 5.0 lets you hook up your company's web presence to tomorrow's internet without causing any nuisance for today's customers accessing your pages over IPv4.

IPv4 and the world's problems

The most urgent reason for pushing IPv6 out of its perceived niche in research and academia is the simple fact that the internet's growth will stop as soon as 2010 if we stick to IPv4 only. The reason for this has to do with the number of available addresses. In order to be reachable over IPv4 a computer must have been assigned a unique 32 bit number. This means that at most about four billion different machines can be connected to the internet at any given instant in time (one of the clever mechanisms to work around that limitation is network address translation - NAT - which allows many computers to 'share' the same IP address).

Four billion is, however, only the theoretical upper limit, the practical limit is much lower. Due to the lax distribution policies in the early days of the internet, large parts of the IPv4 address space are lost forever. There are companies and institutions in the United States that have more addresses at their disposal than whole countries. The Halliburton Company, for instance, has the subnet 34.0.0.0/8, corresponding to 224 or well over 16 million addresses. A country like Senegal, on the other hand, has to make do with a paltry 67866 IPv4 numbers. Yes, that's sixty-seven thousand eight hundred and sixty-six IPv4 addresses for more than eleven million people (for more details, please consult AfriNIC's database).

This discrepancy of course has to do with the fact that the internet originated in the US and that back in those days nobody could have foreseen the global spread of what was then an arcane technology. However, apart from limiting IT growth in those parts of the world where it is most needed, this imbalance amounts to a gross injustice and it is easy to foresee a kind of "black market" for internet numbers in which those who have IP addresses in excess take advantage of those who came late to the table where addresses were being served. The following pictures give you an idea of the situation:


lone caribbean beach Packed pool in Tokyo

But rest assured, there is hope, the world is not yet lost completely.

Tomorrow's internet: IPv6

IPv6 is coming to the rescue with a whopping 2128 addresses. While the 4×109 IPv4 addresses are at least vaguely imaginable in that they roughly correspond to the number of people on earth, the total number of IPv6 addresses (3.4×1038) is well beyond our grasp and falls somewhere between the numbers we hear from our politicians when they speak about the public debt (1013 US$ for the United States) and those astronomers tend to juggle with (there are roughly 1080 atoms in the universe).

Ironically, IPv6 has already arrived: many sites are already reachable via this level 3 protocol. The big problem is that IPv6 is not backwards-compatible with IPv4. Hence, just connecting a computer to the internet over IPv6 does not in itself make available any of the good stuff on all those wonderful IPv4 sites like ebay.com and knowledgehub.zeus.com. Instead, those sites have to be assigned IPv6 addresses as well (in addition, the software serving the pages has to be extended to handle IPv6 traffic).

Users of course access the internet using domain names like www.zeus.com, they do not usually type in addresses like 65.2.137.23 manually. One prerequisite for IPv6 reachability is assigning one or more IPv6 addresses like 2001:8b0:3e6::4¹ to a domain name in addition to any IPv4 addresses it already has.

The straightforward solution

Increasingly, ISPs offer IPv6 connectivity to their customers. Given the advantages over IPv4, one would expect network administrators everywhere to rush to the new technology. One problem is legacy software that is restricted to IPv4. ZXTM 5.0 can help in many cases because if you use ZXTM the back-ends do not have to be changed. Let ZXTM handle all the front-end IPv6 (and of course IPv4) traffic, the back-ends will only ever have to deal with IPv4 connections originating from ZXTM. The following diagram illustrates the traffic flow (IPv4 traffic is shown in blue, IPv6 traffic in green):

Website with IPv4 and IPv6 addresses assigned to the same domain name

The drawback

Now why do we not see more companies doing exactly this? Even Google has not added an IPv6 address to www.google.com. To understand this reluctance, we have to look in a bit more detail at how programs actually connect to each other.

A network application that wants to connect to the service in a sense has the choice: it can use either the IPv4 or the IPv6 connection. In fact, modern browsers like Firefox try IPv6 first and only resort to IPv4 if necessary². The application queries IPv6 addresses assigned to the host from a name server and will then try to use the returned IPv6 address to establish a connection. However, because most homes and even businesses do not have an IPv6 connection, this attempt will fail, most likely with a timeout. Only then will the application try any IPv4 addresses that might be associated with the hostname.

Finally, the user gets a connection and can start to surf around the site. Most users are, of course, unaware of all these technical details and at the slightest delay will assume that the site "doesn't work" and immediately go somewhere else. In effect, a company will be punished for having gone through the pain and expenses of enabling IPv6 on their website by losing customers. Nobody is happy with that, so system administrators are told to put off IPv6 until everybody else has it - a chicken-and-egg situation.

How do you work around this problem? Many companies choose to assign any IPv6 addresses allocated to them not to their main site, say www.zeus.com but create a new DNS entry, typically www6.zeus.com and assign IPv6 addresses to that site. Google, for instance, has made its search functionality available over IPv6 on ipv6.google.com That way, when users with an IPv4-only connection attempt to resolve the hostname to an IPv6 address they immediately get a negative response, the software switches to IPv4 and therefore no delay or timeout occurs. Customers connected via IPv6 will have to go to www6.zeus.com (or ipv6.google.com), but apart from that everything should work the same.

The network topology

In the following it will be assumed that the back-end web servers have been configured to serve requests with host header www.zeus.com, support.zeus.com and knowledgehub.zeus.com. Much work has gone into the website and its application servers. Therefore, the admins are reluctant to change anything about it. Several ZXTMs with external-facing NICs are accepting connections from the internet in an active-active configuration and balancing the requests between the back-ends.

The existing internet connection has recently been upgraded to IPv6 and IPv6 addresses have been allocated for www6.zeus.com, support6.zeus.com and knowledgehub6.zeus.com. It's our task to make the existing website reachable on these domain names over the IPv6. Version 5.0 of ZXTM is fully IPv6-enabled and will help us get this job done. The traffic flow will therefore be: Starting from an IPv6 client it reaches the ZXTM hosting the IPv6 addresses. The request is not being passed on directly to the back-ends. Instead, a TrafficScript™ rule will check whether the connection comes from an IPv6 client. If so, the host-header must be rewritten before the back-ends are contacted.

IPv4 and IPv6 addresses assigned to different domain names

Setting up the virtual server

First you have to create a traffic IP group containing all the IPv6 and IPv4 addresses you have assigned to your domain. Then set up a virtual server with protocol 'http' listening on that traffic ip group on port 80. In order to morph the incoming requests into a shape acceptable by the back-ends, the host headers have to be rewritten: the '6' before '.zeus.com' has to be removed. Furthermore, in all html pages sent back to the client any links to *.zeus.com have to be changed to point to *6.zeus.com. If it was not for this last change, IPv6-only users could only ever see our starting page but not follow any links deeper into our website.

The TrafficScript™ rule for incoming requests

Let's start with the request rule which maps the host header of the incoming requests before they are sent on to the web servers:

$ip = request.getremoteip();
$ipv = string.validipaddress( $ip );
if( $ipv != 6 ) {
  break;
}

$hh = http.getHostHeader();
if( $hh ) {
   $hh = string.regexsub(
    $hh,
    "(www|support|knowledgehub)6\\" .
    ".zeus\\.com",
    "$1.zeus.com",
    "i"
   );
   http.setHeader( "Host", $hh );
} else {
   http.sendResponse(
      "400 Bad request",
      "text/plain",
      "No Host-header provided\n",
      "Server: ZXTM/5.0" );
}

First of all, the rule checks whether the connection is coming from an IPv4 client. If so, no further action is needed. For IPv6 connections, the rule then tries to retrieve the host header from the request. If there is a host header in the request, the TrafficScript function string.regexsub is used to append the literal '6' after 'www', 'support' and 'knowledgehub' wherever those words appear before '.zeus.com'.

Two things deserve explanation: First, we have to treat the period ('.') specially because it has the meaning of a wild card in regular expressions. By prefixing it with a backslash ('\'), we tell ZXTM to use it as a normal character and not as a wild card (since a backslash in turn is interpreted as a special character when TrafficScript creates strings, we need two backslashes to obtain the desired effect). Furthermore, we specify the 'i' modifier to regexsub because host names are case insensitive. If the request doesn't contain a host header we send back an error page.

Cooking the response for IPv6 clients

With the manipulations of the previous rule the back-ends were able to process the request and are now sending the response to us. Much of the value of websites comes from the links to other websites but also to other pages on the same site. If we were to forward the response to the client unchanged, it would contain lots of links like http://knowledgehub.zeus.com/ which are not accessible from IPv6. Therefore, we have to do the inverse of the substitution carried out in the request rule: convert those links to http://knowledgehub6.zeus.com/. Again, this only needs to be done if the original request came from an IPv6 client (the code to do this is identical to what was shown for the response rule and therefore not repeated here).

We have to be a lot more careful, however. After all, the response can be anything from a jpeg image over a flash video to an audio stream. Not only would it be wrong to manipulate such files (any regular expression matches would be pure coincidence and would result in corrupted data), but they might be too large to keep in memory. The first thing to do, therefore, is to inspect the content-type response header:

$ct = http.getResponseHeader( "Content-Type" );
$ctype = string.lowercase( $ct );
if( $ctype && string.startsWith( $ctype, "text/html" ) ) {

Since content-types are case-insensitive, we convert the header value to lowercase before checking it. This saves us an expensive regular expression match. So now we have limited the processing to html pages, but that is not paranoid enough. To keep the load on the server low, we only process pages that are at most one megabyte in size:

   $clen = http.getResponseHeader( "Content-Length" );
   $clen = lang.toInt( $clen );
   $max_len = 1024*1024;
   if( $clen > 0 && $clen <= $max_len ) {

Note the check whether $clen is larger than zero. This check covers the case that a page came back without the Content-Length header. In this case the value of $clen would be zero because that's the value TrafficScript assigns when converting an empty string or a non-numeric string to an integer. Obviously, you might want to adapt the value of the upper limit to the actual amount of memory on your server; one megabyte is a very low limit. Also, it might be useful to convert links in other text files, for example plain text or XML files. After all these precautions it should now be safe to use the http.getResponseBody() function without putting our server to risk:

     $body = http.getResponseBody( 0 ); # need full response

     $result = string.regexsub(
      $body,
      "(src|href)=\"http://" .
      "(www|support|knowledgehub)\\" .
      ".zeus\\.com",
      "$1=\"http://$26.zeus.com",
      "gi"
     );
     http.setResponseBody( $result );

Note that the regular expression handles both links and image source specifications and repeatedly replaces all matches in the string. To round this section up, here's the rule in full length:

$ip = request.getremoteip();
$ipv = string.validipaddress( $ip );
if( $ipv != 6 ) {
  break;
}

$ct = http.getResponseHeader( "Content-Type" );
$ctype = string.lowercase( $ct );
if( $ctype && string.startsWith( $ctype, "text/html" ) ) {
   $clen = http.getResponseHeader( "Content-Length" );
   $clen = lang.toInt( $clen );
   # don't do a regexreplace on responses larger than 1MB
   $max_len = 1024*1024;
   if( $clen > 0 && $clen <= $max_len ) {
     $body = http.getResponseBody( 0 ); # need full response
     $result = string.regexsub(
      $body,
      "(src|href)=\"http://" .
      "(www|support|knowledgehub)\\" .
      ".zeus\\.com",
      "$1=\"http://$26.zeus.com",
      "gi"
     );
     http.setResponseBody( $result );
   } else {
      log.warn( "Not replacing: content-length was ". $clen );
   }
}

We've added a line logging a warning when a file was not processed due to its size. This gives you an indication of whether you need to adjust the upper limit on the files you process.

Location header rewriting

There is one final thing to consider: Sometimes the web servers don't respond with a page, but instead send a so-called redirect (HTTP response codes from 300 to 399) with a location header specifying the URL where the content can be found. ZXTM allows you to modify such redirects in the 'Connection Management' tab of the virtual server configuration. Expand the 'Location Header Settings' section and use the following values:

location!regex     http://(www|support|knowledgehub)\.zeus\.com
location!replace   http://$16.zeus.com

An imperfect world

The TrafficScript rules described in this article will make your site reachable for internet users with an IPv6-only connection. But of course the solution is not perfect. After all, only the links referring to other pages on your own website are undergoing the manipulation to turn IPv4 host names into IPv6 host names. What about other sites? If the host names resolve to IPv4 only, IPv6 users won't be able to reach them. Many sites won't have IPv6 connectivity, others might have IPv4 and IPv6 addresses assigned to the same host name. Sites that employed the workaround we've discussed in this article won't be reachable. But still, some IPv6 presence is better than none and once enough people use IPv6, we can add the IPv6 addresses to www.zeus.com.



¹RFC 3849 reserves the subnet 2001:db8::/32 for IPv6 addresses mentioned explicitly in documentation and advises not to use 'real' IP addresses when documenting IPv6. If you think that's a good recommendation, please replace all IPv6 addresses in this article with such documentation addresses.
²IPv6 name resolution can be switched off in Firefox by going to 'about:config' and setting 'network.dns.disableIPv6' to 'true'.

michael [Zeus Dev Team] 04 July 2008  Permalink  
Leave a comment ...
Your email address will not be displayed.
Your URL will be displayed.
This public messageboard is not a forum for technical support. To report technical support problems, please contact our dedicated Support team using the instructions at the bottom of this page.
Options:
 
(Line breaks become <br />)
(Set cookies for name, email & url)
Download Free ZXTM Desktop Edition

Recent Articles

Other Resources



www.zeus.com