Managing Website GrowthAs websites grow, the structure of their URLs can change dramatically. This makes things much more manageable from an operational point of view, but what can be done about all of those links people have to that website? Ideally, we would like to ensure that people with the old link are presented the most relevant content on the new website. When the web-application is large or under the control of a different department, it may be difficult to change things. Well TrafficScript™ comes to the rescue once again. www.example.com/news.html www.example.com/about.html www.example.com/careers.html www.example.com/demos.html The example company's website grew and they now have news about their products and about their share prices. They have multiple offices located around the world in exotic locations, with job listings per location. Their new website is very large but here is a selection of pages that are related to pages on their old site. www.example.com/corporate/news.html www.example.com/product/news.html www.example.com/product/demos.html www.example.com/about/cambridge.html www.example.com/about/honalulu.html www.example.com/careers/overview.html www.example.com/careers/honalulu/engineering.html www.example.com/careers/cambridge/research.html The challenge now is what to do about serving out the best content to people who make requests to the old URLs. Ideally, we would like to issue an HTTP redirect to the most appropriate page. This is, of course, simple using TrafficScript. http.redirect( "http://www.example.com/corporate/news.html" ); You could then go on to build up a rule that looks similar to this:
$rawurl = http.getRawURL();
if( $rawurl == "/news.html" ){
http.redirect( "http://www.example.com/product/news.html" );
} else if( $rawurl == "/about.html" ) {
http.redirect( "http://www.example.com/about/cambridge.html" );
} else if ...
}
But, as you can see this becomes quite cumbersome and difficult to manage. One of its other problems is that you require access to ZXTM to make any changes, which may be difficult if the people wanting to make changes are application/website developers, rather than those responsible for managing the ZXTMs. Ideally, you would like this information external to ZXTM and there are two ways TrafficScript can provide this. The first approach is to have an external service, which will provide the URL to URL mappings. The second approach is to have a local resource which ZXTM can load and use. I'll be discussing the second approach, as it is more efficient and easier to set up. TrafficScript provides the ability to load files from disk and we can use this to load a file that contains mappings from one URL to another. The structure of the file can be anything you would like, but for this example it will just be single line entries with the old path and new URL separated by a tab character. /news.html http://www.example.com/product/news.html /about.html http://www.example.com/about/cambridge.html ... We can then load this and store the information efficiently so that it may be used to issue the redirect later. Note that the mapping file must be placed in the ZEUSHOME/zxtm/conf/extra/ directory on each of your ZXTM machines. Here is what the loading and parsing code might look like:
# Empty our current mappings if we have any
data.reset();
# Load the file
$rawfile = resource.get( "exampledotcom_mappings" );
# Process each line
while( string.regexmatch( $rawfile, "^([^\t]*)\t([^\n]*)\n(.*)$" ) ){
# Store the mapping for later use
data.set( $1, $2 );
$rawfile = $3;
}
As you can see, this is very simple, however, we wouldn't want to load the file for every request so we want to cache the file's contents for a period of time. To do this we wrap the code above like this:
$md5 = resource.getMD5( "exampledotcom_mappings" );
if( $md5 != data.get( "Mappings MD5" ) ){
# Empty data then load and process file here
...
data.set( "Mappings MD5", $md5 );
}
This will cache the loaded data for as long as the file on disk remains unchanged. Now we have an efficient way of loading the mappings, all that is required now is to use the loaded information to issue the redirect. Up to this point, I have not really discussed how to run this rule, be it as a request rule or a response rule. As a request rule, we incur the very small cost of checking the MD5 of the mapping file and a lookup in our data to see if we have a mapping, for every request that is made. However, if we move all the processing to a response rule we get many benefits. The first benefit is that we can limit any checking we do to only those requests that could possibly benefit from it. By simply checking the response code of the response, we can only try to rectify requests to pages that yield the 404 response code. This means that for the vast majority of requests we will not be doing any processing and that it will only be requests for the 'old' pages that will incur the loading cost of the file from disk. This has another advantage in that any repaired or replaced page will be served in preference to a specified redirect. As a response rule, the rule in its entirety will look like this:
# If we get a response other than 404 break out of rule
if( http.getResponseCode() != 404 ) break;
$md5 = resource.getMD5( "exampledotcom_mappings" );
if( $md5 != data.get( "Mappings MD5" ) ){
# Empty our current mappings if we have any
data.reset();
# Load the file
$rawfile = resource.get( "exampledotcom_mappings" );
# Process each line
while( string.regexmatch( $rawfile, "^([^\t]*)\t([^\n]*)\n(.*)$" ) ){
# Store the mapping for later use
data.set( $1, $2 );
$rawfile = $3;
}
data.set( "Mappings MD5", $md5 );
}
# Do the redirect from the loaded data
$redirected_url = data.get( http.getPath() );
if( $redirected_url ){
http.redirect( $redirected_url );
} # else fall through and return the 404
We now have a scalable and yet remarkably simple solution for providing a list of redirects that can ease the transition from one version of a website to another. If you found this article useful you may also wish to read "No more 404 Not Found...?."
Dec
[Zeus Dev Team] 07 January 2008
|
Recent Articles
Other Resources
|


