Adding meta-tags to a website with ZXTM
One of the things that didn’t quite make it in time for the initial relaunch was meta-tags; descriptions and keywords. Google doesn’t use meta-keywords after all. But Yahoo! does. And, Yahoo! accounts for roughly one third of the US web searches. Google uses meta-descriptions too. So, as an experiment in drinking our own cool-aid and to take a bit of a weight off of our overworked marketing chums just before Christmas, I decided to try to generate meta-tags using ZXTM’s traffic inspection and response rewriting capabilities. Rewriting a responseFirst, I had to decide what to use to generate a list of related keywords. It would have been nice to have been able to slurp up all the text on the page and calculate the most commonly occurring unusual words. Surely that would have been the über-geek thing to do? Well, not really: TrafficScript just isn’t designed for this kind of use; it (currently) has no arrays or linked lists and unless I was careful I could end-up slowing down each response. Plus, there would be the danger that I produced a strange list of keywords that didn’t accurately represent what the page is trying to say (and could also be widely “Off-message”). So I instead turned to the two repositories of on-message page summaries - the <title> tag and the big <h1> tag found on each page. They should already contain everything we need. The scriptIn truth, I took a detour. Maybe I had REST on my mind (cough), but I first looked at the URL path. After all, it has raw keywords in it, separated by slashes: easy to grep with a regular expression. So, I set up a development machine with a virtual HTTP server on port 80 and a pool containing www.zeus.com:80 and added a request rule to spoof the host header. http.setheader( "Host", "www.zeus.com" ); Next, I set about writing a response rule to inject meta-tags into the HTML head of each page served. First I had to get the response body: $body = http.getResponseBody( 0 ); This will be grepped for keywords, mangled to add the meta-tags and then returned by setting the response body: http.setResponseBody( $body ); Next I had to make a list of keywords. As I mentioned before, my first plan was to look at the path: by converting slashes to commas I should be able to generate some correct keywords, something like this: $path = http.getPath(); $path = string.regexsub( $path, "/+", "; ","g" ); After adding a few lines to first tidy-up the path: removing slashes at the beginning and end; and replacing underscores with spaces, it worked pretty well. And, for solely aesthetic reasons I added $path = string.uppercase($path); Then, I took a look at the <title> tag. Something like this ought to do the trick: if( string.regexmatch( $body, "<title>\\s*(.*?)\\s*</title>", "i" ) ) {
$title_tag_text = $1;
}
(the “i” flag here makes the search case-insensitive, just in-case). This, indeed, worked fine. With a little cleaning-up, I was able to generate a meta-description similarly: I just stuck them together after adding some punctuation (solely to make it nicer when read: search engines often return the meta-description in the search result). After playing with this for a while I wasn’t completely satisfied with the results: the meta-keywords looked great; but the meta-description was a little lacking in the real english department. So, instead I turned my attention to the <h1> tag on each page: it should already be a mini-description of each page. I grepped it in a similar fashion to the <title> tag and the generated description looked vastly improved. Lastly, I added some code to check if a page already has a meta-description or meta-keywords to prevent the automatic tags being inserted in this case. This allows us to gradually add meta-tags by hand to our pages - and it means we always have a backup should we forget to add metas to a new page in the future. The finished script looked like this: $body = http.getResponseBody( 0 );
$path = http.getPath();
# remove the first and last slashes
$path = string.regexsub( $path, "^/?(.*?)/?$", "$1" );
$path = string.iReplace( $path, "_", " " );
# convert slashes to something nicer, ie " - "
$path = string.regexsub( $path, "/+", " - ","g" );
if( string.regexmatch( $body, "<h1>\\s*(.*?)\\s*</h1>", "i" ) ) {
# replace path with contents of <h1> tag
$h1 = $1;
}
# build a description and keywords
if( string.regexmatch( $body, "<title>\\s*(.*?)\\s*</title>", "i" ) ) {
# grab the title tag's contents
# append it onto the tidied path with a colon to make a description
if($h1){
$description = "Zeus Technology - " . $h1 . ": " . $1;
$keywords = "Zeus - Zeus Technology - " . $path . ": " . $h1 . ": " . $1;
} else {
$description = "Zeus Technology - " . $path . ": " . $1;
$keywords = "Zeus - Zeus Technology - " . $path . ": " . $1;
}
# make an a uppercase, comma separated version to be the keywords
$keywords = string.uppercase(
string.regexsub( $keywords, "\\s*[\\-:]\\s+", ", ","g" )
);
# only rewrite the meta-keywords if we don't already have some
if(! string.regexmatch( $body, "<meta\\s+name='keywords'", "i" ) ) {
$meta_keywords = " <meta name='keywords' content='" . $keywords ."'/>\n";
}
# only rewrite the meta-description if we don't already have one
if(! string.regexmatch( $body, "<meta\\s+name='description'", "i" ) ) {
$meta_description = " <meta name='description' content='" . $description . "'/>";
}
# find the title and stick the new meta tags in afterwards
$body = string.regexsub( $body,
"(<title>.*</title>)", "$1\n"
. $meta_keywords
. $meta_description );
http.setResponseBody( $body );
} It should be fairly easy to adapt it to another site assuming the pages are built consistently. The FutureI now need to convince our marketing department to use my nifty meta-generator script on the live Zeus.com site. It is, naturally already being managed by a pair of ZXTMS. However, this may involve a little more effort than it took to create the script: TrafficScript is super-quick-and-easy for this sort of application; corporate wheels, even in a nimble organization such as Zeus, run somewhat slower. And, I guess, we should just-maybe add a meta-description injection rule to KnowledgeHub: I’ve just checked and they seem to be missing! Owen?… Comments:This public messageboard is not a forum for technical support. To report technical support problems, please contact our dedicated Support team using the instructions at the bottom of this page.
Comment from:
Owen Garrett [Zeus Dev Team]
You're twisting my arm... meta tags are now live on knowledgehub and www.zeus.com.
There were a couple of changes before we deployed the rule: Don't waste time trying to process non-HTML responses: add this to the start of the rule: $ct = http.getResponseHeader( "Content-Type" ); if( ! string.startsWith( $ct, "text/html" ) ) break;Strip any HTML from the descriptions: some of our headings had anchors and other markup in them: # get rid of any inline HTML markup $keywords = string.regexsub( $keywords, "<.*?>", "", "g" ); $description = string.regexsub( $description, "<.*?>", "", "g" );
Improve the check for existing meta tags: use '['\"]?' to match quote marks:
# only rewrite the meta-keywords if we don't already have some
if(! string.regexmatch( $body, "<meta\\s+name=['\"]?keywords['\"]?", "i" ) ) {
$meta_keywords = " <meta name=\"keywords\" content=\"" . $keywords ."\"/>\n";
}
# only rewrite the meta-description if we don't already have one
if(! string.regexmatch( $body, "<meta\\s+name=['\"]?description['\"]?", "i" ) ) {
$meta_description = " <meta name=\"description\" content=\"" . $description . "\"/>";
}
All things that you only find when you make the rule live of course!
Cheers Sam! |
Recent Articles
Other Resources
|

You may have noticed that we performed a major refresh on our corporate web site earlier this year. It was a lot of work for our shiny, small-but-super-effective marketing department and (as you may also have noticed) it is still somewhat of a work-in-process.
Improve the check for existing meta tags: use '
