Geographic location detection using TrafficScript

World globeSometimes you may want to determine the country of origin of each remote client so that you can act on this information using TrafficScript™. This article describes how to use Maxmind's free GeoLite Country database within ZXTM to determine a client's geographic location from their source IP address.

For example, you may wish to present two different versions of your home pages, with national news stories and advertisements for local visitors and international ones for other visitors. If you are hosting content that is restricted under export regulations, you may wish to redirect users from restricted locations.

This article is quite lengthy. It describes:

  • How to obtain a source IP-to-country database and create a version suitable for use in ZXTM;
  • How to import the database into ZXTM and search it with a TrafficScript rule;
  • A simple example that tests the TrafficScript code;
  • Another example showing how to ban users from particular locations.

 

The Source Database

In this article, we'll use Maxmind's free GeoLite Country database. If it's not sufficiently accurate, you can use their commercial GeoIP Country database instead.

The database contains ranges of IP addresses and their corresponding countries. For the purposes of reading the database within TrafficScript, it's easiest to convert the database into a compact array of IP ranges that can be searched quickly using a binary search:

+------+-----------+------+-----------+------+-----------+ ... +------+-----------+ 
| IP 1 | Country 1 | IP 2 | Country 2 | IP 3 | Country 3 | ... | IP n | Country n | 
+------+-----------+------+-----------+------+-----------+ ... +------+-----------+ 

This array will be constructed as a string; each IP address as a 4-byte integer, and each country as the two-letter country code (so the string is 6*n bytes long). An IP address in the range IPi <= IP < IPi+1 is located in Countryi. Where there are gaps in the database, these are identified with the two-letter country code '??'.

The following Perl script reads the CSV version of the GeoLite Country database and outputs the array:

Save this file in the same folder as the GeoIpCountryWhois.csv CSV database file, and run it as follows:

  • Linux/Unix
    $ chmod +x ./ip2array.pl
    $ ./ip2array.pl GeoIpCountryWhois.csv geoip.dat
    
  • Windows

    Install ActivePerl (the free standard version is sufficient), then run the perl script:

    C:\Folder\Name> ip2array.pl GeoIpCountryWhois.csv geoip.dat
    

This operation will write the array to a file named 'geoip.dat'. Using the September 2007 version of the database, this file is 579,432 bytes long. It contains entries for 232 ISO 3166 country codes, along with codes for proxies and satellite providers.

Copy this file to your ZEUSHOME/zxtm/conf/extra folder on your ZXTM machine using scp or an equivalent. For example, to copy it to a ZXTM appliance at 192.168.1.1 from a Windows system, you can use the free pscp utility:

C:\Folder\Name> pscp geoip.dat root@192.168.1.1:/opt/zeus/zxtm/conf/extra

The TrafficScript code

Now that you've copied the geoip.dat file to your zxtm/conf/extra folder, you can read it using the resource.get() function in TrafficScript. This efficient function reads the file once from disk and caches it until it changes, so even though the file is 0.5Mb, there is no performance hit.

The following code looks up the client's remote address and converts it to 4-byte integer form. It then binary-searches the array and determines the two-character country code:

$ipaddr = request.getRemoteIP();

# Integer representation of $ipaddr >> 1
string.regexmatch( $ipaddr, "(\\d+)\\.(\\d+)\\.(\\d+)\\.(\\d+)" );
$ip = ((($1*256+$2)*256+$3)*128+$4/2);

$arr = resource.get( "geoip.dat" );

# initialize indices
$i = 0; $j = string.len( $arr )/6-1;

# $arr[$i] <= $ip < $arr[$j]
# iteratively halve the distance between $i and $j until they are adjacent

while( $j-$i > 1 ) {
   # midpoint between $i and $j
   $k = ($i+$j)/2;

   # compare $ip with $arr[$k]
   if( string.bytesToInt( string.subString( $arr, $k*6, $k*6+3 ) ) > $ip ) {
      $j = $k;
   } else {
      $i = $k;
   }
}

# Now, $arr[$i] <= $ip < $arr[$j] and $j == $i+1
# Look up the 2-character country code (returns '??' if unknown)
$ccode = string.subString( $arr, $i*6+4, $i*6+5 );

One point to note: all integers in TrafficScript are stored as 32-bit signed integers. To avoid integer overflows and keep the code simple, all IP addresses are converted to 31-bit integers (shifting-right to drop the bottom-most bit); the array already contains the 31-bit values.

Testing the code

It's quite easy to test this conversion. Create a request rule using the following TrafficScript code and assign it to an HTTP virtual server (you can create a new one with no back-end nodes if you like, or use an existing one).

if( !string.startsWith( http.getPath(), "/geo" ) ) break;

$ipaddr = http.getFormParam( "ip" );

# Integer representation of $ipaddr >> 1
string.regexmatch( $ipaddr, "(\\d+)\\.(\\d+)\\.(\\d+)\\.(\\d+)" );
$ip = ((($1*256+$2)*256+$3)*128+$4/2);

$arr = resource.get( "geoip.dat" );

# initialize indices
$i = 0; $j = string.len( $arr )/6-1;

# $arr[$i] <= $ip < $arr[$j]
# iteratively halve the distance between $i and $j until they are adjacent

while( $j-$i > 1 ) {
   # midpoint between $i and $j
   $k = ($i+$j)/2;

   # compare $ip with $arr[$k]
   if( string.bytesToInt( string.subString( $arr, $k*6, $k*6+3 ) ) > $ip ) {
      $j = $k;
   } else {
      $i = $k;
   }
}

# Now, $arr[$i] <= $ip < $arr[$j] and $j == $i+1
# Look up the 2-character country code (returns '??' if unknown)
$ccode = string.subString( $arr, $i*6+4, $i*6+5 );

http.sendResponse( 200, "text/plain", 
   "IP = ".$ipaddr."\nCountry = ".$ccode."\n",
   "" );

You can call this script by requesting the URL "/geo" with a querystring containing a parameter named 'ip' which contains an IP address. For example, type the following into your browser location bar:

  • http://zxtm.server:port/geo?ip=131.111.131.1

screenshot

Another example

Suppose that access to part of your website is restricted by export controls. You can limit access and redirect users within the restricted geographies using TrafficScript as follows:

# Don't worry about pages that don't start "/restricted"
if( !string.startsWith( http.getPath(), "/restricted" ) ) break;

# calculate origin country (from code above):
$ipaddr = request.getRemoteIP();
.....
$ccode = string.subString( $arr, $i*6+4, $i*6+5 );

# Restrict access to Cuba, Iran, N.Korea, Sudan, Syria, 
# proxies and other unknown locations
$banned = "CU,IR,KP,SD,SY,A1,A2,??,";

if( string.contains( $banned, $ccode."," ) ) {
   http.redirect( "/accessdenied.html" );
}

Obviously, this method is only as accurate as the source database and it's just advisory; it does not satisfy any legal obligations you may have under export regulations.

More...

This technique is useful if you want to handle site visitors from various locations in different ways. You can use the http.setResponseCookie() function in TrafficScript to set a client-side 'location' cookie that the destination webserver can read.

If you want to perform global request distribution, do take a look at Zeus' ZXTM Global Load Balancer product for a capable and complete solution.

Owen Garrett [Zeus Dev Team] 18 September 2007 Bookmark with del.icio.us Post this article to Digg Post this article to reddit Post this article to Facebook Tweet this article 8 comments  

Comments:

This public messageboard is not a forum for technical support. To report technical support problems, please contact our dedicated Support team using the instructions at the bottom of this page.

Comment from: binky [Visitor]
The pscp step doesn't work for me.
I'm getting:
Fatal: Network error: Connection refused
The ZXTM server is working correctly otherwise, and I can ping the same ip address I'm using in pscp from the same dos prompt.
Am I meant to do something on the ZXTM admin interface to enable this?
Permalink 09 November 2007 @ 09:54
Comment from: Owen Garrett [Zeus Dev Team]
With a ZXTM appliance (virtual or hardware), you can ssh and scp files using the user 'admin', and your admin password.

The Desktop Edition prior to 5.0r1 hadn't got an SSH server running; if you're using 5.0r1, follow these instructions to enable SSH.
Permalink 15 November 2007 @ 17:36
Comment from: SteveF [Visitor]
Hi Owen

Thanks for the code, it works well. However, would it be possible for you to provide a couple of more snippets to get the Full country Name and Continent Name?

Thanks

Permalink 23 September 2008 @ 13:19
Comment from: Owen Garrett [Zeus Dev Team]
Hi Steve,

The perl code generates a flat list (geoip.dat) with fixed-width fields. The current version generates a 2-character field for the country data, but you could expand it to a larger fixed width so that you can fill it with the country and continent names (for example).

Then you need to modify the TrafficScript code so that it uses a the correct field width (i.e. not the 6 characters currently).
Permalink 26 September 2008 @ 10:53
Comment from: SteveF [Visitor]
Hi Owen

We have another issue I hope you may be able to shed som light on. We use ZXTM and are trying to incorporate openads into one of our sites. Unfortunately we cannot seem to pass IP information through ZXTM to openads. Obviously if we run the page from the backend node directly the ip info is picked because we do not go through the TM e.g.:

GEOIP_CONTINENT_CODE: EU
GEOIP_COUNTRY_CODE: GB
GEOIP_COUNTRY_NAME: United Kingdom

However, if I try to go through the TM the variables do not get passed through:

GEOIP_CONTINENT_CODE: --
GEOIP_COUNTRY_CODE: --
GEOIP_COUNTRY_NAME: N/A

I have tried using the request.getRemoteIP() function to pass the IP info through but this does not work. Am I missing something obvious or can you suggest anything? Any help would be most appreciated.

Regards

Steve.
Permalink 17 March 2009 @ 07:50
Comment from: Owen Garrett [Zeus Dev Team]
Hi Steve,

It you're running the GeoIP code on a backend node, you'll need to ensure that the node receives the correct IP address. By default, the node will see the connection originating from the local ZXTM, typically a private address (10.x.x.x for example).

You have a few options for pushing the correct IP address through to the backend: use IP Transparency (requires Linux or Appliance), pass the IP address through in a header (X-Cluster-IP or a custom header), or use one of the solutions described here.

Alternatively, you can use the code in this article to calculate the country, etc. and than pass this data through to the back-end server by adding a header or custom cookie to the HTTP request:

http.setHeader( "Country-Code", $ccode );

... or...
http.setCookie( "Country-Code", $ccode );


Then you should be able to read the value of the HTTP request header or cookie in your back-end application code.

Owen
Permalink 17 March 2009 @ 09:34
Comment from: mrz@mozilla.com [Member]
Same rule in ZXTM 6.0 syntax:

$ipaddr = request.getRemoteIP();

$country = geo.getCountryCode( $ipaddr );

http.sendResponse( 200, "text/plain",
"IP = ".$ipaddr."\nCountry = ".$country."\n","" );
Permalink 22 October 2009 @ 23:54
Comment from: Pratap [Visitor] · http://www.bt.com
Hi Team,

I have a list of IP's in UK with the sitename but not having any mask's and corresponding integer number.

Generally to have this functionality we need to have the info in the below format

3.0.0.0 4.17.135.31 50331648 68257567 US United States
but at the moment I have only below info -

10.229.202 Aberdeen
10.229.203 Aberdeen
10.231.209 Aberdeen
10.232.130 Alness
10.232.131 Alness
10.232.133 Alness
10.232.134 Alness
10.232.135 Alness
10.232.136 Alness
10.232.137 Alness
10.225.52 Bangor
10.228.243 Barrow

Can someone please help the way of getting rest of the values

Thanks
Pratap
spratap@techmahindra.com
Permalink 11 November 2009 @ 08:59
Leave a comment ...
Your email address will not be displayed.
Your URL will be displayed.
This public messageboard is not a forum for technical support. To report technical support problems, please contact our dedicated Support team using the instructions at the bottom of this page.
Options:
 
(Line breaks become <br />)
(Set cookies for name, email & url)

Recently...

Other Resources