Network Side Scripting - Modifying HTTP responses with Netscaler VPX, F5 BigIP and Zeus Traffic Manager

In a previous article, we discussed a simple application of Network Side Scripting, i.e. modifying an HTTP request. In this post, we’ll consider a more complex application, inspecting HTTP responses and masking out any sensitive information within. We’ll compare solutions for three leading traffic management systems – Citrix’s Netscaler, F5’s BigIP Local Traffic Manager and Zeus Traffic Manager.

We'll conclude by looking at some of the 'gotchas' - idiosyncracies of HTTP that can catch out even the most knowledgeable admin and render their Network Side Scripting rules useless.

The original goal of this exercise was to develop solutions to obfuscate email addresses in HTML data, replacing name@host.com with name(at)host(dot)com. This was not possible to achieve on the Netscaler system because of limitations in AppExpert’s templated actions. Consequently, this exercise uses a simpler example where target text is replaced by a fixed string.

The challenge – removing sensitive data

Network-side scripting solutions are often used as the first line of defence to quickly patch over application errors and faults. For example, imagine that a vulnerability has been discovered that can cause the application to leak sensitive information – social security numbers (SSNs) for example. A network-side scripting solution could be used to replace any SSNs in outgoing data with alternate text, isolating and sealing off the problem while the more lengthy process of debugging and fixing the application takes place. In this exercise, we’ll apply the following logic:

  1. Check the URL and Content-Type of each request and response
  2. If the URL begins /myapp and the content-type is text/html, then inspect the response, replacing anything that resembles a SSN with the string "<b>Removed</b>".
  3. Send the modified response to the client

This example is based on F5's "Social Security Number Scrubbing" and Zeus' "Masking SSNs in HTTP responses" solutions.

Implementation note: Social security numbers are nine-digit numbers, formatted as 987-65-4321. All of the examples below use the regular expression "\d{3}-\d{2}-\d{4}" to match SSNs.

Citrix Netscaler VPX

This example uses Netscaler VPX Express version 9.1; the same solution applies to the Netscaler MPX hardware appliances. In the solution, we create a Policy with the appropriate condition (URL and Content-Type) and use the 'REPLACE_ALL' action to replace any text that matches our regular expression.

This solution is based in part on an example from the Citrix Developer Network.

Constructing the Rule

Create a Rewrite action (named 'ReplaceSSN') using the command line as follows:

add rewrite action ReplaceSSN replace_all "HTTP.RES.BODY( 2147483647 )" \
    "\"<b>Removed</b>\"" -pattern re/\d{3}-\d{2}-\d{4}/

Note: although the replacement text may be given as an expression, there is no way to use the matched text in that expression, making the email address substitution solution not possible.

Create a Rewrite policy with the following condition:

add rewrite policy ReplaceSSN "HTTP.REQ.URL.PATH.STARTSWITH( \"/myapp\" ) && \
   HTTP.RES.HEADER( \"Content-Type\" ).EQ( \"text/html\" )" ReplaceSSN

Bind this policy to your virtual server:

bind lb vserver MyVirtualServer -policyName ReplaceSSN -priority 110 \
   -gotoPriorityExpression NEXT -type RESPONSE

This process configures the Netscaler to replace text resembling a social security number with the fixed string specified in the action.

F5 BigIP Local Traffic Manager

This example uses F5 BigIP version 9 and is based on the published devcentral solution.

Constructing the Rule

The iRule to implement the solution is as follows:

when HTTP_REQUEST {
   if { [HTTP::uri] starts_with "/myapp" } {
      set scrub_content 1
      # Force downgrade to HTTP 1.0 to prevent chunking, but still allow
      # keep-alive connections.  Since 1.1 is keep-alive by default, and 1.0
      # isn't, we need make sure the headers reflect the keep-alive status.
      if { [HTTP::version] eq "1.1" } {
         if { [HTTP::header is_keepalive] } {
            HTTP::header replace "Connection" "Keep-Alive"
         }
         HTTP::version "1.0"
      }
   } else {
      set scrub_content 0
   }
}

when HTTP_RESPONSE {
   if { $scrub_content && [ HTTP::header "Content-Type" ] equals "text/html" } {
      if { [HTTP::header exists "Content-Length"] } {
         set content_length [HTTP::header "Content-Length"]
      } else {
         # Note: the suggested value of 4294967295 does not work 
         # because the comparison below fails
         set content_length 2147483647
      }
      if { $content_length > 0 } {
         HTTP::collect $content_length 
      }
   }
}

when HTTP_RESPONSE_DATA {
    if { [regsub -all {\d{3}-\d{2}-\d{4}} [HTTP::payload] 
            "<b>Removed</b>" newdata] } {
        HTTP::payload replace 0 [HTTP::payload length] $newdata  
    }
}

iRules are not fully conversant with HTTP traffic, and the onus lies on the application administrator to write an iRule to correctly parse HTTP responses. For example, rather than dealing with chunk-transfer encoded responses, the rule forcibly downgrades the connection to HTTP/1.0 and adds the necessary headers to enable keepalives. The rule parses the "Content-Length" response header, guessing at the length if the header is missing, so that it can instruct the BigIP system to collect sufficient response data.

The iRule involves three distinct iRules events. The $scrub_content state variable is necessary because the HTTP::uri method is not available in the HTTP_RESPONSE event.

Create the iRule and associate it with the virtual server; the BigIP system will now scrub social security numbers from HTTP responses.

Zeus Traffic Manager

This example uses Zeus Traffic Manager (previously 'ZXTM') version 5.1, and is based on the published knowledghub solution.

Constructing the Rule

The TrafficScript rule to implement the solution is as follows:

$type = http.getResponseHeader( "Content-Type" );
$url  = http.getPath();

if( string.startsWith( $url, "/myapp" ) && $type == "text/html" ) {
   $body = http.getResponseBody();

   $body = string.regexsub( $body, "\\d{3}-\\d{2}-\\d{4}", "<b>Removed</b>", "g" );
   http.setResponseBody( $body );
}

TrafficScript’s single http.getResponseBody() function returns the HTTP response, handling missing Content-Length values and automatically dechunking chunked responses, so no modifications (i.e. de-optimizations) to the request are necessary.

Configure this rule as a Response Rule on the virtual server. The Zeus Traffic Manager system will now scrub social security numbers from HTTP responses.

Email Address Obfuscation

The email address obfuscation name@host.com to name(at)host(dot)com is a more complex replacement. A simple replacement can be achieved with relative ease in iRules and TrafficScript, by changing the substitution used:

iRules

regsub -all {([\w\d]+)@([\w\d]+)\.([\w\d]+)} 
    [HTTP::payload] "\\1(at)\\2(dot)\\3" ] newdata

TrafficScript

$body = string.regexsub( $body, "([\\w\\d]+)@([\\w\\d]+)\\.([\\w\\d]+)",
    "$1(at)$2(dot)$3", "g" );

A more sophisticated replacement would be required to match all of the variants of email addresses, and this goes beyond the capabilities of a simple regex substitution. In both languages, the replacement could be achieved by iterating through the HTTP response and making replacements programmatically:

# From http://www.regular-expressions.info/email.html
$re = "[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@".
      "(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?";

$body = http.getResponseBody();

$newbody = "";

while( string.regexmatch( $body, "(.*?)(".$re.")(.*)" ) ) {
   $start = $1;
   $email = $2;
   $remainder = $3;

   # rewrite the email address
   $email = string.replaceAll( $email, "@", "(at)" );
   $email = string.replaceAll( $email, ".", "(dot)" );

   $newbody .= $start . $email;
   $body = $remainder;
}

$newbody .= $body;

http.setResponseBody( $newbody );

TrafficScript example: Iterating through a response and making a complex replacement

Gotchas

A capable Network Side Scripting implementation should be fully conversant in the protocols that it manages. It should, by default, shield the administrator from the many variations of each protocol to reduce the possibility of error when creating policies. Otherwise, there is a real risk that a rule could fail or be subverted once it is deployed on a live production service.

Three such unanticipated 'gotchas' that affect this example are Chunked-Transfer Encoding, Compression and URL Encoding.

Chunked-Transfer Encoding

Chunk-Transfer encoding is used by web applications when the total size of the response cannot be pre-calculated. It allows a connection to be kept alive (reducing server load) and helps to ensure message integrity.

A rule which is not aware of chunked-transfer encoding may pass testing, but later fail when placed in an environment where chunking is used. Corrupt responses will be delivered by that rule.

The Netscaler solution outlined above does not cater for chunk-transfer encoded responses. It is necessary to add a Request Rewrite policy that downgrades the request to HTTP/1.0:

add rewrite action downgrade_1.0 replace http.req.version "\"HTTP/1.0\""
add rewrite policy to_1.0 true downgrade_1.0 
bind lb vserver your_vserver -policyName to_1.0 -priority 20 
     -gotoPriorityExpression NEXT -type REQUEST

Creating a policy to downgrade to HTTP/1.0 using the Netscaler command line

The F5 example (taken from DevCentral) includes measures to explicitly de-optimize requests to disable chunked-transfer encoding. Otherwise, the iRule would need to process chunked responses, adding significantly to its complexity.

The Zeus example uses the high-level http.getResponseBody() function which handles chunked responses by dechunking them transparently. No downgrade is required.

Compression

Content Compression is used when a browser indicates that it can support compression (by way of an Accept-Encoding header – all modern browsers do this) and the server elects to return a compressed response. It reduces the bandwidth used for web traffic, and can improve performance over high-latency and low-bandwidth networks.

A rule that is not aware of HTTP compression will fail if servers, caches or intermediate proxies compress responses.

Neither the Netscaler nor F5 rules cater for compressed responses. In each case, the data returned by the HTTP.RES.BODY() (Netscaler) or HTTP::payload (F5) methods will be binary compressed data rather than HTML text, so the content manipulations will fail. The administrator must remember to add a policy or rule to delete the Accept-Encoding header in the request to disable content compression for all traffic.

Zeus' http.getResponseBody() function automatically decompresses HTTP responses when required. If Content Compression is enabled on the Virtual Server, Zeus Traffic Manager will then re-compress the modified response before returning it to the client.

URL Encoding

URL encoding is designed to allow for safe use of high-bit and Unicode characters, but it is commonly used by attackers to encode URL data in order to bypass security checks.

If a security rule is not aware of URL Encoding, it will be trivially possible to bypass the rule using simple 8-bit or Unicode encoding. Can the rules above be bypassed using the URLs "/my%61pp" or "/my%u0061pp"?

With Netscaler, the recommended way to obtain the URL in an expression is the method HTTP.REQ.URL.PATH. This function returns the raw (unencoded) URL, but if it is used in comparisons, it decodes the URL (both 8-bit and unicode). The example rule above cannot be bypassed by the URL encoding attacks.

In an F5 iRule, there are two ways to obtain the URL: the methods HTTP::path and HTTP::uri. Neither of these methods decode the URL, so the administrator must remember to explicitly decode the URL using URI::decode. If he forgets (as in the published example), the rule can be trivially bypassed.

F5's URI::decode method correctly handles simple %61 encoding, but does not support the more complex unicode (%u0061) encoding, leaving a potential route for attack open.

In Zeus' TrafficScript, the recommended method to obtain the URL is the http.getPath() function. This function returns the decoded URL (managing both 8-bit and unicode encodings), so the rule cannot be bypassed. The function http.getRawURL() returns the raw URL data should it be required. This behavior is explained in the Hex-encoded URLs KnowledgeHub article.

Observations

This example shows the limitations of templated policies; it is often not possible to implement a solution to a problem that the vendor has not anticipated or allowed for. Free-form languages such as iRules (TCL) and TrafficScript are clearly better able to implement sophisticated Network Side Scripting rules.

There are distinct differences in the qualities of the different solutions – how error-free the obvious solution is, how easily they can obtain an HTTP response, and how effectively they protect the administrator from the 'gotchas' of the underlying protocol. These differences will affect the reliability, security and time-to-fix of network-side scripting solutions.

In the final article in this series, I’ll describe a number of more advanced solutions to application problems that go beyond the capabilities of most traffic management systems.

Owen Garrett [Zeus Dev Team] 30 September 2009 Bookmark with del.icio.us Post this article to Digg Post this article to reddit Post this article to Facebook Tweet this article  
Leave a comment ...
Your email address will not be displayed.
Your URL will be displayed.
This public messageboard is not a forum for technical support. To report technical support problems, please contact our dedicated Support team using the instructions at the bottom of this page.
Options:
 
(Line breaks become <br />)
(Set cookies for name, email & url)

Recently...

Other Resources