Setup Global load balancing for your site using Open source nginx



Nginx, called engine-x is a high performance HTTP server and reverse proxy, with proxy capabilities for IMAP/POP3/SMTP. Nginx is the creation of Russian developer, Igor Sysoev, and has been running in production for over two years. The latest stable release at the time of writing is Nginx 0.5.30, and is the focus of this article. While Nginx is capable of proxying non-HTTP protocols, we’re going to focus on HTTP and HTTPS.


High Performance, Yet Lightweight

Nginx uses a master process and N+1 worker process model. The number of workers is controlled by the configuration, yet the memory footprint and resources used by Nginx are several orders of magnitude less than Apache. Nginx uses epoll() in Linux. In our lab, Nginx was handling hundreds of requests per second, while using about 16MB of ram and a consistent load average of about 1.00. This is considerably better than Apache 2.2, and Pound doesn’t scale well with this type of usage (high memory usage, lots of threads). In general, Nginx offers a very cost effective solution.



Lighttpd is a great lightweight option, but it has a couple of drawbacks. Nginx has very good reverse proxy capabilities with integrated basic load balancing. This makes it a very good option as a front end to dynamic web applications, such as those running under Rails and using Mongrel. Lighttpd on the other hand, has an old and unmaintained proxy module. Now it does have a new proxy module with Lighttpd 1.5.x, but that is the other problem with Lighttpd, where its going. Lighttpd 1.4 is lightweight, relies on very few external libraries and is fast. Lighttpd 1.5.x on
the other hand requires many more external libraries, including glib, now I don’t know about you but anything using glibc is far from “lightweight”.


Basic Configuration

The basic configuration of Nginx specifies the unprivileged user to run as, the number of worker processes, error log, pid and events block. After this basic configuration block, you have per protocol blocks (http for example).



  • user nobody;
  • worker_processes 4;
  • error_log logs/error.log;
  • pid logs/;
  • events {
  • worker_connections 1024;
  • }



Basic HTTP server

Nginx is relatively easy to configure as a basic web server, it supports IP and Name based virtual hosts, and it uses a pcre based URI processing system. Configuring static hosting is very easy, you just specify a new server block:



  • server {
  • listen;
  • server_name;
  • access_log logs/ main;
  • location / {
  • index index.html index.htm;
  • root /var/www/static/;
  • }
  • }



Here we are listening on port 80 on, with name virtual hosting using and The server_name option also supports wildcards, so you can specify * and have it handled by the configuration. The usual access logs, and root specifies htdocs. If you have a large number of name virtual hosts, you’ll need to increase the size of the hash bucket with server_names_hash_bucket_size 128;


Gzip compression

Nginx like many other web servers, can compress content using gzip.



  • gzip on;
  • gzip_min_length 1100;
  • gzip_buffers 4 8k;
  • gzip_types text/plain text/html text/css text/js;



Here Nginx allows you to enable gzip, specify a minimum length to compress, buffers and the mime types that Nginx will compress. Gzip compression is supported by all modern browsers.


HTTP Load Balancing

Nginx can be used a simple HTTP load balancer, in this configuration, you would place Nginx in front of your existing web servers. The existing web servers can be running Nginx as well. In HTTP load balancer mode, you simply need to add an upstream block to the configuration :



  • upstream {
  • server;
  • server;
  • server;
  • }
  • upstream {
  • server;
  • server;
  • server;
  • }



Then in the server block, you add the line:



  • proxy_pass;



Health Check Limitations

Nginx has only simple load balancing capabilities. It doesn’t have health checking capabilities and it uses a simple load balancing algorithm. However, Nginx is a relatively new project, so one would expect to see various load balancing algorithms and health checking support added over time. While it might not be wise to replace your commercial load balancer with Nginx anytime soon, Nginx is almost there in terms of a very competitive solution. Monit, and other monitoring applications offer good options to compensate for a lack of health checking capabilities in Nginx.


Global Server Load Balancing

Nginx has a very interesting capability. With a little configuration can provide Global Server Load Balancing. Now Global Server Load Balancing (GSLB) is a feature you’ll find on high-end load balancing switches such as those from F5, Radware, Nortel, Cisco etc. Typically GSLB is an additional license you have to purchase for a few thousand dollars, on top of a switch that typically start around US$10,000.


GSLB works by having multiple sites distributed around the world, so you might have a site in Europe, a site in Asia and a site in North America. Normally, you would direct traffic by region by using different top level domains (TLD). So might go to North America, to Europe, to the server in Asia. This isn’t a very effective solution because it relies on the user to visit the proper domain. A user in Asia, might see a print advertisement for the North American market, hitting the .com address means they aren’t visiting the closest and fastest server.


GSLB works by looking at the source IP address of the request, and then determines which site is closest to that source address. The simplest method is to break the Internet address space down per region, then to route
traffic to the local site in that region. When we say region, we mean – North America, South America, EMEA (Europe, Middle East and Africa) and APAC (Asia-Pacific).


Configuring Nginx for GSLB

The geo {} block is used to configure GSLB in Nginx, the geo block causes Nginx to look at the source IP, and set a variable based on the configuration. The nice thing with Nginx is that you can set a default.



  • geo $gslb {
  • default na;
  • include conf/gslb.conf
  • }



Here in our configuration, we’re setting the default to na (North America) and then including the gslb.conf. The configuration file gslb.conf is a basic file consisting of subnet variable. Here is an excerpt from gslb.conf:



  • emea;
  • emea;
  • apac;



When Nginx receives a request from a source IP in (for those of you unfamiliar with slash notation, this is the entire Class A, thru, it sets the variable $gslb to emea. We then use that later in the configuration to redirect.


Inside the location block of our server configuration in Nginx, we add a number of if statements before the proxy_pass (if used) statement. These instruct the server to do a HTTP 302 Redirect (temporary redirect).



  • if ($gslb = emea) {
  • rewrite ^(.*)$1 redirect;
  • }
  • if ($gslb = apac) {
  • rewrite ^(.*)$1 redirect;
  • }



These are configured under the named virtual server, if someone from North America hits, it hits the default and simply loads from the same server. If the user is from Europe, the request should match one of the subnets listed in gslb.conf, and sets the gslb variable to emea. This request causes the North American site hosting the .com domain to redirect the client to the server(s) at the site in Europe.


On the European server, the configuration is slightly different. Instead of the emea check, you check for NA and redirect to the US site. This is to handle the situation when someone in North America hits the .eu or site.



  • if ($gslb = na) {
  • rewrite ^(.*)$1 redirect;
  • }



Traffic Control: In-region not always faster

The problem with commercial solutions is that they are too generalized. In our example configurations so far, we make some pretty wild assumptions. The problem with the Internet is that a user in Asia, might not for example, have a faster connection to servers in Asia. A good example of this is India and Pakistan. A server hosted in Hong Kong or Singapore, is in Asia, and would be considered “in region” for customers in India and Pakistan. The reality though is that traffic from those countries to Hong Kong, is actually routed through Europe, so packets from India to Hong Kong, go from India thru Europe, across the United States and hit Hong Kong from the Pacific. However, in the same subnet, customers in Australia are only a few hops away from Hong Kong.


In such a situation, with commercial solutions, you are just out of luck, but with Nginx you can fine tune how traffic is directed. Here we know is mainly APAC, but and have faster connections to Europe. So, we simply add these subnets to the configuration. Nginx will use the closest match to the source IP. So is
finer grained than, so Nginx will use it.


Manual Tuning

The initial tuning can be done by using the whois command, for example whois will give you an idea which region it belongs to – ARIN, RIPE, etc. ARIN, RIP, APNIC, AFRINIC, and LACNIC are regional internet registries or RIR. An RIR is an organization overseeing the allocation and registration of Internet number resources within a particular region of the world. IP addresses both IPv4 and IPv6 are managed by these RIRs. However, as in our previous example, you’re going to need to fine tune the gslb configuration with traceroute and ping information. Probably the best approach is to do a general configuration and then fine tune the configuration based on feedback from customers.


Cost Savings vs. Features

Looking at a well known Layer 4-7 switching solution, you would need a minimum of $15k per site to purchase the necessary equipment and licensing. Commercial solutions do have some additional fault tolerant measures, such as the ability to measure load and availability of servers at remote sites. However, with Nginx offering a very close solution which is available for FREE with the source code, it is only a matter of time before such features are part of Nginx or available thru other projects.



The following is an initial example of gslb.conf, it should be sufficient for most users.



  • uk;
  • emea;
  • emea;
  • apac;
  • uk;
  • emea;
  • emea;
  • apac;
  • apac;
  • apac;
  • apac;
  • emea;
  • emea;
  • emea;
  • emea;
  • emea;
  • uk;
  • uk;
  • uk;
  • emea;
  • emea;
  • apac;
  • apac;
  • uk;
  • uk;
  • apac;
  • apac;
  • emea;
  • emea;
  • emea;
  • apac;
  • emea;
  • emea;
  • emea;
  • emea;
  • apac;
  • emea;
  • apac;
  • emea;
  • emea;
  • emea;
  • emea;
  • emea;
  • emea;
  • emea;
  • emea;
  • emea;
  • apac;
  • apac;
  • emea;
  • emea;
  • apac;
  • apac;
  • apac;
  • apac;



How to catch 500 error from error logs in apache

A. Enable cgi for your apache. Add following.

1) LoadModule cgid_module modules/


<Directory “/appl/apache2/cgi-bin”>

AllowOverride None

Options ExecCGI

Order allow,deny

Allow from all



ScriptAlias /cgi-bin/ “/appl/apache2/cgi-bin/”

AddHandler cgi-script .cgi

ErrorDocument 413 /cgi-bin/error.cgi

4) Restart apache.

B. Set up the following python script to catch this error, send an email to admin and give the custome message to users.


chmod +x /appl/apache/cgi-bin/error.cgi

import sys, os
SENDMAIL = “/usr/sbin/sendmail” # sendmail location
print “Content-Type: text/htmlnn”;
if (os.environ[“REDIRECT_STATUS”] == “413”) or (os.environ[“REDIRECT_STATUS”] == “500”):
stats = “<table border=1><tr><td>Variable</td><td>Value</td></tr>”
for name, value in os.environ.items():
stats += “<tr><td>%s</td><td>%s</td></tr>” % (name,value)
stats += “</table>”
sendmail_location = “/usr/sbin/sendmail” # sendmail location
p = os.popen(“%s -t” % “/usr/sbin/sendmail”, “w”)
p.write(“From: %sn” % “”)
p.write(“To: %sn” % “”)
p.write(“Content-Type: text/htmln”)
p.write(“Subject: Error %s in accessing n” % os.environ[“REDIRECT_STATUS”])
p.write(“n”) # blank line separating headers from body
status = p.close()
”’print “<H3><center>Inconvenience Regretted.  Team has been notified of this issue</center></h3>””’
<script language=’JavaScript’>
var todate = new Date ( );
todate.setTime ( todate.getTime() – 100000 );
var domain_url_del = window.location.href;
var domain_Name_url_del = domain_url_del.split(“http://”);
var domain_Name_temp_del = domain_Name_url_del[1].split(“/”);
var domain_Name_del = domain_Name_temp_del[0];
var cookieList = document.cookie.split(‘;’);
for(var i=0;i < cookieList.length;i++)
var name = cookieList[i];
document.cookie = ”+name+’=; path=//APPLICATION/PATH; domain=.’ + domain_Name_del + ‘; expires=’ + todate.toGMTString();
document.cookie = ”+name+’=; path=//APPLICATION/PATH; domain=.’ + domain_Name_del + ‘; expires=’ + todate.toGMTString();
print cookieclearjs
print “<script language=’JavaScript’>window.location=’%s'</script>” % os.environ[“REDIRECT_SCRIPT_URI”]
print “<H3><center>What you are looking for, is not here</center></h3>”


Command Line Options

-A Print frame payload in ASCII
-c <count> Exit after capturing count packets
-D List available interfaces
-e Print link-level headers
-F <file> Use file as the filter expression
-G <n> Rotate the dump file every n seconds
-i <iface> Specifies the capture interface
-K Don’t verify TCP checksums
-L List data link types for the interface
-n Don’t convert addresses to names
-p Don’t capture in promiscuous mode
-q Quick output
-r <file> Read packets from file
-s <len> Capture up to len bytes per packet
-S Print absolute TCP sequence numbers
-t Don’t print timestamps
-v[v[v]] Print more verbose output
-w <file> Write captured packets to file
-x Print frame payload in hex
-X Print frame payload in hex and ASCII
-y <type> Specify the data link type
-Z <user> Drop privileges from root to user

Capture Filter Primitives

[src|dst] host <host> Matches a host as the IP source, destination, or either
ether [src|dst] host <ehost> Matches a host as the Ethernet source, destination, or either
gateway host <host> Matches packets which used host as a gateway
[src|dst] net <network>/<len> Matches packets to or from an endpoint residing in network
[tcp|udp] [src|dst] port <port> Matches TCP or UDP packets sent to/from port
[tcp|udp] [src|dst] portrange <p1>-<p2> Matches TCP or UDP packets to/from a port in the given range
less <length> Matches packets less than or equal to length
greater <length> Matches packets greater than or equal to length
(ether|ip|ip6) proto <protocol> Matches an Ethernet, IPv4, or IPv6 protocol
(ether|ip) broadcast Matches Ethernet or IPv4 broadcasts
(ether|ip|ip6) multicast Matches Ethernet, IPv4, or IPv6 multicasts
type (mgt|ctl|data) [subtype <subtype>] Matches 802.11 frames based on type and optional subtype
vlan [<vlan>] Matches 802.1Q frames, optionally with a VLAN ID of vlan
mpls [<label>] Matches MPLS packets, optionally with a label of label
<expr> <relop> <expr> Matches packets by an arbitrary expression








TCP Flags

tcp-urg tcp-rst
tcp-ack tcp-syn
tcp-psh tcp-fin


! or not
&& or and
|| or or


udp dst port not 53
host && host
tcp dst port 80 or 8080
UDP not bound for port 53
Traffic between these hosts
Packets to either TCP port

ICMP Types

icmp-echoreply icmp-routeradvert icmp-tstampreply
icmp-unreach icmp-routersolicit icmp-ireq
icmp-sourcequench icmp-timxceed icmp-ireqreply
icmp-redirect icmp-paramprob icmp-maskreq
icmp-echo icmp-tstamp icmp-maskreply