Memory Dump

Low cost load balancing with Nginx and HAProxy on Linux

Posted in Uncategorized by Chris on June 29, 2009

This post is more for future personal reference than anything, but maybe it will help someone else. It assumes nginx and haproxy running on the same front end box with nginx listening on port 80 and forwarding requests to haproxy on port 81. nginx could also be used as an SSL accelerator in this configuration, but the directives are not included in the example configuration shown here.

The basic idea is that nginx sits at the front doing rewrites, SSL decryption, etc… and then forwards the traffic to haproxy which distributes the traffic among the nodes on the backend. I’d love to have used just haproxy or just nginx, but nginx supports url rewriting and SSL accelerator features among others, while haproxy provides better load balancing algorithms and such. I haven’t had the chance to install yet in an environment that is good for benchmarking, but when I do, I’ll post the results here.

All servers were running default “server” installs of CentOS 5.3 for this configuration. The only dependencies I had to install were:

  • gcc
  • openssl-devel
  • pcre-devel

Install HAProxy

cd /tmp
wget http://haproxy.1wt.eu/download/1.3/src/haproxy-1.3.18.tar.gz
tar -xvf haproxy*.gz
cd haproxy*
make TARGET=linux26 CPU=i686
mv haproxy /usr/sbin

Create or modify /etc/haproxy.cfg

global
        maxconn     25000
        daemon

defaults
        mode        http
        cookie      SERVERID insert nocache indirect
        clitimeout  60000
        srvtimeout  30000
        contimeout  4000
        option      httpclose
        maxconn     25000

listen  http_proxy  192.168.1.14:81
        balance     roundrobin
        option      httpchk
        option      forwardfor
        server      server1 192.168.1.15:80 weight 1 maxconn 5000 cookie SERVER1 check
        server      server2 192.168.1.16:80 weight 1 maxconn 5000 cookie SERVER2 check

Create or modify /etc/init.d/haproxy

#!/bin/sh
#
# chkconfig: 2345 85 15
# description: HA-Proxy is a TCP/HTTP reverse proxy which is particularly suited \
#              for high availability environments.
# processname: haproxy
# config: /etc/haproxy.cfg
# pidfile: /var/run/haproxy.pid

# Source function library.
if [ -f /etc/init.d/functions ]; then
  . /etc/init.d/functions
elif [ -f /etc/rc.d/init.d/functions ] ; then
  . /etc/rc.d/init.d/functions
else
  exit 0
fi

# Source networking configuration.
. /etc/sysconfig/network

# Check that networking is up.
[ ${NETWORKING} = "no" ] && exit 0

[ -f /etc/haproxy.cfg ] || exit 1

RETVAL=0

start() {
  /usr/sbin/haproxy -c -q -f /etc/haproxy.cfg
  if [ $? -ne 0 ]; then
    echo "Errors found in configuration file."
    return 1
  fi

  echo -n "Starting HAproxy: "
  daemon /usr/sbin/haproxy -D -f /etc/haproxy.cfg -p /var/run/haproxy.pid
  RETVAL=$?
  echo
  [ $RETVAL -eq 0 ] && touch /var/lock/subsys/haproxy
  return $RETVAL
}

stop() {
  echo -n "Shutting down HAproxy: "
  killproc haproxy -USR1
  RETVAL=$?
  echo
  [ $RETVAL -eq 0 ] && rm -f /var/lock/subsys/haproxy
  [ $RETVAL -eq 0 ] && rm -f /var/run/haproxy.pid
  return $RETVAL
}

restart() {
  /usr/sbin/haproxy -c -q -f /etc/haproxy.cfg
  if [ $? -ne 0 ]; then
    echo "Errors found in configuration file, check it with 'haproxy check'."
    return 1
  fi
  stop
  start
}

check() {
  /usr/sbin/haproxy -c -q -V -f /etc/haproxy.cfg
}

rhstatus() {
  status haproxy
}

condrestart() {
  [ -e /var/lock/subsys/haproxy ] && restart || :
}

# See how we were called.
case "$1" in
  start)
    start
    ;;
  stop)
    stop
    ;;
  restart)
    restart
    ;;
  reload)
    restart
    ;;
  condrestart)
    condrestart
    ;;
  status)
    rhstatus
    ;;
  check)
    check
    ;;
  *)
    echo $"Usage: haproxy {start|stop|restart|reload|condrestart|status|check}"
    RETVAL=1
esac

exit $RETVAL

Set haproxy to start on boot (and start now)

chkconfig --add haproxy
chkconfig haproxy on
service haproxy start

Install Nginx

cd /tmp
wget http://sysoev.ru/nginx/nginx-0.7.61.tar.gz
tar -xvf nginx*.gz
cd nginx*
./configure
make
make install

Create or modify /usr/local/nginx/conf/nginx.conf

worker_processes  2;
worker_rlimit_nofile 10000;
pid /var/run/nginx.pid;

events {
    worker_connections  4000;
    use epoll;
}

http {
    include       mime.types;
    default_type  application/octet-stream;
    keepalive_timeout  65;
    gzip  on;

    server {
        listen       80;
        server_name  test1.com;
        proxy_set_header  X-Forwarded-For $proxy_add_x_forwarded_for;
        location / {
            proxy_pass http://192.168.1.14:81/;
        }
    }

    server {
        listen       80 default;
        server_name_in_redirect off;
        rewrite      ^ http://test1.com$request_uri;
    }
}

Create or modify /etc/init.d/nginx

#! /bin/sh
# chkconfig: 2345 87 13
# description: A HTTP and mail proxy server licensed under a \
#              2-clause BSD-like license.

# Description: Startup script for nginx webserver on Debian. Place in /etc/init.d and
# run 'sudo update-rc.d nginx defaults', or use the appropriate command on your
# distro.
#
# Author:       Ryan Norbauer 
# Modified:     Geoffrey Grosenbach http://topfunky.com

set -e

PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
DESC="nginx daemon"
NAME=nginx
DAEMON=/usr/local/nginx/sbin/$NAME
CONFIGFILE=/usr/local/nginx/conf/nginx.conf
PIDFILE=/var/run/$NAME.pid
SCRIPTNAME=/etc/init.d/$NAME

# Gracefully exit if the package has been removed.
test -x $DAEMON || exit 0

d_start() {
  $DAEMON -c $CONFIGFILE || echo -n " already running"
}

d_stop() {
  kill -QUIT `cat $PIDFILE` || echo -n " not running"
}

d_reload() {
  kill -HUP `cat $PIDFILE` || echo -n " can't reload"
}

case "$1" in
  start)
        echo -n "Starting $DESC: $NAME"
        d_start
        echo "."
        ;;
  stop)
        echo -n "Stopping $DESC: $NAME"
        d_stop
        echo "."
        ;;
  reload)
        echo -n "Reloading $DESC configuration..."
        d_reload
        echo "reloaded."
  ;;
  restart)
        echo -n "Restarting $DESC: $NAME"
        d_stop
        # One second might not be time enough for a daemon to stop,
        # if this happens, d_start will fail (and dpkg will break if
        # the package is being upgraded). Change the timeout if needed
        # be, or change d_stop to have start-stop-daemon use --retry.
        # Notice that using --retry slows down the shutdown process somewhat.
        sleep 1
        d_start
        echo "."
        ;;
  *)
          echo "Usage: $SCRIPTNAME {start|stop|restart|force-reload}" >&2
          exit 3
        ;;
esac

exit 0

Set nginx to start on boot (and start now)

chkconfig --add nginx
chkconfig nginx on
service nginx start

If everything works, you should now be able to access your nginx box on port 80. This will proxy the requests through haproxy on port 81 to whatever back end nodes you have configured in haproxy.cfg and return the content. Spiffy, eh?

BitNami: Oh the wonderful things you do

Posted in Uncategorized by Chris on June 25, 2009

Open source software is great. I use a lot of it. I contribute to some of it, and I overwhelmingly support its use whenever possible (and reasonable). That said, a lot of open source software today is linux based. This is only natural because there are so many variations of Linux designed to suit every user and they’re almost all entirely free to download, install, and use on a daily basis.

However, for someone who works in a Microsoft shop, getting your hands on this great software can sometimes be a pain. A lot of open source web apps were written in PHP and designed for Apache, but have you tried installing Apache and configuring a web app designed for the LAMP stack recently? While not incredibly difficult, it’s mildly annoying to the point that I wouldn’t wish it upon anyone frequently. Furthermore, the added dependencies of PHP, MySQL, and the headaches associated with the various version requirements can make installing these apps on Windows a real pain at times.

Enter BitNami.

This group takes popular open source packages like MediaWiki and Drupal and packages them with their required components (Apache, PHP, MySQL, etc..) into an easy to install distribution for multiple platforms. Just the other day I installed the MediaWiki app on our Windows development server and I have to say, it was a pleasure doing so. The entire process was streamlined and my new wiki was up and running in less than 5 minutes. I didn’t have to worry about the latest version of apache and whether the package I downloaded came with mod_php, nor did I have to worry about downloading PHP and enabling the MySQL support (two of the biggest hurdles I’ve had in the past). Everything installed right into my C:\Program Files\BitNami MediaWiki Stack folder just like I told it and functions pefectly.

I plan to experiment with some of the other packages BitNami has created, and I sincerely hope they (and encourage them to) continue to develop these packages for folks like me. BitNami, you rock!

Tagged with: ,

Keeping your passwords safe and secure with KeePass

Posted in Uncategorized by Chris on June 25, 2009

Since I started using KeePass a couple of years ago, I’ve marveled at the simplicity and security of the software. I appreciate the knowledge that if I ever forget a password (and let’s face it, I have several hundred to remember), I can always look it up in my “little black book” aka KeePass. The program works like this:

  1. Supply a master password
  2. Enter your usernames and passwords in the database
  3. Never forget a password again!

Each time you open a KeePass database, you are prompted for your master password, and your database cannot be unencrypted without it. If you choose, you can forego a master password (or supplement it) by using a key file instead. This key file can be securely tucked away on a USB thumb drive and attached to your key chain. No key file? No access.

keepass

Tagged with: ,

Areas in ASP.NET MVC

Posted in Uncategorized by Chris on June 3, 2009

It wasn’t long after I started using ASP.NET MVC that I realized I needed to be able to split functionality based on specific sections of my site (admin, user, etc…). I wanted a separate set of controllers, models, and views for each section, because each section would be working with the same data but in vastly different ways. A quick google search led me to posts by Phil Haack and Steve Sanderson, and the resulting AreaViewEngine class derived from their code worked well, with one major issue: performance.

As I was developing a site for a sweepstakes for a nationally syndicated TV show by a very famous TV/media personality, the site needed to perform incredibly well. Profiling of the application during development revealed serious problems with the VirtualPathProviderViewEngine as documented here at stackoverflow.com. It seemed no matter what code existed in my app, the highest consumer of CPU time was the FindPartialView method.

Some research led me to the question of view resolution caching documented here but the problem was, I was profiling in release mode. How could it be that the view resolution was being cached and still causing such a problem? I dug into the code (downloaded from CodePlex) and confirmed that the cache should have been enabled, but still, the performance problem persisted. I then spent 3 days drilling into the problem and came up with two improvements that almost nullified the impact of partial view resolution.

  1. If the view name passed to the view engine starts with “~/”, assume an absolute path and skip area view resolution altogether.
  2. Avoid the slower resolution of views by the VirtualPathProviderViewEngine and resolve it myself.

The resulting code is below. Testing on my local machine revealed that the enhancements noted above (and commented in the code) increased my requests per second from ~30 to ~110 (a 350% improvement)! On our production environment, testing a static page with no database access, this code allowed us to approach 2000 requests per second using a Zeus ZXTM LB (v5.1) with 4 Windows 2003 web server nodes.

2009/06/19: Alexander reported a missing slash in the path name formatting. Good catch! The code has been updated to reflect the change.

using System;
using System.Collections.Generic;
using System.IO;
using System.Web;
using System.Web.Routing;
using System.Web.Mvc;

namespace ElserInteractive.Framework.Web.Mvc
{
    public class AreaViewEngine : WebFormViewEngine
    {
        public AreaViewEngine()
            : base()
        {
            ViewLocationFormats = new[]
            {
                "~/{0}.aspx",
                "~/{0}.ascx",
                "~/Views/{1}/{0}.aspx",
                "~/Views/{1}/{0}.ascx",
                "~/Views/Shared/{0}.aspx",
                "~/Views/Shared/{0}.ascx",
            };

            MasterLocationFormats = new[]
            {
                "~/{0}.master",
                "~/Shared/{0}.master",
                "~/Views/{1}/{0}.master",
                "~/Views/Shared/{0}.master",
            };

            PartialViewLocationFormats = ViewLocationFormats;
            base.ViewLocationCache = new DefaultViewLocationCache(TimeSpan.FromMinutes(30));
        }

        public override ViewEngineResult FindPartialView(ControllerContext controllerContext, string partialViewName, bool useCache)
        {
            string controller;
            string areaPartialName;
            ViewEngineResult result = null;

            // Performance enhancement #1:
            // Don't attempt to resolve absolute paths as area paths
            // ========================================================
            if (partialViewName.StartsWith("~"))
                return base.FindPartialView(controllerContext, partialViewName, useCache);

            if (controllerContext.RouteData.Values.ContainsKey("area"))
            {
                areaPartialName = FormatViewName(controllerContext, partialViewName, true);
                result = base.FindPartialView(controllerContext, areaPartialName, useCache);
                if (result != null && result.View != null)
                    return result;

                areaPartialName = FormatSharedViewName(controllerContext, partialViewName, true);
                result = base.FindPartialView(controllerContext, areaPartialName, useCache);
                if (result != null && result.View != null)
                    return result;
            }

            // Performance enhancement #2:
            // Resolve the view path internally, if possible. This avoids the
            // slower method of view path resolution used by the  ViewPathProviderViewEngine
            // ========================================================
            controller = controllerContext.RouteData.GetRequiredString("controller");
            foreach (string fmt in base.ViewLocationFormats)
            {
                var path = string.Format(fmt, partialViewName, controller);
                var path2 = controllerContext.HttpContext.Request.MapPath(path);
                if (File.Exists(path2))
                    return base.FindPartialView(controllerContext, path, useCache);
            }

            return base.FindPartialView(controllerContext, partialViewName, useCache);
        }

        public override ViewEngineResult FindView(ControllerContext controllerContext, string viewName, string masterName, bool useCache)
        {
            string controller;
            string areaViewName;
            ViewEngineResult result = null;

            // Performance enhancement #1:
            // Don't attempt to resolve absolute paths as area paths
            // ========================================================
            if (viewName.StartsWith("~"))
                return base.FindPartialView(controllerContext, viewName, useCache);

            if (controllerContext.RouteData.Values.ContainsKey("area"))
            {
                areaViewName = FormatViewName(controllerContext, viewName, false);
                result = base.FindView(controllerContext, areaViewName, masterName, useCache);
                if (result != null && result.View != null)
                    return result;

                areaViewName = FormatSharedViewName(controllerContext, viewName, false);
                result = base.FindView(controllerContext, areaViewName, masterName, useCache);
                if (result != null && result.View != null)
                    return result;
            }

            // Performance enhancement #2:
            // Resolve the view path internally, if possible. This avoids the
            // slower method of view path resolution used by the  ViewPathProviderViewEngine
            // ========================================================
            controller = controllerContext.RouteData.GetRequiredString("controller");
            foreach (string fmt in base.ViewLocationFormats)
            {
                var path = string.Format(fmt, viewName, controller);
                var path2 = controllerContext.HttpContext.Request.MapPath(path);
                if (File.Exists(path2))
                    return base.FindView(controllerContext, path, masterName, useCache);
            }

            return base.FindView(controllerContext, viewName, masterName, useCache);
        }

        private static string FormatViewName(ControllerContext controllerContext, string viewName, bool isPartial)
        {
            string controllerName = controllerContext.RouteData.GetRequiredString("controller");
            string area = controllerContext.RouteData.Values["area"].ToString();
            return "~/Areas/" + area + "/Views/" + controllerName + "/" + viewName + (isPartial ? ".ascx" : ".aspx");
        }

        private static string FormatSharedViewName(ControllerContext controllerContext, string viewName, bool isPartial)
        {
            string area = controllerContext.RouteData.Values["area"].ToString();
            return "~/Areas/" + area + "/Views/Shared/" + viewName + (isPartial ? ".ascx" : ".aspx");
        }
    }
}
Tagged with: ,

Attempt #8

Posted in Uncategorized by Chris on June 3, 2009

I like blogs. I read them every day. I often muse about keeping my own and on occasion, I actually try. For one reason or another, though, my attempts at joining the blogosphere have generally ended in failure. I simply can’t keep up after the first few articles. Sometimes I lose interest, sometimes I get busy, and sometimes, I just plain run out of things to write. Blogging isn’t easy for everyone, you know.

With that in mind, I’m going to try this again. Throughout my work week I invariably reference several mainstream programming blogs and a few not-so-work related as well, and a well written article often leaves me wanting to contribute rather than merely consume. It’s not like I don’t have things to write about, either. My recent adoption of the ASP.NET MVC framework and Ling 2 SQL at work have led me into many dark places where documentation is scant and help is less than readily available. I’ve undertaken many experiments and gained much knowledge that should be shared. It wants to be shared. Let this be my attempt to share it (along with other random less relevant musings).

Follow

Get every new post delivered to your Inbox.