Sunday, December 12, 2010

Killer Java applications server with nginx and memcached

Last days I worked setting up a new web serving structure for Wine, the largest wine’s e-commerce in Latin America. After testing, studying and learning a lot, we built a nice solution based on nginx and memcached. I will use a picture to describe the architecture:

As you can see, when a client do a request to the nginx server, it first checks on memcached if the response is already cached. If the response was not found on cache server, then nginx forward the request to Tomcat, which process the request, cache the response on memcached and returns it to nginx. Tomcat works only for the first client, and all other clients requesting the same resource will get the cached response on RAM. My objective with this post is to show how we built this architecture.

nginx

nginx was compiled following Linode instructions for nginx installation from source. The only difference is that we added the nginx memcached module. So, first I downloaded the memc_module source from Github and then built nginx with it. Here is the commands for compiling nginx with memcached module:
% ./configure --prefix=/opt/nginx --user=nginx --group=nginx --with-http_ssl_module --add-module={your memc_module source path}
% make
% sudo make install
After install nginx and create an init script for it, we can work on its settings for integration with Tomcat. Just for working with separate settings, we changed the nginx.conf file (located in /opt/nginx/conf directory), and it now looks like this:
user  nginx;
worker_processes  1;

error_log  logs/error.log;

events {
    worker_connections  1024;
}

http {
    include       mime.types;
    default_type  application/octet-stream;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                                '$status $body_bytes_sent "$http_referer" '
                                '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  logs/access.log  main;

    sendfile        on;
    #tcp_nopush     on;

    #keepalive_timeout  0;
    keepalive_timeout  65;

    #gzip  on;

    include /opt/nginx/sites-enabled/*;
}
See the last line inside http section: this line tells nginx to include all settings present in the /opt/nginx/sites-enabled directory. So, now, let’s create a default file in this directory, with this content:
server {
    listen       80;
    server_name  localhost;

    default_type  text/html;

    location / {
        proxy_set_header    X-Real-IP   $remote_addr;
        proxy_set_header    Host        $http_host;
        proxy_set_header    X-Forwarded-For $proxy_add_x_forwarded_for;

        if ($request_method = POST) {
            proxy_pass      http://localhost:8080;
            break;
        }

        set $memcached_key   "$uri";
        memcached_pass      127.0.0.1:11211;

        error_page  501 404 502 = /fallback$uri;
    }

    location /fallback/ {
        internal;    

        proxy_set_header    X-Real-IP   $remote_addr;
        proxy_set_header    Host        $http_host;
        proxy_set_header    X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_redirect      off;

        proxy_pass          http://localhost:8080;
    }

}
Some stuffs must be explained here: the default_type directive is necessary for proper serving of cached responses (if you are cache other content types like application/json or application/xml, you should take a look at nginx documentation and deal conditionally with content types). The location / scope defines some settings for proxy, like IP and host. We just did it because we need to pass the right information to our backend (Tomcat or memcached). See more about proxy_set_header at nginx documentation. After that, there is a simple verification oF the request method. We don’t want to cache POST requests.

Now we get the magic: first we set the $memcached_key and then we use the memcached_pass directive, the $memcached_key is the URI. memcached_pass is very similar to proxy_pass, nginx “proxies” the request to memcached, so we can get some HTTP status code, like 200, 404 or 502. We define error handlers for two status codes:
  • 404: memcached module returns a 404 error when the key is not on memcached server;
  • 502: memcached module returns a 502 error when it can’t found memcached server.
So, when nginx gets any of those errors, it should forward the request to Tomcat, creating another proxy. We configured it out on fallback, an internal location that builds a proxy between nginx and Tomcat (listening on port 8080). Everything is set up with nginx. As you can see in the picture or in the nginx configuration file, nginx doesn’t write anything to memcached, it only reads from memcached. The application should write to memcached. Let’s do it.

Java application

Now is the time to write some code. I chose an application written by a friend of mine. It’s a very simple CRUD of users, built by Washington Botelho with the goal of introducing VRaptor, a powerful and fast development focused web framework. Washington also wrote a blog post explaining the application, if you don’t know VRaptor or want to know how the application was built, check the blog post "Getting started with VRaptor 3". I forked the application, made some minor changes and added a magic filter for caching. All Java code that I want to show here is the filter code:
package com.franciscosouza.memcached.filter;

import java.io.IOException;
import java.io.PrintWriter;
import java.io.StringWriter;
import java.net.InetSocketAddress;

import javax.servlet.Filter;
import javax.servlet.FilterChain;
import javax.servlet.FilterConfig;
import javax.servlet.ServletException;
import javax.servlet.ServletOutputStream;
import javax.servlet.ServletRequest;
import javax.servlet.ServletResponse;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import javax.servlet.http.HttpServletResponseWrapper;

import net.spy.memcached.MemcachedClient;

/**
 * Servlet Filter implementation class MemcachedFilter
 */
public class MemcachedFilter implements Filter {

    private MemcachedClient mmc;

    static class MemcachedHttpServletResponseWrapper extends HttpServletResponseWrapper {

        private StringWriter sw = new StringWriter();

        public MemcachedHttpServletResponseWrapper(HttpServletResponse response) {
            super(response);
        }

        public PrintWriter getWriter() throws IOException {
            return new PrintWriter(sw);
        }

        public ServletOutputStream getOutputStream() throws IOException {
            throw new UnsupportedOperationException();
        }

        public String toString() {
            return sw.toString();
        }
    }

    /**
     * Default constructor.
     */
    public MemcachedFilter() {
    }

    /**
     * @see Filter#destroy()
     */
    public void destroy() {
    }

    /**
     * @see Filter#doFilter(ServletRequest, ServletResponse, FilterChain)
     */
    public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException {
        MemcachedHttpServletResponseWrapper wrapper = new MemcachedHttpServletResponseWrapper((HttpServletResponse) response);
        chain.doFilter(request, wrapper);

        HttpServletRequest inRequest = (HttpServletRequest) request;
        HttpServletResponse inResponse = (HttpServletResponse) response;

        String content = wrapper.toString();

        PrintWriter out = inResponse.getWriter();
        out.print(content);

        if (!inRequest.getMethod().equals("POST")) {
            String key = inRequest.getRequestURI();
            mmc.set(key, 5, content);
        }
    }

    /**
     * @see Filter#init(FilterConfig)
     */
    public void init(FilterConfig fConfig) throws ServletException {
        try {
            mmc = new MemcachedClient(new InetSocketAddress("localhost", 11211));
        } catch (IOException e) {
            e.printStackTrace();
            throw new ServletException(e);
        }
    }
}
First, the dependency: for memcached communication, we used spymemcached client. It is a simple and easy to use memcached library. I won’t explain all the code, line by line, but I can tell the idea behind the code: first, call doFilter method on FilterChain, because we want to get the response and work with that. Take a look at the MemcachedHttpServletResponseWrapper instance, it encapsulates the response and makes easier to play with response content.

We get the content, write it on response writer and put it in cache using the MemcachedClient provided by spymemcached. The request URI is the key and timeout is 5 seconds.

web.xml

Last step is to add the filter on web.xml file of the project, map it before the VRaptor filter is very important for proper working:
<?xml version="1.0" encoding="UTF-8"?>
<web-app xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://java.sun.com/xml/ns/javaee" xmlns:web="http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd" xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd" id="WebApp_ID" version="2.5">
    <display-name>memcached sample</display-name>

    <filter>
        <filter-name>vraptor</filter-name>
        <filter-class>br.com.caelum.vraptor.VRaptor</filter-class>
    </filter>
    
    <filter>
        <filter-name>memcached</filter-name>
        <filter-class>com.franciscosouza.memcached.filter.MemcachedFilter</filter-class>
    </filter>
    
    <filter-mapping>
        <filter-name>memcached</filter-name>
        <url-pattern>/*</url-pattern>
    </filter-mapping>

    <filter-mapping>
        <filter-name>vraptor</filter-name>
        <url-pattern>/*</url-pattern>
        <dispatcher>FORWARD</dispatcher>
        <dispatcher>REQUEST</dispatcher>
    </filter-mapping>

</web-app>
That is it! Now you can just run Tomcat on port 8080 and nginx on port 80, and access http://localhost on your browser. Try some it: raise up the cache timeout, navigate on application and turn off Tomcat. You will still be able to navigate on some pages that use GET request method (users list, home and users form).

Check the entire code out on Github: https://github.com/fsouza/starting-with-vraptor-3. If you have any questions, troubles or comments, please let me know! ;)