Killer Java applications server with nginx and memcached

Last days I worked setting up a new web serving structure for Wine, the largest wine’s e-commerce in Latin America. After testing, studying and learning a lot, we built a nice solution based on nginx and memcached. I will use a picture to describe the architecture (sorry, I’m not so good with pictures =P):

nginx, tomcat and memcached

As you can see, when a client do a request to the nginx server, it first checks on memcached if the response is already cached. If the response was not found on cache server, then nginx forward the request to Tomcat, which process the request, cache the response on memcached and returns it to nginx. Tomcat works only for the first client, and all other clients requesting the same resource will get the cached response on RAM. My objective with this post is to show how we built this architecture.

nginx

nginx was compiled following Linode instructions for nginx installation from source. The only difference is that we added the nginx memcached module. So, first I downloaded the memc_module source from Github and then built nginx with it. Here is the commands for compiling nginx with memcached module:

$ ./configure --prefix=/opt/nginx --user=nginx --group=nginx --with-http_ssl_module --add-module={your memc_module source path}
$ make
$ sudo make install

After install nginx and create an init script for it, we can work on its settings for integration with Tomcat. Just for working with separate settings, we changed the nginx.conf file (located in /opt/nginx/conf directory), and it now looks like this:

user  nginx;
worker_processes  1;

error_log  logs/error.log;

events {
    worker_connections  1024;
}

http {
    include       mime.types;
    default_type  application/octet-stream;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                                '$status $body_bytes_sent "$http_referer" '
                                '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  logs/access.log  main;

    sendfile        on;
    #tcp_nopush     on;

    #keepalive_timeout  0;
    keepalive_timeout  65;

    #gzip  on;

    include /opt/nginx/sites-enabled/*;
}

See the last line on http directive: this line tells nginx to include all settings present in the /opt/nginx/sites-enabled directory. So, now, let’s create a default file in this directory, with this content:

server {
    listen       80;
    server_name  localhost;

    default_type  text/html;

    location / {
        proxy_set_header    X-Real-IP   $remote_addr;
        proxy_set_header    Host        $http_host;
        proxy_set_header    X-Forwarded-For $proxy_add_x_forwarded_for;

        if ($request_method = POST) {
            proxy_pass      http://localhost:8080;
            break;
        }

        set $memcached_key   "$uri";
        memcached_pass      127.0.0.1:11211;

        error_page  501 404 502 = /fallback$uri;
    }

    location /fallback/ {
        internal;    

        proxy_set_header    X-Real-IP   $remote_addr;
        proxy_set_header    Host        $http_host;
        proxy_set_header    X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_redirect      off;

        proxy_pass          http://localhost:8080;
    }

}

Some stuffs must be explained here: the default_type directive is necessary for proper serving of cached responses (if you are cache other content types like application/json or application/xml, you should take a look at nginx documentation and deal conditionally with content types). The location / scope defines some settings for proxy, like IP and host. We just do it because we need to pass the right information to our backend (Tomcat or memcached). See more about proxy_set_header at nginx documentation. After that, there is a simple verification on the request method. We don’t want to cache POST requests.

Now we get the magic: first we set the $memcached_key and then we use the memcached_pass directive, the $memcached_key is the URI. memcached_pass directive is very similar to proxy_pass, nginx “proxies” memcached to get the response. So we can get some HTTP status code, like 200, 404 or 502. We define two error handlers for two status codes:

  • 404: memcached module returns a 404 error when the key is not on memcached server
  • 502: memcached module returns a 502 error when it can’t found memcached server (it is a bad gateway error, the same you get if you start nginx withou start Tomcat ;D)

So, when nginx gets any of those errors, it should forward the request to Tomcat, creating another proxy. We configured it out on fallback, an internal location that builds a proxy between nginx and Tomcat (listening on port 8080). Everything is set up with nginx. As you can see in the picture or in the nginx configuration file, nginx doesn’t put anything on cache, it only gets cached items. The application should put everything on cache. Let’s do it :)

Java application

Now is the time to write some code :) I chose an application written by a friend. It’s a very simple CRUD of users, written by Washington Botelho with the goal of introducing VRaptor, a powerful and fast development focused web framework. Washington also wrote a blog post explaining the application, if you don’t know VRaptor or want to know how the application was built, check the blog post “Getting started with VRaptor 3″. I forked the application, made some minor changes and added a magic filter for caching. All Java code that I want to show here is the filter code:

package com.franciscosouza.memcached.filter;

import java.io.IOException;
import java.io.PrintWriter;
import java.io.StringWriter;
import java.net.InetSocketAddress;

import javax.servlet.Filter;
import javax.servlet.FilterChain;
import javax.servlet.FilterConfig;
import javax.servlet.ServletException;
import javax.servlet.ServletOutputStream;
import javax.servlet.ServletRequest;
import javax.servlet.ServletResponse;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import javax.servlet.http.HttpServletResponseWrapper;

import net.spy.memcached.MemcachedClient;

/**
 * Servlet Filter implementation class MemcachedFilter
 */

public class MemcachedFilter implements Filter {

    private MemcachedClient mmc;

    static class MemcachedHttpServletResponseWrapper extends HttpServletResponseWrapper {

        private StringWriter sw = new StringWriter();

        public MemcachedHttpServletResponseWrapper(HttpServletResponse response) {
            super(response);
        }

        public PrintWriter getWriter() throws IOException {
            return new PrintWriter(sw);
        }

        public ServletOutputStream getOutputStream() throws IOException {
            throw new UnsupportedOperationException();
        }

        public String toString() {
            return sw.toString();
        }
    }

    /**
     * Default constructor.
     */

    public MemcachedFilter() {
    }

    /**
     * @see Filter#destroy()
     */

    public void destroy() {
    }

    /**
     * @see Filter#doFilter(ServletRequest, ServletResponse, FilterChain)
     */

    public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException {
        MemcachedHttpServletResponseWrapper wrapper = new MemcachedHttpServletResponseWrapper((HttpServletResponse) response);
        chain.doFilter(request, wrapper);

        HttpServletRequest inRequest = (HttpServletRequest) request;
        HttpServletResponse inResponse = (HttpServletResponse) response;

        String content = wrapper.toString();

        PrintWriter out = inResponse.getWriter();
        out.print(content);

        if (!inRequest.getMethod().equals("POST")) {
            String key = inRequest.getRequestURI();
            mmc.set(key, 5, content);
        }
    }

    /**
     * @see Filter#init(FilterConfig)
     */

    public void init(FilterConfig fConfig) throws ServletException {
        try {
            mmc = new MemcachedClient(new InetSocketAddress("localhost", 11211));
        } catch (IOException e) {
            e.printStackTrace();
            throw new ServletException(e);
        }
    }

}

First, the dependency: for memcached communication, we used spymemcached client. It is a simple and easy to use memcached library. I won’t explain all the code, line by line, but I can tell the idea behind the code: first, call doFilter method on FilterChain, because we want to get the response and work with that. Look the MemcachedHttpServletResponseWrapper object, it encapsulates the response and makes easier to play with response content.

We get the content, write it on response writer and put it in cache using the MemcachedClient provided by spymemcached. The request URI is the key and timeout is 5 seconds.

web.xml

Last step is to add the filter on web.xml file of the project, map it before the VRaptor filter is very important for proper working:

<?xml version="1.0" encoding="UTF-8"?>
<web-app xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://java.sun.com/xml/ns/javaee" xmlns:web="http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd" xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd" id="WebApp_ID" version="2.5">
    <display-name>memcached sample</display-name>

    <filter>
        <filter-name>vraptor</filter-name>
        <filter-class>br.com.caelum.vraptor.VRaptor</filter-class>
    </filter>
   
    <filter>
        <filter-name>memcached</filter-name>
        <filter-class>com.franciscosouza.memcached.filter.MemcachedFilter</filter-class>
    </filter>
   
    <filter-mapping>
        <filter-name>memcached</filter-name>
        <url-pattern>/*</url-pattern>
    </filter-mapping>

    <filter-mapping>
        <filter-name>vraptor</filter-name>
        <url-pattern>/*</url-pattern>
        <dispatcher>FORWARD</dispatcher>
        <dispatcher>REQUEST</dispatcher>
    </filter-mapping>

</web-app>

That is it! Now you can just run Tomcat on port 8080 and nginx on port 80, and access http://localhost on your browser. Try some it: raise up the cache timeout, navigate on application and turn off Tomcat. You will still be able to navigate on some pages that use GET request method (users list, home and users form).

Check the entire code out on Github: https://github.com/fsouza/starting-with-vraptor-3. If you have any questions, troubles or comments, please let me know! ;)

20 thoughts on “Killer Java applications server with nginx and memcached

  1. Nice article – clear, clean and simple! I like that you access memcached directly from nginx instead of having Java code to check if an item is cached.

  2. “I spent last days setting …”

    I see a lot of misusage of ‘last’ and ‘since’ in non-native English speakers.

    In this post, you would have wanted to started it as, “I spent the last x days setting …”

    Great article, regardless. Have a great Christmas!

  3. Isn’t EHCache (or other Java-based caches) a better choice for Java apps? (than memcached). I’ve seem some benchmarks that showed EHCache performs better (lazy to find them now). Did you review Java caching options as well?

  4. Pingback: World Spinner

  5. In your doFilter, I would check for POST first. If it is then you don’t need to use your wrapper response and can just call chain.doFilter. This avoids the overhead of writing the response to an intermediate string and then writing the response again in POSTs.

    Check out TeeOutputStream in Commons IO. This would allow the response to be streamed directly back to the browser and to your temporary buffer to write to memcached. Should be a smoother experience all round.

  6. How much QPS (Queries per Seconds) you achieved with this architecture.

    I am using Amazon EC2 instance with following configuration.
    7 GB of memory
    20 EC2 Compute Units (8 virtual cores with 2.5 EC2 Compute Units each)
    1690 GB of instance storage
    OS: Linux, 64-bit platform

    With your architecture, I am not able reach more than 600 qps.

    I tried by changing worker_processes to 5, worker_connections to 1000 and on Memcache side I changed MAXCONN to 20000.
    But still it’s not going above 600 qps.

    Instead of this, If I changed my java code and let it handle set & get data from memcache (instead of nginx). I am able to reach till 900 qps.

  7. Pingback: film izle

  8. …just stumbled over your article while being redirected from page about page regarding the nginx topic – actually i forgot how i came initially about to start browsing on nginx at all :-) but just wanted to say thanks for the shared information. great read!

    • Hi Simone,
      you’d need to invalidate the cache from the application, or use another plugin for memcached invalidation. It is not related to nginx’s cache, but memcached itself.

      You can use the memc-nginx-module, which provides delete, so you can have your purge url that purges from memcached instead of nginx’s cache.

      Regards,
      Francisco

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Spam protection by WP Captcha-Free

Proudly powered by WordPress
Theme: Esquire by Matthew Buchanan.