Monday, May 21, 2012

Apache HTTP Commons and Cookie support for REST Services

Apache HTTP commons is a framework for managing client HTTP connections - compared to standard Java Sockets it offers easier way to handle persistent connections, pooling, and high level socket API. It also support cookies - and I would like to concentrate on this subject now, especially in REST context.

This is the typical way how most Java applications are using HTTP client. 
public class RestClient {

    static HttpClient httpClient = null;
    static {
        ThreadSafeClientConnManager cm = new ThreadSafeClientConnManager();
        httpClient = new DefaultHttpClient(cm);

    public HttpResponse executeGet(URI url) throws Exception {
        return httpClient.execute(new HttpGet(url));
For the most cases it is also a preferred way - HttpClient is thread save, and keeping single instance saves resources and increases performance.

Bu there is a catch, if your have a multi-thread server application, which shares single instance of HttpClient and you have to also manage cookies.
Multiple threads use single HttpClient to connect to different URLs at the same time - HttpClient is thread save, so this is allowed - but have you checked how DefaultHttpClient manages cookies? It simulates browser, but in our case this browser is being used simultaneously by several users....

HttpClient executes call against remote REST service and in response it receives a Cookie - this Cookie is valid for given domain and path and our request was executed on concreate URL - everything looks normal. When we use the same instance of HttpClient to execute another call on the same URL it will send back cookies, that were returned by previous response. If cookies were issued for sub-domain or root path, they will be also send. This is the way how normal browser would behave. And here is a catch! Our HttpClient behaves like browser but we are using it in multi-thread environment, where we communicate with many different services within different scopes - now this can have advantages, but it can be also source for serious problems.

Lets consider few scenarios - we will use as example a small application which executes calls for different users, each one is identified by integer ID (URI part).

Scenario A  - cookies with root path
HttpClient client sends request to, as response it receives some data and also cookie:

Set-Cookie: USER_ID=234234; Expires=Thu, 01 Jan 2049 00:00:01 GMT; Path=/;

Request above stored cookie in HttpClient instance - its CookieStore exactly. Some time later the same instance of HttpClient is being used to execute following request: Since domain and path for request match, response to this request will forward cookie which was set by previous call.
The question is - do you really wanted to send this cookie back within this response (different User ID in URL)?
Actually we have just used standard HttpClient and the rest was caused by service that we've called - it set cookie on wrong path, which did not contain  User ID. But still, this might be an issue - depends how this cookie will be interpreted - it might be used as User ID instead of ID from URL (nasty, but hey - it's REST - there are no rules).

Scenario B - identical URL for different context (users)
We call REST service which uses for all methods the same URL, User IDs are part of PUT/POST request's body. In this case our multi-thread application will use the same URL to execute calls for different users. HttpClient will send back all cookies for each request! Message body contains different User ID, but this is not relevant for CookieStore - only URL matters. Now imagine what happens when our REST interface will try to use cookie to open HTTP session - yes - we would mix up sessions.

Using single HttpClient is in general good idea, but we have to be careful when it comes to cookies. Most REST services do not use them, but in this case we should simply disable cookie support, just to be sure.

We can also create new instance of HttpClient for each request (conversational state). This is just fine, until you use shared ThreadSafeClientConnManager instance for each new HttpClient.

No comments:

Post a Comment