The H Half-Hour: Cache Back
Greg Luck and the return of JSR 107
by Dj Walker-Morgan
The Java Community Process sees expert groups form and propose Java Specification Requests (JSRs) to create new standards for the Java language and API ecosystem. Some JSRs move forward with great speed, but others just drift off, abandoned by the community and the process. JSR 107 was one of the latter: proposed in 2001 and created by Oracle, it aimed to standardise how Java applications temporarily cached data. Now, more than ever, caching is a vital element in helping applications scale. JSR 107 got past a review ballot in 2001 but never progressed any further in the process beyond seeing various drafts up to 2005. But, over ten years after its creation, JSR 107 is back and being actively developed. We talked to Greg Luck, creator of Ehcache and a specification lead for the re-energised "107".
The H: What changed to bring JSR 107 back from the dead zone of the JCP?
Greg Luck: JSR 107 was originally created by Oracle, but work on it stalled. About three years ago, I got involved when I was commissioned to create an Ehcache implementation of the draft spec when I was looking for funding to work on the spec. Ehcache was acquired by Terracotta two and a half years ago, and we have since been busy with eight major releases, which aimed to add more enterprise and scale features to Ehcache.
By March 2011, the user interest in having a standard had grown significantly, and I finally had the bandwidth to pick my work on the spec back up. Additionally, the JCP were in talks to kill off any inactive specs. After I spoke with Cameron Purdy, Vice President of Development at Oracle, we both committed our resources to get it completed. We have been working on it ever since, putting out four versions of the spec to Maven central, so far.
The H: Does what's in JSR 107 now bear any relationship to what was in 2001's JSR 107?
GL: Yes, it is the same idea.
What is new is that there is now a mature caching industry which we need to take into account. There are the open source in-process caches, then distributed caches. The latter are based around a variety of architectures and support a range of features. Examples of enterprise features not supported by all include search, transactions, tunable consistency, and multi data centre support.
So in designing a useful spec we need to make sure it works with the different architectures and feature sets. Given that we are aimed at enterprise use with inclusion in Java EE 7 and Spring, we want to make sure we add some common enterprise features and more importantly don't get in the way of those features being used while using 107.
The H: How is 107 progressing? From the mailing list, we get the impression that there's still plenty to shake down before we get close to a final draft?
GL: We have got a complete API which we are at release 0.4. As the number of people reviewing the API increases, we are going back over parts of the API and re-discussing them. This proves to be a lengthy process due to the 10 years of precedent each of the caching companies has.
So, for example, how should read-through work? Should it just be read-through on get() calls? Should the cache in essence act like a proxy to the thing it is caching? Or should there be some in-between set of calls that read-through? Well, Ehcache and Infinispan do it the first way, as does the 0.4 version of the spec.
So, to answer your question, yes there is plenty of discussion still to shake down, but it is driven by review of a feature complete specification.
We will be submitting an early draft to the JCP in February 2012. We submitted an early draft last year, but it was delayed over our intent to license the TCK Apache 2. On a side note, Terracotta and Oracle are finalising a TCK license, which will be done in time for the resubmission.
The H: When JSR 107 is finalised, who's already committed to implementing it?
GL: The list is:
- Terracotta – Ehcache
- Oracle – Coherence
- JBoss – Infinispan
- IBM – ExtemeScale
- SpringSource – GemFire
- GridGain
- TMax
- Google App Engine Java1
1 – Google are already using a version based on the older draft spec.
We also hope that spymemcache will implement the core feature set for use with memcache and some of the NoSQL solutions will implement it for their caching use cases.
The H: And is there anyone you want to get on board who currently isn't?
GL: In terms of implementations, I think we are in pretty good shape. Personally, I would like to see IBM, who is a member of the expert group, contribute in addition to SpringSource.
While IBM did indicate it will implement the spec, SpringSource publicly stated in its Spring 3.1 announcement that it would implement in addition to adding the JSR 107 caching annotations into Spring, along with its own.
The H: Will JSR 107 be a complementary API for cache providers or is there a commitment to adopt JSR 107 as the core API for caching?
GL: At this point, we are all still figuring it out. We already have an ehcache-jcache module which implements the 0.4 version of the spec and wraps Ehcache – which we will keep for older versions of Ehcache. However, the spec has some nice optimisations for caches that are distributed, which we cannot get the benefit of unless we expose 107 natively. In a subsequent Ehcache version, we will probably introduce a native 107 API, which does not wrap the existing Ehcache API. Of course we will also include our Ehcache API in there as well and provide access to it for 107 caches via the unwrap method that is in the spec.
I am talking to Manik Surtani from JBoss and I think he has similar ideas.
The H: How are you finding the JCP as a process? I note you are using GitHub as a transparent collaboration tool; how’s that working out?
GL: We are deliberately running our JSR as close to the conventions for an open source project as possible. It is completely transparent; you see the code and the spec as a work in progress with regular releases. GitHub has been brilliant for that. Also, we have been using CloudBees for running our builds and Sonatype for hosting our Maven artifacts along with syncing them with Maven central. In my opinion, GitHub has encouraged a lot more contribution, which is always fantastic. The only issue is that it cannot accept pull requests from those not on the JSR – it would create imperfections in the IP.
The JCP has been alright, however, I do find it very difficult to figure out what is acceptable. Because Oracle is a co-spec lead we also need to do things the Oracle way, a more stringent standard than the JCP, which is a little more permissive.
In the end, we went down and spent a half day with the JCP to get all of our questions answered. My other challenge is that Oracle wears two hats: one as the JCP Office and the other as the commercial company. I think proper separation of the JCP from Oracle would be helpful.