QCon: Open Terracotta

Cool ... but in a good way!

There have been numerous presentations at QCon that I've taken things from; presentations I've thought it might be worth distilling and replaying to others. However, Ari Zilka's talk on Open Terracotta got me excited (and I know I'm not alone) - possibly because I might be able to realise and demonstrate some pretty clear benefit without having to do much work at all. Ideal!

Open Terracotta ( www.terracotta.org ) is essentially an open source clustering solution for Java. However, rather than presenting an API for sharing data between nodes, Terracotta works transparently against the heap. This makes it more like sharing memory between VMs, and that sounds a lot like a silver buller for many clustering requirements.

Ok, it won't solve everything as there are problems inherent in distribution or common resources that exist regardless of the solution. But as a proposition it's very interesting: if VMs share memory (or, more accurately, transparently simulate shared memory, even down to the semantics of locking), that frees the developer from worrying about numerous distribution concerns. Indeed, it permits very simple paradigms that are employed in-process (such as locking and trivial object identity) to be leveraged regardless of the target environment. That's got to be a good thing!

I had to think about it for a bit because it sounded too good to be true - is this analogous to having a distributed VM? In theory it could be, in practice there's obviously a choice to be made about what to cluster. Ari was ready to admit that this sort of idea isn't new - Sun and IBM apparently tried using shared memory for multiple VMs without much success. This, however, seems to be real. Runtime optimisation and monitoring just make it sound even better.

There are several examples of where this sort of technology would have been valuable on real projects I've worked on in the past few years. At the moment I can't wait to take it for a test drive on something stupid like a clustered Swing application before looking forward to an enterprise project that presents an acid test.

If anyone else has any opinions or concerns about this then it'd be really great to hear about them.



Re: QCon: Open Terracotta

For those interested, there's a video of Ari talking about Terracotta here: http://video.google.com/videoplay?docid=7660457673499305140

Personally, I'm not sure that I'm as enthused. Some reasons:

1) APIs are bad?
Java's concurrency constructs and memory model are poorly understood and error-prone - APIs at least give us a chance to think at a higher level (e.g. process-oriented like CSP). In Terracotta’s case, the missing API complexity is just pushed out to config.

2) Transparency is good?
The level of transparency offered by Terracotta would make it very difficult for application developers to address the usual 'fallacies of distributed computing'. Any field/method access could potentially be remote; any synchronized block could potentially be a distributed transaction (and cause a distributed deadlock). Network failures also seem to be unrecoverable at runtime.

Re: QCon: Open Terracotta

I agree with all your points, Tim. Distribution and memory models are trivial matters.

Sure there are risks, but APIs can be misused too. Pushing the clustering behaviour into configuration also doesn't sound like a bad idea to me - it's arguably more of a characteristic of the target environment than of the system type. It also follows a common theme from QCon, that of reducing the cost of change.

I certainly wouldn't advocate letting a development team run wild on it just because it looks like vanilla Java. Similarly, I wouldn't consider configuring java.util.concurrent.CyclicBarrier as clustered. However, I like the idea of being able to write my own abstraction for these essential items without having to learn (and hide) someone else's abstraction.

Re: QCon: Open Terracotta

While I agree with some of this like not letting people go run amuck and write bad code just because they think clustering can be done transparently I did have one small issue. The thing about your choice of example of CyclicBarrier is that I find it to be one of the most useful things to cluster. When we wrote a distributed testing framework we clustered the meta-data for the tests and coordinated the starts and stops using a clustered CyclicBarrier. Thinking back to all the times I had to coordinate processes before using Terracotta and all the hoops I jumped through makes me really love the terracotta model of dev. Cheers, Steve (I work at terracotta)

Re: QCon: Open Terracotta

Thanks for your post Steve.

The CyclicBarrier example comes from what Ari said during his QCon talk so I can't take any credit for it!

Presumably when you talk about clustering CyclicBarrier you don't simply instrument java.util.concurrent.CyclicBarrier but rather you configure a particular instance/pattern which contains a CyclicBarrier, eg com.mycompany.loadtest.Manager (imagine Manager contains such a barrier).

Re: QCon: Open Terracotta

Yep, I think that's right. I wrote an old blog on this (warning it is really dated and our product has changed a bit since it was written but the idea is still right) http://blog.terracottatech.com/archive/2005/08/fun_with_dso_an.html And we have a sample in our kit on a super simple use of CyclicBarrier and dso. Cheers Steve

Re: QCon: Open Terracotta

When Sun described their fallacies of distributed computing in 1994, they went into detail.  It is an interesting read because it assumes something like Terracotta cannot exist.  Why?  Because Sun engineers assumed the entire heap would be replicated and, as I said in my talk, we do not replicate everything as much as we can replicate almost anything.  Specifically, they say transparency will suffer from 4 problems:
1. network latency
2. Memory address spaces cannot transparently translate
3. concurrency semantics won't translate
4. failure cannot be efficiently dealt with in a cluster.

The paper says to ignore #1 since the network will just get faster (10MBits when they wrote the paper is now 10Gbits 12 years later...)
2.  We can ignore #2 since object identity is  preserved with Terracotta and we don't have strange copies of objects appearing where the developer hadn't expected references to end up generating clones on deserialization
3.  Sun didn't assert that making locks clustered is a bad idea.  They only asserted that synchronization does not become clustered and thus, clustered programming cannot be transparent.  Again, Terracotta did not exist at the time.

4. The paper asserts that #1-3 can be quite easily dealt with (as Terracotta has done--arguably not transparently but with config).  This is the one they say that is insurmountable.  But #4 assumes a peer-to-peer cluster and Sun engineers are talking about network partitioning.  Terracotta's server makes recovery from failure as easy as waiting for another Terracotta Server; there is no such thing as a partition.  its not perfect by any means (Oracle Active / Passive clusters fail to fail over all the time).  But it is O(1) to find a backup place to write your data.  O(1) to find a backup server is what Amazon, EBay, and Google do with their decentralized architectures and these seem to work, no?

I don't mean to come off as defensive.  I just want to get the facts about "fallacies of distributed computing" written down somewhere.

The config is the key.  Without it, Sun is correct.  With it, we are still programming clustering into our app, just not through saying "implements Serializable" and "get()" or "put()."

--Ari

Not so cool...

Well, after having spent several fruitless train journeys trying to get Terracotta to cluster an existing application I've finally given up, coming to the conclusion that Tim was probably right; clustering really isn't something that you want to implement transparently.

I could probably have managed to achieve clustering of my app's state but it would have involved rewriting portions of my code - it just wasn't designed for this sort of thing. I've endured running out of heap space (why?!) and the JVM terminating whenever it doesn't like something (excusable?), messages instructing me that my loggers are being distributed (ah.) and even a cold-call offering professional services support (in response to a blog entry?!).

Admittedly, Terracotta makes no claims that clustering should be bolted on post-hoc or taken on lightly; it simply provides a familiar programming paradigm for achieving it. While it's transparent to your Java code, it's not transparent to your Java development.

I guess what I've learned is that if you want distribution, you still need to think long and hard about it. The cost of implementation may be reduced by using something like Terracotta, but that doesn't mean you can leave the implementation to "cheaper" developers - the complexity of the problem remains.

Re: QCon: Open Terracotta

Hi there, "At the moment I can't wait to take it for a test drive on something stupid like a clustered Swing application " interesting to know about your plans for a clustered swing app. I am interested to know about the memory and performance characterestics of this app. Are you clustering the Swing app or the server that serves the swing app ? Thank you, BR, ~A

Re: QCon: Open Terracotta

I was hoping to take a fairly serious standalone Swing desktop application (no server involved) and use Terracotta to replicate a simple user interaction between two instances.

For example, I'd start up two instances of the application and update something in the model via one instance and watch it get updated in the other instance.

The application I had in mind is substantial although I was only interested in distributing one model from it (albeit a very large and pervasive one).

This was an artifical requirement and had no practical use for the application in question. The hope was to find out how easy it was to use Terracotta with something that wasn't designed with this behaviour in mind and whether a noticeable performance overhead was incurred. As it turns out I never got it to work - the model was far too complex to be a successful candidate for clustering; Terracotta demanded that I distribute more and more components until I finally came to the conclusion that while clustering may be relatively trivial to implement with Terracotta it's still not something you can retrofit into an unaccommodating design. Kinda obvious in hindsight.

I hope to give it another go someday when I get the time, but perhaps on something a little more amenable!

If you want to get a feel for the overhead and performance, though, you could do a lot worse than trying out the examples that come with Terracotta - they're basic but pretty good.

Re: QCon: Open Terracotta

Thanks for blogging about QCon! I just wanted to let you know that we quoted and linked from this entry on the over all QCon 2007 blogger's key takeaway points and lessons learned article:

http://www.infoq.com/articles/qcon-2007-bloggers-summary

Feel free to link to it and of course blogging about this articles existence would help even more people learn from your and other bloggers takeaways.

Thanks again!

Diana
InfoQ/QCon

Add a comment Send a TrackBack