Monday, 14 February 2011

Write performant code: keep some fundamental figures in mind

It's funny to see the amount of time spent (i would say lost) by some developers to optimize their code in some location... where the ROI will be peanuts at the end of the day ;-(

Worse: premature optimizations. The real ones, I mean (I heard you Joe, and I agree ;-). The ones where  peoples sacrify readability and evolutivity to the hostel of performance (really? did you measured it concretely?). In most cases, a simple but wise choice of relevant types and datastructures within our code save us lots of time and energy without creating maintenability nightmares.

If you want to develop low latency and scalable solutions, it is obvious that you should know the core mechanisms of your platform (.Net, Java, C++, windows, linux, but also processors,  RAM, network stacks and adapters...). How the GC works, how the memory and proc cache lines are synchronized, etc.

But do you have in mind the cost of some elementary operations (in term of time and CPU cycles) when you are coding? How long it takes to make a classic network hop? How long it takes to make a typical I/O read? How long to access the memory depending on its current state?

As a reminder, here is some figures (some are borrowed to Joe Duffy's blog) that you should definitely post-it in front of your development desktop. If you don't want to improve your code blindly, it's important to know what things cost.

  • a register read/write (nanoseconds, single-digit cycles)
  • a cache hit (nanoseconds, tens of cycles)
  • a cache miss to main memory (nanoseconds, hundreds of cycles)
  • a disk access including page faults (micro- or milliseconds, millions of cycles)
  • a local network hop with kernel-bypassing RDMA & 10GigEth (sub 10 microseconds)
  • a LAN network hop (100-500 microseconds)
  • a WAN network roundtrip (milliseconds or seconds, many millions of cycles)

Wednesday, 9 February 2011

The ultimate MOM?

I've never had faith in silver bullets ;-) but I have to admit that Solace (appliance-based) solution is very attractive for a pre-trade (but also post-trade) financial entreprise messaging system.

Because Solace use silicon instead of software for their "hub and spoke" oriented messaging solution (fully compliant with JMS standard, but with much more features) , there is no OS interrupts, context switching nor data copies between kernel and user space for the "hub" part.

I still didn't had the chance to evaluate it, but on the paper and according to some of my ex-colleagues that had made evaluations,  Solace looks like a kind of ultimate solution for financial entreprise messaging system.

"Reliable delivery with average latency of 22 microseconds at 1M msgs/sec", "Guaranteed delivery with average latency of 98 microseconds at 150,000 msgs/sec",  "10 million topics with support for multi-level, wild carded topics", "9,000 client connections"... ( Such figures make me dream and remind us that software can't win over hardware for those message-oriented middleware (MOM) use cases...

Their several white papers are very relevants and informatives. In particular, the one that explain how to build a single dealer platform 
(meaning: a web oriented application produced by an investment banking in order to allow all its clients to directly cope/deal with it). Oh Yes, because the future version of their appliance will also allow to bridge with http clients (one more killer feature ;-)

This particular white paper is available from here:

More than the (high) price of such a solution, and even if it is increasingly used in some banks, perhaps risks regarding the sustainability of such hardware-based solutions that has to be lifted.

Indeed, what would happen if their (unique?) appliance production factory get burnt? how long would it take to Solace to replace this mass production to fulfill contracts, etc.?