Legacy systems revisited

This post is actually not inspired by this years GOTO Conference, but the one 2 years ago in Copenhagen. I can’t remember if it was in a talk or it was during my discussing with Dave Thomas in one of the breaks, that he started talking about “Legacy systems”. I was kind of star struck, so I didn’t question anything he said – well, not until I got home anyway. When I was looking through my old posts, I found this entry, where I asked how others would define the term “Legacy System”. Nobody really coined it and I’ve been meaning to follow up on this post for 2 years now, but never got around to it before now, so here goes. This is my definition.

Technology needs to be dead or dying
If the system is build on a brand new techology, it is hard to argue that is has any kind of legacy to it. It has to be something that has been, something that was state of the art, but not any longer. The technology was either adequate or simply the best available at the time the system was build, but the world/company moved on and the chosen techology had a hard time keeping up with new demands.

Systems has to have substantual/significant value to the company
If the systems does not have any value to the company, it can be turned off. The revenue stays the same and nobody will miss it, therefor the cost of having it running needs to be less than the loss of shutting it down.

The rate of new features added is going towards zero
If features are still add to the system at a high rate, I would argue that it is a system still under development. This to me indicates, that its technology can keep up with demands, thus it is not dying. The system needs to be in a state where it either sufficiently solves the business needs or the pain and cost of adding extra features are higher than the revenue from adding them.

The time spent on new features is less than time spent on bugfixing and maintenance
Though the rate of adding new features is going towards zero, I will still argue, that if more time is spent adding features that maintaining the system, it is still under development. Every time a feature is added, the value of the system increases and if it increases more than the cost of running it, the company is investing thus the system is still under development, therefor it is not a legacy system. It has to be in a state where the company invests less in adding new features than the cost of running the system.

* * *

So this basically sums up to be an old system that have a substantual/significant value to the company, a system that does not grow in value but needs to be maintained, so it does not devaluate.

What is “simple”

So I was all high on simplicity yesterday and of course Frank had to ask me the question that made me crach land again: “So, what do you mean by ‘simple’?” I almost didn’t hear this mornings talk from Brian Goetz about lambdas in Java, since me head was crunching that question. What is “simple”, what does it all mean?

It struck me that I need to be able to measure complexity to answer that question. I need some way of comparing to things and say, that one is more complex than the other. Looking at code, at a system, at a framework, we have en intuitive understanding of simple. But it is not only intuitive, it is subjective. If something is easy to do or easy to understand. Does that make i simple?

I’ve been doing karate for years and it is pretty easy for me to round house kick someone and kill them 3 times before they hit the ground. Does that make it easy to do, thus simple? Not really. Karate is hard and complex, I just practiced a lot. So easy to do, does not make simple.

What about easy to understand? I understood most of Brian Goetz talk about lambas in Java without a big effort. Of course I had to think really hard about some af the stuff, but in overall, it was pretty straight forward. But I’ve been programming for 25 years now in anything from assembler to prolog, and I’ve been a full time clojure programmer for almost 2 years now. I should know about lambdas and low level stuff by now. That gives me an edge. It is only easy because of all the other stuff I know about the subject.

That still leaves the question unanswered. Should complexity then be messaured by LOC? That’s at least a objective messaurement and maybe better. If I can express the same algorithm in half the lines of code, it is simpler right?

Here’s the Adler checksum algorithm in C:

const int MOD_ADLER = 65521;
uint32_t adler32(unsigned char *data, int32_t len) 
{
    uint32_t a = 1, b = 0;
    int32_t index;
    for (index = 0; index < len; ++index)
    {
        a = (a + data[index]) % MOD_ADLER;
        b = (b + a) % MOD_ADLER;
    }
    return (b << 16) | a;
}

12 lines of code

and here it is in clojure:

(def base 65521)
(defn cumulate [[a b] x]
    (let [a-prim (rem (+ a (bit-and x 255)) base)]
         [a-prim (+ b a-prim)]))
(derive clojure.lang.LazySeq ::collection)
(defmulti checksum class)
(defmethod checksum String [data]
    (checksum  (lazy-seq (.getBytes data))))
(defmethod checksum ::collection [data]
    (let [[a b] (reduce cumulate [1 0] data)]
          (bit-or (bit-shift-left b 16) a)))

11 lines of code

The latter should the be less complex, right?

I’ll leave this one open and hope for a discussion in the comments.