Programming as closing doors

As you program a solution you make a gigantic number of tiny decisions. Each of these represents choosing from a number of possible solutions to each of these decisions. At some point you have chosen enough solutions that you have some working software, according to your understanding of how the software should work. You are choosing a path through a graph with many many nodes.

Then your understanding of how the software should work changes. Some (perhaps many) of the decisions you made are now incompatible with your new understanding. Not only that, but some of those decisions are near the root of the graph of the path you chose. You now have to go back to those points and make a different decision. Sometimes it's easy and you can snip the branch at the new decision point and reattach it to your new decision node. However more often than not the branch is incompatible with your new decision. So you have to find a new path through the decision tree all the way through to working software.

This is why refactoring is often difficult. You are not just changing some tiny decision you made a while ago, you have to completely re-evaluate large chunks of your software according to your new decision.

Object Oriented Programming is not a solution to this problem. You may find that your new decision requires changing something fundamental about an objects behaviour. You must then evaluate if that change in behaviour is compatible with all usages of that object.

The skill of programming is the ability to "look ahead" and make decisions that leave a decent number of options open in the future. You must play ahead in your mind how you think the software may be changed in the future and make decisions that point towards that goal. It is also a skill to write as little code as possible (not D.R.Y., just less code). Less code means less decisions that need to be unwound.

You must also frequently go back and revisit decisions made in the past and actively refactor them according to your new understanding. If you don't then you must make brand new decisions that "go against" previous decisions. Programmers call this "spagetti code". It will make your newer code complex (larger amount of code), and will limit future decisions. If you find yourself writing brand new code that is difficult to understand, you are likely writing on top of older code that is in need of refactor. You are also making it more difficult to add more code in the future, as well as increasing the difficulty of refactoring.

The more code you have, the more decisions have been made, thus the smaller the range of possible future decisions. More code == less flexibility.

This is how projects grind to a halt. It becomes increasingly expensive to add new code. It also becomes increasingly difficult to refactor old code, because all the code is increasing in size.

If you want to write high quality code:

  • Try to write as little code as possible. Try to do things the most obvious way.
  • Refactor aggressively, with the specific goal of having less code after the refactor.

If you are refactoring and you find that you can delete chunks of code, it is a very good sign that you are increasing the amount of possible future decisions.. Unless you are abusing D.R.Y.

D.R.Y.

At first glance this appears to advocate for D.R.Y. However this is not always true. If you have code that repeats, those repetitions represent a single decision. If you need to change that decision later on it could be relatively easy to make that same change in multiple locations, provided they are visually similar enough for your brain to identify them as a pattern. Even better if the text of the code is easily searchable. The key is to get them looking the same (same indentation style, same variable names if possible, etc).

With D.R.Y. you would abstract them away inside a single chunk of code. However this creates a focal point for complexity. If your repetitions turn out to be not so similar as you progress, you must now increase the complexity of your abstraction. Programmers tend to keep increasing the complexity of the abstraction, rather than say create a second abstraction that has some code duplication with the original.

You should create a second abstraction if you have good reason to believe there will be more behavioural differences in the future. Likewise, you should consider not creating an abstraction in the first place if the cognitive load of the repetitions is not high - eg: they all look the same, and can be searched and identified easily.

An example of this is abstracting an HTTP client. Because an HTTP client is a highly configurable object, it is very tempting to see repetitions in this configuration and wrap the creation of the client in an abstraction. Eg: "we use a proxy" or "we use https" etc. Over time these repetitions will deviate: someone will need to bypass the proxy, someone will need to use http, etc. This abstraction can start to collect as many configuration options as the abstracted HTTP client itself. It may become better to split the abstraction even if there is still repeated code in each. Or never do the abstraction in the first place if it's easy to search for all usages of the HTTP client (and thus perform bulk changes with confidence).

Notice that "we use a proxy" and "we use https" are decisions that limit possible future options. By abusing D.R.Y. you have reduced the amount of code, but you have turned these decisions into "we always use a proxy" and "we always use https". Less code, but less options in the future.