Michael Eriksson
A Swede in Germany
Home » Software development | About me Impressum Contact Sitemap

Measure twice, cut once

Note: This page will probably be heavily revised at some future date. The ideas and individual discussions are sound, but the text as a whole is not satisfactory from the second heading and on (and not entirely complete in the first place).


Addendum:

Addendum/Excursion on Scrum and other agile approaches:

Since writing the first version of this text in 2008, I have had considerable exposure to agile techniques, in particular Scrum. In many ways, they might seem to preach the exact opposite of what I do below. I would argue, however, that the contradiction is only partial—and that both approaches address the same underlying problem: The narrow limits of the human mind.

Broadly speaking, I say below that future problems should be reduced by thinking first and acting later—lamenting that lack of forethought often leads development down the wrong road. Scrum, e.g., says that future problems should be reduced by not going too far down any particular road before it is truly known that this road should be followed and that course alterations should be made easy—claiming that it simply is not possible to think everything through. At the same time, Scrum books warns against making the mistake of naively confusing Scrum with not planning and I warn against the danger of over-thinking in a below caveat—pointing to concerns that come quite close to the arguments against over- and pre-mature planning used by Scrum.

The eventual true difference is that my (half-completed) article was written with a purpose of combatting problems within the framework of traditional software development, whereas Scrum builds a new framework (and requires a considerable organisational change to work well). Changing the framework is a better approach, but one which requires much more momentum and credibility than the urgings of a single developer. (I leave unstated whether a 2008 attempt by me to change the framework would have born any great resemblance to Scrum.)

Seeing that software development has increasingly moved away from its traditional forms and towards more agile ones, I will likely leave the article approximately as is (despite the above note): The principles are sound, but developing them further would likely not be productive. For an agile framework, the principles can be left as they are (after some mental adjustment for e.g. the shortness of a Scrum sprint relative a full classical project), even if the text as a whole might be incompatible. For a traditional framework, the text might benefit from further development, but this feels like a waste of effort, with traditional development losing ground.

I might also speculate that agile methods are becoming relatively more beneficial as time goes by, because (a) the size and complexity of software is increasing, which makes long-term detail planning harder, division into smaller tasks more beneficial, etc., (b) competence levels are dropping, which makes the planners less suited for long-term planning, etc.


Introduction

One of the most common errors in software development (and in very many other areas) is to begin a particular type of work, especially implementation, too early. This problem is particular prevalent where superiors or project managers lack insight into the work at hand, and fail to understand that thinking is the most important part of the work of a good software developer (irrespective of his exact role; although there are variations). Below I am going to discuss some of the issues involved in the form of a set of phases. Beware that these phases should not be taken too seriously: There will always be some overlap, sometimes a phase might need re-visiting, phases might overlap, not all phases will always be present, ... It is further quite possible that I left out phases that would be vital, were I to try the extension to an academic level of problem solution; however, this will not affect the main points.


Side-note:

Bear in mind that the amount of planning, the way that planning is done, etc., will be highly dependent on the circumstances: For a ten-line use-once shell script, the below can typically be condensed into a one-step brief contemplation; for a million-line enterprise application, a highly formalized process spanning several months and two dozen persons might be optimistic. (Incidentally, a good reason to avoid writing such applications...)


Solve the right problem

This starts at the very beginning: Writing a perfect, easy to maintain, bug free, and super-fast program or module counts for nothing—if it solves the wrong problem! Thus, the first step should always be to establish what the problem to solve is.

Depending on what kind of work the developer does, this will entail different things: In one extreme, he might have to interview users and stakeholders to find out the overall purpose of an application; in another, he can look at a pre-existing interface definition with pre- and post-conditions, prescribed input types and function names, and so on. In any case, it is almost always better to ask “stupid” questions than make assumptions. Further, even when a clear-cut description is present, time might be needed to actually comprehend it. Care should in particular be taken with words that might have different meanings in different contexts or are part of company jargon, connections and interactions between different parts of the description or the involved components, references to unfamiliar technologies, and similar.


Addendum:

As time has gone by, it has become less and less likely for a developer to have a non-trivial influence on decisions immediately relating to users and stakeholders, at least in even medium-sized businesses resp. for even medium-sized applications.

However, a switch from e.g. “developer” to “product manager” does not change the underlying principle. Moreover, chances are that the developer will still need to put in effort towards the product manager in order to ensure a correct understanding of the requirements. Similar potential complications will be ignored in the rest of the text.

(I do suspect that this move of influence is a partial reason for the low quality of most modern user interfaces, but that is a different topic entirely.)


Notably, even when the task does not involve anyone else, e.g. when a developer wants to write a shell script for his own use, it is easy to attack the wrong problem. For instance, at a first glance a task might seem to be “Back-up and then delete all files older than two weeks.”; however, eventually turn out to be “Back-up and then delete all files that were older than two weeks last midnight; excepting the administrative files from Subversion, which are entirely untouched; and excepting files with the suffixes ~, .swp, .bak and .old, which are deleted without a back-up being made.”—a very different task. Sometimes this can be compensated for by an incremental development and refactoring, but in many cases the additional effort spent grows disproportionally; further, refactoring takes discipline and, in my experience, is not something that most developers do to a sufficiently high degree—in particular not when a project manager is pressuring for an early completion.

The above is also a good example of how special cases are often left undefined by whoever provides the description. It can pay to keep an eye out for such special cases and bring them to his attention; or, if appropriate, make sure that one has the right to decide over the handling of unspecified special cases (and similar). Notably, there is no guarantee that the provider of the specification actually has a deep understanding of the problem—more often than not, he will only have a few high-level characteristics on his mind, and will not have bothered to think the subtleties through. (This need not be incompetence, although it sometimes is: Often there is/are a lot of room for negotiation, many things that he is willing to leave to the discretion of the developer, or a need for feedback from the developer before final decisions are made.)

If a description of the problem does not exist in writing, it can pay to correct this—possibly, even to have it officially signed off. This will depend on the circumstances of the task and the attitude of other persons involved. Regrettably, it is not uncommon that a stake-holder incorrectly claims that feature X was agreed upon after implementation is already completed, that a project manager tries to use a developer as a scape-goat for his own mistakes, or that another developer forgets the details of a discussion about an interface (resulting in him developing something incompatible), etc. Further, a written document can help prevent both misunderstandings and honest mistakes. Generally, the more important the task, the more persons/management/money/... involved, the vaguer the character of the task, etc.; the more beneficial is a written documentation.

(Note that the documentation can take many forms, ranging from a requirement document of several hundred pages to a one-line email, or from a natural language description to one that uses a formalized, special purpose IDL of some kind.)

Caveat: It is possible to over-think a problem; in particular, it easy to slip into a mode of thinking where every conceivable eventuality is considered and, later, reflected in the actual work—with half the eventualities not ever occurring after the software has been deployed. There are also many cases where the eventual problem might not always be predictable (in particular on the level of requirements): It can, for instance, be that a first version or prototype is shown to an end-user, and it is then discovered that the original thoughts were not practical. The key is balance—and, in my experience, most err on the side of too little thinking.

Think the problem through

(The border to the previous section can be very blurred, and the phases can often overlap; however, for illustrative reasons, I have made the distinction.)

Once the problem is identified and understood in its meaning, its deeper character and implications might need further investigation. Skipping this step can lead to sub-optimal solutions, the need to start the implementation over from the beginning, and similar complications.

Example: Assume that the task is a web shop with an estimated number of users x, an estimated average number of purchases per user of y, and an average amount of data per purchase of 1 KB. This implies that the back-end (e.g. database and file system) must be able to handle xy KB of data, both with regards to space and speed. (This is a minimum, other additional data, e.g. user information, bills, product information, ... are needed too. For the sake of demonstration, these are left out of the discussion.) For small values of x and y, this is a non-issue: With e.g. 100 users with 10 purchases, a total of 1 MB is needed—this is so negligible that even a small server should be able to keep the entire database in memory (the ideal situation).

Now raise the numbers to 10 million users and 100 purchases: We now need one TB of data. Currently (2009), no ordinary server has this much RAM, and even many RAID systems fail to provide this much hard-drive space. Correspondingly, we have an entirely new set of complications that have to be investigated, e.g. possibilities for and need of caching, what kind of storage back-end is needed (RAID, NAS, SAN, ...), whether archiving of older data is an option, etc.

If someone had just rushed ahead without a thought on these issues, troubles would soon have occurred—or, in a worst case scenario, occurred after deployment...

TODO: Write a better example that focuses on more qualitative characteristics, and deals mainly with choices in the software development.

Consider possible solutions and possible tools/technologies/...

Again, it is important not to jump straight into implementation without a plan. Further, and here comes one of the differences between a good problem solver and a poor one: if the problem is not very easy, more than one solution should be thought through. Subsequently, a choice is made between these solutions. (For very large and/or important tasks, it might even pay to implement prototypes of several solutions, and only make a choice based on these prototypes; however, few situations will justify this.)

In many cases, it will now make sense to consider what tools and technologies (and other helping factors) make the best sense with the chosen solutions; however, this order is not always appropriate. It can makes sense to consider tools first and solutions second (or, obviously, both conjointly). Assume, for instance, that a central choice of solution is whether an application should have business logic on the DBMS or the application level. If the former is chosen, then any DBMS without sufficient support is ruled out in advance; however, external circumstances might enforce the decision to use a DBMS without this support—and now the latter solution is a necessity. Then again, the choice of DBMS and solution might be a complex pro-and-con situation where various combinations have to be evaluated, and the decision made simultaneously.

An interesting issue here (and in many other situations) is the possibility to challenge pre-suppositions that unnecessarily limit the options available. It often turns out that what seems to be a given thing (e.g. through company policy or a superior’s preference) can be highly negotiable. Consider e.g. the use of Java vs. C++: In the early days of Java, it had considerable performance problems; and an application that needed to be both performant and object oriented would typically be written in C++ (other alternatives did, of course, exist, but they did not have even nearly the same popularity). Today, however, Java comes with highly optimized VMs, better compilers, better native libraries, enhancements like NIO, etc.; further, hardware has become much faster, with the implication that more wasteful overhead is tolerable. The result is that performance is only rarely a valid argument against Java today—and that a no-Java policy from 1998 could be over-turned in 2008.

(Then again, in other cases policies are set in stone, often even after changing circumstances have made them obsolete.)

And so on

From here the general gist of correct procedure should be obvious; in particular, we eventually land back at the first step with a set of smaller tasks. Depending on the exact circumstances, the next steps can, among many other things, include researching the chosen solution and working out the high-level details, investigating problems to be expected in the implementation, looking into and/or requesting resources, and the implementation it self,

Addendum on the trickiness of requirements

An early incident in my own career provides a good illustration of potential issues around requirements (cf. above), including how one person’s clear requirement can be someone else’s non-requirement (obscure requirement, whatnot) and how tricky provided-by-someone-else requirements can be:

In one of my first tasks in the workforce (maybe even the very first task, short of setting up my computer), I was given a simulated screenshot detailing how a particular web page displaying a data table should look, and was told to implement this page. I duly did so, reading data from a DBMS, ensuring that the display was correct (including a weird triangle on the table header), ensuring that all buttons did what they were supposed to do, etc.

When I showed the result to my boss for acceptance testing (this was a very small and not very professional company), he first seemed pleased, but then clicked on a column header and complained that the table was not re-sorted based on the column. I had never seen a table with such functionality and no such functionality was mentioned anywhere in the (minimal) requirements; he considered it an absolute standard behavior and a given part of the requirements—the weird triangle should indicate which column was currently the sort criterion.

I went back to work, implemented such a sorting, and went back to the boss. Now, he complained that the order or sorting was not reversed when he clicked a second time on the column header—something that he had not mentioned with one word during our last talk (and which, obviously, had not been mentioned in the requirements either).

The problem? Between leaving high school in 1994 and my first day at this company (likely January 1999) I had worked exclusively with Unix computers, where GUIs were rarer and GUI conventions originally radically different than under Windows. (Far less so today.) Moreover, I am far from certain that this convention existed even under Windows in 1994 (and/or before Windows 95)—and if it did, I had no recollection of it. While the functionality is reasonable with hindsight (and very well familiar to me today), it was not something that I could recall ever having seen, and there was no clue beyond that triangle that something might have been wanted.

Imagine instead if the requirements had been more explicit, if I had asked whether that weird triangle had some purpose, if the boss had mentioned the reverse sorting during our first discussion, or any other improvement.

(A secondary lesson is that it is dangerous to apply conventions from a particular OS to e.g. a web interface, as not every user will be familiar with these conventions. Here, we could also imagine a scenario where some other Windows conventions were implied that I was not aware of and which he did not check.)