Software Development: Every Error Is an Opportunity

Through the cyclical repetition of measurements, evaluations and optimizations a targeted improvement of productivity and quality can be achieved in software development. Thereby error analysis is an essential part of this process.

When processing an error, the focus is naturally on correcting it in order to eliminate the user’s restrictions on the use of the system. In many cases, agreed response or troubleshooting times require a quick focus on a solution, even if it is only a workaround solution that does not prevent the error from occurring in the long term.

Defect analyses can sustainably improve quality

Any defect that only deals with the effects causes nothing but costs and problems. On the other hand, defects that can be prevented from occurring again and again due to the treatment of their root cause sustainably, improve the quality. You save cash in the form of unaccrued correction costs. Therefore, the effort involved in analyzing the cause of an error is not an additional effort, but rather an investment that pays off quickly.

The search for the root cause

If a defect is reported, this report initially only describes the effects, i.e. what the user perceives as deviating from the expected result or system behavior. Of course, this is an important piece of information which requires a precise check to see whether it is actually a deviation from the original requirements, a misuse or a new requirement. However, it is then crucial to find the exact cause at the beginning of the chain, based on the description of the defect, which is the effect at the end of the cause-and-effect chain. Once it has been found, it is necessary to check whether it has also been caused by another problem or negligence. This backward reconstruction of the cause-and-effect chain, also known as root cause analysis, usually ends with a problem that could cause further defects. When this problem is resolved, all defects that have already been caused by this are eliminated and future defects are avoided.

Methodically, it helps in practice to orientate oneself on the 5-Why method, in which several (not necessarily five) questions are asked about the why. Only if this question is no longer answered meaningfully, the cause of the defect is found with high probability, which can supply indications of an effective improvement measure. As an example, the error message may serve, which says that the user cannot change the content of a specific field of a mask, although his role should enable him to change it:

Question Answer
Why can the field content not be edited by a specific user? The field is read-only for him/her.
Why is it read-only? The user’s role does not have the appropriate right.
Why is the user’s role missing this right? Because it has not been assigned when setting-up the user roles.
Why has it not been assigned? It was not included in the specification of the user roles.

Sample root cause analysis inspired by the 5-Why method

Of course, it could be further questioned why the specification of the user roles was incomplete. By no means this was a ‘programming error’ or ‘GUI error’, but rather an ‘incorrect specification’, possibly even a new requirement (and thus no error at all). Possible measures to prevent errors of the category ‘incorrect specification’ in the future are: more accuracy in requirements management and specifying, appropriate quality assurance and an approval process for the specification prior to its implementation.

The importance of standardized defect classes

For the practical application of root cause analyses, it can be recommended to standardize the error causes company-wide and to ensure that only root causes from this schema are selected. Technically this can be done by a picklist in the applied ticketing system or error tracking tool. Standardization is crucial for easy evaluations of the frequency of errors by root cause category. These show in which areas improvement measures probably are most effective – due to the related number of errors. To remain with the above mentioned example: If 25 % of all detected errors can be assigned to the category ‘specification wrong’, it should be clear for the responsibles that they have a problem with the creation and quality assurance of specifications. But, in other words, they also have the chance to improve their error rate by 25% with probably few targeted measures and save high costs for bugfixing.

For the management model described in our book series “Increasing Productivity of Software Development”, the results of error cause analyses are, in addition to KPIs, important input variables of the evaluation phase. The results of this phase in turn form the basis for the optimization phase, which looks for the most effective improvement measures within the Key Performance Areas (KPAs). To make this easier, the scheme of defect classes at the highest level should already be grouped according to the relevant fields of action.

Schema of standardized error causes (extract, sample)
Schema of standardized error causes (extract, sample)

At first glance, a regular root cause analysis of errors appears to be an additional effort compared to the quick elimination of effects. However, if the management model is effective, the organization saves money through effective root cause analyses. The next article in this series shows how this can be made transparent and controlled by regular monitoring of productivity and defect density.

Picture credit: Shutterstock / cunaplus 

Leave a reply

Your mail address will not be published. Required fields are marked with *.