How good is the firmware?

May 5th 2018

When organizations can not answer how good the firmware needs to be they will often ask how good is the firmware?  How many bugs are in the firmware?  What is the quality of the firmware?

So how do you measure the firmware quality?   

Many people result to "good firmware is working firmware" mentality.  Others say "if it passes MISRA it is good".

The reality is that I can write unstable working firmware, and I can write MISRA compliant code that does not work.  

A better place to start is with what does quality mean to you?  

For example some organizations rate the ability for the code to be understood by others as part of their quality metric.  While small shops where their is one developer may not consider easy of understanding, even if they should.  So quality really means different things to different people. However one universal is the fit for purpose requirement. 

Fit for Purpose

Everyone seems to agree that if the firmware does not work as intended it is of poor quality.  That is no matter how well the actual code is constructed if it does not do what they want it is poor quality.  To visualize this take a nice brand new top of the line Mercedes and all the fine German engineering that went into it.  Now visualize the need to haul a load of gravel to a construction site and then say "The Mercedes is of poor quality because it can not haul the gravel." 

The problem with fit for purpose defining quality is that organizations fail to define the purpose. That is they asked for a quality vehicle and got a Mercedes, when they wanted a dump truck.  To use fit for purpose metrics in analyzing code you need to have some requirements written down.  More specifically the requirements should be good requirements (ie testable). 

Bugs

When it comes to code quality people often ask "how many bugs does the firmware have?"  I love this question as I can always say "a lot".  No code ships without bugs, if you have not documented the requirements, then the firmware is just a big bug.

There are documented requirements and assumed requirements.  Hence a customer might assume the product should operate a certain way and if it does not then it is a bug for them if it does not operate that way.   

Stability 

Code can be perfect, pass all testing and never present the end customer with a problem and still be of bad quality. Stable firmware is firmware that is clear and easy to maintain over time and reused on future products. 

So how can we judge the quality of firmware?  Lets start with a list of questions: 

Is the requirements documented and testable?  

If the answer is no then the firmware is of poor quality.  That is you can not meet fit for purpose without the purpose! 

Do you have tests for each requirement, has the firmware passed testing? 

If there are no tests, or the firmware fails testing you have problems. 

Is the firmware in version control?

The firmware can be perfect and work as intended however without version control I generally find that this is a temporary state at best.  To put it another way, you would not hire a bankrupt financial planner, so you do not want trust a developer who does not use version control. 

Does the firmware have a version change log record? 

It is frustrating to try and determine which version of code a bug was fixed in, you can not trust code where the changes are not documented. 

Does the firmware have any documentation, example test commands or interface specs? 

The last thing you need to find out that test has been testing a product the wrong way for the last several weeks or months. Hence firmware must include documentation and interface specs as needed so you can trust your test results. 

Does the firmware monitor heap and stack usage?

One of my first checks in code is looking for memory allocation. I find that code which has used memory allocation incorrectly means the developer does not understand firmware development and as a result the code will be full of countless other bugs.  If the firmware has stack and heap tracking then the developer was/is experienced and the code will usually be of much higher quality. 

Does the team use bug tracking on firmware? 

I have used a compiler in the past where I submitted a bug, they fixed it. However the same bug appeared in next release, needless to say I do not use that compiler anymore.  It is said, that those who do not learn from history or bound to repeat it and bug tracking is part of that historical documentation.  Mature organizations will go further and create test cases for the bug and add it to their regression test suite to make sure they never see it again. 

Does the firmware have regression testing and/or nightly builds? 

If a organization has regression testing, then they have version control, bug tracking, requirement specification, etc.  Hence the odds are their firmware will be of higher quality.

Does the firmware have error logging to non volatile memory?

So the firmware has something go wrong, a driver fails to initialize properly for example. How does this get back to the developers to be fixed?  If there is no error logging or handling then the firmware will almost certainly be full of bugs, if not now then in future releases for sure. 

Has static code analysis been ran on the firmware?

Like many things this indicates the maturity of development team. If they do not do it then they are not valuing finding bugs and becoming better developers.  Hence your firmware will most likely have bugs. 

Does the firmware need and use watchdog timers? 

Watchdogs are not always needed for all products, however if you ask about watchdog timers and the developers know what they are and do not have a documented requirement as to why they are not needed you know you have problems. 

Does the firmware have an error recovery plan?

What does the firmware do if a driver fails to initialize properly?  For most firmware I have seen, the answer is nothing because they don't check. Often the developers don't check because they don't know what to do when an error happens. Hence if there is an error recovery plan (even it is just reboot) then the developers are more likely to check for errors. 

Has the firmware gone through a peer review? 

If the firmware has not been reviewed by others then it will have bugs. Often developers are scared of peer reviews because they know their code is not good enough. Hence be weary of any code where developers have not had peer reviews. 

How many issues per hour were found in the peer review?

As you do peer reviews count the bugs found per hour of review.  This perhaps is the best actual metric for code quality. That is if you find zero bugs after an 8 hour code review then you most likely have pretty good code. If you find 8 bugs in first 15 minutes, then you might want to stop the peer review and decided how to pivot. 

Does the functions in the code have side effects?  

Functions and methods in the code should do what they say they do and nothing more. For example functions that changes a global variable inside the function has a side effect. That is the next developer who modifies the code might not know that the function changes the global variable and thus creates a new bug when modifying the code.  

Yes! this means global variables are a bad thing and should be avoid!

Another example might be a function which initializes a driver and also turns on an LED. Turning on the LED is most likely not required to initialize the driver and hence is a side effect.  Code that has functions with side effects might be good and pass all tests, but the code base is unstable and minor edits can cause major failures, and hence is considered poor quality as a result. 

I often look at the source files and if they have extern.h, or globals.h/c then this is a big red flag that even if the code works it will most likely require a near complete rewrite to be maintainable and stable. 

Is the project file structure and build process documented? 

You might have perfect firmware but then your developer leaves the organization and no one knows how to build the firmware you have a problem.  If there is no instructions about how to build the firmware, including tools need (compiler versions, etc), then you will have bugs in the near future.  Also if the project file structure is a mess, such that code is not organized by function then you are guarantee that even the original developer will not understand it in 6 months and will create new bugs each time the code is touched. 

Is the firmware built with optimizations turned on? 

If the firmware is built with optimizations turned off for a release build you are 100% guaranteed that you should not ship the product. That is the developer most likely left optimizations off because it would not work when turned on and he could not figure out why. If he left optimizations off because he did not know better, then again he will have created many more bugs in the code. 

Does the firmware have warnings when compiled with all warnings on? 

Often developers will turn off warning reporting on their projects, most often because they do not understand the warning.  More often the warnings are actual bugs. If the project has warnings when compiled with all warnings turned on, or the developer turned off warnings or worse just ignored them, then your code quality is low. 

Does the firmware have a method of doing firmware updates, aka boot loader? 

Making firmware to do firmware updates is a difficult task, most developers cringe when it comes up and avoid it like the plague as it is one of the hardest tasks to get right. That is it requires intimate knowledge of the processor and firmware to make it work.  Hence if the developer has added this capability it indicates they are experienced and know that their code will not be perfect. If they argue against, or complain that it will take too long, then it is a good sign your code might have issues where it is needed.  

If you go through each question above, by the time you are done you will have an intuitive feel if nothing else about  firmware's quality .  For sole developer projects the reality is that the firmware quality is tied directly to the developer's capability. Therefore, for this developer any metric on the quality is a direct attack on his livelihood and will be met with considerable resistance. 

With all problems I usually recommend first is solving the problem, second solving how problem came into existence or was not found earlier.  However with developers and code quality I recommend organizations plan to prevent problem in the future first. For example if you have lined up training and mentoring for the developer before going through the code quality analysis it might help them feel their livelihood is not threatend. 

Misfit Tech offers consultant and contract services for your firmware or hardware development needs.