Saturday, April 25, 2015

Unit Test of Embedded System Code

Manifesto: Unit Test of Embedded System Code

I'm not sure eactly what this means, but I intend to apply it, starting now.
And why not give my efforts a grandiose name?

Having just spent an interesting Saturday in a Code Retreat graciously hosted by OC Tanner, including fabuluous food, I hereby resolve to start applying some of what I learned, even though I am not at all certain what that means. Why let ignorance stand in the way of application? Here are my intended actions. To anyone who knows what they are doing these ideas may sound infantile, but so be it. Has anyone else already figured this all out? A minute of Googling the title of this post does produce some hits such as this paper from Parasoft.

What does it mean to test embedded code?

I'm envious of programmers who can live in the perfect world where their universe is just software: PC and web apps, databases, transaction processing, etc. Sure there is hardware there executing code and storing data, but you never have to touch it or debug it. Your PC either works or it doesn't. You have powerful debuggers and analyzers and never need to touch a multimeter or oscilloscope.

Embedded systems are a lot messier. For example I am working on a driver for the AD7794, a 24-bit delta-sigma analog to digital converter, with PGA and a bunch of other programmable features. It works up to 125 C, too!  Good news: it does a lot and the options are useful. Bad news: it is really complex and so the driver has to manage that complexity. It uses a rather unusual (in my experience anyway) SPI interface which does not synchronize with assertion of the slave select. My first problem was that I could not find any good examples of general purpose C code, so I started writing my own from scratch. The simplest reading of a device status register seemed to return nonsensical (but not random) data. How do I debug this? The failure might be in the part or in some hardware-access routine.

Then, another system uses a confusing series of shift registers to drive AC and DC loads in a system under control. How do I really test my code without also testing the outputs of these shift registers, for which I need special hardware? So in fact we are spinning a quick-turn circuit board to display the state of all system outputs and provide a means to test system inputs. It's not fully closed-loop and therefore capable of fully automated functional test, but it's better than the current state of affairs where we can't even tell if we are driving most of the loads (you can't see a low-wattage heat pad turn on).  This board will let us visually watch ones and zeros being walked across outputs both low-voltage DC and 120 VAC.

Testing once is not enough

Maybe in the perfect world of software it is, but not in the embedded space. When you go home at night why not leave test code running with some way of logging errors? You might be shocked to find that there is some failure 0.00001% of the time. If you can run a million or more tests overnight you might be lucky and see 10 such failures. This is indeed lucky: better to find and fix it now early on than have customers later report mysterious failures in the field.

Consider functional test hardware as part of the original system design

What good is a system which can't be functionally tested? Usually this means special custom hardware (it can be simple, but it is still custom) just for the purpose of functional test. So consider that as part of the original system design and include it in the budget and schedule. If it's a design for someone else, xplain to your customer the benefits of your doing so.

Add a C++ test class to every device driver

I'm working on several C/C++ device drivers for prototypes of commercial products. They are really C with just the thinnest layer of C++. I normally write tests as I go but don't save all of them. Many times they end up as chunks of commented-out code (a bad pattern/habit I'm working on correcting), or old versions of the driver which get discarded or lost in an archive. Instead, why not write a deliberate test class which can then be invoked from a simple driver test program and run throughout the product's life, as needed? This way the tests stay as part of the project but don't have to get built into the shipping binary. It seems like this should work. I'll try it straight off and let you know. Maybe everyone else in the world already does this and I'm the last to adopt it.

Mock up data to test algorithms

What I mean here, is that if there is some data processing needed to convert from a sensor's native format (e.g, TMP102 stores only negative temperatures in a 13-bit two's complement requiring two separate byte reads) to a useful one (such as degrees C or F), it is a good idea to pass simulated data of all possible values to that routine so that you know it works and the converted temperature values will be correct. I've seen open source drivers that just didn't bother with this and ignore the special conversion needed for negative temperatures. This is especially true if the algorithm has other dependencies such as using calibration coefficients pulled from device registers or calibration memory. What if those calibration values are wrong, or span all possible values for the given data type: will a calculation overflow or wrap around? This overlaps into boundary testing.

Test all boundaries

If a 12-bit ADC returns a value into a 16-bit data type, and it is right-justified, the top nibble should always be zero filled. But what if it isn't: either the converter fails, or it is not initialized properly, or noise infects the data lines? Will downstream calculations or conversions fail? Test all such routines with the maximum possible data values. Clip values to the maximum permissible in your code if that makes sense. For example in a PID algorithm, there is the danger of "integral windup" but this can be handled by limiting the range of your integral variable.

Catch all exceptions and at least report a unique code

Sometimes if there is a possible, but inconceivable, error state I put a message such as "this should never happen: error XXX" where XXX is a unique integer so every error is distinct. Imagine my suprise when I see this very error message later in system test. Clearly something big has not met my expectations, but at least a) I know it happened and b) I have a clue where to look in the code.

Use a C/C++ documentation tool

Documentation was never mentioned in the code retreat. Apparently it is not even taught in most CS and CE curricula. What's wrong with this picture? That's a topic for another day. One thing I like and miss about Java is javadoc: it is baked right in to the tools so there's no valid reason to not use it. With C/C++ you have to take extra steps to even find and install a tool. I'll try doxygen first.