HOW TO WRITE DEBUGGABLE CODE.

by Larry Colen

Quality can't be added, it has to be designed in.

A summer intern asked for my help learning to program in C. I thought that it would be worthwhile to write a quick missive explaining some of the tips and tricks that I've learned over the past 20 years that I've been attempting to bend computers to my will. As with many such projects, what was intially conceived as a quick half hour project, quickly grew in scope.

This missive will not attend to the whole software engineering process. It will concentrate on tips and techniques to use during the "implementation" phase, while the code is actually being written.

While the example language will be C, most of these tricks and techniques can be applied to any programming language.

Programmer Discipline:

Discipline is probably the most important requirement to writing reliable code that is easy to debug. By discipline I do not mean leather and whips, but a strict adherance to a consistent set of coding practices. This discipline can be imposed by the organization (the software engineering process), the tools (lint), or the programmer. A programmer with good discipline can write reliable, easy to debug code in any language, be it C++, ADA or even assembly.

Let us assume that proper coding practices are being followed, whether by the programmers own self-discipline, or external mandate. What are the components that make coding practices effective, rather than merely a bureaucratic time sink? I feel that there are three of these components: Consistency, Comments and Clarity.

There are two advantages to consistency: First, as I tell my students when I teach performance driving "You can't do it right every time until you can do it the same every time". The second, is that it makes it a lot easier to read code when it is written in a consitent style.

Many people make the mistake that since the comments are not compiled into the code, they are only of secondary importance. If however you look at the usage of a source file, over the life of a product, you will find that compilers spend far less time reading the file, than do programmers. There are two types of documentation associated with software, reference and tutorial. Most programmers make the mistake of only writing reference documentation, when they write any at all. Unfortunately reference documentation does little, if any good, for someone who doesn't already know what is being documented in the first place.

The clarity of a programmers style is often overlooked. Both the code, and the comments should be laid out in a logical, orderly, easy to read manner. Use sufficient white space so that the information is readable, but not so much that one cannot keep a reasonable amount of code in view at one time. This is one of the hardest things to balance, depending almost entirely upon the work environment of the programmer:

Is the work done off of paper or off of a CRT?
How many lines per page?
How many columns per line?
How many pages are in view at a time?

Consitency:

One of the biggest nightmares a contract employer has to face is the plethora of coding styles that one is likely to face on a new project. Theologians can debate endlessly on the "One true indenting style" or the "One true naming convention". These are well and truly religious issues that cannot be proven, but must be accepted as matters of faith. Every style has it's advantages and disadvantages. The important thing is that a single, consistent style be used.

Most structured languages do not impose their own discipline in this area. A C compiler does not care if the opening brace is on the same line as the if, lined up with both the if and the closing brace, or 5indented with a tab or a number of spaces equal to the number of vowels in the name of the month. It is, however, very difficult for most programmers to read code where the 'style practices' keep changing. While I truly believe that my own choice of styles is indeed the "One true programming style", it is far more important to be consistent. If you are modifying someone elses code, use a style consistent with the rest of the file, or project.

There are two things which an organization wide coding standard will achieve. First of all, it is almost guaranteed to upset everyone a little bit. If one person mandates his own style upon the whole organization then there will only be one person who doesn't disagree with some aspect of it. Second, after a period of adjustment, then everyone will be able to read everyone elses code, without the congintive dissonance from reading an unfamiliar style.

There are some people who get around the style question by using automated formatting tools (pretty printers) to automatically convert code from one programming style to another. In my experience, these tools do a pretty good job for most of the task, but will not be able to do a good job for the whole task.

Commenting style:

One of the most important things that I have learned is to do the documentation first. In thirteen years of professional programming, I have actually gone back after a program was done and cleaned up the documentation, Once.

The corollary to this is of course that you must keep the documentation up to date as you go along.

NOTES ON COMMENTS:

There are several kinds of comment formats. They can be in the form of headers, they can be in comment boxes, they can be columnar at the end of a line, they can start in the first column, or they can be formatted along with the text.

The format of a comment will greatly affect how much it interrupts the flow of reading a program. I personally feel that the high level documentation (This is what the function does, and why) should be in a structured header. There should be enough running commentary that a competent programmer can follow the flow of a program, and no more. This in-line commentary should be in columns parallel to the code, so that the "code column" can be read without being interrupted by comments, and the "comment column" can be read with minimum interruption by long lines of code. I use hard tabs to start my comments in a consistent column and to end them in a consistent column:

					/* I find that having the	*/
					/* comment delimiters lined up 	*/
					/* makes it easier for the 	*/
					/* programmers eye to pick the 	*/
					/* comments out from the code.	*/

If I want a comment to stand out, either a warning, or a signpost to say that I am entering a new section of the code, I put in a comment box. This box, like the commentary, will have the delimiters in regular columns, but it will have the top and bottom of the box filled in with a line of stars.

				/****************************************/
				/*					*/
				/* Here there be Dragons.		*/
				/*					*/
				/****************************************/

I try to restrict any comments that must be so wordy as to require starting in the first column, to the headers.

My theological belief on comments formatted along with the code is that they just serve to interrupt the flow of reading the code. I find that when looking at the body of a program, I am mostly interested in reading the code, and only occasionally want to look at the comments for further explanation.

Mark questionable, or problem areas with
			/*!!! This is a warning		*/
or 
			/*??? I'm not sure about this	*/

HEADERS:

A very usefull commenting technique is the use of standard headers. These fall into two classes, file headers and function headers.

Many years ago, I set up a header format so that I could easily write an editor macro and extract the headers into a separate file. I then took care to include everything in the header that I would want to find in a manual page on that function, or library. At Schlumberger, we have an automated tool for extracting this information from the headers to construct the manual pages, it is called "automan".

At first, writing the function headers seems like an awfull lot of bother. What I try to do is to write all the function headers during the low-level design phase of the program. If you look at the header, it has all the information that you need before writing a function. If you haven't already thought about these questions, you haven't really designed the function. As long as you are documenting your design (You are documenting your design aren't you?) you might as well document your software at the same time.

There are software tools available that will analyze an entire software program and automatically create a function header with all sorts of interesting data such as, what functions are called by the function, what functions call it, what variables are used etc. I have found these tools to be very usefull in some cases and an incredible waste of time and bother in others.

At the very top of the file should be the name of the file, and the time and date of the last revision. This is important because when you look at printouts it allows you to tell at a glance what they are, and how current they are.

The automan documentation goes into detail about all the various automan fields.

A SAMPLE TEMPLATE FOR A FILE HEADER

/*
cfhdr.c
XX/XX/XX XXXX
*/
/****************************************************************************/
/*MAN MSRT
NAME
    .c -- 

SECTION PURPOSE
  

DESCRIPTION
SECTION TEST SEQUENCE
<
>
SEE ALSO
    
DESCRIPTION

AUTHOR
    Larry Colen

UPDATES
     XX June    1996  L. Colen  Created

*/
/****************************************************************************/

/****************************************************************************\
*  Copyright (C) 1996  with all the usual disclaimers                       *
\****************************************************************************/

A SAMPLE TEMPLATE FOR A FUNCTION HEADER

/*MAN APEDIAG @begin_fn_hdr@==============================================
NAME
		
SYNOPSIS BELOW

PARAMETER	

RETURN VALUE

SIDE EFFECTS

DESCRIPTION

TESTING

WARNINGS

SEE ALSO

AUTHOR
    Larry Colen
	
UPDATES
    JUN  XX, 1996  Larry Colen	First created

============================================================@end_fn_hdr@*/

MY ORIGINAL FUNCTION HEADER LAYOUT

/*@begin_fn_hdr@=========================================================
 Name:		A plain-english version of the name.
 Synopsis:	A one or two line explanation of the function.
 Input:		What it requires (parameters and globals) as input.
 Output:	What is returned, or changed, by the function.
 Description:	A detailed description of the function, including
 		explanations of algorithms, intent, and any tricks.
 Testing:	Things to keep in mind while testing the program.
		Special cases to test for.
 Warnings:	Anything that might be dangerous, or a source of bugs.
 Updates:	Note who modified the function, when, why and how.
===========================================================@end_fn_hdr@*/

Alternate versions of this header can include an example declaration and a typical usage of the function. The premise is that if the headers of all the functions were extracted with the function declarations, it would make a good basis for a programmers reference to the system. On a system with a good tags facility, the hypertext ability of tags makes extracting the headers less important.

TRICKS TO IMPROVE CLARITY :

Line things up.

Human vision is very good at recognizing patterns. One of the easiest patterns to recognize is a straight line. Standardized indentation of blocks of code is a technique that makes use of this. I have found several other places where lining things up helps the programmer to quickly find what he is looking for.

I have already discussed lining up comment delimiters. Lining up the various fields in variable declarations makes it a lot easier for the programmer to find the declaration of a particular variable, or all the variables of a particular type. I also strongly feel that every variable should be declared on a line by itself, and it should be commented as to what it does, or is used for. When looking at the function, it is often obvious what each variable does. However, when maintaining code, it is standard practice to grep for strings that have something to do with what you are trying to achieve. Documenting each variable makes it a lot easier to find the particular function that you are looking for.

Here is the variable declarations out of one of my functions:
   register int		what_to_do;	/* which selection from menu	*/
   register int		last_test;	/* the last APE board to test	*/
   	    int		loop_count = 1;


   register int		idx;		/* generic index		*/
					/* I used to always use 'i' for */
					/* my generic index, until a 	*/
					/* friend pointed out that 	*/
					/* searching on idx generated	*/
					/* fewer "false hits"		*/
            int		flag;		/* generic flag			*/
   register int		thead_idx;	/* test head index		*/
   STDItem_list 	*glob_ilist;	/* Which chs to test   		*/
      
   
   SYSMixed_config	*mix_sig_config;

When performing a series of calculations, I've found that lining up the '=' makes the code a lot more readable. Again, the pattern matching of human vision will separate out the what is being changed, from the inputs for each arithmetic step.

   test_function = reg_struct_ptr->reg_addr;
   test_mask     = reg_struct_ptr->reg_mask;
   a_mask        = test_mask & 0x5555;
   b_mask	 = test_mask & 0xAAAA;

PRECOMPILER TRICKS

Use the precompiler, and the compiler to automatically generate sanity checks, and to keep related information in sync with each other.

Use enums to keep things in order and to automatically generate a symbol for how many items are possible.

enum foo_limits
{
   LIMITA,
   LIMITB,
   LIMITC,
   NUM_OF_LIMITS
} foo_limits;

A friend taught me an extremely handy use of the precompiler. When there is a lot of information that needs to be kept in sync, but the data needs to be stored in multiple locations, create a ".def" file, with all of the information in tabular format. Set the table up as a macro being performed upon all of the fields of the table. Then the macro can be redefined as needed to intialize arrays, structures, enums, or whatever, with the information in the table. That way, everything is kept in one place and you don't get errors from changing a table in one location, and forgetting to change it someplace else.

/* The following 2 lines are a comment, and one line out of a .def file. */
/*	 SYMBOL,         ACCESS, INDEX,     TYPE,   NUMERIC, BIT MASK */
REG_DEF("REGISTER_NAME", RW, REGISTER_NAME, TYPE_1, 0x02A80, 0x000000FF )

/* In a header file, we might have the */ /* following to automatically generate the */ /* index into an array of structures of the */ /* table data. */ #ifdef REG_DEF #undef REG_DEF /* Make sure the macro isn't already defined */ #endif

#define REG_DEF( R_SYMBOL, R_ACCESS, R_INDEX, R_TYPE, R_NUMERIC, R_BIT_MASK ) \ R_INDEX, enum table_idx { #include "foo_reg.def" /* this is where the table is loaded */ NUM_OF_TABLE_LINES /* the macro leaves a trailing comma */ /* this also tells us the size of the table. */ } table_idx; /* In a source file, we will initialize an */ /* array of structures with the data from the */ /* table in the .def file. */ typedef struct reg_struct { int access; /* 1=W, 2=R, 3=RW... */ int reg_type; /* 1, 2, 3 ... */ int reg_mask; /* Which bits are active */ char *reg_name; } reg_struct;

#ifdef REG_DEF #undef REG_DEF /* Make sure the macro isn't already defined */ #endif

#define REG_DEF( R_SYMBOL, R_ACCESS, R_INDEX, R_TYPE, R_NUMERIC, R_BIT_MASK ) \ R_ACCESS, R_TYPE, R_BIT_MASK, R_SYMBOL,

reg_struct *reg_struct_array; reg_struct foo_reg_struct_ary[] = { #include "foo_reg.def"; 0,0,0,0 };

In order to prevent problems from a header file being included multiple times (if it is included by two different header files, both included by the source file), define a macro that is a permutation of the headers files name. Wrap everything, including that definition, in a precompiler test to see if that macro is defined.

#ifndef _FILE_H_
#define _FILE_H_
      include file
#endif

This is about the only occasion that I will use an #ifdef rather than an #if. I prefer having all of my flags to be either explicitely turned on or off. It also makes it easier to tell what the compilation options are.

One of my favorite quotes about programming was I believe it was Dennis Ritchie saying, and I paraphrase, "Never understimate the power of a printf when debugging. But don't spend hours getting the format of a debugging statement perfect".

What I like to do, is to wrap my debugging statements in conditional compilation statements. But rather than using the macro as a simple flag, I use it as a bitwise flag, with each bit representing a different phase of debugging. This way, as I am writing the program, I can plan ahead what to check while debugging the program, and make sure that all important aspects of the code get checked during unit test.

As the program is debugged, I can turn off different sections of debugging code. Later in the life cycle of the product, people maintaining (modifying or debugging) the code, can then make use of the previous work developing the debugging statements.

Testing the value of parameters on the entry to a function can prevent a lot of grief. I have found that during the debugging phase, rather than just trapping and returning an error on illegal input parameters, printing out a notice, greatly speeds the debugging process. However, once the program is running, you don't want to carry around all the overhead of these parameter tests.

				/****************************************/
				/*					*/
				/* Macros used to test input parameters	*/
				/* of functions.			*/
				/* basic tests shared by classes of 	*/
				/* parameters.				*/
				/* These tests will return a true if	*/
				/* the parameter is out of range.	*/
				/* These macros take advantage of the	*/
				/* fact that most of the parameters are */
				/* in specific "classes" of data.	*/
				/* define these in the header file	*/
				/*					*/
				/****************************************/
				/* use with test_1param()		*/
#define bad_ana_ch(ch_num)	((ch_num<0)  ||(ch_num  > MAX_ANALOG_CHANNEL))
#define bad_foo_num(foo_num)	((foo_num<0) ||(foo_num > NUM_OF_FOO_ARRAYS ))
#define bad_foo_addr(foo_addr)	((foo_addr<0)||(foo_addr>=SIZE_OF_FOO_ARRAYS))
				/* use with test_2param()		*/
#define bad_foo_numval(numval, start_addr)  ((numval<0) ||			\
					     ((start_addr+numval)>SIZE_OF_FOO_ARRAYS))

/* * HELP_DEBUG is a bitwise flag, which can be set in the .c file * to select what debugging features are activated. * bit Used for * 0x0001 printing out diagnostic messages on parameter tests. * */

/****************************************/ /* */ /* If HELP_DEBUG is not defined BEFORE */ /* this file is included, */ /* define it here. */ /* If it is defined later, it should */ /* generate a warning. */ /* */ /****************************************/ #ifndef HELP_DEBUG #define HELP_DEBUG 0 #endif

/****************************************/ /* */ /* If the DEBUG flag is set, then print */ /* diagnostic message, */ /* else just return with an error */ #if (HELP_DEBUG & 0x01) /* test: macro returning true on error */ /* param1: parameter to test */ /* string: diagnostic string that will */ /* print out value of offending param */ /* */ /****************************************/

#define test_1param(test, param1, string, errval) \ if test(param1) { warnPrintf(string, param1); return (errval);}; #else #define test_1param(test, param1, string, errval) \ if test(param1){return (errval);}; #endif /* HELP_DEBUG */

#if HELP_DEBUG & 0x01 #define test_2param(test, param1, param2, string, errval) \ if test(param1, param2){ \ warnPrintf(string, param1, param2); \ return (errval);}; #else #define test_2param(test, param1, param2, string, errval) \ if test(param1, param2){return (errval);}; #endif /* HELP_DEBUG */

/********************************************************/ /* */ /* call DEBUG_PRINT_RET with the following */ /* format in order to use variable args */ /* DEBUG_PRINT_RET(ANALOG_ERROR,("Error Message")); */ /* Note that the variable args for the printf are */ /* enclosed in a second pair of parenthesis to make */ /* them look like a single parameter to the macro. */ /* */ /********************************************************/ #if (HELP_DEBUG & 0x02) #define DEBUG_PRINT_RET(errval, print_args) \ {warnPrintf print_args; return(errval);} #else #define DEBUG_PRINT_RET(errval, print_args) \ { return(errval);}

#endif #endif /* _HELP_DEBUG_H_ */

/****************************************/ /* */ /* Here we actually use the macro to */ /* test the parameter on entry to the */ /* function. */ /* */ /****************************************/

int foo_ret_reg(register int array) { #if (BAR_DEBUG & 0x0004) warnPrintf("foo_ret_reg %d\n", array); #endif test_1param(bad_foo_num, array, "foo_ret_reg called with bad foo parameter %d\n", ERROR_VAL); return OK_VAL; }

CODE REVIEWS:

One of the most usefull tools available to a programmer is a code review. There are two styles, a walkthrough, where the programmer sits down with someone and explains the code to them. While the second person will often find many problems, the greatest benefit is from the walkthrough forcing the programmer to actually look at and think about every line of code. The other type of code review is called an inspection, where someone else, or several other people, will take a source listing, read through it, and mark comments and questions. Often times inspections will involve a formal checklist.

Larry Colen
lrc@netcom.com

Back to the top.

Last Updated 1/10/98