Monday, November 9, 2009

meta level debugging

One of the challenges in meta programming is the ability to debug on the meta level. It is not satisfying to have to step through the generated code in order to figure out what is wrong in the model. And indeed many meta programming toolchains have poor support for proper debugging.

But wait, this problem has in principle actually been solved since ages. When we reduce meta programming to code transformation and realize that a C compiler in that sense is a code transformer this becomes apparent. Almost nobody steps through the assembler code in order to find a bug in the C source. Instead the toolchain has support for so called source level debugging, which is exactly what we want: debugging on the meta level. So it seams advisable to understand the concepts of source level debugging so that we can copy or adapt them to meta programming. That's why I decided to explore the GCC and the GDB together with its graphical frontend DDD which are the tools I am most familiar with.


When we pass the -g flag to GCC it adds debugging information to the generated object file which can be evaluated by the debugger. For a simple hello world program on my platform the size of the object file goes from 1.5 kB up to 3.8 kB when adding debugging information. Apparently a lot of information is needed for debugging. strip -d removes it from an object file:

$ cat hello.c
#include <stdio.h>

int main()
    printf("hello world\n");
    return 0;

$ gcc -c hello.c   

$ ls -lh hello.o
-rw-r--r-- 1 alex alex 1.5K 2009-11-02 14:19 hello.o

$ gcc -g -c hello.c

$ ls -lh hello.o
-rw-r--r-- 1 alex alex 3.8K 2009-11-02 14:19 hello.o

$ strip -d hello.o

$ ls -lh hello.o
-rw-r--r-- 1 alex alex 1.5K 2009-11-02 14:19 hello.o

But let's have a closer look at how this information looks like. The file format of the object file is ELF. Obviously the binutils include many tools which can deal with this format. One of them is nm which lists symbols from object files. With nm -a even so called debugging symbols are displayed and they are marked with a capital N. Let's see:

$ gcc -g -c hello.c
$ nm -a hello.o | grep " N "
0000000000000000 N .debug_abbrev
0000000000000000 N .debug_aranges
0000000000000000 N .debug_frame
0000000000000000 N .debug_info
0000000000000000 N .debug_line
0000000000000000 N .debug_loc
0000000000000000 N .debug_pubnames
0000000000000000 N .debug_str

What's the meaning of all these symbols and what's their contents? The man page of GCC says:
-g Produce debugging information in the operating system's native format (stabs, COFF, XCOFF, or DWARF 2). GDB can work with this debugging information.
stabs, COFF, XCOFF and DWARF are debugging formats. They specify the meaning, encoding and embedding of debugging information in the object file. My system's native format is DWARF and the above listed symbols are the way how DWARF data is embedded into ELF object files.

Investigating DWARF

The name DWARF is a pun, because ELF is the major targeted object file format. But DWARF information can be embedded in many other object file formats, too. Furthermore DWARF is not restricted to a single language but can cover a broad range of different languages like Pascal, Ada, C, C++, FORTRAN and Modula2. Most important, there is a public available, free and complete documentation. The introduction to the DWARF debugging format by Michael Eager gives a comprehensive overview of the basic concepts:
A program is described as a tree with nodes representing the various functions, data and types in the source in a compact language and machine independent fashion. The line table provides the mapping between the executable instructions and the source that generated them. The CFI [Call Frame Information] describes how to unwind the stack.
The introduction also explains the meaning of the various debug symbols which are used when DWARF information is embedded into ELF object files:
  • .debug_abbrev: Abbreviations used in the .debug_info section
  • .debug_aranges: A mapping between memory address and compilation
  • .debug_frame: Call Frame Information
  • .debug_info: The core DWARF data containing DIEs
  • .debug_line: Line Number Program
  • .debug_loc: Macro descriptions
  • .debug_macinfo: A lookup table for global objects and functions
  • .debug_pubnames: A lookup table for global objects and functions
  • .debug_pubtypes: A lookup table for global types
  • .debug_ranges: Address ranges referenced by DIEs
  • .debug_str: String table used by .debug_info
DIE stands for Debugging Information Entry which is the "basic descriptive entity in DWARF". The Line Number Program is an encoded "mapping between the source lines [...] and the memory that contains the code that corresponds to the source". Macro descriptions provide information to enable the debugger to deal with code generated from macros. The rest of the listing should be roughly self-explanatory.

If we want to have a closer look at the contents of the debug symbols we need a decoder for the DWARF format. The binutils bring a program called readelf along which can display information about ELF files and is also capable of reading debugging information.

$ readelf -wi hello.o
The section .debug_info contains:

  Compilation Unit @ offset 0x0:
   Length:        0x8c (32-bit)
   Version:       2
   Abbrev Offset: 0
   Pointer Size:  8
 <0><b>: Abbrev Number: 1 (DW_TAG_compile_unit)
    < c>   DW_AT_producer    : (indirect string, offset: 0x28): GNU C 4.3.4     
    <10>   DW_AT_language    : 1        (ANSI C)
    <11>   DW_AT_name        : (indirect string, offset: 0x5d): hello.c   
    <15>   DW_AT_comp_dir    : (indirect string, offset: 0xd): /tmp/try 
    <19>   DW_AT_low_pc      : 0x0      
    <21>   DW_AT_high_pc     : 0x15     
    <29>   DW_AT_stmt_list   : 0x0   
 <1><2d>: Abbrev Number: 2 (DW_TAG_base_type)
    <2e>   DW_AT_byte_size   : 8        
    <2f>   DW_AT_encoding    : 7        (unsigned)
    <30>   DW_AT_name        : (indirect string, offset: 0x16): long unsigned int       
 <1><6f>: Abbrev Number: 5 (DW_TAG_subprogram)
    <70>   DW_AT_external    : 1        
    <71>   DW_AT_name        : (indirect string, offset: 0x7a): main    
    <75>   DW_AT_decl_file   : 1        
    <76>   DW_AT_decl_line   : 4        
    <77>   DW_AT_type        : <0x57>   
    <7b>   DW_AT_low_pc      : 0x0      
    <83>   DW_AT_high_pc     : 0x15     
    <8b>   DW_AT_frame_base  : 0x0      (location list)

In this excerpt we see three DIEs, one for the compilation unit, one for the basic type long unsigned int and one for the main function. The attributes of the DIEs list the various properties of the described entity. As we can see DIEs are just that, a listing of attributes with values. That's why DWARF is considered extensible.

Let's have a look at the line number table:

$ readelf -wl hello.o
Raw dump of debug contents of section .debug_line:
 The File Name Table:
  Entry Dir Time    Size    Name
  1     0   0       0       hello.c

 Line Number Statements:
  Extended opcode 2: set Address to 0x0
  Special opcode 8: advance Address by 0 to 0x0 and Line by 3 to 4
  Special opcode 62: advance Address by 4 to 0x4 and Line by 1 to 5
  Special opcode 146: advance Address by 10 to 0xe and Line by 1 to 6
  Special opcode 76: advance Address by 5 to 0x13 and Line by 1 to 7
  Advance PC by 2 to 0x15
  Extended opcode 1: End of Sequence

This is how the debugger knows what to do if the user steps through the code. Every single step of the program is listed here. Don't get confused by the addresses. They all are relative to the object file and will be changed by the linker.

For a deeper understanding in what we see here we should read the DWARF standard. But for the moment it's enough to see how source level debugging basically works. The compiler adds debugging information to the object file which can be used by the debugger to undo the compilation. This information is so extensive that the debugger becomes nothing but an interpreter for that information. Besides those information GDB has no additional knowledge about the programming language which was used to generate the object file.

The same can be done for meta programming in general. The compiler has to persist information about what piece of source code has been transformed to what piece of target code. The easiest and most obvious way to do so is to annotate the generated code with comments. Basically this is not different from ELF object files enriched with DWARF debugging information. I have seen such approaches and they work, at least for the specific meta programming language, the specific compiler and the specific debugger. Something more platform independent such as DWARF and GDB does not exist in the meta programming community, at least up to my knowledge. Please correct me if I am wrong.

C-based meta programming languages

Programs are supposed to be executed. The easiest way to achieve this for a meta program is to transform it into C. The big advantage hereby is that there are C toolchains for almost any platform. So C-based meta programming languages are portable. If we deal with such a language the question is if we can use the existing debugging infrastructure to do debugging on the meta level. So let's see.

First, GCC has a handy feature which allows developers to manipulate the line number table:

$ cat magic.c
#include <stdio.h>

int main()
#line 23 "meta.stuff"
    printf("hello world\n");
#line 42
    return 0;

$ gcc -g magic.c

$ readelf -wl magic.o
Raw dump of debug contents of section .debug_line:
 The File Name Table:
  Entry Dir Time    Size    Name
  1 0   0   0   magic.c
  2 0   0   0   meta.stuff

 Line Number Statements:
  Extended opcode 2: set Address to 0x0
  Special opcode 8: advance Address by 0 to 0x0 and Line by 3 to 4
  Set File Name to entry 2 in the File Name Table
  Advance Line by 19 to 23
  Special opcode 61: advance Address by 4 to 0x4 and Line by 0 to 23
  Advance Line by 19 to 42
  Special opcode 145: advance Address by 10 to 0xe and Line by 0 to 42
  Special opcode 76: advance Address by 5 to 0x13 and Line by 1 to 43
  Advance PC by 2 to 0x15
  Extended opcode 1: End of Sequence

When we provide a file called meta.stuff with more than 42 lines and start DDD on the magic executable meta.stuff is loaded. When we add a break point for the function main it will be added to line 23 of that file. When we step through the code DDD will output "hello world" and then jump to line 42. This is very convenient.

In fact this feature is used by the nesC compiler. nesC is an extension to the C programming language which was designed for the TinyOS project in order to provide a adequate programming paradigm for wireless sensor networks. The nesC compiler merges all involved nesC files of a given program into one single C program which is then compiled by a standard C compiler for the target platform. The generated C program is thereby annotated with line macros in order to enable debuggers to step through nesC code as opposed to the generated C code.

As the documentation on debugging nesC code in GDB describes there is a catch, though. The only alternation of the debugging information is the line number table. In particular the symbol table is unmodified which means that all symbols are from the C code only. As a consequence the nesC developer must be familiar with the name mangling which is used by the nesC compiler in order to address nesC symbols by their C name. This is cumbersome.

The Yeti 2 project is a great improvement to this. It is a TinyOS 2 plugin for Eclipse and allows you to work on the nesC level. In fact you almost never notice that nesC is actually first compiled into C. How does Yeti do this? First, it uses the Eclipse C/C++ Development Tooling (CDT) to do the communication with the C toolchain. Second, it depends on the line macros which are generated by the nesC compiler in order to be able to step through the nesC code. Third, it provides a translation from nesC names to C names, so that you can set breakpoints in the nesC code and browse nesC modules and variables. There are only some cases such as stack traces when you are faced with the C nature of your application. I think this is acceptable.

Concerning the user's perspective Eclipse plugins like Yeti solve the problem of meta level debugging. But when we disregard usability concerns and focus on a technical aspects only this solution is not satisfying. First of all the plugin introduces functional redundancy for our toolchain. For example the parser for the meta language is implemented twice, once by the compiler and once by the Eclipse plugin. As a matter of fact this redundancy must be kept consistent manually. So besides writing a plugin for every meta language in the first place it has to be maintained to follow language extensions and evolution. Last but not least we use the heavy weight Eclipse run time to do simple name mangling. If we are not interested in Eclipse's GUI this is a high price to pay.

As GDB is nothing but a DWARF interpreter it should be possible to do meta level debugging directly with an unmodified GDB. All we need is to let the compiler write proper debugging information. This works as long as the meta language is C like enough so that DWARF's concepts can cover it. Then the following workflow should be possible:
  • The meta compiler translates the meta program into C.
  • The C toolchain translates the C program into binary and adds proper debugging information.
  • The meta compiler maps the meta level concepts to the binary code by analyzing and altering the debugging information.
  • The debugger uses the debugging information to undo the translation from the meta language to the binary code.
Fortunately there is a libdwarf which provides a consumer and a producer interface to DWARF debugging information. Together with libelf this is all we need to test the above sketched idea.
In the best case it is possible to hide the intermediate C step completely. But I would not be surprised, though, if this fails in practice because of some design flaws or heritage issues. Theory complies with practice only in theory. So I will have to investigate that.

No comments:

Post a Comment