HOWTO send good bug reports for olsrd and how to get quick fixes back


0.5.6-r2 and 0.5.6-r3 unfortunately saw some bugs which were only discovered in the field. It turned out that sometimes it was hard for people to properly describe the problem. This is quite understandable since a MANET mesh is a complicated setup with a lot of parallelism. Debugging this is not easy!
Neither is understanding the problem. So this HOWTO tries to summarize what info developers need.

Needed data for debugging

Usually the following info is sufficient to analyze the problem.

  1. first: what version(s) of olsrd were you running? (mixed? if so which?)
  2. your olsrd.conf file
  3. any parameters you used for calling olsrd
  4. a description of your network setup. Which nodes are around? What are their IP addresses?
  5. a tcpdump on the concerned node. Start tcpdump like this:
    $ tcpdump -vv -ni ath0 port 698

  6. turn on the logging and send the debug output (Loglevel in the config file)

Stack traces

Sometimes when olsrd crashes it is necessary to understand where crash occured. If it is by any chance possible for you to repeat the crash, then a stack trace is the most valuable info the developers can get.
Here is how you make a stack trace:

  1. compile olsrd with debugging info: set DEBUG=1 in
  2. start olsrd in a screen session with debug output and in -nofork mode from within gdb:

    $ sudo gdb olsrd
    (gdb) set args -d 2 -nofork -i your_interface
    (gdb) start
    ... (olsrd runs, and crashes)....
    (gdb) bt

    The output helps to solve the problem very quickly. Of course, in a small embedded system, running gdb natively might be hard.