The most important advice is to start small with your debugging. Don't build up a 100 node network on the roofs of your city only to discover later that nothing works. Get a couple of nodes into the same room first to test them.
The second most important advice is to start testing with the lower OSI-layers ( http://en.wikipedia.org/wiki/OSI_model ). If your link layer doesn't work, there is nothing the upper layers can do.
Link layer tests
The first step is to configure both nodes to use the same adhoc network on wifi. Start with two devices. If everything works, you can start adding more devices to your test setup.
Use the iwconfig <wifi-interface> command to check if you configured both of them with the same ESSID and Channel. The output will look similar to this:
# iwconfig wlan0 wlan0 IEEE 802.11bg ESSID:"myessid" Mode:Ad-Hoc Frequency: 2.412 GHz Cell: 0E:12:34:56:78:9a Bit Rate:54 Mb/s Tx-Power=20 dBm Retry long limit: 7 RTS thr:off Fragment thr:off Power Management:on
Next you can use the iw <wifi-interface> station dump command to check if both sides of your test link can see each other. The output will look similar to this:
# iw wlan0 station dump Station 0E:12:34:56:78:9b (on wlan0) inactive time: 200 ms rx bytes: 10016 rx packets: 122 tx bytes: 6010 tx packets: 31 signal: -20 dBm tx bitrate: 48.0 MBit/s
The command will tell you each known neighbor and lists a lot of link layer metadata about the link.
IP layer tests
You confirmed that both sides of the link you want to test can see each other on the link layer. Now its time to check if you have IP connectivity.
You do not need a routing protocol to use IP connectivity over a single wifi link, all you need is a correct IP configuration on both sides. Because of this you should not run the routing protocol for this tests.
You can use the ping <ip-address> command to test if you have bidirectional IP connectivity with the other side of the link. If you have connectivity the output will look similar to this and you can skip the rest of the IP connectivity tests:
# ping 10.10.0.1 PING 10.10.0.1 (10.10.0.1) 56(84) bytes of data. 64 bytes from 10.10.0.1: icmp_req=1 ttl=64 time=0.054 ms 64 bytes from 10.10.0.1: icmp_req=2 ttl=64 time=0.041 ms 64 bytes from 10.10.0.1: icmp_req=3 ttl=64 time=0.042 ms ^C --- 10.10.0.1 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 1998ms rtt min/avg/max/mdev = 0.041/0.045/0.054/0.009 ms
If you have no connectivity, it will look similar to this:
ping 10.10.0.2 PING 10.10.0.2 (10.10.0.2) 56(84) bytes of data. From 10.10.0.1 icmp_seq=1 Destination Host Unreachable From 10.10.0.1 icmp_seq=2 Destination Host Unreachable From 10.10.0.1 icmp_seq=3 Destination Host Unreachable ^C --- 10.10.0.2 ping statistics --- 4 packets transmitted, 0 received, +3 errors, 100% packet loss, time 3015ms
Run the ip addr command to check the IP configuration of the interface. The output will look similar to this:
# ip addr 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: wlan0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether aa:bb:cc:dd:ee:ff brd ff:ff:ff:ff:ff:ff inet 10.64.3.150/24 brd 10.64.3.255 scope global dynamic enp0s25 valid_lft 914404sec preferred_lft 914404sec inet6 fe80::2e59:e5ff:feb9:b149/64 scope link valid_lft forever preferred_lft forever 3: eth0 ...
Check the following:
- does each of the devices has its own unique IP address on the wifi interface?
- do they all have the same subnet mask?
- are all IP addresses configured on the interfaces within the subnet mask?
- have you used the same IP address or subnet mask on a second interface?
- do you have a firewall up that might block the routing protocol UDP packets?
- do both sides of your link have the same MTU set?
Basic routing tests
Start testing the routing agent with a minimal configuration unless you have to attach it to an existing network. Just define the used interfaces without anything else.
One common mistake is to set routing tables without using the corresponding policy routing scripts.
Trouble with attached clients
If you cannot reach clients through an OLSR network, there are a couple of additional things you should check:
- can the routers and their local clients can talk to each other (check with a PING).
- did you set the IP address (or prefix) of the attached clients in the configuration of the local router as an attached network?
- do the clients have a default route set towards their local router (or at least a prefix which contains all OLSR router IPs and all clients)?
Report to the mailing list
A couple of questions you might want to answer in your report to the Mailinglists:
- which version of the routing daemon do you use?
- was it compiled by yourself or is it part of a distribution (which might contain additional patches)?
- what configuration (file) do you used (strip it of empty lines and comments please)?
- what network topology do you use (nodes, interfaces, radio links between these interfaces)? The output of the commands ip addr and ip route for each of the nodes are really useful for us!
State that you followed this guide in your report!
Report for olsrd2
While the original set of commands was giving a LOT of information, it is difficult to process them. So instead of doing all these commands, just run a single dump of "netjsoninfo" on all of your nodes.
echo /netjsoninfo graph route | nc 127.0.0.1 2009
The output is designed to be automatically processed by visualization tools.
old debugging commands
Add the output of the following telnet commands to help us understand what is going on:
- echo /nhdpinfo link | nc 127.0.0.1 2009
- echo /nhdpinfo link_twohop | nc 127.0.0.1 2009
- echo /nhdpinfo neighbor | nc 127.0.0.1 2009
- echo /olsrv2info originator | nc 127.0.0.1 2009
- echo /olsrv2info route | nc 127.0.0.1 2009
for later debugging the output of the following command might also be useful, but it will be quite a bit larger than the earlier ones:
- echo /olsrv2info edge | nc 127.0.0.1 2009