Project

General

Profile

Actions

Feature #353

open

Translate layer 3 addresses from non Layer 3 neighbors

Added by Andre Kasper over 6 years ago. Updated over 1 year ago.

Status:
New
Priority:
Normal
Target version:
-
Start date:
04/12/2018
Due date:
% Done:

0%

Estimated time:

Description

To me it looks like it is possible to translate macs via dc because batctl is able to view dc. also I guess, that dc content is correct, because elsewhise batman should be broken. So I can't follow why not using it as first source of mac/ip translation and just do the other stuff is this hit doesn't match.

I'm user, not developer. From my perspective it's all about functionality. -i use batctl tr and batctl as an debugging tool. I think this may be the only usecase for this commands. If there is an IP 192.168.4.3 in my network and I would like to find out why und where it is, I would traceroute it. I can't do it with layer 3 tools so I need batctl. It is possible to do it manually. showing and grepping dc and using the mac for tr. from user perspektive it would make much more sense that this would happen also automatically if I translate or traceroute or ping the ip. I can resolve IPs I can't reach via layer2 ping and I can't resolv IPs I can reach via batman. Just from user perspektive and ponyhof I would wish that the debugging functionalities would be able to translate every IP in batman network and don't have a need to translate IPs that are not in batman network (non batman devices maybe could be filtered out?). But seems less a bug issue than a feature request.


Original message

If I make batctl tr on a gateway to its own ip the tr goes to wrong mac. also batctl is unable to find mac to other ips.
batman 2018.0

root@node82:~# ip a s bat0
5: bat0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 02:00:00:02:08:01 brd ff:ff:ff:ff:ff:ff
    inet 10.110.64.1/21 brd 10.110.71.255 scope global bat0
       valid_lft forever preferred_lft forever
    inet6 2a03:2260:300b:208::1/64 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::d4a2:a7ff:fe6d:26c5/64 scope link
       valid_lft forever preferred_lft forever
root@node82:~# batctl tr 10.110.64.1
traceroute to 10.110.64.1 (72:8e:0a:4d:07:03), 50 hops max, 20 byte packets
 1: 02:00:00:02:05:00  0.267 ms  0.144 ms  0.168 ms
 2: 4e:70:0a:55:1a:fb  29.208 ms  27.537 ms  28.530 ms
 3: 1e:03:61:52:62:93  27.344 ms  26.860 ms  30.777 ms
 4: 72:8e:0a:4d:07:03  79.296 ms  75.739 ms  109.504 ms
root@node82:~#

root@node72:~# batctl tr 10.110.56.1
traceroute to 10.110.56.1 (72:8e:0a:4d:07:03), 50 hops max, 20 byte packets
 1: 02:00:00:02:05:00  0.256 ms  0.165 ms  0.219 ms
 2: 4e:70:0a:55:1a:fb  25.500 ms  25.870 ms  37.836 ms
 3: 1e:03:61:52:62:93  29.220 ms  27.655 ms  25.810 ms
 4: 72:8e:0a:4d:07:03  77.655 ms  145.679 ms  90.243 ms
root@node72:~# ip a s bat0
5: bat0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 02:00:00:02:07:01 brd ff:ff:ff:ff:ff:ff
    inet 10.110.56.1/21 brd 10.110.63.255 scope global bat0
       valid_lft forever preferred_lft forever
    inet6 2a03:2260:300b:207::1/64 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::307c:cbff:fe21:b4e2/64 scope link
       valid_lft forever preferred_lft forever
root@node72:~#

root@node52:~# ip a s bat0
5: bat0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 02:00:00:02:05:01 brd ff:ff:ff:ff:ff:ff
    inet 10.110.40.1/21 brd 10.110.47.255 scope global bat0
       valid_lft forever preferred_lft forever
    inet6 2a03:2260:300b:205::1/64 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::7c6f:2bff:fe98:a3a9/64 scope link
       valid_lft forever preferred_lft forever
root@node52:~# batctl tr 10.110.40.1
traceroute to 10.110.40.1 (aa:a5:39:b1:e3:63), 50 hops max, 20 byte packets
 1: 02:00:00:02:06:00  0.243 ms  0.081 ms  0.117 ms
 2: aa:a5:39:b1:e3:63  14.457 ms  14.159 ms  11.271 ms

root@node42:~# ip a s bat0
5: bat0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 02:00:00:02:04:01 brd ff:ff:ff:ff:ff:ff
    inet 10.110.32.1/21 brd 10.110.39.255 scope global bat0
       valid_lft forever preferred_lft forever
    inet6 2a03:2260:300b:204::1/64 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::acc9:d6ff:fe2b:3968/64 scope link
       valid_lft forever preferred_lft forever
root@node42:~# batctl tr 10.110.32.1
traceroute to 10.110.32.1 (72:8e:0a:4d:07:03), 50 hops max, 20 byte packets
 1: 02:00:00:02:05:00  0.235 ms  0.263 ms  0.266 ms
 2: 4e:70:0a:55:1a:fb  27.696 ms  25.413 ms  27.730 ms
 3: 1e:03:61:52:62:93  27.051 ms  29.464 ms  29.175 ms
 4: b2:bf:98:e5:c9:bb  26.780 ms  33.047 ms  35.286 ms
 5: 72:8e:0a:4d:07:03   *   *  28.838 ms
root@node42:~#

5: bat0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 02:00:00:02:03:01 brd ff:ff:ff:ff:ff:ff
    inet 10.110.24.1/21 brd 10.110.31.255 scope global bat0
       valid_lft forever preferred_lft forever
    inet6 2a03:2260:300b:203::1/64 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::8c8e:cff:fe09:6c7c/64 scope link
       valid_lft forever preferred_lft forever
root@node32:~# batctl tr 10.110.24.1
traceroute to 10.110.24.1 (aa:a5:39:b1:e3:63), 50 hops max, 20 byte packets
 1: 02:00:00:02:06:00  0.209 ms  0.317 ms  0.240 ms
 2: aa:a5:39:b1:e3:63  11.947 ms  14.116 ms  13.883 ms

root@node22:~# ip a s bat0
5: bat0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 02:00:00:02:02:01 brd ff:ff:ff:ff:ff:ff
    inet 10.110.16.1/21 brd 10.110.23.255 scope global bat0
       valid_lft forever preferred_lft forever
    inet6 2a03:2260:300b:202::1/64 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::7c68:ffff:fe6c:480e/64 scope link
       valid_lft forever preferred_lft forever
root@node22:~# batctl tr 10.110.16.1
traceroute to 10.110.16.1 (72:8e:0a:4d:07:03), 50 hops max, 20 byte packets
 1: 02:00:00:02:05:00  0.063 ms  0.103 ms  0.098 ms
 2: 4e:70:0a:55:1a:fb  27.590 ms  29.041 ms  29.014 ms
 3: 1e:03:61:52:62:93  27.610 ms  25.379 ms  27.543 ms
 4: b2:bf:98:e5:c9:bb  28.462 ms  32.701 ms  64.105 ms
 5: 72:8e:0a:4d:07:03   *  42.850 ms  32.786 ms

root@node12:~# ip a s bat0
5: bat0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 02:00:00:02:01:01 brd ff:ff:ff:ff:ff:ff
    inet 10.110.8.1/21 brd 10.110.15.255 scope global bat0
       valid_lft forever preferred_lft forever
    inet6 2a03:2260:300b:201::1/64 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::a4cb:6fff:fe9e:a115/64 scope link
       valid_lft forever preferred_lft forever
root@node12:~# batctl tr 10.110.8.1
traceroute to 10.110.8.1 (aa:a5:39:b1:e3:63), 50 hops max, 20 byte packets
 1: 02:00:00:02:06:00  0.288 ms  0.205 ms  0.189 ms
 2: aa:a5:39:b1:e3:63  12.672 ms  14.053 ms  14.329 ms

root@node12:~# batctl tr 10.110.16.1
Error - mac address of the ping destination could not be resolved and is not a bat-host name: 10.110.16.1
root@node12:~# batctl dc |grep 10.110.16.1
 *     10.110.16.1 02:00:00:02:02:01   -1      0:11
root@node12:~# batctl dc |grep 10.110.8.1
 *      10.110.8.1 02:00:00:02:01:01   -1      0:00
Actions #1

Updated by Sven Eckelmann over 6 years ago

  • Description updated (diff)
Actions #2

Updated by Sven Eckelmann over 6 years ago

  • Status changed from New to Feedback
  • Assignee changed from batman-adv developers to Andre Kasper

tr is not to translate an IP to its own mac. It is a traceroute. The translate layer can sometimes used to guess remote originator for the traceroute.

The translate layer/command tries to get the remote originator which could handle a client mac address (aka address from TT). An IP to client mac address translation is tried via the ARP table of the kernel and not using dat. If you want something that uses dat then you have to implement your own command.

Right now it is not clear whether you just misunderstood the command or whether there is actually problem in the way the ARP/TT/Originator tables are parsed.

Actions #3

Updated by Sven Eckelmann over 6 years ago

Here an example how it looks in FFV. We have two gateway nodes (APs don't get an IPv4 address):

root@vpn01:~# ip addr show dev bat0
11: bat0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 02:ba:7a:df:01:00 brd ff:ff:ff:ff:ff:ff
    inet 10.204.16.1/16 brd 10.204.255.255 scope global bat0
       valid_lft forever preferred_lft forever
    inet6 2a03:2260:200f:100::1/56 scope global 
       valid_lft forever preferred_lft forever
    inet6 2a03:2260:200f:1337::1/48 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::ba:7aff:fedf:100/64 scope link 
       valid_lft forever preferred_lft forever

root@vpn02:~# ip addr show dev bat0
6: bat0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 02:ba:7a:df:02:00 brd ff:ff:ff:ff:ff:ff
    inet 10.204.32.1/16 brd 10.204.255.255 scope global bat0
       valid_lft forever preferred_lft forever
    inet6 2a03:2260:200f:200::1/56 scope global 
       valid_lft forever preferred_lft forever
    inet6 2a03:2260:200f:1337::2/48 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::ba:7aff:fedf:200/64 scope link 
       valid_lft forever preferred_lft forever

We now use the t(ranslate) command instead of misusing the tr(aceroute):

root@vpn01:~# batctl t 10.204.32.1
02:62:e7:ab:02:02

This sounds plausible to me. Why does it sound plausible? Because following steps (roughly) were done by the translate utility:

  • root@vpn01:~# ip neigh|grep '^10.204.32.1 '
    10.204.32.1 dev bat0 lladdr 02:ba:7a:df:02:00 REACHABLE
    

    So we know that its mac address (the one seen in the ethernet layer above batman-adv) is 02:ba:7a:df:02:00 (which seems to be correct)
  • root@vpn01:~# batctl tg|grep ' \* 02:ba:7a:df:02:00 '
     * 02:ba:7a:df:02:00   -1 [....] (  2) 02:62:e7:ab:02:02 (  2) (0x1f212fb6)
    

    The best remote originator (the one with the asterisk) for this address is 02:62:e7:ab:02:02
Actions #4

Updated by Andre Kasper over 6 years ago

Thank you for your explanation. It wasn't clear to me, that the corresponding MAC is only guessed.

Now I can't say if this is a bug or not. What happens is that the guess is wrong. the tr leads to a MAC, that don't belongs to the IP.

I was searching for the issue of some strange network behaviour, so I tried to test the way to thie gateway IPs via batctl tr. I was just lazy to lookup the mac and thought if the command accepts IPs they will be resolved correctly.

What I've found:
1. The Gateway could not tr the neighbour gateway via IP
2. The Gateway could tr the own IP, but the l2traceroute goes to the wrong mac

As you can see in the outputs above the mac of the own bat0 interface. I don't understand why the mac ist guessed wrong. In the last example you can see, that the correct IP is in batmans dat. To me it is strange that the wrong mac is resolved. If dat is enabled it would be near that batman uses this IP->MAC Translation to resolve and would get the correct MAC.

I don't understand whats happening here and why it is okay, that the wrong MAC is resolved, but you know batman much better than me. If this is an expected output this issue can be closed.

Actions #5

Updated by Sven Eckelmann over 6 years ago

  • Project changed from batman-adv to batctl

Did you check the tables as I did?

Actions #6

Updated by Andre Kasper over 6 years ago

I can't follow why you ask, but here is the requested result:

Case 1: batctl t own address

root@node12:~# batctl t 10.110.8.1
fe:81:88:d0:6d:bb

..but in dc it's shown correct: 

root@node12:~# batctl dc|grep '.8.1 '
 *     10.110.48.1 02:00:00:02:06:01   -1      0:03
 *      10.110.8.1 02:00:00:02:01:01   -1      0:00

The own IP is not in ip neigh.

Additional tests:

root@node12:~# batctl t 10.110.8.1
fe:76:8c:61:d3:0b

root@node12:~# batctl tr 10.110.8.1
traceroute to 10.110.8.1 (de:f7:75:20:cc:33), 50 hops max, 20 byte packets
 1: 02:00:00:02:04:00  0.415 ms  0.165 ms  0.373 ms
 2: 8a:34:46:82:46:6b  23.519 ms  22.778 ms  22.229 ms
 3: de:f7:75:20:cc:33  29.571 ms  22.535 ms  24.285 ms

root@node12:~# batctl tr fe:76:8c:61:d3:0b
traceroute to fe:76:8c:61:d3:0b (fe:76:8c:61:d3:0b), 50 hops max, 20 byte packets
 1: 02:00:00:02:04:00  0.176 ms  0.200 ms  0.199 ms
 2: 6e:8a:91:80:59:cb  11.526 ms  11.153 ms  20.154 ms
 3: fe:76:8c:61:d3:0b  111.953 ms  186.147 ms  160.482 ms

To me it is not plausible why a node multiple hops away should be the originator of the gateway itself.
I've checked the Macs in our nodes.json http://services.freifunk-bochum.de:4000/nodes.json they are nodes somewhere in the network.

As already mentioned I'm searching for strange behaviour in our network and where it comes from, so someoneelse should check, but to me as non batman expert this looks wrong.

Actions #7

Updated by Sven Eckelmann over 6 years ago

If the ip isn't in the neighbor table then it doesn't make a lot of sense that it translates it. Please build your own version of batctl from the branch ecsv/translate_debug and repeat the test:

git clone git://git.open-mesh.org/batctl.git -b ecsv/translate_debug batctl-debug
cd batctl-debug
make
sudo ./batctl t 10.110.8.1

I would need the output for a first assessment

Actions #8

Updated by Andre Kasper over 6 years ago

View log...

Actions #9

Updated by Andre Kasper over 6 years ago

what happens if you make the same with your gateways? In your example your made an batctl tr to the other gateway. does it behave different like here if you make it to the own ip?
I'm asking because as I told I'm searching for an issue in my network and if it's not the same in your network I can't guarantee, that there isn't something wrong my network causing this, I don't know about. I would suggest to try to reproduce the issue on another network to exclude the possibility that it's not batman related

Actions #10

Updated by Sven Eckelmann over 6 years ago

It rejects it because it cannot find the IP address when doing the translate.

,-(ecsv@vpn01:pts/0:~)
'-(17:52:%)-%> ip addr show dev bat0
11: bat0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 02:ba:7a:df:01:00 brd ff:ff:ff:ff:ff:ff
    inet 10.204.16.1/16 brd 10.204.255.255 scope global bat0
       valid_lft forever preferred_lft forever
    inet6 2a03:2260:200f:100::1/56 scope global 
       valid_lft forever preferred_lft forever
    inet6 2a03:2260:200f:1337::1/48 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::ba:7aff:fedf:100/64 scope link 
       valid_lft forever preferred_lft forever
,-(ecsv@vpn01:pts/0:~)
'-(130:17:52:%)-%> sudo batctl t 10.204.16.1   
Error - mac address of the ping destination could not be resolved and is not a bat-host name: 10.204.16.1

But your device seems to have itself in the neighbor table (see resolve_mac_from_parse:645):

resolve_mac_from_parse:587 28
resolve_mac_from_parse:602 ndm_state 00000040
resolve_mac_from_parse:607: 10.110.8.1
resolve_mac_from_parse:637
resolve_mac_from_parse:645: 10.110.8.1
resolve_mac_from_parse:656
resolve_mac_from_parse:660
resolve_mac_from_parse:664
resolve_mac_from_cache:714
resolve_mac_from_cache:718

Please check whether the ip is really not in `ip neigh`. I would guess that it is actually there with the extra flag "noarp"

Actions #11

Updated by Andre Kasper over 6 years ago

it's really not there. here is a complete output without grep:

root@node12:~/batctl-debug# ip neigh
10.110.11.158 dev bat0 lladdr 14:99:e2:70:be:09 REACHABLE
172.31.2.81 dev ens19  FAILED
10.110.11.113 dev bat0 lladdr b0:a2:e7:6f:66:19 REACHABLE
10.110.15.112 dev bat0 lladdr 54:f2:01:ad:aa:51 REACHABLE
10.110.9.189 dev bat0 lladdr 3e:bb:b2:36:6f:d0 STALE
10.110.15.132 dev bat0 lladdr 04:f1:28:5c:cc:b6 REACHABLE
10.110.12.37 dev bat0 lladdr 7c:11:be:07:c3:6c STALE
10.110.14.145 dev bat0 lladdr 04:d6:aa:05:47:e1 REACHABLE
10.110.9.203 dev bat0 lladdr 80:b0:3d:0a:6e:6f STALE
10.110.11.184 dev bat0 lladdr f4:f5:e8:3c:8c:f4 REACHABLE
10.110.12.174 dev bat0 lladdr a8:b8:6e:62:28:d3 REACHABLE
212.23.154.1 dev ens18 lladdr 00:22:83:d3:51:9c REACHABLE
10.110.10.197 dev bat0 lladdr e8:93:09:85:83:2f REACHABLE
10.110.14.196 dev bat0 lladdr 88:83:22:8f:e5:ee STALE
10.110.15.101 dev bat0 lladdr c0:bd:d1:e5:b8:8a STALE
10.110.15.60 dev bat0 lladdr d8:c4:e9:f7:de:f3 STALE
172.31.2.82 dev ens19 lladdr ba:9f:ee:b5:ca:f5 REACHABLE
10.110.15.244 dev bat0 lladdr 54:27:58:24:b7:c8 REACHABLE
10.110.11.167 dev bat0 lladdr 34:2d:0d:41:ee:81 STALE
10.110.13.99 dev bat0 lladdr c0:bd:d1:66:74:99 STALE
10.110.9.110 dev bat0  FAILED
10.110.14.230 dev bat0 lladdr 44:c3:46:0b:de:26 STALE
10.110.11.132 dev bat0 lladdr ec:10:7b:8d:e1:5a REACHABLE
10.110.13.156 dev bat0 lladdr 1c:7b:23:25:dc:0a STALE
10.110.9.79 dev bat0 lladdr 38:ca:da:5c:51:ba STALE
10.110.10.26 dev bat0 lladdr 00:ae:fa:43:11:6f REACHABLE
10.110.12.128 dev bat0 lladdr 44:74:6c:dc:2a:ba REACHABLE
10.110.14.158 dev bat0 lladdr 88:63:df:2a:53:10 REACHABLE
10.110.12.183 dev bat0 lladdr 98:52:b1:38:bf:10 REACHABLE
10.110.13.47 dev bat0 lladdr 3c:05:18:8a:02:18 REACHABLE
10.110.13.176 dev bat0 lladdr d4:28:d5:33:a3:e7 REACHABLE
10.110.14.41 dev bat0 lladdr 50:8f:4c:da:bb:6f STALE
10.110.12.70 dev bat0 lladdr 00:53:84:ad:41:cf REACHABLE
10.110.11.2 dev bat0 lladdr 24:18:1d:3e:e2:08 STALE
10.110.10.93 dev bat0 lladdr 5c:70:a3:43:1d:12 REACHABLE
10.110.12.203 dev bat0 lladdr 34:23:ba:4c:25:38 STALE
10.110.13.145 dev bat0 lladdr f0:27:65:dc:31:5a REACHABLE
10.110.13.104 dev bat0  FAILED
10.110.14.14 dev bat0 lladdr 5c:70:a3:6a:0f:59 STALE
10.110.11.135 dev bat0 lladdr 60:a4:d0:fd:0f:69 REACHABLE
172.31.2.72 dev ens19 lladdr f2:0b:73:eb:5f:db PROBE
10.110.14.184 dev bat0 lladdr a0:10:81:23:35:2d STALE
10.110.15.226 dev bat0 lladdr 9c:e0:63:34:09:ee REACHABLE
172.31.2.92 dev ens19 lladdr 2a:e4:dc:4e:03:1a STALE
10.110.15.109 dev bat0 lladdr e8:93:09:b1:d1:04 REACHABLE
10.110.9.176 dev bat0  FAILED
10.110.12.151 dev bat0 lladdr 2c:0e:3d:8f:9e:65 STALE
172.31.2.51 dev ens19  INCOMPLETE
10.110.14.9 dev bat0 lladdr a0:10:81:d9:c5:81 REACHABLE
10.110.12.38 dev bat0  FAILED
10.110.9.159 dev bat0 lladdr b8:53:ac:21:b9:44 STALE
172.31.2.71 dev ens19  INCOMPLETE
10.110.10.198 dev bat0 lladdr 1c:7b:21:68:84:4f REACHABLE
10.110.12.44 dev bat0 lladdr b4:74:43:7f:2f:93 STALE
10.110.12.222 dev bat0 lladdr a8:7c:01:aa:24:32 REACHABLE
10.110.14.244 dev bat0 lladdr c0:11:73:75:9d:6f REACHABLE
10.110.15.194 dev bat0 lladdr 80:4e:81:f7:f4:83 REACHABLE
10.110.11.68 dev bat0  FAILED
10.110.14.121 dev bat0 lladdr b4:bf:f6:10:a2:29 REACHABLE
10.110.11.201 dev bat0 lladdr c4:86:e9:af:d1:de STALE
10.110.9.15 dev bat0 lladdr 08:ee:8b:f9:89:c3 REACHABLE
10.110.12.105 dev bat0 lladdr e0:19:1d:5e:62:16 STALE
10.110.10.173 dev bat0 lladdr a0:cb:fd:f6:aa:b8 REACHABLE
10.110.15.122 dev bat0 lladdr 74:72:b0:ed:3c:24 REACHABLE
10.110.9.144 dev bat0  FAILED
10.110.13.147 dev bat0 lladdr 00:d5:06:d2:7a:bd REACHABLE
172.31.2.41 dev ens19  INCOMPLETE
10.110.13.126 dev bat0 lladdr 58:c5:cb:33:81:b3 REACHABLE
10.110.10.29 dev bat0 lladdr 44:6e:e5:5e:bd:72 STALE
10.110.11.30 dev bat0 lladdr 60:d8:19:01:a0:f7 REACHABLE
10.110.9.178 dev bat0 lladdr a0:cb:fd:8e:80:68 STALE
10.110.15.70 dev bat0 lladdr 8c:f5:a3:be:aa:d5 REACHABLE
10.110.11.200 dev bat0 lladdr b4:bf:f6:4b:23:55 REACHABLE
10.110.12.67 dev bat0 lladdr 7c:7d:3d:a7:c7:c0 STALE
172.31.2.61 dev ens19  INCOMPLETE
10.110.11.77 dev bat0 lladdr e8:93:09:9d:87:6a STALE
10.110.15.254 dev bat0 lladdr 30:07:4d:04:6c:15 STALE
10.110.11.5 dev bat0 lladdr 5c:51:81:34:6d:d7 REACHABLE
10.110.12.210 dev bat0 lladdr 8c:eb:c6:b1:fc:1e REACHABLE
172.31.2.42 dev ens19 lladdr 7e:c1:a2:71:2e:6c REACHABLE
10.110.10.155 dev bat0 lladdr 7c:1c:68:9f:66:6c REACHABLE
10.110.10.114 dev bat0 lladdr 54:f2:01:f6:fe:0d REACHABLE
10.110.9.40 dev bat0  FAILED
172.31.2.32 dev ens19 lladdr 06:4a:51:c9:27:ef STALE
10.110.11.240 dev bat0  FAILED
10.110.12.66 dev bat0 lladdr 3c:f7:a4:cf:e6:73 REACHABLE
172.31.2.62 dev ens19 lladdr ca:6f:10:46:d7:7e STALE
10.110.9.187 dev bat0 lladdr 34:69:87:c9:d0:f4 STALE
10.110.10.134 dev bat0 lladdr 00:6b:8e:bf:f1:d5 STALE
172.31.2.52 dev ens19 lladdr d6:57:ac:b7:1c:07 DELAY
10.110.10.62 dev bat0 lladdr 00:25:d3:e3:41:eb REACHABLE
10.110.13.206 dev bat0 lladdr 3a:3b:ff:e1:84:62 STALE
10.110.15.13 dev bat0 lladdr 54:40:ad:61:ba:e0 REACHABLE
10.110.15.197 dev bat0  FAILED
10.110.14.255 dev bat0 lladdr 00:cd:fe:d2:72:e1 DELAY
10.110.15.78 dev bat0 lladdr 10:30:47:c4:5f:8d REACHABLE
10.110.9.100 dev bat0 lladdr 00:16:dc:6f:5d:78 REACHABLE
10.110.15.248 dev bat0 lladdr bc:3d:85:55:4d:ab STALE
10.110.14.5 dev bat0 lladdr 8c:bf:a6:1a:c7:a9 REACHABLE
10.110.10.102 dev bat0 lladdr f4:42:8f:af:38:55 REACHABLE
10.110.14.56 dev bat0 lladdr a0:b4:a5:23:25:ef REACHABLE
10.110.14.19 dev bat0 lladdr 9c:e0:63:34:08:dc REACHABLE
10.110.14.234 dev bat0 lladdr f4:5c:89:8d:7e:1d STALE
10.110.12.173 dev bat0 lladdr 1c:15:1f:e3:9e:09 REACHABLE
10.110.11.64 dev bat0 lladdr 54:14:73:8f:63:e6 STALE
172.31.2.11 dev ens19  INCOMPLETE
172.31.2.1 dev ens19 lladdr 6a:cd:86:e2:0f:fa REACHABLE
10.110.14.90 dev bat0 lladdr 7c:2e:dd:4c:86:09 STALE
10.110.11.84 dev bat0 lladdr 48:88:ca:55:f9:46 STALE
172.31.2.31 dev ens19  INCOMPLETE
10.110.9.201 dev bat0  FAILED
172.31.2.21 dev ens19  INCOMPLETE
10.110.15.66 dev bat0 lladdr 08:fd:0e:4a:a8:2f STALE
10.110.10.254 dev bat0 lladdr 58:48:22:59:af:25 STALE
10.110.14.171 dev bat0 lladdr c0:c9:76:f7:9b:96 REACHABLE
172.31.2.2 dev ens19  INCOMPLETE
10.110.13.220 dev bat0 lladdr e0:aa:96:29:ca:4a REACHABLE
10.110.12.192 dev bat0 lladdr e4:f8:ef:57:7c:97 STALE
10.110.12.28 dev bat0 lladdr 04:d6:aa:a0:bb:26 REACHABLE
10.110.14.99 dev bat0 lladdr ac:5f:3e:2c:f2:7b REACHABLE
10.110.12.161 dev bat0 lladdr 9c:fc:01:3a:aa:f9 REACHABLE
172.31.2.22 dev ens19 lladdr 02:a9:4f:58:c3:96 REACHABLE
10.110.10.24 dev bat0 lladdr a8:0c:63:ee:33:da REACHABLE
172.31.2.91 dev ens19  INCOMPLETE
fe80::1429:23ae:3092:4f5b dev bat0 lladdr 70:ec:e4:5a:fd:56 STALE
fe80::8eeb:c6ff:feb1:fc1e dev bat0 lladdr 8c:eb:c6:b1:fc:1e STALE
2a03:2260:300b:2ff::22 dev ens19 lladdr 02:a9:4f:58:c3:96 router DELAY
2a03:2260:300b:2ff::91 dev ens19  FAILED
fe80::947c:b20c:aa61:b513 dev bat0 lladdr b0:52:16:26:6c:43 STALE
fe80::528f:4cff:feda:bb6f dev bat0 lladdr 50:8f:4c:da:bb:6f STALE
fe80::44a:51ff:fec9:27ef dev ens19 lladdr 06:4a:51:c9:27:ef router REACHABLE
fe80::2afc:f6ff:fe0c:56e9 dev bat0 lladdr 28:fc:f6:0c:56:e9 STALE
2a03:2260:300b:2ff::81 dev ens19  INCOMPLETE
fe80::9ee0:63ff:fe34:9ee dev bat0 lladdr 9c:e0:63:34:09:ee STALE
2a03:2260:300b:2ff::2 dev ens19  INCOMPLETE
2a03:2260:300b:2ff::71 dev ens19  FAILED
fe80::960e:6bff:feb4:ea2 dev bat0 lladdr 94:0e:6b:b4:0e:a2 STALE
fe80::7cc1:a2ff:fe71:2e6c dev ens19 lladdr 7e:c1:a2:71:2e:6c router PROBE
2a03:2260:300b:2ff::61 dev ens19  FAILED
fe80::28e4:dcff:fe4e:31a dev ens19 lladdr 2a:e4:dc:4e:03:1a router STALE
2a03:2260:300b:2ff::51 dev ens19  FAILED
fe80::d457:acff:feb7:1c07 dev ens19  router FAILED
fe80::1cf4:dd03:534c:d302 dev bat0 lladdr b0:ca:68:77:df:88 STALE
2a03:2260:300b:2ff::41 dev ens19  INCOMPLETE
fe80::7a62:56ff:fe4d:64a8 dev bat0 lladdr 78:62:56:4d:64:a8 STALE
fe80::a9:4fff:fe58:c396 dev ens19 lladdr 02:a9:4f:58:c3:96 router STALE
fe80::4e66:41ff:fe12:3311 dev bat0 lladdr 4c:66:41:12:33:11 STALE
fe80::4240:a7ff:fe51:4293 dev bat0 lladdr 40:40:a7:51:42:93 STALE
fe80::beee:7bff:fe04:6f29 dev bat0 lladdr bc:ee:7b:04:6f:29 STALE
fe80::b89f:eeff:feb5:caf5 dev ens19 lladdr ba:9f:ee:b5:ca:f5 router REACHABLE
2a03:2260:300b:2ff::31 dev ens19  INCOMPLETE
fe80::c86f:10ff:fe46:d77e dev ens19  router FAILED
fe80::287:1ff:fea6:8039 dev bat0 lladdr 00:87:01:a6:80:39 STALE
2a03:2260:300b:2ff::21 dev ens19  INCOMPLETE
2a03:2260:300b:2ff::11 dev ens19  FAILED
fe80::cf:86cb:4c3:c0f6 dev bat0 lladdr 5c:8d:4e:49:59:6a STALE
fe80::4a88:caff:fed4:8a5a dev bat0 lladdr 48:88:ca:d4:8a:5a STALE
fe80::68cd:86ff:fee2:ffa dev ens19 lladdr 6a:cd:86:e2:0f:fa router STALE
2a03:2260:300b:2ff::1 dev ens19 lladdr 6a:cd:86:e2:0f:fa router DELAY
fe80::8ea:184d:7aba:dd55 dev bat0 lladdr 70:ec:e4:4c:25:d6 STALE
fe80::d628:d5ff:fe33:a3e7 dev bat0 lladdr d4:28:d5:33:a3:e7 STALE
2a03:2260:300b:2ff::82 dev ens19 lladdr ba:9f:ee:b5:ca:f5 router REACHABLE
fe80::307e:cac3:2aaf:df84 dev bat0 lladdr 60:d8:19:01:a0:f7 STALE
fe80::4fa:3f9b:acd9:4fe dev bat0 lladdr 38:ca:da:b1:7f:a9 STALE
fe80::a21:efff:fe6e:2cb1 dev bat0 lladdr 08:21:ef:6e:2c:b1 STALE
2a03:2260:300b:2ff::72 dev ens19 lladdr f2:0b:73:eb:5f:db router REACHABLE
fe80::147f:6285:ae64:fd68 dev bat0 lladdr f4:31:c3:6d:b7:05 STALE
fe80::14a0:7ac4:19fa:f6f4 dev bat0 lladdr d4:a3:3d:51:6e:db STALE
fe80::d638:9cff:fea0:4ac0 dev bat0 lladdr d4:38:9c:a0:4a:c0 STALE
2a03:2260:300b:2ff::62 dev ens19 lladdr ca:6f:10:46:d7:7e router DELAY
fe80::8f:534e:f247:5c08 dev bat0 lladdr bc:54:36:2d:56:90 STALE
fe80::ea50:8bff:fea1:adba dev bat0 lladdr e8:50:8b:a1:ad:ba STALE
fe80::ae37:43ff:fe50:2f44 dev bat0 lladdr ac:37:43:50:2f:44 STALE
fe80::2a3f:69ff:fec9:8a93 dev bat0 lladdr 28:3f:69:c9:8a:93 STALE
fe80::14ff:a6dd:1559:a9db dev bat0 lladdr 7c:11:be:07:c3:6c REACHABLE
fe80::b247:bfff:fee3:c4d4 dev bat0 lladdr b0:47:bf:e3:c4:d4 STALE
2a03:2260:300b:2ff::42 dev ens19  router FAILED
fe80::f227:65ff:fedc:315a dev bat0 lladdr f0:27:65:dc:31:5a STALE
fe80::f00b:73ff:feeb:5fdb dev ens19 lladdr f2:0b:73:eb:5f:db router STALE
fe80::ff:fe02:901 dev bat0 lladdr 02:00:00:02:09:01 router STALE
fe80::2692:eff:fe98:f806 dev bat0 lladdr 24:92:0e:98:f8:06 STALE
2a03:2260:300b:2ff::32 dev ens19 lladdr 06:4a:51:c9:27:ef router PROBE
fe80::8638:38ff:fe44:4398 dev bat0 lladdr 84:38:38:44:43:98 STALE

The problem why I'm searching for an a bug in my network is, that i get some bursts of

br-client: received packet on bat0 with own address as source address

on my nodes. But with batctl tg | grep macofbrclient i cant find anything. It's like an hidden loop and I don't know how to find the problem. Thats why I'm not sure if there is another issue responsible for this and asked to try to reproduce it in your network. I found this strange behaviour while searching for the other problem, but it could be related if you can't reproduce

Actions #12

Updated by Sven Eckelmann over 6 years ago

Sven Eckelmann wrote:

Please check whether the ip is really not in `ip neigh`. I would guess that it is actually there with the extra flag "noarp"

Let me show more details about that. The kernel is returning a netlink message for 10.110.8.1 (0a 6e 08 01):

-- Debug: Received Message:
--------------------------   BEGIN NETLINK MESSAGE ---------------------------
  [NETLINK HEADER] 16 octets
    .nlmsg_len = 76
    .type = 28 <0x1c>
    .flags = 2 <MULTI>
    .seq = 1523633407
    .port = 1342203925
  [PAYLOAD] 60 octets
    02 00 00 00 01 00 00 00 40 00 00 02 08 00 01 00 ........@.......
    0a 6e 08 01 0a 00 02 00 00 00 00 00 00 00 00 00 .n..............
    08 00 04 00 00 00 00 00 14 00 03 00 73 2b 00 00 ............s+..
    03 14 00 00 03 14 00 00 00 00 00 00             ............
---------------------------  END NETLINK MESSAGE   ---------------------------
resolve_mac_from_parse:587 28
resolve_mac_from_parse:602 ndm_state 00000040
resolve_mac_from_parse:607: 10.110.8.1
resolve_mac_from_parse:637
resolve_mac_from_parse:645: 10.110.8.1
resolve_mac_from_parse:656
resolve_mac_from_parse:660
resolve_mac_from_parse:664
resolve_mac_from_cache:714
resolve_mac_from_cache:718

As you can see, the bytes for the mac address ("7e 5c 86 22 14 e3") are missing in this message. This leads me to the conclusion that libnl isn't cleaning out the earlier parsed information correctly.

A quick check of the nlmsg_parse code shows me that two parameters of this function are swapped in the batctl code. Can you please pull the current code from ecsv/translate_debug and try again?

Actions #13

Updated by Sven Eckelmann over 6 years ago

A quick check of the nlmsg_parse code shows me that two parameters of this function are swapped in the batctl code. Can you please pull the current code from ecsv/translate_debug and try again?

Forget it, I just looked at the wrong function declaration in libnl (nla_parse). So it doesn't make a sense right now why libnl thinks that there is the lladdr 7e 5c 86 22 14 e3 in the shown message.

Actions #14

Updated by Sven Eckelmann over 6 years ago

  • Status changed from Feedback to In Progress

Ok, I have now just parsed your two initial messages and the noarp entry for your local address is 00:00:00:00:00:00. I would therefore assume that your tg has now also an entry for 00:00:00:00:00:00. And this address 00:00:00:00:00:00 is now mapped to some originator (maybe multiple originators) in the global translation table.

The only thing I could do now in batctl is to do some mac address validation (non-multicast, non-zero). I've prepared a patch in https://patchwork.open-mesh.org/project/b.a.t.m.a.n./patch/20180413181618.24144-1-sven@narfation.org/

Please test it and reply with "Tested-by: Andre Kasper <>" when it works for you.

Actions #15

Updated by Andre Kasper over 6 years ago

good work. Tahnk you. You guessed right:

root@node12:~/batctl-debug# batctl tg|grep 00:00:00:00:00:00
 * 00:00:00:00:00:00   -1 [....] (191) be:fd:ef:b6:b3:6b (193) (0x97ac3875)
   00:00:00:00:00:00   -1 [....] (232) c2:57:2a:fe:15:8b (235) (0xc0afcc6c)

I can test it, but I need step by step instructions. I'm firm enough with these things to realize it by myself. Sorry.

Actions #16

Updated by Sven Eckelmann over 6 years ago

You can just get batctl 2018.0 with this change from

git clone git://git.open-mesh.org/batctl.git -b ecsv/issue-353 batctl-issue-353
cd batctl-issue-353
make
sudo ./batctl t 10.110.8.1
Actions #17

Updated by Andre Kasper over 6 years ago

Error - mac address of the ping destination could not be resolved and is not a bat-host name: 10.110.8.1
Actions #18

Updated by Sven Eckelmann over 6 years ago

Sounds like the expected response. Now we only have the discuss the TT content with Antonio and Linus.

Actions #19

Updated by Andre Kasper over 6 years ago

I was unsure if it is the expected response.
To me it's unclear why you aren't using the dat cache if available to find the correct MAC. I would think this would result in correct answers....

Actions #20

Updated by Sven Eckelmann over 6 years ago

Because this is the complete wrong approach. And it would not result in the right answer - batman-adv doesn't send to its own originator. All the translate stuff is made for the global translation table.

Why must translate give you an answer how to reach the originator when batman-adv would not be involved in sending to this IP (yourself)? To be more precise: batman-adv would not even be able to send unicast frames to itself.

Actions #21

Updated by Andre Kasper over 6 years ago

To me this would make sense...

root@node12:~# batctl dc |grep 10.110.8.1
 *      10.110.8.1 02:00:00:02:01:01   -1      0:00

Maybe I dont get it. Sorry if so, but in my mind I would realize translation as follows:
If IP available in dc, use mac, else do the magic stuff you are doing now.

Benefit:
If you are using ip neigh all LOCAL known IPs will be resolved.
In my case there is a public ipv4 configured on a non batman interface at the same maschine. batctl could translate it but will not be able to ping it, because it's not a batman related mac

On the other Hand there are some 192.168.x.x adresses in batman dc and reachable via tr that are not resolved with ip neigh, because the kernel don't knows about them.

root@node12:~# batctl t 212.23.154.1
00:22:83:d3:51:9c
root@node12:~# batctl tr 00:22:83:d3:51:9c
traceroute to 00:22:83:d3:51:9c (00:22:83:d3:51:9c), 50 hops max, 20 byte packets
00:22:83:d3:51:9c: Destination Host Unreachable
root@node12:~# batctl t 192.168.178.65
Error - mac address of the ping destination could not be resolved and is not a bat-host name: 192.168.178.65
root@node12:~# batctl tr e8:94:f6:90:7e:0a
traceroute to e8:94:f6:90:7e:0a (62:0c:a9:50:55:23), 50 hops max, 20 byte packets
 1: 02:00:00:02:06:00  0.204 ms  0.154 ms  0.150 ms
 2: 62:0c:a9:50:55:23  21.010 ms  22.037 ms  24.758 ms

Usecase to me is, to find who is responsible for the misconfigured IPs in my network. It would make sense to me that batctl would resolv these macs that I'm able to ping and trace with batctl.

Really: Im not deep enough in this stuff and may don't see the whole thing and why this is stupid, but this is what I think about it. Sorry if this is totally bullshit.

Actions #22

Updated by Sven Eckelmann over 6 years ago

No, this is not how this is supposed to work. The DAT cache is not there to get accessed directly. The system will ask the DAT via ARP. And when the system is accepting the answer from DAT then batctl translate should use it - and not before that. Otherwise it would not reflect the upper layers behavior (e.g. a normal ping) at all.

Actions #23

Updated by Andre Kasper over 6 years ago

but the system doesn't use it, because it would never ask dat for IPs of other networks.
To me it looks like it is possible to translate macs via dc because batctl is able to view dc. also I guess, that dc content is correct, because elsewhise batman should be broken. So I can't follow why not using it as first source of mac/ip translation and just do the other stuff is this hit doesn't match.

I'm user, not developer. From my perspektive it's all about functionality. -i use batctl tr and batctl as an debugging tool. I think this may be the only usecase for this commands. If there is an IP 192.168.4.3 in my network and I would like to find out why und where it is, I would traceroute it. I can't do it with layer 3 tools so I need batctl. It is possible to do it manually. showing and grepping dc and using the mac for tr. from user perspektive it would make much more sense that this would happen also automatically if I translate or traceroute or ping the ip. I can resolve IPs I can't reach via layer2 ping and I can't resolv IPs I can reach via batman. Just from user perspektive and ponyhof I would wish that the debugging functionalities would be able to translate every IP in batman network and don't have a need to translate IPs that are not in batman network (non batman devices maybe could be filtered out?). But seems less a bug issue than a feature request.

Actions #24

Updated by Sven Eckelmann over 6 years ago

  • Tracker changed from Bug to Feature
  • Subject changed from batctl tr mac resolution defect to Translate layer 3 addresses from non Layer 3 neighbors
  • Status changed from In Progress to New
  • Assignee changed from Andre Kasper to batman-adv developers

Then please find another person to implement it. This is not what I had in mind when implementing it and I would rather remove the translation completely than do that.

And btw. (batctl) traceroute it the wrong tool here. The hops in between have nothing to do with the content of the DAT cache. And the dat cache is a DHT - so you don't have the complete view of the network.

If there is an IP 192.168.4.3 in my network and I would like to find out why und where it is, I would traceroute it. I can't do it with layer 3 tools so I need batctl. It is possible to do it manually. showing and grepping dc and using the mac for tr. from user perspektive it would make much more sense that this would happen also automatically if I translate or traceroute or ping the ip. I can resolve IPs I can't reach via layer2 ping and I can't resolv IPs I can reach via batman.

You can't reach IPs in the first place with batctl.

Actions #25

Updated by Sven Eckelmann over 6 years ago

  • Description updated (diff)
Actions #26

Updated by Linus Lüssing over 6 years ago

Can someone explain why there is this noarp entry and which configuration lead to this?

Actions #27

Updated by Linus Lüssing over 1 year ago

Rereading this ticket I'm getting the feeling that there are maybe two misunderstandings:

1) While what is in the DAT cache should be correct, it is not complete. A node will distribute MAC<->IPv4 pairs from itself and bridged-in clients it is responsible for only to 3 other nodes. And only when an unknown IPv4 address is requested via ARP then batman-adv's DAT will resolve it via these 3 according, responsible nodes and will populate the response to its own DAT cache. So you'd nevertheless need to use ARP over the upper layer first to populate DAT accordingly.

As gateway nodes in a Freifunk setup usually tend to communicate with all/many other hosts it might appear as if the DAT cache were a complete cache for all MAC<->IPv4 pairs when looking at a gateways DAT cache. But that's not generally the case.

2) When you do a "batctl tr/ping <IPv4>" address, then it should indirectly use the DAT cache (edit: I think). "batctl tr/ping" will ask the Linux kernel to resolve the IPv4 address. Then the Linux kernel will create and transmit an ARP request towards bat0 (if it has no matching entry in "ip neigh" yet). And then batman-adv will either interceept and create an ARP response directly if it has a matching entry in its DAT cache, without forwarding the ARP request into the mesh. Or will query the 3 responsible nodes which are responsible for this address.

The nice thing about this approach is, that it will/should also work if DAT is disabled.


Hoping this clarifies things. Can the ticket be closed?

Actions

Also available in: Atom PDF