Bug #380
closedToo high number of OGMs after interface up->down->up
100%
Description
It seems there is a problem with a too high number of management frames on some devices/firmwares (gluon). The culprit seems to be the OGM interval on these devices.
The details are slightly unclear but Linus wrote following in IRC:
ok, I think I found the issue. it seems that the interface activate functions are called twice, with a deactivate in between
however each activate starts OGM timer
so there are basically two timers running, explaining why we see this 2s, 3s, 2s, 3s interval for OGMs
and we calculated and compared statistics with compat14 / batman-adv-legacy. there, we do not have this issue
I'm wondering whether I might have messed something up with the forw-packet changes I had made somewhere around v2016.x
and yeah, it's reproduceable in VMs. "batctl if add dummy0; ip link set up dev dummy0" -> 1x OGM per interval. then "ip link set down dummy0; ip link set up dev dummy0" -> 2x OGMs per interval
nope, not my forw-packet fix in v2016.5. the issue was introduced with the API changes in v2016.3
v2016.2 -> good, v2016.3 -> bad
Updated by Sven Eckelmann over 5 years ago
It can be reproduced in Emulation Environment setup with kernel 4.4.180. The commands to run in the virtual machine are:
rmmod batman-adv || true insmod /host/batman-adv/net/batman-adv/batman-adv.ko /host/batctl/batctl if add dummy0 /host/batctl/batctl it 5000 /host/batctl/batctl if add enp0s3 ip link set up dev enp0s3 ip link set up dev bat0 ip link set down dev dummy0 sleep 2 ip link set up dev dummy0
A wireshark process was attached to the tap interface to analyze the interval. A third ogm (beside the 2 expected ones for the two interfaces) appeared when I ran the lower part (down + sleep + up).
This behavior was introduced with commit 0d8468553c3c ("batman-adv: remove ogm_emit and ogm_schedule API calls")
Updated by Sven Eckelmann over 5 years ago
Problem is in the patch that bat_iface_activate is used to start the sending of OGMs - directly via batadv_iv_ogm_schedule from batadv_hardif_activate_interface. It was done in the past by batadv_schedule_bat_ogm which was called from batadv_hardif_enable_interface.
One difference here is that batadv_hardif_activate_interface is only called by batadv_hardif_enable_interface when batadv_hardif_is_iface_up is true. BUT this dependency to IFF_UP also requires that the function batadv_hardif_activate_interface is called again when we are notified about an interface which goes up via NETDEV_UP. Thus everytime we receive a NETDEV_UP event, a new OGM is scheduled :(
Updated by Sven Eckelmann over 5 years ago
- Status changed from New to In Progress
- Assignee changed from Linus Lüssing to Sven Eckelmann
- Target version set to 2019.3
Updated by Sven Eckelmann over 5 years ago
- Status changed from In Progress to Resolved
Patch was proposed at https://patchwork.open-mesh.org/project/b.a.t.m.a.n./patch/20190602095135.15604-1-sven@narfation.org/
- 2019.2 backport: https://github.com/openwrt-routing/packages/pull/475
- 2018.1 backport: https://github.com/openwrt-routing/packages/pull/474
- 2016.5 backport: https://github.com/openwrt-routing/packages/pull/473
Updated by Sven Eckelmann over 5 years ago
- Status changed from Resolved to Closed
- % Done changed from 0 to 100
Thanks for the test and the ack. It is now queued up for 2019.3