Bug #416
openB.A.T.M.A.N. V: include packet loss in link throughput estimation
0%
Description
Scenario:
I have 2 dual radio APs (1 x 2.4GHz and 1 x 5GHz, both ath10k).
The APs are placed in two different rooms with various walls in between. Because of that meshing over 5GHz is quite unreliable.
Problem:
Batman-adv is often selecting the route going over the 5GHz radio because the tx rate (used to estimate the throughput) is often higher.
This route selection, however, turns out to be a very bad choice because the packet loss makes the 5GHz link unusable (I can hardly ping the other AP with batctl p).
(I wonder though, why is the tx rate often this high if packet loss is high as well...?)
Proposal:
One way to mitigate this issue would be to include the packet loss in the 1-hop link throughput estimation logic.
Mixing throughput and packet loss can be quite complicated, therefore I would like to keep it simple: i.e. when packet loss over a link is below 50%, drop the throughput to 0.1Mbps.
This way that link is heavily penalized and excluded from the routing (unless it's the only choice we have).
To measure the 1-hop packet loss we could either use the OGMs (similarly to what we did in B.A.T.M.A.N. IV, but it may become ugly quite fast) or we could rely on counting the received ELPs and sending back a periodic report to the sender.
Opinions? Comments?
Updated by Simon Wunderlich over 4 years ago
Shouldn't we discuss this on the mailing list? :)
But anyway: Did you check where the throughput estimation comes from? It could be that only the selected modulation is reported, and not an actual throughput estimate based on modulation + success rate (like we get from minstrel with ath9k).
If we have the choice between ELP and OGM to implement this extension, I think ELP is the better place since this is really about the link.
Updated by Sven Eckelmann over 4 years ago
Yes, the throughput is not actually the expected throughput for ath10k but the modulation rate. It is a "regression" caused by this commit: https://git.open-mesh.org/batman-adv.git/commit/dcb63377d9de914e15c500aead14f28c4eda8d23
Updated by Linus Lüssing over 4 years ago
It seems, from skimming the 802.11s code, that 802.11s is able to do a more reliable, driver independent throughput estimation here:
https://elixir.bootlin.com/linux/v5.8.2/source/net/mac80211/mesh_hwmp.c#L295
So via checking the "(struct ieee80211_tx_info *)txinfo->flags & IEEE80211_TX_STAT_ACK flag" and "&sta->tx_stats.last_rate". And the result should then be stored in "&sta->mesh->tx_rate_avg".
Maybe that's a more reliable source for us as well, especially for ath10k?