Project

General

Profile

Actions

Bug #261

closed

Refcount for usage wrong again

Added by Ruben Kelevra over 8 years ago. Updated over 7 years ago.

Status:
Closed
Priority:
Immediate
Assignee:
Target version:
Start date:
06/24/2016
Due date:
% Done:

0%

Estimated time:

Description

We had this bug a while ago, now its back. 2016.0 does not have it, now I've updated yesterday to 2016.2 and cannot restart my server, because batman is blocking this.

Please fix this as fast as possible!

Thanks


Files

Actions #1

Updated by Sven Eckelmann over 8 years ago

Since you can reproduce it - please bisect which commit introduced the problem. Otherwise we will not be able to fix the problem in time.

You can also force a reboot via:

echo s > /proc/sysrq-trigger
echo b > /proc/sysrq-trigger
Actions #2

Updated by Sven Eckelmann over 8 years ago

  • Status changed from New to Feedback
  • Assignee set to Ruben Kelevra
Actions #3

Updated by Sven Eckelmann over 8 years ago

Were you able to find the faulty commit via git-bisect?

There are also a couple of reference counting related fixes in the maint branch. I know that they are not for things which broke after v2016.0 but maybe your problem was already in v2016.0.

  • 20df5c5 batman-adv: Free last_bonding_candidate on release of orig_node
    • fix for problem since v2014.1.0
  • 6ecc711 batman-adv: Fix reference leak in batadv_find_router
    • fix for problem since v2014.1.0
  • e401297 batman-adv: Fix non-atomic bla_claim::backbone_gw access
    • fix for problem since v2012.2.0/v2015.2
  • 719afd2 batman-adv: Fix orig_node_vlan leak on orig_node_release
    • fix for problem since v2014.0.0

There are also other fixes in the maint branch

Actions #4

Updated by Sven Eckelmann over 8 years ago

  • Status changed from Feedback to Closed

Closing because:

  • cannot reproduce problem with maint branch (which contains fixes for pre-v2016.0 problems)
  • not enough information how to reproduce the problem at all
  • review of the changes between v2016.0 and v2016.2 was not successful in finding a problematic change which could cause reference imbalances
  • reporter doesn't react to questions since one month (even when he wanted to have it fixed "as fast as possible")
Actions #5

Updated by Ruben Kelevra over 8 years ago

This issue does not disappear if you close this ticket, so please reopen it until it's fixed.

I was not able to reproduce it in a testing environment, and can't reboot a production system several dozen times to do a bisect.

I would still appreciate a fast fix since I have to shutdown this system hard on every reboot since I've installed this update.

Actions #6

Updated by Marek Lindner over 8 years ago

  • Status changed from Closed to In Progress

Ruben Kelevra wrote:

This issue does not disappear if you close this ticket, so please reopen it until it's fixed.

I was not able to reproduce it in a testing environment, and can't reboot a production system several dozen times to do a bisect.

I would still appreciate a fast fix since I have to shutdown this system hard on every reboot since I've installed this update.

The ticket was closed mainly due to your perceived inactivity. Except for the initial ticket description you did not respond which led us to believe this issue was forgotten.

Please note that with the limited information you provided we are unable to fix the issue (though we tried).

Since you have to reboot your system every now and then wouldn't it be possible to bisect the issue on your live system over a longer period of time ? That would provide us the information we need to get to a fix.

In short: Without more support from you or somebody else who can reproduce the issue we are unable to address the bug no matter how long the ticket stays open.

Actions #7

Updated by Sven Eckelmann over 8 years ago

Ruben Kelevra wrote:

This issue does not disappear if you close this ticket, so please reopen it until it's fixed.

How should we know if it is fixed when you are not testing anything or answer questions/give status updates? We have to timeout ticket like this. Otherwise tickets with MIA reporters + unreproducable problems would take over the relevant ones.

And ticket updates like "you still have not fixed it and I haven't tested anything" don't count.

I would still appreciate a fast fix since I have to shutdown this system hard on every reboot since I've installed this update.

I have several servers running (and rebooting) perfectly fine with the maint branch. Lets start there. And if this doesn't work then try to do a slow bisect (as suggested by Marek). Just give give some status updates from time to time and and try to respond.

Actions #8

Updated by Sven Eckelmann about 8 years ago

What is the state of the tests?

Actions #9

Updated by Sven Eckelmann almost 8 years ago

  • Status changed from In Progress to Feedback
Actions #10

Updated by Sven Eckelmann over 7 years ago

Ruben, do you have gathered any kind of infos or anything which we can use to reproduce it? I was not successful when I tried to reproduce it (multiple times) - even when we tried to reproduce it on some Freifunk gateway servers (used by 300 gluon nodes as VPN servers)

Actions #11

Updated by Sven Eckelmann over 7 years ago

ping?

Actions #12

Updated by Ruben Kelevra over 7 years ago

I’m sorry for the late response, this Bug is gone. Not having any issue with 2017.1 :)

Actions #13

Updated by Sven Eckelmann over 7 years ago

  • Status changed from Feedback to Closed
  • Target version set to 2017.1
Actions

Also available in: Atom PDF