Bug #428
closedTT local memory leak / RCU race condition on shutdown
100%
Description
On shutdown/rmmod of batman-adv.ko with batman-adv compiled from the main branch Slab complains with the following error message about a memory leak in the batman-adv TT local cache (full dmesg output in attachment):
[ 112.208633] =============================================================================
[ 112.210359] BUG batadv_tl_cache (Tainted: G OE ): Objects remaining in batadv_tl_cache on __kmem_cache_shutdown()
[ 112.211943] -----------------------------------------------------------------------------
[ 112.212517] Slab 0xffffe8afc0216d00 objects=16 used=1 fp=0xffff93f4085b4340 flags=0xfffffc0000a00(workingset|slab|node=0|zone=1|lastcpupid=0x1fffff)
[ 112.212517] CPU: 1 PID: 776 Comm: rmmod Tainted: G OE 6.8.12-amd64 #1 Debian 6.8.12-1
[ 112.212517] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[ 112.212517] Call Trace:
[ 112.212517] <TASK>
[ 112.212517] dump_stack_lvl+0x64/0x80
[ 112.212517] slab_err+0xe6/0x120
[ 112.212517] __kmem_cache_shutdown+0x160/0x2e0
[ 112.212517] kmem_cache_destroy+0x55/0x160
[ 112.220849] batadv_tt_cache_destroy+0x15/0x60 [batman_adv]
[ 112.220849] __do_sys_delete_module+0x1d5/0x320
[ 112.220849] do_syscall_64+0x83/0x190
[ 112.220849] ? do_syscall_64+0x8f/0x190
[ 112.220849] ? exc_page_fault+0x7f/0x180
[ 112.220849] entry_SYSCALL_64_after_hwframe+0x78/0x80
[ 112.224478] RIP: 0033:0x7f2ac8434977
[ 112.224478] Code: 73 01 c3 48 8b 0d a9 94 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 79 94 0c 00 f7 d8 64 89 01 48
[ 112.224478] RSP: 002b:00007ffe0adf6138 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
[ 112.224478] RAX: ffffffffffffffda RBX: 000055db9018e770 RCX: 00007f2ac8434977
[ 112.224478] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 000055db9018e7d8
[ 112.224478] RBP: 0000000000000000 R08: 1999999999999999 R09: 0000000000000000
[ 112.224478] R10: 00007f2ac84a6ac0 R11: 0000000000000206 R12: 00007ffe0adf6390
[ 112.224478] R13: 000055db9018e770 R14: 000055db9018d2a0 R15: 0000000000000000
[ 112.233961] </TASK>
[ 112.233961] Disabling lock debugging due to kernel taint
[ 112.233961] Object 0xffff93f4085b4140 @offset=320
[ 112.233961] Allocated in batadv_tt_local_add+0x297/0xa20 [batman_adv] age=15835 cpu=1 pid=755
[ 112.233961] batadv_tt_local_add+0x297/0xa20 [batman_adv]
[ 112.233961] batadv_interface_set_mac_addr+0xf6/0x120 [batman_adv]
[ 112.233961] dev_set_mac_address+0xde/0x140
[ 112.233961] dev_set_mac_address_user+0x30/0x50
[ 112.233961] do_setlink+0x261/0x12d0
[ 112.233961] rtnl_setlink+0x11f/0x1d0
[ 112.233961] rtnetlink_rcv_msg+0x152/0x3c0
[ 112.241772] netlink_rcv_skb+0x5b/0x110
[ 112.241772] netlink_unicast+0x1a6/0x290
[ 112.241772] netlink_sendmsg+0x223/0x490
[ 112.241772] __sys_sendto+0x1df/0x1f0
[ 112.241772] __x64_sys_sendto+0x24/0x30
[ 112.241772] do_syscall_64+0x83/0x190
[ 112.241772] entry_SYSCALL_64_after_hwframe+0x78/0x80
[ 112.245994] ------------[ cut here ]------------
[ 112.246650] kmem_cache_destroy batadv_tl_cache: Slab cache still has objects when called from batadv_tt_cache_destroy+0x15/0x60 [batman_adv]
[ 112.246668] WARNING: CPU: 1 PID: 776 at mm/slab_common.c:493 kmem_cache_destroy+0x14d/0x160
[ 112.249584] Modules linked in: veth batman_adv(OE-) cfg80211 rfkill bridge stp llc libcrc32c crc32c_generic crc16 rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver binfmt_misc pcspkr button joydev evdev serio_raw loop dm_mod efi_pstore nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vmw_vmci qemu_fw_cfg ip_tables x_tables autofs4 nfsv3 nfs_acl nfs lockd grace sunrpc 9pnet_rdma rdma_cm iw_cm ib_cm ib_core configfs 9p netfs ata_generic ata_piix libata psmouse scsi_mod 9pnet_virtio i2c_piix4 9pnet e1000 scsi_common floppy crypto_simd cryptd
[ 112.256555] CPU: 1 PID: 776 Comm: rmmod Tainted: G B OE 6.8.12-amd64 #1 Debian 6.8.12-1
[ 112.258457] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[ 112.260410] RIP: 0010:kmem_cache_destroy+0x14d/0x160
[ 112.261687] Code: 00 eb be 5b 5d 41 5c 41 5d c3 cc cc cc cc 48 8b 53 60 48 8b 4c 24 20 48 c7 c6 60 d5 e3 98 48 c7 c7 b8 ec 2d 99 e8 43 0d d8 ff <0f> 0b e9 e2 fe ff ff c3 cc cc cc cc 0f 1f 80 00 00 00 00 90 90 90
[ 112.265219] RSP: 0018:ffffb3b2806e7e48 EFLAGS: 00010282
[ 112.266044] RAX: 0000000000000000 RBX: ffff93f4270a2640 RCX: 0000000000000027
[ 112.267157] RDX: ffff93f43c521708 RSI: 0000000000000001 RDI: ffff93f43c521700
[ 112.268268] RBP: 000055db9018e7d8 R08: 0000000000000000 R09: ffffb3b2806e7cd8
[ 112.269418] R10: ffffb3b2806e7cd0 R11: 0000000000000003 R12: 0000000080012d00
[ 112.270572] R13: ffffb3b2806e7f58 R14: 0000000000000000 R15: 0000000000000000
[ 112.271699] FS: 00007f2ac8308440(0000) GS:ffff93f43c500000(0000) knlGS:0000000000000000
[ 112.273001] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 112.273923] CR2: 00005584ef830110 CR3: 000000000787c000 CR4: 00000000000006f0
[ 112.275050] Call Trace:
[ 112.275464] <TASK>
[ 112.275810] ? kmem_cache_destroy+0x14d/0x160
[ 112.276518] ? __warn+0x81/0x130
[ 112.277043] ? kmem_cache_destroy+0x14d/0x160
[ 112.277730] ? report_bug+0x171/0x1a0
[ 112.278315] ? prb_read_valid+0x1b/0x30
[ 112.278919] ? handle_bug+0x3c/0x80
[ 112.279467] ? exc_invalid_op+0x17/0x70
[ 112.280071] ? asm_exc_invalid_op+0x1a/0x20
[ 112.280741] ? kmem_cache_destroy+0x14d/0x160
[ 112.281603] ? kmem_cache_destroy+0x14d/0x160
[ 112.282489] batadv_tt_cache_destroy+0x15/0x60 [batman_adv]
[ 112.283373] __do_sys_delete_module+0x1d5/0x320
[ 112.284080] do_syscall_64+0x83/0x190
[ 112.284696] ? do_syscall_64+0x8f/0x190
[ 112.285315] ? exc_page_fault+0x7f/0x180
[ 112.285970] entry_SYSCALL_64_after_hwframe+0x78/0x80
[ 112.286768] RIP: 0033:0x7f2ac8434977
[ 112.287355] Code: 73 01 c3 48 8b 0d a9 94 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 79 94 0c 00 f7 d8 64 89 01 48
[ 112.290282] RSP: 002b:00007ffe0adf6138 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
[ 112.291465] RAX: ffffffffffffffda RBX: 000055db9018e770 RCX: 00007f2ac8434977
[ 112.292595] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 000055db9018e7d8
[ 112.293724] RBP: 0000000000000000 R08: 1999999999999999 R09: 0000000000000000
[ 112.294863] R10: 00007f2ac84a6ac0 R11: 0000000000000206 R12: 00007ffe0adf6390
[ 112.295982] R13: 000055db9018e770 R14: 000055db9018d2a0 R15: 0000000000000000
[ 112.297103] </TASK>
[ 112.297465] ---[ end trace 0000000000000000 ]---
The system is an x86-64 KVM instance running the latest Debian Sid, including the Linux kernel as provided by Debian through the package linux-image-6.8.12-amd64 / 6.8.12-1. The batman-adv kernel module was compiled with gcc 13.2.0-25, as provided by Debian.
qemu-system-x86-64 command:
$ qemu-system-x86_64 -enable-kvm -nodefaults -smp 2 -m 1024M -serial mon:stdio -nographic -kernel /home/linus/mesh-node/boot/vmlinuz-6.8.
12-amd64 -initrd /home/linus/mesh-node/boot/initrd.img-6.8.12-amd64 -append nodeid=1 slub_debug console=ttyS0 uts=1 rootflags=ro ro net
.ifnames=1 ip=192.168.23.2::192.168.23.1:255.255.255.0:mesh-node:ens2:off root=/dev/nfs nfsroot=192.168.23.1:/home/linus/mesh-node -dev
ice e1000,netdev=net0,mac=02:04:64:a4:39:b1 -netdev tap,id=net0,ifname=vmtap0,script=/home/linus/dev-priv/vm/ens2-ifup.sh -device e1000
,netdev=net1,mac=02:04:64:a4:39:c1 -netdev tap,id=net1,ifname=vmtap1,script=no -device e1000,netdev=net2,mac=02:04:64:a4:39:d1 -netdev
tap,id=net2,ifname=vmtap2,script=no -device e1000,netdev=net3,mac=02:04:64:a4:39:e1 -netdev tap,id=net3,ifname=vmtap3,script=no -device
e1000,netdev=net4,mac=02:04:64:a4:39:f1 -netdev tap,id=net4,ifname=vmtap4,script=no -fsdev local,security_model=passthrough,id=fsdev-r
oot,path=/home/linus/mesh-node -device virtio-9p-pci,id=fs-root,fsdev=fsdev-root,mount_tag=virtfs_root
Within the VM the issue is reliably reproduceable with the following commands (CONFIG_BATMAN_ADV_MCAST=n to keep the TT local cache and therefore the splat short and debugging easier):
$ modprobe crc16; modprobe libcrc32c; modprobe bridge; modprobe cfg80211
$ cd /mnt/batman-adv/
$ make clean && make CONFIG_BATMAN_ADV_MCAST=n
$ insmod ./net/batman-adv/batman-adv.ko
$ ip link add dev veth0 type veth peer name veth1
$ ip link add dev bat0 type batadv
$ ip link set dev veth0 master bat0
$ ip link set up dev veth0
$ ip link set up dev veth1
$ ip link set up dev bat0
$ batctl tl
[B.A.T.M.A.N. adv 2024.1-7-g08e8aefb, MainIF/MAC: veth0/ba:8d:34:7b:53:01 (bat0/da:00:ca:2f:34:50 BATMAN_IV), TTVN: 1]
Client VID Flags Last seen (CRC )
da:00:ca:2f:34:50 -1 [.P....] 0.000 (0x2babba54)
$ rmmod batman-adv.ko
Afterwards an entry in slabinfo remains for batadv_tl_cache, however with 0 objects:
$ cat /proc/slabinfo | grep batadv
batadv_tl_cache 0 16 256 16 1 : tunables 0 0 0 : slabdata 1 1 0
The issue seems to have started to appear with the following commit, reverting it makes the kernel dmesg splat disappear: https://git.open-mesh.org/batman-adv.git/commit/cf9845d4566698731c23cd7a02db956706d058a1
Alternatively, adding 10x calls with msleep(500) between rcu_barrier() and batadv_tt_cache_destroy() in batadv_exit() makes the issue disappear, too.
With extra debug output I also see the kref refcount to decrease to zero for this TT local entry.
So overall it seems that somehow rcu_barrier() call is not waiting long enough for this entry to be freed since switching from call_rcu() to kfree_rcu() for freeing TT local entries?
Update/Note: This can also be reproduced with "-smp 1" instead of "-smp 2" to qemu.
Files
Updated by Linus Lüssing 8 months ago
Interestingly, even when I add msleep()s (here: 3x msleep(100)) before the rcu_barrier(), so that the rcu_barrier() is called quite some time after the kfree_rcu(), the rcu_barrier() call does not seem to wait for a kfree().
[ 8679.302055] ~~~ batadv_tt_local_entry_release:229: before kfree_rcu()
[ 8679.302723] ~~~ batadv_tt_local_entry_release:231: after kfree_rcu()
[ 8679.628280] ~~~ batadv_exit:143: before rcu_barrier()
[ 8679.629267] ~~~ batadv_exit:145: after rcu_barrier()
[ 8679.630066] ~~~ batadv_exit:170: before batadv_tt_cache_destroy()
[ 8679.630582] =============================================================================
[ 8679.631282] BUG batadv_tl_cache (Tainted: G B W OE ): Objects remaining in batadv_tl_cache on __kmem_cache_shutdown()
[ 8679.632233] -----------------------------------------------------------------------------
...
[ 8679.705199] ~~~ batadv_exit:172: after batadv_tt_cache_destroy()
Could it be that kfree_rcu() + rcu_barrier() is buggy?
Updated by Linus Lüssing 8 months ago
- File config-6.8.12-amd64.txt config-6.8.12-amd64.txt added
Updated by Sven Eckelmann 8 months ago
Just as weird idea: Is there another rcu indirection caused by using kfree_rcu? So that the memory is freed correctly by kfree_rcu but the adjustment in the kmem_cache is done by another indirection. Would need to check the rcu-tree or slub code (or related code) for that.
I would propose to test this on the current kernel, revert this and inform the original author of the patch (for example by Cc'ing this person in the revert).
Updated by Linus Lüssing 8 months ago
- File config-6.10.0-rc3+ config-6.10.0-rc3+ added
Also reproduceable on a current, vanilla Linux master (2ef5971ff345d3c000873725db555085e0131961). Will prepare a revert.
Updated by Sven Eckelmann 3 months ago
- Status changed from New to Closed
- Target version set to 2024.2
- % Done changed from 0 to 100
This was reverted in 2024.2 and a new version of the change will be added in 2024.4/linux 6.13