Project

General

Profile

Bug #395

Updated by Sven Eckelmann over 5 years ago

Just compiled a new linux-next kernel as described in https://www.open-mesh.org/projects/open-mesh/wiki/Emulation_Debug/38 and linux v4.19.56. Batman-adv was e2b064ff5cf53e75fb8176b1c749474366a20ab9 with the two attached patches (sorry that it was an unclean test environment). I have then bridged my local node to a freifunk vogtland mesh node. 

 The freifunk vogtland node was just using one ethernet device (eth0) as slave of its bat0 device. And I was therefore able to see the mesh traffic in my virtual machine. And thus it could participate in the mesh via its own bat0 interface. 

 The node was pinging the IPv6 address of the freifunk vogtland node for some hours and suddenly I saw following error: 

 <pre> 
 Message from syslogd@sven-edge at Jul    7 16:49:57 ... 
  kernel:page:ffffea00001ed000 count:1 mapcount:0 mapping:ffff88801a40c000 index:0xffff888007b43b80 compound_mapcount: 0 

 Message from syslogd@sven-edge at Jul    7 16:49:57 ... 
  kernel:flags: 0x4000000000008100(slab|head) 
 </pre> 

 The oops for that was: 

 <pre> 
 [ 5074.200617] ================================================================== 
 [ 5074.202112] BUG: KASAN: use-after-free in batadv_mcast_want_rtr4_update+0x335/0x3a0 [batman_adv] 
 [ 5074.203871] Write of size 8 at addr ffff888007b41a38 by task swapper/0/0 

 [ 5074.205232] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G             O        4.19.56 #1 
 [ 5074.206686] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 
 [ 5074.208356] Call Trace: 
 [ 5074.208661]    <IRQ> 
 [ 5074.208862]    dump_stack+0x9a/0xf0 
 [ 5074.209458]    print_address_description.cold.1+0x9/0x225 
 [ 5074.210472]    kasan_report.cold.2+0x6d/0xa0 
 [ 5074.211221]    batadv_mcast_want_rtr4_update+0x335/0x3a0 [batman_adv] 
 [ 5074.212493]    batadv_mcast_purge_orig+0x71/0xa0 [batman_adv] 
 [ 5074.213591]    batadv_orig_node_free_rcu+0x19/0x60 [batman_adv] 
 [ 5074.214734]    ? batadv_frag_check_entry+0x90/0x90 [batman_adv] 
 [ 5074.215865]    rcu_process_callbacks+0xab1/0x1950 
 [ 5074.216706]    ? call_rcu_sched+0x20/0x20 
 [ 5074.217382]    __do_softirq+0x238/0x987 
 [ 5074.218017]    irq_exit+0x101/0x130 
 [ 5074.218567]    smp_apic_timer_interrupt+0x16e/0x510 
 [ 5074.219445]    apic_timer_interrupt+0xf/0x20 
 [ 5074.220155]    </IRQ> 
 [ 5074.220406] RIP: 0010:native_safe_halt+0x24/0x30 
 [ 5074.221238] Code: 00 00 00 00 00 90 be 04 00 00 00 48 c7 c7 40 f7 00 84 e8 1f b4 fb fe 8b 05 59 a7 af 01 85 c0 7e 07 0f 00 2d 0e 39 52 00 fb f4 <c3> 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 41 57 41 56 41 
 [ 5074.225342] RSP: 0018:ffffffff83007de0 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 
 [ 5074.226802] RAX: 0000000000000000 RBX: ffffffff83761ea8 RCX: ffffffff82514fe1 
 [ 5074.228036] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff8400f740 
 [ 5074.229251] RBP: dffffc0000000000 R08: fffffbfff0801ee9 R09: fffffbfff0801ee8 
 [ 5074.230475] R10: fffffbfff0801ee8 R11: 0000000000000003 R12: 0000000000000000 
 [ 5074.231694] R13: 0000000000000000 R14: ffffffff8376c7a8 R15: ffffffff83007e50 
 [ 5074.232913]    ? native_safe_halt+0x11/0x30 
 [ 5074.233495]    default_idle+0x68/0x340 
 [ 5074.233989]    do_idle+0x264/0x2c0 
 [ 5074.234418]    ? arch_cpu_idle_exit+0x40/0x40 
 [ 5074.235033]    ? finish_task_switch+0x214/0x620 
 [ 5074.235679]    cpu_startup_entry+0x19/0x1b 
 [ 5074.236240]    start_kernel+0x473/0x494 
 [ 5074.236749]    ? mem_encrypt_init+0x6/0x6 
 [ 5074.237290]    ? memcpy_orig+0x54/0x110 
 [ 5074.237796]    secondary_startup_64+0xa4/0xb0 

 [ 5074.238481] Allocated by task 0: 
 [ 5074.238901]    kasan_kmalloc+0xbf/0xe0 
 [ 5074.239400]    __kmalloc_track_caller+0x114/0x2c0 
 [ 5074.240084]    __kmalloc_reserve.isra.7+0x29/0xa0 
 [ 5074.240774]    __alloc_skb+0x100/0x550 
 [ 5074.241269]    skb_copy+0x10c/0x2d0 
 [ 5074.241718]    batadv_iv_ogm_process_per_outif+0xba/0x2f80 [batman_adv] 
 [ 5074.242806]    batadv_iv_ogm_receive+0x980/0x1690 [batman_adv] 
 [ 5074.243760]    batadv_batman_skb_recv+0x472/0x5a0 [batman_adv] 
 [ 5074.244791]    __netif_receive_skb_one_core+0xda/0x140 
 [ 5074.245659]    netif_receive_skb_internal+0x12a/0x3e0 
 [ 5074.246513]    napi_gro_receive+0x1d0/0x340 
 [ 5074.247162]    receive_buf+0x5b3/0x2320 
 [ 5074.247726]    virtnet_poll+0x6fd/0xca9 
 [ 5074.248297]    net_rx_action+0x30d/0xba0 
 [ 5074.248886]    __do_softirq+0x238/0x987 

 [ 5074.249516] Freed by task 0: 
 [ 5074.249901]    __kasan_slab_free+0x12e/0x180 
 [ 5074.250570]    kfree+0x117/0x2e0 
 [ 5074.250997]    consume_skb+0xc6/0x280 
 [ 5074.251529]    batadv_iv_ogm_process_per_outif+0x82e/0x2f80 [batman_adv] 
 [ 5074.252756]    batadv_iv_ogm_receive+0x980/0x1690 [batman_adv] 
 [ 5074.253784]    batadv_batman_skb_recv+0x472/0x5a0 [batman_adv] 
 [ 5074.254811]    __netif_receive_skb_one_core+0xda/0x140 
 [ 5074.255666]    netif_receive_skb_internal+0x12a/0x3e0 
 [ 5074.256518]    napi_gro_receive+0x1d0/0x340 
 [ 5074.257162]    receive_buf+0x5b3/0x2320 
 [ 5074.257714]    virtnet_poll+0x6fd/0xca9 
 [ 5074.258305]    net_rx_action+0x30d/0xba0 
 [ 5074.258908]    __do_softirq+0x238/0x987 

 [ 5074.259568] The buggy address belongs to the object at ffff888007b41980 
                 which belongs to the cache kmalloc-2048 of size 2048 
 [ 5074.261891] The buggy address is located 184 bytes inside of 
                 2048-byte region [ffff888007b41980, ffff888007b42180) 
 [ 5074.263797] The buggy address belongs to the page: 
 [ 5074.264537] page:ffffea00001ed000 count:1 mapcount:0 mapping:ffff88801a40c000 index:0xffff888007b43b80 compound_mapcount: 0 
 [ 5074.266636] flags: 0x4000000000008100(slab|head) 
 [ 5074.267421] raw: 4000000000008100 ffffea00002f8600 0000000400000004 ffff88801a40c000 
 [ 5074.268941] raw: ffff888007b43b80 00000000800f0003 00000001ffffffff 0000000000000000 
 [ 5074.270474] page dumped because: kasan: bad access detected 

 [ 5074.271569] Memory state around the buggy address: 
 [ 5074.272416]    ffff888007b41900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc 
 [ 5074.273807]    ffff888007b41980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb 
 [ 5074.275213] >ffff888007b41a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb 
 [ 5074.276614]                                           ^ 
 [ 5074.277516]    ffff888007b41a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb 
 [ 5074.278930]    ffff888007b41b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb 
 [ 5074.280261] ================================================================== 
 </pre> 

 Some interesting address translations for you 

 <pre> 

 # use-after-free 
 $ ./scripts/faddr2line vmlinux.o 'skb_copy+0x10c/0x2d0'                                
 skb_copy+0x10c/0x2d0: 
 skb_copy at /home/sven/tmp/qemu-batman/linux-next/net/core/skbuff.c:1349 

 $ ./scripts/faddr2line ../batman-adv/net/batman-adv/batman-adv.o 'batadv_mcast_want_rtr4_update+0x335/0x3a0'    
 batadv_mcast_want_rtr4_update+0x335/0x3a0: 
 hlist_add_head_rcu at /home/sven/tmp/qemu-batman/linux-next/./include/linux/rculist.h:516 
 (inlined by) batadv_mcast_want_rtr4_update at /home/sven/tmp/qemu-batman/batman-adv/net/batman-adv/multicast.c:1902 


 $ ./scripts/faddr2line ../batman-adv/net/batman-adv/batman-adv.o 'batadv_iv_ogm_process_per_outif+0xba/0x2f80' 
 skipping batadv_iv_ogm_process_per_outif address at 0x6a6a due to size mismatch (0x2f80 != 0x10b0) 
 no match for batadv_iv_ogm_process_per_outif+0xba/0x2f80 

 $ gdb net/batman-adv/batman-adv.ko  
 (gdb) l    *(&batadv_mcast_want_rtr4_update+0x335) 
 0x36785 is in batadv_mcast_want_rtr4_update (./include/linux/list.h:666). 
 661               h->pprev = NULL; 
 662       } 
 663 
 664       static inline int hlist_unhashed(const struct hlist_node *h) 
 665       { 
 666               return !h->pprev; 
 667       } 
 668 
 669       static inline int hlist_empty(const struct hlist_head *h) 
 670       { 

 $ ./scripts/faddr2line ../batman-adv/net/batman-adv/batman-adv.o 'batadv_mcast_purge_orig+0x71'          
 batadv_mcast_purge_orig+0x71/0xa0: 
 batadv_mcast_purge_orig at /home/sven/tmp/qemu-batman/batman-adv/net/batman-adv/multicast.c:2424 

 $ gdb net/batman-adv/batman-adv.ko  
 (gdb) l    *(&batadv_mcast_purge_orig+0x71) 
 0x3db51 is in batadv_mcast_purge_orig (/home/sven/tmp/qemu-batman/batman-adv/net/batman-adv/multicast.c:2424). 
 2419 
 2420              batadv_mcast_want_unsnoop_update(bat_priv, orig, BATADV_NO_FLAGS); 
 2421              batadv_mcast_want_ipv4_update(bat_priv, orig, BATADV_NO_FLAGS); 
 2422              batadv_mcast_want_ipv6_update(bat_priv, orig, BATADV_NO_FLAGS); 
 2423              batadv_mcast_want_rtr4_update(bat_priv, orig, BATADV_NO_FLAGS); 
 2424              batadv_mcast_want_rtr6_update(bat_priv, orig, BATADV_NO_FLAGS); 
 2425 
 2426              spin_unlock_bh(&orig->mcast_handler_lock); 
 2427      } 


 $ ./scripts/faddr2line ../batman-adv/net/batman-adv/batman-adv.o 'batadv_orig_node_free_rcu+0x19' 
 batadv_orig_node_free_rcu+0x19/0x60: 
 batadv_orig_node_free_rcu at /home/sven/tmp/qemu-batman/batman-adv/net/batman-adv/originator.c:894 

 $ gdb net/batman-adv/batman-adv.ko  
 (gdb) l    *(&batadv_orig_node_free_rcu+0x19) 
 0x43009 is in batadv_orig_node_free_rcu (/home/sven/tmp/qemu-batman/batman-adv/net/batman-adv/originator.c:894). 
 889 
 890               orig_node = container_of(rcu, struct batadv_orig_node, rcu); 
 891 
 892               batadv_mcast_purge_orig(orig_node); 
 893 
 894               batadv_frag_purge_orig(orig_node, NULL); 
 895 
 896               kfree(orig_node->tt_buff); 
 897               kfree(orig_node); 
 898       } 


 # allocated 
 $ ./scripts/faddr2line ../batman-adv/net/batman-adv/batman-adv.o 'batadv_iv_ogm_process_per_outif+0xba/0x2f80' 
 skipping batadv_iv_ogm_process_per_outif address at 0x6a6a due to size mismatch (0x2f80 != 0x10b0) 
 no match for batadv_iv_ogm_process_per_outif+0xba/0x2f80 

 $ gdb net/batman-adv/batman-adv.ko  
 (gdb) l    *(&batadv_iv_ogm_process_per_outif+0xba) 
 0x6aaa is in batadv_iv_ogm_process_per_outif (/home/sven/tmp/qemu-batman/batman-adv/net/batman-adv/bat_iv_ogm.c:1285). 
 1280              bool is_bidirect; 
 1281 
 1282              /* create a private copy of the skb, as some functions change tq value 
 1283               * and/or flags. 
 1284               */ 
 1285              skb_priv = skb_copy(skb, GFP_ATOMIC); 
 1286              if (!skb_priv) 
 1287                      return; 
 1288 
 1289              ethhdr = eth_hdr(skb_priv); 


 # freed 
 $ ./scripts/faddr2line ../batman-adv/net/batman-adv/batman-adv.o 'batadv_iv_ogm_process_per_outif+0x82e' 
 batadv_iv_ogm_process_per_outif+0x82e/0x10b0: 
 batadv_iv_ogm_process_per_outif at /home/sven/tmp/qemu-batman/batman-adv/net/batman-adv/bat_iv_ogm.c:1263 

 (gdb) l    *(&batadv_iv_ogm_process_per_outif+0x82e) 
 0x721e is in batadv_iv_ogm_process_per_outif (/home/sven/tmp/qemu-batman/batman-adv/net/batman-adv/bat_iv_ogm.c:1454). 
 1449              if (orig_neigh_router) 
 1450                      batadv_neigh_node_put(orig_neigh_router); 
 1451              if (hardif_neigh) 
 1452                      batadv_hardif_neigh_put(hardif_neigh); 
 1453 
 1454              consume_skb(skb_priv); 
 1455      } 
 1456 
 1457      /** 
 1458       * batadv_iv_ogm_process_reply() - Check OGM for direct reply and process it 


 # common for allocated + freed 
 $ ./scripts/faddr2line ../batman-adv/net/batman-adv/batman-adv.o 'batadv_iv_ogm_receive+0x980/0x1690'           
 batadv_iv_ogm_receive+0x980/0x1690: 
 __preempt_count_add at /home/sven/tmp/qemu-batman/linux-next/./arch/x86/include/asm/preempt.h:76 
 (inlined by) __rcu_read_lock at /home/sven/tmp/qemu-batman/linux-next/./include/linux/rcupdate.h:81 
 (inlined by) rcu_read_lock at /home/sven/tmp/qemu-batman/linux-next/./include/linux/rcupdate.h:625 
 (inlined by) batadv_iv_ogm_process at /home/sven/tmp/qemu-batman/batman-adv/net/batman-adv/bat_iv_ogm.c:1619 
 (inlined by) batadv_iv_ogm_receive at /home/sven/tmp/qemu-batman/batman-adv/net/batman-adv/bat_iv_ogm.c:1708 


 $ gdb net/batman-adv/batman-adv.ko  
 (gdb) l    *(&batadv_iv_ogm_receive+0x980) 
 0xa2f0 is in batadv_iv_ogm_receive (./arch/x86/include/asm/preempt.h:76). 
 71         * The various preempt_count add/sub methods 
 72         */ 
 73 
 74        static __always_inline void __preempt_count_add(int val) 
 75        { 
 76                raw_cpu_add_4(__preempt_count, val); 
 77        } 
 78 
 79        static __always_inline void __preempt_count_sub(int val) 
 80        { 

 </pre> 

 --- 

 Please check whether this looks to you more like an bug in the RCU implementation or whether this could actually be related to your multicast code.

Back