aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2024-04-29ipv6: introduce dst_rt6_info() helperGravatar Eric Dumazet 23-70/+60
Instead of (struct rt6_info *)dst casts, we can use : #define dst_rt6_info(_ptr) \ container_of_const(_ptr, struct rt6_info, dst) Some places needed missing const qualifiers : ip6_confirm_neigh(), ipv6_anycast_destination(), ipv6_unicast_destination(), has_gateway() v2: added missing parts (David Ahern) Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-29inet: use call_rcu_hurry() in inet_free_ifa()Gravatar Eric Dumazet 1-1/+6
This is a followup of commit c4e86b4363ac ("net: add two more call_rcu_hurry()") Our reference to ifa->ifa_dev must be freed ASAP to release the reference to the netdev the same way. inet_rcu_free_ifa() in_dev_put() -> in_dev_finish_destroy() -> netdev_put() This should speedup device/netns dismantles when CONFIG_RCU_LAZY=y Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-29net: give more chances to rcu in netdev_wait_allrefs_any()Gravatar Eric Dumazet 1-1/+2
This came while reviewing commit c4e86b4363ac ("net: add two more call_rcu_hurry()"). Paolo asked if adding one synchronize_rcu() would help. While synchronize_rcu() does not help, making sure to call rcu_barrier() before msleep(wait) is definitely helping to make sure lazy call_rcu() are completed. Instead of waiting ~100 seconds in my tests, the ref_tracker splats occurs one time only, and netdev_wait_allrefs_any() latency is reduced to the strict minimum. Ideally we should audit our call_rcu() users to make sure no refcount (or cascading call_rcu()) is held too long, because rcu_barrier() is quite expensive. Fixes: 0e4be9e57e8c ("net: use exponential backoff in netdev_wait_allrefs") Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/all/28bbf698-befb-42f6-b561-851c67f464aa@kernel.org/T/#m76d73ed6b03cd930778ac4d20a777f22a08d6824 Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-26tcp: fix tcp_grow_skb() vs tstampsGravatar Eric Dumazet 1-14/+19
I forgot to call tcp_skb_collapse_tstamp() in the case we consume the second skb in write queue. Neal suggested to create a common helper used by tcp_mtu_probe() and tcp_grow_skb(). Fixes: 8ee602c63520 ("tcp: try to send bigger TSO packets") Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Link: https://lore.kernel.org/r/20240425193450.411640-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-26rstreason: make it work in trace worldGravatar Jason Xing 3-3/+3
At last, we should let it work by introducing this reset reason in trace world. One of the possible expected outputs is: ... tcp_send_reset: skbaddr=xxx skaddr=xxx src=xxx dest=xxx state=TCP_ESTABLISHED reason=NOT_SPECIFIED Signed-off-by: Jason Xing <kernelxing@tencent.com> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-04-26mptcp: introducing a helper into active reset logicGravatar Jason Xing 3-7/+14
Since we have mapped every mptcp reset reason definition in enum sk_rst_reason, introducing a new helper can cover some missing places where we have already set the subflow->reset_reason. Note: using SK_RST_REASON_NOT_SPECIFIED is the same as SK_RST_REASON_MPTCP_RST_EUNSPEC. They are both unknown. So we can convert it directly. Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Jason Xing <kernelxing@tencent.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-04-26mptcp: support rstreason for passive resetGravatar Jason Xing 2-5/+44
It relies on what reset options in the skb are as rfc8684 says. Reusing this logic can save us much energy. This patch replaces most of the prior NOT_SPECIFIED reasons. Signed-off-by: Jason Xing <kernelxing@tencent.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-04-26tcp: support rstreason for passive resetGravatar Jason Xing 2-8/+14
Reuse the dropreason logic to show the exact reason of tcp reset, so we can finally display the corresponding item in enum sk_reset_reason instead of reinventing new reset reasons. This patch replaces all the prior NOT_SPECIFIED reasons. Signed-off-by: Jason Xing <kernelxing@tencent.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-04-26rstreason: prepare for active resetGravatar Jason Xing 5-12/+24
Like what we did to passive reset: only passing possible reset reason in each active reset path. No functional changes. Signed-off-by: Jason Xing <kernelxing@tencent.com> Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-04-26rstreason: prepare for passive resetGravatar Jason Xing 7-24/+37
Adjust the parameter and support passing reason of reset which is for now NOT_SPECIFIED. No functional changes. Signed-off-by: Jason Xing <kernelxing@tencent.com> Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-04-26net: hsr: Provide RedBox support (HSR-SAN)Gravatar Lukasz Majewski 8-16/+203
Introduce RedBox support (HSR-SAN to be more precise) for HSR networks. Following traffic reduction optimizations have been implemented: - Do not send HSR supervisory frames to Port C (interlink) - Do not forward to HSR ring frames addressed to Port C - Do not forward to Port C frames from HSR ring - Do not send duplicate HSR frame to HSR ring when destination is Port C The corresponding patch to modify iptable2 sources has already been sent: https://lore.kernel.org/netdev/20240308145729.490863-1-lukma@denx.de/T/ Testing procedure (veth and netns): ----------------------------------- One shall run: linux-vanila/tools/testing/selftests/net/hsr/hsr_redbox.sh (Detailed description of the setup one can find in the test script file). Testing procedure (real hardware): ---------------------------------- The EVB-KSZ9477 has been used for testing on net-next branch (SHA1: 5fc68320c1fb3c7d456ddcae0b4757326a043e6f). Ports 4/5 were used for SW managed HSR (hsr1) as first hsr0 for ports 1/2 (with HW offloading for ksz9477) was created. Port 3 has been used as interlink port (single USB-ETH dongle). Configuration - RedBox (EVB-KSZ9477): if link set lan1 down;ip link set lan2 down ip link add name hsr0 type hsr slave1 lan1 slave2 lan2 supervision 45 version 1 ip link add name hsr1 type hsr slave1 lan4 slave2 lan5 interlink lan3 supervision 45 version 1 ip link set lan4 up;ip link set lan5 up ip link set lan3 up ip addr add 192.168.0.11/24 dev hsr1 ip link set hsr1 up Configuration - DAN-H (EVB-KSZ9477): ip link set lan1 down;ip link set lan2 down ip link add name hsr0 type hsr slave1 lan1 slave2 lan2 supervision 45 version 1 ip link add name hsr1 type hsr slave1 lan4 slave2 lan5 supervision 45 version 1 ip link set lan4 up;ip link set lan5 up ip addr add 192.168.0.12/24 dev hsr1 ip link set hsr1 up This approach uses only SW based HSR devices (hsr1). -------------- ----------------- ------------ DAN-H Port5 | <------> | Port5 | | Port4 | <------> | Port4 Port3 | <---> | PC | | (RedBox) | | (USB-ETH) EVB-KSZ9477 | | EVB-KSZ9477 | | -------------- ----------------- ------------ Signed-off-by: Lukasz Majewski <lukma@denx.de> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-04-26net/sched: fix false lockdep warning on qdisc root lockGravatar Davide Caratti 2-19/+6
Xiumei and Christoph reported the following lockdep splat, complaining of the qdisc root lock being taken twice: ============================================ WARNING: possible recursive locking detected 6.7.0-rc3+ #598 Not tainted -------------------------------------------- swapper/2/0 is trying to acquire lock: ffff888177190110 (&sch->q.lock){+.-.}-{2:2}, at: __dev_queue_xmit+0x1560/0x2e70 but task is already holding lock: ffff88811995a110 (&sch->q.lock){+.-.}-{2:2}, at: __dev_queue_xmit+0x1560/0x2e70 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&sch->q.lock); lock(&sch->q.lock); *** DEADLOCK *** May be due to missing lock nesting notation 5 locks held by swapper/2/0: #0: ffff888135a09d98 ((&in_dev->mr_ifc_timer)){+.-.}-{0:0}, at: call_timer_fn+0x11a/0x510 #1: ffffffffaaee5260 (rcu_read_lock){....}-{1:2}, at: ip_finish_output2+0x2c0/0x1ed0 #2: ffffffffaaee5200 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x209/0x2e70 #3: ffff88811995a110 (&sch->q.lock){+.-.}-{2:2}, at: __dev_queue_xmit+0x1560/0x2e70 #4: ffffffffaaee5200 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x209/0x2e70 stack backtrace: CPU: 2 PID: 0 Comm: swapper/2 Not tainted 6.7.0-rc3+ #598 Hardware name: Red Hat KVM, BIOS 1.13.0-2.module+el8.3.0+7353+9de0a3cc 04/01/2014 Call Trace: <IRQ> dump_stack_lvl+0x4a/0x80 __lock_acquire+0xfdd/0x3150 lock_acquire+0x1ca/0x540 _raw_spin_lock+0x34/0x80 __dev_queue_xmit+0x1560/0x2e70 tcf_mirred_act+0x82e/0x1260 [act_mirred] tcf_action_exec+0x161/0x480 tcf_classify+0x689/0x1170 prio_enqueue+0x316/0x660 [sch_prio] dev_qdisc_enqueue+0x46/0x220 __dev_queue_xmit+0x1615/0x2e70 ip_finish_output2+0x1218/0x1ed0 __ip_finish_output+0x8b3/0x1350 ip_output+0x163/0x4e0 igmp_ifc_timer_expire+0x44b/0x930 call_timer_fn+0x1a2/0x510 run_timer_softirq+0x54d/0x11a0 __do_softirq+0x1b3/0x88f irq_exit_rcu+0x18f/0x1e0 sysvec_apic_timer_interrupt+0x6f/0x90 </IRQ> This happens when TC does a mirred egress redirect from the root qdisc of device A to the root qdisc of device B. As long as these two locks aren't protecting the same qdisc, they can be acquired in chain: add a per-qdisc lockdep key to silence false warnings. This dynamic key should safely replace the static key we have in sch_htb: it was added to allow enqueueing to the device "direct qdisc" while still holding the qdisc root lock. v2: don't use static keys anymore in HTB direct qdiscs (thanks Eric Dumazet) CC: Maxim Mikityanskiy <maxim@isovalent.com> CC: Xiumei Mu <xmu@redhat.com> Reported-by: Christoph Paasch <cpaasch@apple.com> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/451 Signed-off-by: Davide Caratti <dcaratti@redhat.com> Link: https://lore.kernel.org/r/7dc06d6158f72053cf877a82e2a7a5bd23692faa.1713448007.git.dcaratti@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-04-25Merge branch '40GbE' of ↵Gravatar Jakub Kicinski 1-5/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== net: intel: start The Great Code Dedup + Page Pool for iavf Alexander Lobakin says: Here's a two-shot: introduce {,Intel} Ethernet common library (libeth and libie) and switch iavf to Page Pool. Details are in the commit messages; here's a summary: Not a secret there's a ton of code duplication between two and more Intel ethernet modules. Before introducing new changes, which would need to be copied over again, start decoupling the already existing duplicate functionality into a new module, which will be shared between several Intel Ethernet drivers. The first name that came to my mind was "libie" -- "Intel Ethernet common library". Also this sounds like "lovelie" (-> one word, no "lib I E" pls) and can be expanded as "lib Internet Explorer" :P The "generic", pure-software part is placed separately, so that it can be easily reused in any driver by any vendor without linking to the Intel pre-200G guts. In a few words, it's something any modern driver does the same way, but nobody moved it level up (yet). The series is only the beginning. From now on, adding every new feature or doing any good driver refactoring will remove much more lines than add for quite some time. There's a basic roadmap with some deduplications planned already, not speaking of that touching every line now asks: "can I share this?". The final destination is very ambitious: have only one unified driver for at least i40e, ice, iavf, and idpf with a struct ops for each generation. That's never gonna happen, right? But you still can at least try. PP conversion for iavf lands within the same series as these two are tied closely. libie will support Page Pool model only, so that a driver can't use much of the lib until it's converted. iavf is only the example, the rest will eventually be converted soon on a per-driver basis. That is when it gets really interesting. Stay tech. * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: MAINTAINERS: add entry for libeth and libie iavf: switch to Page Pool iavf: pack iavf_ring more efficiently libeth: add Rx buffer management page_pool: add DMA-sync-for-CPU inline helper page_pool: constify some read-only function arguments slab: introduce kvmalloc_array_node() and kvcalloc_node() iavf: drop page splitting and recycling iavf: kill "legacy-rx" for good net: intel: introduce {, Intel} Ethernet common library ==================== Link: https://lore.kernel.org/r/20240424203559.3420468-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-25net: add two more call_rcu_hurry()Gravatar Eric Dumazet 1-1/+1
I had failures with pmtu.sh selftests lately, with netns dismantles firing ref_tracking alerts [1]. After much debugging, I found that some queued rcu callbacks were delayed by minutes, because of CONFIG_RCU_LAZY=y option. Joel Fernandes had a similar issue in the past, fixed with commit 483c26ff63f4 ("net: Use call_rcu_hurry() for dst_release()") In this commit, I make sure nexthop_free_rcu() and free_fib_info_rcu() are not delayed too much because they both can release device references. tools/testing/selftests/net/pmtu.sh no longer fails. Traces were: [ 968.179860] ref_tracker: veth_A-R1@00000000d0ff3fe2 has 3/5 users at dst_alloc+0x76/0x160 ip6_dst_alloc+0x25/0x80 ip6_pol_route+0x2a8/0x450 ip6_pol_route_output+0x1f/0x30 fib6_rule_lookup+0x163/0x270 ip6_route_output_flags+0xda/0x190 ip6_dst_lookup_tail.constprop.0+0x1d0/0x260 ip6_dst_lookup_flow+0x47/0xa0 udp_tunnel6_dst_lookup+0x158/0x210 vxlan_xmit_one+0x4c2/0x1550 [vxlan] vxlan_xmit+0x52d/0x14f0 [vxlan] dev_hard_start_xmit+0x7b/0x1e0 __dev_queue_xmit+0x20b/0xe40 ip6_finish_output2+0x2ea/0x6e0 ip6_finish_output+0x143/0x320 ip6_output+0x74/0x140 [ 968.179860] ref_tracker: veth_A-R1@00000000d0ff3fe2 has 1/5 users at netdev_get_by_index+0xc0/0xe0 fib6_nh_init+0x1a9/0xa90 rtm_new_nexthop+0x6fa/0x1580 rtnetlink_rcv_msg+0x155/0x3e0 netlink_rcv_skb+0x61/0x110 rtnetlink_rcv+0x19/0x20 netlink_unicast+0x23f/0x380 netlink_sendmsg+0x1fc/0x430 ____sys_sendmsg+0x2ef/0x320 ___sys_sendmsg+0x86/0xd0 __sys_sendmsg+0x67/0xc0 __x64_sys_sendmsg+0x21/0x30 x64_sys_call+0x252/0x2030 do_syscall_64+0x6c/0x190 entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 968.179860] ref_tracker: veth_A-R1@00000000d0ff3fe2 has 1/5 users at ipv6_add_dev+0x136/0x530 addrconf_notify+0x19d/0x770 notifier_call_chain+0x65/0xd0 raw_notifier_call_chain+0x1a/0x20 call_netdevice_notifiers_info+0x54/0x90 register_netdevice+0x61e/0x790 veth_newlink+0x230/0x440 __rtnl_newlink+0x7d2/0xaa0 rtnl_newlink+0x4c/0x70 rtnetlink_rcv_msg+0x155/0x3e0 netlink_rcv_skb+0x61/0x110 rtnetlink_rcv+0x19/0x20 netlink_unicast+0x23f/0x380 netlink_sendmsg+0x1fc/0x430 ____sys_sendmsg+0x2ef/0x320 ___sys_sendmsg+0x86/0xd0 .... [ 1079.316024] ? show_regs+0x68/0x80 [ 1079.316087] ? __warn+0x8c/0x140 [ 1079.316103] ? ref_tracker_free+0x1a0/0x270 [ 1079.316117] ? report_bug+0x196/0x1c0 [ 1079.316135] ? handle_bug+0x42/0x80 [ 1079.316149] ? exc_invalid_op+0x1c/0x70 [ 1079.316162] ? asm_exc_invalid_op+0x1f/0x30 [ 1079.316193] ? ref_tracker_free+0x1a0/0x270 [ 1079.316208] ? _raw_spin_unlock+0x1a/0x40 [ 1079.316222] ? free_unref_page+0x126/0x1a0 [ 1079.316239] ? destroy_large_folio+0x69/0x90 [ 1079.316251] ? __folio_put+0x99/0xd0 [ 1079.316276] dst_dev_put+0x69/0xd0 [ 1079.316308] fib6_nh_release_dsts.part.0+0x3d/0x80 [ 1079.316327] fib6_nh_release+0x45/0x70 [ 1079.316340] nexthop_free_rcu+0x131/0x170 [ 1079.316356] rcu_do_batch+0x1ee/0x820 [ 1079.316370] ? rcu_do_batch+0x179/0x820 [ 1079.316388] rcu_core+0x1aa/0x4d0 [ 1079.316405] rcu_core_si+0x12/0x20 [ 1079.316417] __do_softirq+0x13a/0x3dc [ 1079.316435] __irq_exit_rcu+0xa3/0x110 [ 1079.316449] irq_exit_rcu+0x12/0x30 [ 1079.316462] sysvec_apic_timer_interrupt+0x5b/0xe0 [ 1079.316474] asm_sysvec_apic_timer_interrupt+0x1f/0x30 [ 1079.316569] RIP: 0033:0x7f06b65c63f0 Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20240423205408.39632-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netGravatar Jakub Kicinski 33-165/+248
Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/ethernet/ti/icssg/icssg_prueth.c net/mac80211/chan.c 89884459a0b9 ("wifi: mac80211: fix idle calculation with multi-link") 87f5500285fb ("wifi: mac80211: simplify ieee80211_assign_link_chanctx()") https://lore.kernel.org/all/20240422105623.7b1fbda2@canb.auug.org.au/ net/unix/garbage.c 1971d13ffa84 ("af_unix: Suppress false-positive lockdep splat for spin_lock() in __unix_gc().") 4090fa373f0e ("af_unix: Replace garbage collection algorithm.") drivers/net/ethernet/ti/icssg/icssg_prueth.c drivers/net/ethernet/ti/icssg/icssg_common.c 4dcd0e83ea1d ("net: ti: icssg-prueth: Fix signedness bug in prueth_init_rx_chns()") e2dc7bfd677f ("net: ti: icssg-prueth: Move common functions into a separate file") No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-25tcp: avoid premature drops in tcp_add_backlog()Gravatar Eric Dumazet 1-2/+11
While testing TCP performance with latest trees, I saw suspect SOCKET_BACKLOG drops. tcp_add_backlog() computes its limit with : limit = (u32)READ_ONCE(sk->sk_rcvbuf) + (u32)(READ_ONCE(sk->sk_sndbuf) >> 1); limit += 64 * 1024; This does not take into account that sk->sk_backlog.len is reset only at the very end of __release_sock(). Both sk->sk_backlog.len and sk->sk_rmem_alloc could reach sk_rcvbuf in normal conditions. We should double sk->sk_rcvbuf contribution in the formula to absorb bubbles in the backlog, which happen more often for very fast flows. This change maintains decent protection against abuses. Fixes: c377411f2494 ("net: sk_add_backlog() take rmem_alloc into account") Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240423125620.3309458-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-25Merge tag 'wireless-next-2024-04-24' of ↵Gravatar Jakub Kicinski 13-77/+195
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Kalle Valo says: ==================== wireless-next patches for v6.10 The second "new features" pull request for v6.10 with changes both in stack and in drivers. This time the pull request is rather small and nothing special standing out except maybe that we have several kernel-doc fixes. Great to see that we are getting warning free wireless code (until new warnings are added). Major changes: rtl8xxxu: * enable Management Frame Protection (MFP) support rtw88: * disable unsupported interface type of mesh point for all chips, and only support station mode for SDIO chips. * tag 'wireless-next-2024-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (63 commits) wifi: mac80211: handle link ID during management Tx wifi: mac80211: handle sdata->u.ap.active flag with MLO wifi: cfg80211: add return docs for regulatory functions wifi: cfg80211: make some regulatory functions void wifi: mac80211: add return docs for sta_info_flush() wifi: mac80211: keep mac80211 consistent on link activation failure wifi: mac80211: simplify ieee80211_assign_link_chanctx() wifi: mac80211: reserve chanctx during find wifi: cfg80211: fix cfg80211 function kernel-doc wifi: mac80211_hwsim: Use wider regulatory for custom for 6GHz tests wifi: iwlwifi: mvm: Don't allow EMLSR when the RSSI is low wifi: iwlwifi: mvm: disable EMLSR when we suspend with wowlan wifi: iwlwifi: mvm: get periodic statistics in EMLSR wifi: iwlwifi: mvm: don't recompute EMLSR mode in can_activate_links wifi: iwlwifi: mvm: implement EMLSR prevention mechanism. wifi: iwlwifi: mvm: exit EMLSR upon missed beacon wifi: iwlwifi: mvm: init vif works only once wifi: iwlwifi: mvm: Add helper functions to update EMLSR status wifi: iwlwifi: mvm: Implement new link selection algorithm wifi: iwlwifi: mvm: move EMLSR/links code ... ==================== Link: https://lore.kernel.org/r/20240424100122.217AEC113CE@smtp.kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-25Merge tag 'net-6.9-rc6' of ↵Gravatar Linus Torvalds 32-98/+226
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from netfilter, wireless and bluetooth. Nothing major, regression fixes are mostly in drivers, two more of those are flowing towards us thru various trees. I wish some of the changes went into -rc5, we'll try to keep an eye on frequency of PRs from sub-trees. Also disproportional number of fixes for bugs added in v6.4, strange coincidence. Current release - regressions: - igc: fix LED-related deadlock on driver unbind - wifi: mac80211: small fixes to recent clean up of the connection process - Revert "wifi: iwlwifi: bump FW API to 90 for BZ/SC devices", kernel doesn't have all the code to deal with that version, yet - Bluetooth: - set power_ctrl_enabled on NULL returned by gpiod_get_optional() - qca: fix invalid device address check, again - eth: ravb: fix registered interrupt names Current release - new code bugs: - wifi: mac80211: check EHT/TTLM action frame length Previous releases - regressions: - fix sk_memory_allocated_{add|sub} for architectures where __this_cpu_{add|sub}* are not IRQ-safe - dsa: mv88e6xx: fix link setup for 88E6250 Previous releases - always broken: - ip: validate dev returned from __in_dev_get_rcu(), prevent possible null-derefs in a few places - switch number of for_each_rcu() loops using call_rcu() on the iterator to for_each_safe() - macsec: fix isolation of broadcast traffic in presence of offload - vxlan: drop packets from invalid source address - eth: mlxsw: trap and ACL programming fixes - eth: bnxt: PCIe error recovery fixes, fix counting dropped packets - Bluetooth: - lots of fixes for the command submission rework from v6.4 - qca: fix NULL-deref on non-serdev suspend Misc: - tools: ynl: don't ignore errors in NLMSG_DONE messages" * tag 'net-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (88 commits) af_unix: Suppress false-positive lockdep splat for spin_lock() in __unix_gc(). net: b44: set pause params only when interface is up tls: fix lockless read of strp->msg_ready in ->poll dpll: fix dpll_pin_on_pin_register() for multiple parent pins net: ravb: Fix registered interrupt names octeontx2-af: fix the double free in rvu_npc_freemem() net: ethernet: ti: am65-cpts: Fix PTPv1 message type on TX packets ice: fix LAG and VF lock dependency in ice_reset_vf() iavf: Fix TC config comparison with existing adapter TC config i40e: Report MFS in decimal base instead of hex i40e: Do not use WQ_MEM_RECLAIM flag for workqueue net: ti: icssg-prueth: Fix signedness bug in prueth_init_rx_chns() net/mlx5e: Advertise mlx5 ethernet driver updates sk_buff md_dst for MACsec macsec: Detect if Rx skb is macsec-related for offloading devices that update md_dst ethernet: Add helper for assigning packet type when dest address does not match device address macsec: Enable devices to advertise whether they update sk_buff md_dst during offloads net: phy: dp83869: Fix MII mode failure netfilter: nf_tables: honor table dormant flag from netdev release event path eth: bnxt: fix counting packets discarded due to OOM and netpoll igc: Fix LED-related deadlock on driver unbind ...
2024-04-25Merge tag 'nf-24-04-25' of ↵Gravatar Jakub Kicinski 2-3/+7
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following patchset contains two Netfilter/IPVS fixes for net: Patch #1 fixes SCTP checksumming for IPVS with gso packets, from Ismael Luceno. Patch #2 honor dormant flag from netdev event path to fix a possible double hook unregistration. * tag 'nf-24-04-25' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nf_tables: honor table dormant flag from netdev release event path ipvs: Fix checksumming on GSO of SCTP packets ==================== Link: https://lore.kernel.org/r/20240425090149.1359547-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-25af_unix: Suppress false-positive lockdep splat for spin_lock() in __unix_gc().Gravatar Kuniyuki Iwashima 1-1/+1
syzbot reported a lockdep splat regarding unix_gc_lock and unix_state_lock(). One is called from recvmsg() for a connected socket, and another is called from GC for TCP_LISTEN socket. So, the splat is false-positive. Let's add a dedicated lock class for the latter to suppress the splat. Note that this change is not necessary for net-next.git as the issue is only applied to the old GC impl. [0]: WARNING: possible circular locking dependency detected 6.9.0-rc5-syzkaller-00007-g4d2008430ce8 #0 Not tainted ----------------------------------------------------- kworker/u8:1/11 is trying to acquire lock: ffff88807cea4e70 (&u->lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline] ffff88807cea4e70 (&u->lock){+.+.}-{2:2}, at: __unix_gc+0x40e/0xf70 net/unix/garbage.c:302 but task is already holding lock: ffffffff8f6ab638 (unix_gc_lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline] ffffffff8f6ab638 (unix_gc_lock){+.+.}-{2:2}, at: __unix_gc+0x117/0xf70 net/unix/garbage.c:261 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (unix_gc_lock){+.+.}-{2:2}: lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754 __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline] _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154 spin_lock include/linux/spinlock.h:351 [inline] unix_notinflight+0x13d/0x390 net/unix/garbage.c:140 unix_detach_fds net/unix/af_unix.c:1819 [inline] unix_destruct_scm+0x221/0x350 net/unix/af_unix.c:1876 skb_release_head_state+0x100/0x250 net/core/skbuff.c:1188 skb_release_all net/core/skbuff.c:1200 [inline] __kfree_skb net/core/skbuff.c:1216 [inline] kfree_skb_reason+0x16d/0x3b0 net/core/skbuff.c:1252 kfree_skb include/linux/skbuff.h:1262 [inline] manage_oob net/unix/af_unix.c:2672 [inline] unix_stream_read_generic+0x1125/0x2700 net/unix/af_unix.c:2749 unix_stream_splice_read+0x239/0x320 net/unix/af_unix.c:2981 do_splice_read fs/splice.c:985 [inline] splice_file_to_pipe+0x299/0x500 fs/splice.c:1295 do_splice+0xf2d/0x1880 fs/splice.c:1379 __do_splice fs/splice.c:1436 [inline] __do_sys_splice fs/splice.c:1652 [inline] __se_sys_splice+0x331/0x4a0 fs/splice.c:1634 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f -> #0 (&u->lock){+.+.}-{2:2}: check_prev_add kernel/locking/lockdep.c:3134 [inline] check_prevs_add kernel/locking/lockdep.c:3253 [inline] validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869 __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137 lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754 __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline] _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154 spin_lock include/linux/spinlock.h:351 [inline] __unix_gc+0x40e/0xf70 net/unix/garbage.c:302 process_one_work kernel/workqueue.c:3254 [inline] process_scheduled_works+0xa10/0x17c0 kernel/workqueue.c:3335 worker_thread+0x86d/0xd70 kernel/workqueue.c:3416 kthread+0x2f0/0x390 kernel/kthread.c:388 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(unix_gc_lock); lock(&u->lock); lock(unix_gc_lock); lock(&u->lock); *** DEADLOCK *** 3 locks held by kworker/u8:1/11: #0: ffff888015089148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3229 [inline] #0: ffff888015089148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_scheduled_works+0x8e0/0x17c0 kernel/workqueue.c:3335 #1: ffffc90000107d00 (unix_gc_work){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3230 [inline] #1: ffffc90000107d00 (unix_gc_work){+.+.}-{0:0}, at: process_scheduled_works+0x91b/0x17c0 kernel/workqueue.c:3335 #2: ffffffff8f6ab638 (unix_gc_lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline] #2: ffffffff8f6ab638 (unix_gc_lock){+.+.}-{2:2}, at: __unix_gc+0x117/0xf70 net/unix/garbage.c:261 stack backtrace: CPU: 0 PID: 11 Comm: kworker/u8:1 Not tainted 6.9.0-rc5-syzkaller-00007-g4d2008430ce8 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024 Workqueue: events_unbound __unix_gc Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114 check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2187 check_prev_add kernel/locking/lockdep.c:3134 [inline] check_prevs_add kernel/locking/lockdep.c:3253 [inline] validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869 __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137 lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754 __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline] _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154 spin_lock include/linux/spinlock.h:351 [inline] __unix_gc+0x40e/0xf70 net/unix/garbage.c:302 process_one_work kernel/workqueue.c:3254 [inline] process_scheduled_works+0xa10/0x17c0 kernel/workqueue.c:3335 worker_thread+0x86d/0xd70 kernel/workqueue.c:3416 kthread+0x2f0/0x390 kernel/kthread.c:388 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 </TASK> Fixes: 47d8ac011fe1 ("af_unix: Fix garbage collector racing against connect()") Reported-and-tested-by: syzbot+fa379358c28cc87cc307@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=fa379358c28cc87cc307 Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20240424170443.9832-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-25tls: fix lockless read of strp->msg_ready in ->pollGravatar Sabrina Dubroca 2-4/+4
tls_sk_poll is called without locking the socket, and needs to read strp->msg_ready (via tls_strp_msg_ready). Convert msg_ready to a bool and use READ_ONCE/WRITE_ONCE where needed. The remaining reads are only performed when the socket is locked. Fixes: 121dca784fc0 ("tls: suppress wakeups unless we have a full record") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://lore.kernel.org/r/0b7ee062319037cf86af6b317b3d72f7bfcd2e97.1713797701.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-25ethernet: Add helper for assigning packet type when dest address does not ↵Gravatar Rahul Rameshbabu 1-11/+1
match device address Enable reuse of logic in eth_type_trans for determining packet type. Suggested-by: Sabrina Dubroca <sd@queasysnail.net> Cc: stable@vger.kernel.org Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://lore.kernel.org/r/20240423181319.115860-3-rrameshbabu@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-25net: bridge: remove redundant check of f->dstGravatar linke li 1-1/+1
In br_fill_forward_path(), f->dst is checked not to be NULL, then immediately read using READ_ONCE and checked again. The first check is useless, so this patch aims to remove the redundant check of f->dst. Signed-off-by: linke li <lilinke99@qq.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-25Merge tag 'wireless-2024-04-23' of ↵Gravatar David S. Miller 11-38/+138
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Johannes berg says: ==================== Fixes for the current cycle: * ath11k: convert to correct RCU iteration of IPv6 addresses * iwlwifi: link ID, FW API version, scanning and PASN fixes * cfg80211: NULL-deref and tracing fixes * mac80211: connection mode, mesh fast-TX, multi-link and various other small fixes ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-25netfilter: nf_tables: honor table dormant flag from netdev release event pathGravatar Pablo Neira Ayuso 1-1/+3
Check for table dormant flag otherwise netdev release event path tries to unregister an already unregistered hook. [524854.857999] ------------[ cut here ]------------ [524854.858010] WARNING: CPU: 0 PID: 3386599 at net/netfilter/core.c:501 __nf_unregister_net_hook+0x21a/0x260 [...] [524854.858848] CPU: 0 PID: 3386599 Comm: kworker/u32:2 Not tainted 6.9.0-rc3+ #365 [524854.858869] Workqueue: netns cleanup_net [524854.858886] RIP: 0010:__nf_unregister_net_hook+0x21a/0x260 [524854.858903] Code: 24 e8 aa 73 83 ff 48 63 43 1c 83 f8 01 0f 85 3d ff ff ff e8 98 d1 f0 ff 48 8b 3c 24 e8 8f 73 83 ff 48 63 43 1c e9 26 ff ff ff <0f> 0b 48 83 c4 18 48 c7 c7 00 68 e9 82 5b 5d 41 5c 41 5d 41 5e 41 [524854.858914] RSP: 0018:ffff8881e36d79e0 EFLAGS: 00010246 [524854.858926] RAX: 0000000000000000 RBX: ffff8881339ae790 RCX: ffffffff81ba524a [524854.858936] RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff8881c8a16438 [524854.858945] RBP: ffff8881c8a16438 R08: 0000000000000001 R09: ffffed103c6daf34 [524854.858954] R10: ffff8881e36d79a7 R11: 0000000000000000 R12: 0000000000000005 [524854.858962] R13: ffff8881c8a16000 R14: 0000000000000000 R15: ffff8881351b5a00 [524854.858971] FS: 0000000000000000(0000) GS:ffff888390800000(0000) knlGS:0000000000000000 [524854.858982] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [524854.858991] CR2: 00007fc9be0f16f4 CR3: 00000001437cc004 CR4: 00000000001706f0 [524854.859000] Call Trace: [524854.859006] <TASK> [524854.859013] ? __warn+0x9f/0x1a0 [524854.859027] ? __nf_unregister_net_hook+0x21a/0x260 [524854.859044] ? report_bug+0x1b1/0x1e0 [524854.859060] ? handle_bug+0x3c/0x70 [524854.859071] ? exc_invalid_op+0x17/0x40 [524854.859083] ? asm_exc_invalid_op+0x1a/0x20 [524854.859100] ? __nf_unregister_net_hook+0x6a/0x260 [524854.859116] ? __nf_unregister_net_hook+0x21a/0x260 [524854.859135] nf_tables_netdev_event+0x337/0x390 [nf_tables] [524854.859304] ? __pfx_nf_tables_netdev_event+0x10/0x10 [nf_tables] [524854.859461] ? packet_notifier+0xb3/0x360 [524854.859476] ? _raw_spin_unlock_irqrestore+0x11/0x40 [524854.859489] ? dcbnl_netdevice_event+0x35/0x140 [524854.859507] ? __pfx_nf_tables_netdev_event+0x10/0x10 [nf_tables] [524854.859661] notifier_call_chain+0x7d/0x140 [524854.859677] unregister_netdevice_many_notify+0x5e1/0xae0 Fixes: d54725cd11a5 ("netfilter: nf_tables: support for multiple devices per netdev hook") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-04-25tcp: update sacked after tracepoint in __tcp_retransmit_skbGravatar Philo Lu 1-5/+6
Marking TCP_SKB_CB(skb)->sacked with TCPCB_EVER_RETRANS after the traceopint (trace_tcp_retransmit_skb), then we can get the retransmission efficiency by counting skbs w/ and w/o TCPCB_EVER_RETRANS mark in this tracepoint. We have discussed to achieve this with BPF_SOCK_OPS in [0], and using tracepoint is thought to be a better solution. [0] https://lore.kernel.org/all/20240417124622.35333-1-lulie@linux.alibaba.com/ Signed-off-by: Philo Lu <lulie@linux.alibaba.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-24Merge tag 'for-net-2024-04-24' of ↵Gravatar Jakub Kicinski 7-30/+50
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - qca: set power_ctrl_enabled on NULL returned by gpiod_get_optional() - hci_sync: Using hci_cmd_sync_submit when removing Adv Monitor - qca: fix invalid device address check - hci_sync: Use advertised PHYs on hci_le_ext_create_conn_sync - Fix type of len in {l2cap,sco}_sock_getsockopt_old() - btusb: mediatek: Fix double free of skb in coredump - btusb: Add Realtek RTL8852BE support ID 0x0bda:0x4853 - btusb: Fix triggering coredump implementation for QCA * tag 'for-net-2024-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: qca: set power_ctrl_enabled on NULL returned by gpiod_get_optional() Bluetooth: hci_sync: Using hci_cmd_sync_submit when removing Adv Monitor Bluetooth: qca: fix NULL-deref on non-serdev setup Bluetooth: qca: fix NULL-deref on non-serdev suspend Bluetooth: btusb: mediatek: Fix double free of skb in coredump Bluetooth: MGMT: Fix failing to MGMT_OP_ADD_UUID/MGMT_OP_REMOVE_UUID Bluetooth: qca: fix invalid device address check Bluetooth: hci_event: Fix sending HCI_OP_READ_ENC_KEY_SIZE Bluetooth: btusb: Fix triggering coredump implementation for QCA Bluetooth: btusb: Add Realtek RTL8852BE support ID 0x0bda:0x4853 Bluetooth: hci_sync: Use advertised PHYs on hci_le_ext_create_conn_sync Bluetooth: Fix type of len in {l2cap,sco}_sock_getsockopt_old() ==================== Link: https://lore.kernel.org/r/20240424204102.2319483-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-24Merge branch '100GbE' of ↵Gravatar Jakub Kicinski 2-4/+6
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== ice: Support 5 layer Tx scheduler topology Mateusz Polchlopek says: For performance reasons there is a need to have support for selectable Tx scheduler topology. Currently firmware supports only the default 9-layer and 5-layer topology. This patch series enables switch from default to 5-layer topology, if user decides to opt-in. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: Document tx_scheduling_layers parameter ice: Add tx_scheduling_layers devlink param ice: Enable switching default Tx scheduler topology ice: Adjust the VSI/Aggregator layers ice: Support 5 layer topology devlink: extend devlink_param *set pointer ==================== Link: https://lore.kernel.org/r/20240422203913.225151-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-24net: openvswitch: Fix Use-After-Free in ovs_ct_exitGravatar Hyunwoo Kim 1-2/+2
Since kfree_rcu, which is called in the hlist_for_each_entry_rcu traversal of ovs_ct_limit_exit, is not part of the RCU read critical section, it is possible that the RCU grace period will pass during the traversal and the key will be free. To prevent this, it should be changed to hlist_for_each_entry_safe. Fixes: 11efd5cb04a1 ("openvswitch: Support conntrack zone limit") Signed-off-by: Hyunwoo Kim <v4bel@theori.io> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Aaron Conole <aconole@redhat.com> Link: https://lore.kernel.org/r/ZiYvzQN/Ry5oeFQW@v4bel-B760M-AORUS-ELITE-AX Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-24net: openvswitch: Release reference to netdevGravatar Jun Gu 1-2/+6
dev_get_by_name will provide a reference on the netdev. So ensure that the reference of netdev is released after completed. Fixes: 2540088b836f ("net: openvswitch: Check vport netdev name") Signed-off-by: Jun Gu <jun.gu@easystack.cn> Reviewed-by: Aaron Conole <aconole@redhat.com> Link: https://lore.kernel.org/r/20240423073751.52706-1-jun.gu@easystack.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-25ipvs: Fix checksumming on GSO of SCTP packetsGravatar Ismael Luceno 1-2/+4
It was observed in the wild that pairs of consecutive packets would leave the IPVS with the same wrong checksum, and the issue only went away when disabling GSO. IPVS needs to avoid computing the SCTP checksum when using GSO. Fixes: 90017accff61 ("sctp: Add GSO support") Co-developed-by: Firo Yang <firo.yang@suse.com> Signed-off-by: Ismael Luceno <iluceno@suse.de> Tested-by: Andreas Taschner <andreas.taschner@suse.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-04-24Bluetooth: hci_sync: Using hci_cmd_sync_submit when removing Adv MonitorGravatar Chun-Yi Lee 1-2/+2
Since the d883a4669a1de be introduced in v6.4, bluetooth daemon got the following failed message of MGMT_OP_REMOVE_ADV_MONITOR command when controller is power-off: bluetoothd[20976]: src/adapter.c:reset_adv_monitors_complete() Failed to reset Adv Monitors: Failed> Normally this situation is happened when the bluetoothd deamon be started manually after system booting. Which means that bluetoothd received MGMT_EV_INDEX_ADDED event after kernel runs hci_power_off(). Base on doc/mgmt-api.txt, the MGMT_OP_REMOVE_ADV_MONITOR command can be used when the controller is not powered. This patch changes the code in remove_adv_monitor() to use hci_cmd_sync_submit() instead of hci_cmd_sync_queue(). Fixes: d883a4669a1de ("Bluetooth: hci_sync: Only allow hci_cmd_sync_queue if running") Cc: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Cc: Manish Mandlik <mmandlik@google.com> Cc: Archie Pusaka <apusaka@chromium.org> Cc: Miao-chen Chou <mcchou@chromium.org> Signed-off-by: Chun-Yi Lee <jlee@suse.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-04-24Bluetooth: MGMT: Fix failing to MGMT_OP_ADD_UUID/MGMT_OP_REMOVE_UUIDGravatar Luiz Augusto von Dentz 1-5/+15
These commands don't require the adapter to be up and running so don't use hci_cmd_sync_queue which would check that flag, instead use hci_cmd_sync_submit which would ensure mgmt_class_complete is set properly regardless if any command was actually run or not. Link: https://github.com/bluez/bluez/issues/809 Fixes: d883a4669a1d ("Bluetooth: hci_sync: Only allow hci_cmd_sync_queue if running") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-04-24Bluetooth: hci_event: Fix sending HCI_OP_READ_ENC_KEY_SIZEGravatar Luiz Augusto von Dentz 1-3/+2
The code shall always check if HCI_QUIRK_BROKEN_READ_ENC_KEY_SIZE has been set before attempting to use HCI_OP_READ_ENC_KEY_SIZE. Fixes: c569242cd492 ("Bluetooth: hci_event: set the conn encrypted before conn establishes") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-04-24Bluetooth: hci_sync: Use advertised PHYs on hci_le_ext_create_conn_syncGravatar Luiz Augusto von Dentz 4-14/+23
The extended advertising reports do report the PHYs so this store then in hci_conn so it can be later used in hci_le_ext_create_conn_sync to narrow the PHYs to be scanned since the controller will also perform a scan having a smaller set of PHYs shall reduce the time it takes to find and connect peers. Fixes: 288c90224eec ("Bluetooth: Enable all supported LE PHY by default") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-04-24Bluetooth: Fix type of len in {l2cap,sco}_sock_getsockopt_old()Gravatar Nathan Chancellor 2-6/+8
After an innocuous optimization change in LLVM main (19.0.0), x86_64 allmodconfig (which enables CONFIG_KCSAN / -fsanitize=thread) fails to build due to the checks in check_copy_size(): In file included from net/bluetooth/sco.c:27: In file included from include/linux/module.h:13: In file included from include/linux/stat.h:19: In file included from include/linux/time.h:60: In file included from include/linux/time32.h:13: In file included from include/linux/timex.h:67: In file included from arch/x86/include/asm/timex.h:6: In file included from arch/x86/include/asm/tsc.h:10: In file included from arch/x86/include/asm/msr.h:15: In file included from include/linux/percpu.h:7: In file included from include/linux/smp.h:118: include/linux/thread_info.h:244:4: error: call to '__bad_copy_from' declared with 'error' attribute: copy source size is too small 244 | __bad_copy_from(); | ^ The same exact error occurs in l2cap_sock.c. The copy_to_user() statements that are failing come from l2cap_sock_getsockopt_old() and sco_sock_getsockopt_old(). This does not occur with GCC with or without KCSAN or Clang without KCSAN enabled. len is defined as an 'int' because it is assigned from '__user int *optlen'. However, it is clamped against the result of sizeof(), which has a type of 'size_t' ('unsigned long' for 64-bit platforms). This is done with min_t() because min() requires compatible types, which results in both len and the result of sizeof() being casted to 'unsigned int', meaning len changes signs and the result of sizeof() is truncated. From there, len is passed to copy_to_user(), which has a third parameter type of 'unsigned long', so it is widened and changes signs again. This excessive casting in combination with the KCSAN instrumentation causes LLVM to fail to eliminate the __bad_copy_from() call, failing the build. The official recommendation from LLVM developers is to consistently use long types for all size variables to avoid the unnecessary casting in the first place. Change the type of len to size_t in both l2cap_sock_getsockopt_old() and sco_sock_getsockopt_old(). This clears up the error while allowing min_t() to be replaced with min(), resulting in simpler code with no casts and fewer implicit conversions. While len is a different type than optlen now, it should result in no functional change because the result of sizeof() will clamp all values of optlen in the same manner as before. Cc: stable@vger.kernel.org Closes: https://github.com/ClangBuiltLinux/linux/issues/2007 Link: https://github.com/llvm/llvm-project/issues/85647 Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Justin Stitt <justinstitt@google.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-04-24page_pool: constify some read-only function argumentsGravatar Alexander Lobakin 1-5/+5
There are several functions taking pointers to data they don't modify. This includes statistics fetching, page and page_pool parameters, etc. Constify the pointers, so that call sites will be able to pass const pointers as well. No functional changes, no visible changes in functions sizes. Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-04-24net: create a dummy net_device allocatorGravatar Breno Leitao 1-18/+38
It is impossible to use init_dummy_netdev together with alloc_netdev() as the 'setup' argument. This is because alloc_netdev() initializes some fields in the net_device structure, and later init_dummy_netdev() memzero them all. This causes some problems as reported here: https://lore.kernel.org/all/20240322082336.49f110cc@kernel.org/ Split the init_dummy_netdev() function in two. Create a new function called init_dummy_netdev_core() that does not memzero the net_device structure. Then have init_dummy_netdev() memzero-ing and calling init_dummy_netdev_core(), keeping the old behaviour. init_dummy_netdev_core() is the new function that could be called as an argument for alloc_netdev(). Also, create a helper to allocate and initialize dummy net devices, leveraging init_dummy_netdev_core() as the setup argument. This function basically simplify the allocation of dummy devices, by allocating and initializing it. Freeing the device continue to be done through free_netdev() Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-24net: free_netdev: exit earlier if dummyGravatar Breno Leitao 1-1/+2
For dummy devices, exit earlier at free_netdev() instead of executing the whole function. This is necessary, because dummy devices are special, and shouldn't have the second part of the function executed. Otherwise reg_state, which is NETREG_DUMMY, will be overwritten and there will be no way to identify that this is a dummy device. Also, this device do not need the final put_device(), since dummy devices are not registered (through register_netdevice()), where the device reference is increased (at netdev_register_kobject()/device_add()). Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-24net: core: Fix documentationGravatar Breno Leitao 1-2/+2
Fix bad grammar in description of init_dummy_netdev() function. This topic showed up in the review of the "allocate dummy device dynamically" patch set. Suggested-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-23tcp: Fix Use-After-Free in tcp_ao_connect_initGravatar Hyunwoo Kim 1-1/+2
Since call_rcu, which is called in the hlist_for_each_entry_rcu traversal of tcp_ao_connect_init, is not part of the RCU read critical section, it is possible that the RCU grace period will pass during the traversal and the key will be free. To prevent this, it should be changed to hlist_for_each_entry_safe. Fixes: 7c2ffaf21bd6 ("net/tcp: Calculate TCP-AO traffic keys") Signed-off-by: Hyunwoo Kim <v4bel@theori.io> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Dmitry Safonov <0x7f454c46@gmail.com> Link: https://lore.kernel.org/r/ZiYu9NJ/ClR8uSkH@v4bel-B760M-AORUS-ELITE-AX Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-23neighbour: fix neigh_master_filtered()Gravatar Eric Dumazet 1-1/+1
If we no longer hold RTNL, we must use netdev_master_upper_dev_get_rcu() instead of netdev_master_upper_dev_get(). Fixes: ba0f78069423 ("neighbour: no longer hold RTNL in neigh_dump_info()") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20240421185753.1808077-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-23ipv4: check for NULL idev in ip_route_use_hint()Gravatar Eric Dumazet 1-0/+3
syzbot was able to trigger a NULL deref in fib_validate_source() in an old tree [1]. It appears the bug exists in latest trees. All calls to __in_dev_get_rcu() must be checked for a NULL result. [1] general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] CPU: 2 PID: 3257 Comm: syz-executor.3 Not tainted 5.10.0-syzkaller #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 RIP: 0010:fib_validate_source+0xbf/0x15a0 net/ipv4/fib_frontend.c:425 Code: 18 f2 f2 f2 f2 42 c7 44 20 23 f3 f3 f3 f3 48 89 44 24 78 42 c6 44 20 27 f3 e8 5d 88 48 fc 4c 89 e8 48 c1 e8 03 48 89 44 24 18 <42> 80 3c 20 00 74 08 4c 89 ef e8 d2 15 98 fc 48 89 5c 24 10 41 bf RSP: 0018:ffffc900015fee40 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88800f7a4000 RCX: ffff88800f4f90c0 RDX: 0000000000000000 RSI: 0000000004001eac RDI: ffff8880160c64c0 RBP: ffffc900015ff060 R08: 0000000000000000 R09: ffff88800f7a4000 R10: 0000000000000002 R11: ffff88800f4f90c0 R12: dffffc0000000000 R13: 0000000000000000 R14: 0000000000000000 R15: ffff88800f7a4000 FS: 00007f938acfe6c0(0000) GS:ffff888058c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f938acddd58 CR3: 000000001248e000 CR4: 0000000000352ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ip_route_use_hint+0x410/0x9b0 net/ipv4/route.c:2231 ip_rcv_finish_core+0x2c4/0x1a30 net/ipv4/ip_input.c:327 ip_list_rcv_finish net/ipv4/ip_input.c:612 [inline] ip_sublist_rcv+0x3ed/0xe50 net/ipv4/ip_input.c:638 ip_list_rcv+0x422/0x470 net/ipv4/ip_input.c:673 __netif_receive_skb_list_ptype net/core/dev.c:5572 [inline] __netif_receive_skb_list_core+0x6b1/0x890 net/core/dev.c:5620 __netif_receive_skb_list net/core/dev.c:5672 [inline] netif_receive_skb_list_internal+0x9f9/0xdc0 net/core/dev.c:5764 netif_receive_skb_list+0x55/0x3e0 net/core/dev.c:5816 xdp_recv_frames net/bpf/test_run.c:257 [inline] xdp_test_run_batch net/bpf/test_run.c:335 [inline] bpf_test_run_xdp_live+0x1818/0x1d00 net/bpf/test_run.c:363 bpf_prog_test_run_xdp+0x81f/0x1170 net/bpf/test_run.c:1376 bpf_prog_test_run+0x349/0x3c0 kernel/bpf/syscall.c:3736 __sys_bpf+0x45c/0x710 kernel/bpf/syscall.c:5115 __do_sys_bpf kernel/bpf/syscall.c:5201 [inline] __se_sys_bpf kernel/bpf/syscall.c:5199 [inline] __x64_sys_bpf+0x7c/0x90 kernel/bpf/syscall.c:5199 Fixes: 02b24941619f ("ipv4: use dst hint for ipv4 list receive") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/20240421184326.1704930-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-23netlink: support all extack types in dumpsGravatar Jakub Kicinski 1-5/+10
Note that when this commit message refers to netlink dump it only means the actual dumping part, the parsing / dump start is handled by the same code as "doit". Commit 4a19edb60d02 ("netlink: Pass extack to dump handlers") added support for returning extack messages from dump handlers, but left out other extack info, e.g. bad attribute. This used to be fine because until YNL we had little practical use for the machine readable attributes, and only messages were used in practice. YNL flips the preference 180 degrees, it's now much more useful to point to a bad attr with NL_SET_BAD_ATTR() than type an English message saying "attribute XYZ is $reason-why-bad". Support all of extack. The fact that extack only gets added if it fits remains unaddressed. Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20240420023543.3300306-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-23netlink: move extack writing helpersGravatar Jakub Kicinski 1-63/+63
Next change will need them in netlink_dump_done(), pure move. Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20240420023543.3300306-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-23netdev: support dumping a single netdev in qstatsGravatar Jakub Kicinski 2-13/+40
Having to filter the right ifindex in the tests is a bit tedious. Add support for dumping qstats for a single ifindex. Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240420023543.3300306-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-23af_unix: Don't access successor in unix_del_edges() during GC.Gravatar Kuniyuki Iwashima 1-5/+12
syzbot reported use-after-free in unix_del_edges(). [0] What the repro does is basically repeat the following quickly. 1. pass a fd of an AF_UNIX socket to itself socketpair(AF_UNIX, SOCK_DGRAM, 0, [3, 4]) = 0 sendmsg(3, {..., msg_control=[{cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[4]}], ...}, 0) = 0 2. pass other fds of AF_UNIX sockets to the socket above socketpair(AF_UNIX, SOCK_SEQPACKET, 0, [5, 6]) = 0 sendmsg(3, {..., msg_control=[{cmsg_len=48, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[5, 6]}], ...}, 0) = 0 3. close all sockets Here, two skb are created, and every unix_edge->successor is the first socket. Then, __unix_gc() will garbage-collect the two skb: (a) free skb with self-referencing fd (b) free skb holding other sockets After (a), the self-referencing socket will be scheduled to be freed later by the delayed_fput() task. syzbot repeated the sequences above (1. ~ 3.) quickly and triggered the task concurrently while GC was running. So, at (b), the socket was already freed, and accessing it was illegal. unix_del_edges() accesses the receiver socket as edge->successor to optimise GC. However, we should not do it during GC. Garbage-collecting sockets does not change the shape of the rest of the graph, so we need not call unix_update_graph() to update unix_graph_grouped when we purge skb. However, if we clean up all loops in the unix_walk_scc_fast() path, unix_graph_maybe_cyclic remains unchanged (true), and __unix_gc() will call unix_walk_scc_fast() continuously even though there is no socket to garbage-collect. To keep that optimisation while fixing UAF, let's add the same updating logic of unix_graph_maybe_cyclic in unix_walk_scc_fast() as done in unix_walk_scc() and __unix_walk_scc(). Note that when unix_del_edges() is called from other places, the receiver socket is always alive: - sendmsg: the successor's sk_refcnt is bumped by sock_hold() unix_find_other() for SOCK_DGRAM, connect() for SOCK_STREAM - recvmsg: the successor is the receiver, and its fd is alive [0]: BUG: KASAN: slab-use-after-free in unix_edge_successor net/unix/garbage.c:109 [inline] BUG: KASAN: slab-use-after-free in unix_del_edge net/unix/garbage.c:165 [inline] BUG: KASAN: slab-use-after-free in unix_del_edges+0x148/0x630 net/unix/garbage.c:237 Read of size 8 at addr ffff888079c6e640 by task kworker/u8:6/1099 CPU: 0 PID: 1099 Comm: kworker/u8:6 Not tainted 6.9.0-rc4-next-20240418-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024 Workqueue: events_unbound __unix_gc Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114 print_address_description mm/kasan/report.c:377 [inline] print_report+0x169/0x550 mm/kasan/report.c:488 kasan_report+0x143/0x180 mm/kasan/report.c:601 unix_edge_successor net/unix/garbage.c:109 [inline] unix_del_edge net/unix/garbage.c:165 [inline] unix_del_edges+0x148/0x630 net/unix/garbage.c:237 unix_destroy_fpl+0x59/0x210 net/unix/garbage.c:298 unix_detach_fds net/unix/af_unix.c:1811 [inline] unix_destruct_scm+0x13e/0x210 net/unix/af_unix.c:1826 skb_release_head_state+0x100/0x250 net/core/skbuff.c:1127 skb_release_all net/core/skbuff.c:1138 [inline] __kfree_skb net/core/skbuff.c:1154 [inline] kfree_skb_reason+0x16d/0x3b0 net/core/skbuff.c:1190 __skb_queue_purge_reason include/linux/skbuff.h:3251 [inline] __skb_queue_purge include/linux/skbuff.h:3256 [inline] __unix_gc+0x1732/0x1830 net/unix/garbage.c:575 process_one_work kernel/workqueue.c:3218 [inline] process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3299 worker_thread+0x86d/0xd70 kernel/workqueue.c:3380 kthread+0x2f0/0x390 kernel/kthread.c:389 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 </TASK> Allocated by task 14427: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x3f/0x80 mm/kasan/common.c:68 unpoison_slab_object mm/kasan/common.c:312 [inline] __kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:338 kasan_slab_alloc include/linux/kasan.h:201 [inline] slab_post_alloc_hook mm/slub.c:3897 [inline] slab_alloc_node mm/slub.c:3957 [inline] kmem_cache_alloc_noprof+0x135/0x290 mm/slub.c:3964 sk_prot_alloc+0x58/0x210 net/core/sock.c:2074 sk_alloc+0x38/0x370 net/core/sock.c:2133 unix_create1+0xb4/0x770 unix_create+0x14e/0x200 net/unix/af_unix.c:1034 __sock_create+0x490/0x920 net/socket.c:1571 sock_create net/socket.c:1622 [inline] __sys_socketpair+0x33e/0x720 net/socket.c:1773 __do_sys_socketpair net/socket.c:1822 [inline] __se_sys_socketpair net/socket.c:1819 [inline] __x64_sys_socketpair+0x9b/0xb0 net/socket.c:1819 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Freed by task 1805: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x3f/0x80 mm/kasan/common.c:68 kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:579 poison_slab_object+0xe0/0x150 mm/kasan/common.c:240 __kasan_slab_free+0x37/0x60 mm/kasan/common.c:256 kasan_slab_free include/linux/kasan.h:184 [inline] slab_free_hook mm/slub.c:2190 [inline] slab_free mm/slub.c:4393 [inline] kmem_cache_free+0x145/0x340 mm/slub.c:4468 sk_prot_free net/core/sock.c:2114 [inline] __sk_destruct+0x467/0x5f0 net/core/sock.c:2208 sock_put include/net/sock.h:1948 [inline] unix_release_sock+0xa8b/0xd20 net/unix/af_unix.c:665 unix_release+0x91/0xc0 net/unix/af_unix.c:1049 __sock_release net/socket.c:659 [inline] sock_close+0xbc/0x240 net/socket.c:1421 __fput+0x406/0x8b0 fs/file_table.c:422 delayed_fput+0x59/0x80 fs/file_table.c:445 process_one_work kernel/workqueue.c:3218 [inline] process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3299 worker_thread+0x86d/0xd70 kernel/workqueue.c:3380 kthread+0x2f0/0x390 kernel/kthread.c:389 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 The buggy address belongs to the object at ffff888079c6e000 which belongs to the cache UNIX of size 1920 The buggy address is located 1600 bytes inside of freed 1920-byte region [ffff888079c6e000, ffff888079c6e780) Reported-by: syzbot+f3f3eef1d2100200e593@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=f3f3eef1d2100200e593 Fixes: 77e5593aebba ("af_unix: Skip GC if no cycle exists.") Fixes: fd86344823b5 ("af_unix: Try not to hold unix_gc_lock during accept().") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20240419235102.31707-1-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-04-23ax25: Fix netdev refcount issueGravatar Duoming Zhou 1-1/+1
The dev_tracker is added to ax25_cb in ax25_bind(). When the ax25 device is detaching, the dev_tracker of ax25_cb should be deallocated in ax25_kill_by_device() instead of the dev_tracker of ax25_dev. The log reported by ref_tracker is shown below: [ 80.884935] ref_tracker: reference already released. [ 80.885150] ref_tracker: allocated in: [ 80.885349] ax25_dev_device_up+0x105/0x540 [ 80.885730] ax25_device_event+0xa4/0x420 [ 80.885730] notifier_call_chain+0xc9/0x1e0 [ 80.885730] __dev_notify_flags+0x138/0x280 [ 80.885730] dev_change_flags+0xd7/0x180 [ 80.885730] dev_ifsioc+0x6a9/0xa30 [ 80.885730] dev_ioctl+0x4d8/0xd90 [ 80.885730] sock_do_ioctl+0x1c2/0x2d0 [ 80.885730] sock_ioctl+0x38b/0x4f0 [ 80.885730] __se_sys_ioctl+0xad/0xf0 [ 80.885730] do_syscall_64+0xc4/0x1b0 [ 80.885730] entry_SYSCALL_64_after_hwframe+0x67/0x6f [ 80.885730] ref_tracker: freed in: [ 80.885730] ax25_device_event+0x272/0x420 [ 80.885730] notifier_call_chain+0xc9/0x1e0 [ 80.885730] dev_close_many+0x272/0x370 [ 80.885730] unregister_netdevice_many_notify+0x3b5/0x1180 [ 80.885730] unregister_netdev+0xcf/0x120 [ 80.885730] sixpack_close+0x11f/0x1b0 [ 80.885730] tty_ldisc_kill+0xcb/0x190 [ 80.885730] tty_ldisc_hangup+0x338/0x3d0 [ 80.885730] __tty_hangup+0x504/0x740 [ 80.885730] tty_release+0x46e/0xd80 [ 80.885730] __fput+0x37f/0x770 [ 80.885730] __x64_sys_close+0x7b/0xb0 [ 80.885730] do_syscall_64+0xc4/0x1b0 [ 80.885730] entry_SYSCALL_64_after_hwframe+0x67/0x6f [ 80.893739] ------------[ cut here ]------------ [ 80.894030] WARNING: CPU: 2 PID: 140 at lib/ref_tracker.c:255 ref_tracker_free+0x47b/0x6b0 [ 80.894297] Modules linked in: [ 80.894929] CPU: 2 PID: 140 Comm: ax25_conn_rel_6 Not tainted 6.9.0-rc4-g8cd26fd90c1a #11 [ 80.895190] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qem4 [ 80.895514] RIP: 0010:ref_tracker_free+0x47b/0x6b0 [ 80.895808] Code: 83 c5 18 4c 89 eb 48 c1 eb 03 8a 04 13 84 c0 0f 85 df 01 00 00 41 83 7d 00 00 75 4b 4c 89 ff 9 [ 80.896171] RSP: 0018:ffff888009edf8c0 EFLAGS: 00000286 [ 80.896339] RAX: 1ffff1100141ac00 RBX: 1ffff1100149463b RCX: dffffc0000000000 [ 80.896502] RDX: 0000000000000001 RSI: 0000000000000246 RDI: ffff88800a0d6518 [ 80.896925] RBP: ffff888009edf9b0 R08: ffff88806d3288d3 R09: 1ffff1100da6511a [ 80.897212] R10: dffffc0000000000 R11: ffffed100da6511b R12: ffff88800a4a31d4 [ 80.897859] R13: ffff88800a4a31d8 R14: dffffc0000000000 R15: ffff88800a0d6518 [ 80.898279] FS: 00007fd88b7fe700(0000) GS:ffff88806d300000(0000) knlGS:0000000000000000 [ 80.899436] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 80.900181] CR2: 00007fd88c001d48 CR3: 000000000993e000 CR4: 00000000000006f0 ... [ 80.935774] ref_tracker: sp%d@000000000bb9df3d has 1/1 users at [ 80.935774] ax25_bind+0x424/0x4e0 [ 80.935774] __sys_bind+0x1d9/0x270 [ 80.935774] __x64_sys_bind+0x75/0x80 [ 80.935774] do_syscall_64+0xc4/0x1b0 [ 80.935774] entry_SYSCALL_64_after_hwframe+0x67/0x6f Change ax25_dev->dev_tracker to the dev_tracker of ax25_cb in order to mitigate the bug. Fixes: feef318c855a ("ax25: fix UAF bugs of net_device caused by rebinding operation") Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Link: https://lore.kernel.org/r/20240419020456.29826-1-duoming@zju.edu.cn Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-04-22net: openvswitch: Check vport netdev nameGravatar Jun Gu 1-1/+4
Ensure that the provided netdev name is not one of its aliases to prevent unnecessary creation and destruction of the vport by ovs-vswitchd. Signed-off-by: Jun Gu <jun.gu@easystack.cn> Acked-by: Eelco Chaudron <echaudro@redhat.com> Link: https://lore.kernel.org/r/20240419061425.132723-1-jun.gu@easystack.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-22netfilter: nfnetlink: Handle ACK flags for batch messagesGravatar Donald Hunter 1-0/+5
The NLM_F_ACK flag is ignored for nfnetlink batch begin and end messages. This is a problem for ynl which wants to receive an ack for every message it sends, not just the commands in between the begin/end messages. Add processing for ACKs for begin/end messages and provide responses when requested. I have checked that iproute2, pyroute2 and systemd are unaffected by this change since none of them use NLM_F_ACK for batch begin/end. Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20240418104737.77914-5-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>