aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2022-12-07net/mlx5: Introduce IFC bits for migratableGravatar Yishai Hadas 1-1/+5
Introduce IFC related capabilities to enable setting VF to be able to perform live migration. e.g.: to be migratable. Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Acked-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-07Merge tag 'ieee802154-for-net-next-2022-12-05' of ↵Gravatar Jakub Kicinski 2-0/+61
git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan-next Stefan Schmidt says: ==================== ieee802154-next 2022-12-05 Miquel continued his work towards full scanning support. For this, we now allow the creation of dedicated coordinator interfaces to allow a PAN coordinator to serve in the network and set the needed address filters with the hardware. On top of this we have the first part to allow scanning for available 15.4 networks. A new netlink scan group, within the existing nl802154 API, was added. In addition Miquel fixed two issues that have been introduced in the former patches to free an skb correctly and clarifying an expression in the stack. From David Girault we got tracing support when registering new PANs. * tag 'ieee802154-for-net-next-2022-12-05' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan-next: mac802154: Trace the registration of new PANs ieee802154: Advertize coordinators discovery mac802154: Allow the creation of coordinator interfaces mac802154: Clarify an expression mac802154: Move an skb free within the rx path ==================== Link: https://lore.kernel.org/r/20221205131909.1871790-1-stefan@datenfreihafen.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-06net: ethernet: mtk_wed: add reset to rx_ring_setup callbackGravatar Lorenzo Bianconi 1-4/+4
This patch adds reset parameter to mtk_wed_rx_ring_setup signature in order to align rx_ring_setup callback to tx_ring_setup one introduced in 'commit 23dca7a90017 ("net: ethernet: mtk_wed: add reset to tx_ring_setup callback")' Co-developed-by: Sujuan Chen <sujuan.chen@mediatek.com> Signed-off-by: Sujuan Chen <sujuan.chen@mediatek.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/29c6e7a5469e784406cf3e2920351d1207713d05.1670239984.git.lorenzo@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-05ethtool: add netlink based get rss supportGravatar Sudheer Mogilappagari 1-0/+14
Add netlink based support for "ethtool -x <dev> [context x]" command by implementing ETHTOOL_MSG_RSS_GET netlink message. This is equivalent to functionality provided via ETHTOOL_GRSSH in ioctl path. It sends RSS table, hash key and hash function of an interface to user space. This patch implements existing functionality available in ioctl path and enables addition of new RSS context based parameters in future. Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com> Link: https://lore.kernel.org/r/20221202002555.241580-1-sudheer.mogilappagari@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-05Merge tag 'rxrpc-next-20221201-b' of ↵Gravatar David S. Miller 2-117/+371
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs David Howells says: ==================== rxrpc: Increasing SACK size and moving away from softirq, parts 2 & 3 Here are the second and third parts of patches in the process of moving rxrpc from doing a lot of its stuff in softirq context to doing it in an I/O thread in process context and thereby making it easier to support a larger SACK table. The full description is in the description for the first part[1] which is already in net-next. The second part includes some cleanups, adds some testing and overhauls some tracing: (1) Remove declaration of rxrpc_kernel_call_is_complete() as the definition is no longer present. (2) Remove the knet() and kproto() macros in favour of using tracepoints. (3) Remove handling of duplicate packets from recvmsg. The input side isn't now going to insert overlapping/duplicate packets into the recvmsg queue. (4) Don't use the rxrpc_conn_parameters struct in the rxrpc_connection or rxrpc_bundle structs - rather put the members in directly. (5) Extract the abort code from a received abort packet right up front rather than doing it in multiple places later. (6) Use enums and symbol lists rather than __builtin_return_address() to indicate where a tracepoint was triggered for local, peer, conn, call and skbuff tracing. (7) Add a refcount tracepoint for the rxrpc_bundle struct. (8) Implement an in-kernel server for the AFS rxperf testing program to talk to (enabled by a Kconfig option). This is tagged as rxrpc-next-20221201-a. The third part introduces the I/O thread and switches various bits over to running there: (1) Fix call timers and call and connection workqueues to not hold refs on the rxrpc_call and rxrpc_connection structs to thereby avoid messy cleanup when the last ref is put in softirq mode. (2) Split input.c so that the call packet processing bits are separate from the received packet distribution bits. Call packet processing gets bumped over to the call event handler. (3) Create a per-local endpoint I/O thread. Barring some tiny bits that still get done in softirq context, all packet reception, processing and transmission is done in this thread. That will allow a load of locking to be removed. (4) Perform packet processing and error processing from the I/O thread. (5) Provide a mechanism to process call event notifications in the I/O thread rather than queuing a work item for that call. (6) Move data and ACK transmission into the I/O thread. ACKs can then be transmitted at the point they're generated rather than getting delegated from softirq context to some process context somewhere. (7) Move call and local processor event handling into the I/O thread. (8) Move cwnd degradation to after packets have been transmitted so that they don't shorten the window too quickly. A bunch of simplifications can then be done: (1) The input_lock is no longer necessary as exclusion is achieved by running the code in the I/O thread only. (2) Don't need to use sk->sk_receive_queue.lock to guard socket state changes as the socket mutex should suffice. (3) Don't take spinlocks in RCU callback functions as they get run in softirq context and thus need _bh annotations. (4) RCU is then no longer needed for the peer's error_targets list. (5) Simplify the skbuff handling in the receive path by dropping the ref in the basic I/O thread loop and getting an extra ref as and when we need to queue the packet for recvmsg or another context. (6) Get the peer address earlier in the input process and pass it to the users so that we only do it once. This is tagged as rxrpc-next-20221201-b. Changes: ======== ver #2) - Added a patch to change four assertions into warnings in rxrpc_read() and fixed a checker warning from a __user annotation that should have been removed.. - Change a min() to min_t() in rxperf as PAGE_SIZE doesn't seem to match type size_t on i386. - Three error handling issues in rxrpc_new_incoming_call(): - If not DATA or not seq #1, should drop the packet, not abort. - Fix a goto that went to the wrong place, dropping a non-held lock. - Fix an rcu_read_lock that should've been an unlock. Tested-by: Marc Dionne <marc.dionne@auristor.com> Tested-by: kafs-testing+fedora36_64checkkafs-build-144@auristor.com Link: https://lore.kernel.org/r/166794587113.2389296.16484814996876530222.stgit@warthog.procyon.org.uk/ [1] Link: https://lore.kernel.org/r/166982725699.621383.2358362793992993374.stgit@warthog.procyon.org.uk/ # v1 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-05net: stmmac: Power up SERDES after the PHY linkGravatar Revanth Kumar Uppala 1-0/+1
The Tegra MGBE ethernet controller requires that the SERDES link is powered-up after the PHY link is up, otherwise the link fails to become ready following a resume from suspend. Add a variable to indicate that the SERDES link must be powered-up after the PHY link. Signed-off-by: Revanth Kumar Uppala <ruppala@nvidia.com> Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-03net: add netdev_sw_irq_coalesce_default_on()Gravatar Heiner Kallweit 1-0/+1
Add a helper for drivers wanting to set SW IRQ coalescing by default. The related sysfs attributes can be used to override the default values. Follow Jakub's suggestion and put this functionality into net core so that drivers wanting to use software interrupt coalescing per default don't have to open-code it. Note that this function needs to be called before the netdevice is registered. Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-02Merge tag 'wireless-next-2022-12-02' of ↵Gravatar Jakub Kicinski 4-13/+38
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Kalle Valo says: ==================== wireless-next patches for v6.2 Third set of patches for v6.2. mt76 has a new driver for mt7996 Wi-Fi 7 devices and iwlwifi also got initial Wi-Fi 7 support. Otherwise smaller features and fixes. Major changes: ath10k - store WLAN firmware version in SMEM image table mt76 - mt7996: new driver for MediaTek Wi-Fi 7 (802.11be) devices - mt7986, mt7915: enable Wireless Ethernet Dispatch (WED) offload support - mt7915: add ack signal support - mt7915: enable coredump support - mt7921: remain_on_channel support - mt7921: channel context support iwlwifi - enable Wi-Fi 7 Extremely High Throughput (EHT) PHY capabilities - 320 MHz channels support * tag 'wireless-next-2022-12-02' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (144 commits) wifi: ath10k: fix QCOM_SMEM dependency wifi: mt76: mt7921e: add pci .shutdown() support wifi: mt76: mt7915: mmio: fix naming convention wifi: mt76: mt7996: add support to configure spatial reuse parameter set wifi: mt76: mt7996: enable ack signal support wifi: mt76: mt7996: enable use_cts_prot support wifi: mt76: mt7915: rely on band_idx of mt76_phy wifi: mt76: mt7915: enable per bandwidth power limit support wifi: mt76: mt7915: introduce mt7915_get_power_bound() mt76: mt7915: Fix PCI device refcount leak in mt7915_pci_init_hif2() wifi: mt76: do not send firmware FW_FEATURE_NON_DL region wifi: mt76: mt7921: Add missing __packed annotation of struct mt7921_clc wifi: mt76: fix coverity overrun-call in mt76_get_txpower() wifi: mt76: mt7996: add driver for MediaTek Wi-Fi 7 (802.11be) devices wifi: mt76: mt76x0: remove dead code in mt76x0_phy_get_target_power wifi: mt76: mt7915: fix band_idx usage wifi: mt76: mt7915: enable .sta_set_txpwr support wifi: mt76: mt7915: add basedband Txpower info into debugfs wifi: mt76: mt7915: add support to configure spatial reuse parameter set wifi: mt76: mt7915: add missing MODULE_PARM_DESC ... ==================== Link: https://lore.kernel.org/r/20221202214254.D0D3DC433C1@smtp.kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-01sctp: delete free member from struct sctp_sched_opsGravatar Xin Long 1-2/+0
After commit 9ed7bfc79542 ("sctp: fix memory leak in sctp_stream_outq_migrate()"), sctp_sched_set_sched() is the only place calling sched->free(), and it can actually be replaced by sched->free_sid() on each stream, and yet there's already a loop to traverse all streams in sctp_sched_set_sched(). This patch adds a function sctp_sched_free_sched() where it calls sched->free_sid() for each stream to replace sched->free() calls in sctp_sched_set_sched() and then deletes the unused free member from struct sctp_sched_ops. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Link: https://lore.kernel.org/r/e10aac150aca2686cb0bd0570299ec716da5a5c0.1669849471.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-01mptcp: add pm listener eventsGravatar Geliang Tang 1-0/+9
This patch adds two new MPTCP netlink event types for PM listening socket create and close, named MPTCP_EVENT_LISTENER_CREATED and MPTCP_EVENT_LISTENER_CLOSED. Add a new function mptcp_event_pm_listener() to push the new events with family, port and addr to userspace. Invoke mptcp_event_pm_listener() with MPTCP_EVENT_LISTENER_CREATED in mptcp_listen() and mptcp_pm_nl_create_listen_socket(), invoke it with MPTCP_EVENT_LISTENER_CLOSED in __mptcp_close_ssk(). Signed-off-by: Geliang Tang <geliang.tang@suse.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-01net/tcp: Disable TCP-MD5 static key on tcp_md5sig_info destructionGravatar Dmitry Safonov 1-3/+7
To do that, separate two scenarios: - where it's the first MD5 key on the system, which means that enabling of the static key may need to sleep; - copying of an existing key from a listening socket to the request socket upon receiving a signed TCP segment, where static key was already enabled (when the key was added to the listening socket). Now the life-time of the static branch for TCP-MD5 is until: - last tcp_md5sig_info is destroyed - last socket in time-wait state with MD5 key is closed. Which means that after all sockets with TCP-MD5 keys are gone, the system gets back the performance of disabled md5-key static branch. While at here, provide static_key_fast_inc() helper that does ref counter increment in atomic fashion (without grabbing cpus_read_lock() on CONFIG_JUMP_LABEL=y). This is needed to add a new user for a static_key when the caller controls the lifetime of another user. Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-01jump_label: Prevent key->enabled int overflowGravatar Dmitry Safonov 1-4/+17
1. With CONFIG_JUMP_LABEL=n static_key_slow_inc() doesn't have any protection against key->enabled refcounter overflow. 2. With CONFIG_JUMP_LABEL=y static_key_slow_inc_cpuslocked() still may turn the refcounter negative as (v + 1) may overflow. key->enabled is indeed a ref-counter as it's documented in multiple places: top comment in jump_label.h, Documentation/staging/static-keys.rst, etc. As -1 is reserved for static key that's in process of being enabled, functions would break with negative key->enabled refcount: - for CONFIG_JUMP_LABEL=n negative return of static_key_count() breaks static_key_false(), static_key_true() - the ref counter may become 0 from negative side by too many static_key_slow_inc() calls and lead to use-after-free issues. These flaws result in that some users have to introduce an additional mutex and prevent the reference counter from overflowing themselves, see bpf_enable_runtime_stats() checking the counter against INT_MAX / 2. Prevent the reference counter overflow by checking if (v + 1) > 0. Change functions API to return whether the increment was successful. Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-01wifi: mac80211: add support for restricting netdev features per vifGravatar Felix Fietkau 2-6/+15
This can be used to selectively disable feature flags for checksum offload, scatter/gather or GSO by changing vif->netdev_features. Removing features from vif->netdev_features does not affect the netdev features themselves, but instead fixes up skbs in the tx path so that the offloads are not needed in the driver. Aside from making it easier to deal with vif type based hardware limitations, this also makes it possible to optimize performance on hardware without native GSO support by declaring GSO support in hw->netdev_features and removing it from vif->netdev_features. This allows mac80211 to handle GSO segmentation after the sta lookup, but before itxq enqueue, thus reducing the number of unnecessary sta lookups, as well as some other per-packet processing. Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20221010094338.78070-1-nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-12-01rxrpc: Transmit ACKs at the point of generationGravatar David Howells 1-3/+0
For ACKs generated inside the I/O thread, transmit the ACK at the point of generation. Where the ACK is generated outside of the I/O thread, it's offloaded to the I/O thread to transmit it. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-12-01rxrpc: Trace/count transmission underflows and cwnd resetsGravatar David Howells 1-0/+38
Add a tracepoint to log when a cwnd reset occurs due to lack of transmission on a call. Add stat counters to count transmission underflows (ie. when we have tx window space, but sendmsg doesn't manage to keep up), cwnd resets and transmission failures. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-12-01rxrpc: Make the I/O thread take over the call and local processor workGravatar David Howells 1-22/+20
Move the functions from the call->processor and local->processor work items into the domain of the I/O thread. The call event processor, now called from the I/O thread, then takes over the job of cranking the call state machine, processing incoming packets and transmitting DATA, ACK and ABORT packets. In a future patch, rxrpc_send_ACK() will transmit the ACK on the spot rather than queuing it for later transmission. The call event processor becomes purely received-skb driven. It only transmits things in response to events. We use "pokes" to queue a dummy skb to make it do things like start/resume transmitting data. Timer expiry also results in pokes. The connection event processor, becomes similar, though crypto events, such as dealing with CHALLENGE and RESPONSE packets is offloaded to a work item to avoid doing crypto in the I/O thread. The local event processor is removed and VERSION response packets are generated directly from the packet parser. Similarly, ABORTs generated in response to protocol errors will be transmitted immediately rather than being pushed onto a queue for later transmission. Changes: ======== ver #2) - Fix a couple of introduced lock context imbalances. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-12-01rxrpc: Simplify skbuff accounting in receive pathGravatar David Howells 1-1/+2
A received skbuff needs a ref when it gets put on a call data queue or conn packet queue, and rxrpc_input_packet() and co. jump through a lot of hoops to avoid double-dropping the skbuff ref so that we can avoid getting a ref when we queue the packet. Change this so that the skbuff ref is unconditionally dropped by the caller of rxrpc_input_packet(). An additional ref is then taken on the packet if it is pushed onto a queue. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-12-01rxrpc: Move DATA transmission into call processor work itemGravatar David Howells 1-1/+5
Move DATA transmission into the call processor work item. In a future patch, this will be called from the I/O thread rather than being itsown work item. This will allow DATA transmission to be driven directly by incoming ACKs, pokes and timers as those are processed. The Tx queue is also split: The queue of packets prepared by sendmsg is now places in call->tx_sendmsg and the packet dispatcher decants the packets into call->tx_buffer as space becomes available in the transmission window. This allows sendmsg to run ahead of the available space to try and prevent an underflow in transmission. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-12-01rxrpc: Copy client call parameters into rxrpc_call earlierGravatar David Howells 1-0/+3
Copy client call parameters into rxrpc_call earlier so that that can be used to convey them to the connection code - which can then be offloaded to the I/O thread. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-12-01rxrpc: Implement a mechanism to send an event notification to a callGravatar David Howells 1-0/+52
Provide a means by which an event notification can be sent to a call such that the I/O thread can process it rather than it being done in a separate workqueue. This will allow a lot of locking to be removed. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-12-01rxrpc: Don't hold a ref for connection workqueueGravatar David Howells 1-6/+5
Currently, rxrpc gives the connection's work item a ref on the connection when it queues it - and this is called from the timer expiration function. The problem comes when queue_work() fails (ie. the work item is already queued): the timer routine must put the ref - but this may cause the cleanup code to run. This has the unfortunate effect that the cleanup code may then be run in softirq context - which means that any spinlocks it might need to touch have to be guarded to disable softirqs (ie. they need a "_bh" suffix). (1) Don't give a ref to the work item. (2) Simplify handling of service connections by adding a separate active count so that the refcount isn't also used for this. (3) Connection destruction for both client and service connections can then be cleaned up by putting rxrpc_put_connection() out of line and making a tidy progression through the destruction code (offloaded to a workqueue if put from softirq or processor function context). The RCU part of the cleanup then only deals with the freeing at the end. (4) Make rxrpc_queue_conn() return immediately if it sees the active count is -1 rather then queuing the connection. (5) Make sure that the cleanup routine waits for the work item to complete. (6) Stash the rxrpc_net pointer in the conn struct so that the rcu free routine can use it, even if the local endpoint has been freed. Unfortunately, neither the timer nor the work item can simply get around the problem by just using refcount_inc_not_zero() as the waits would still have to be done, and there would still be the possibility of having to put the ref in the expiration function. Note the connection work item is mostly going to go away with the main event work being transferred to the I/O thread, so the wait in (6) will become obsolete. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-12-01rxrpc: Don't hold a ref for call timer or workqueueGravatar David Howells 1-5/+1
Currently, rxrpc gives the call timer a ref on the call when it starts it and this is passed along to the workqueue by the timer expiration function. The problem comes when queue_work() fails (ie. the work item is already queued): the timer routine must put the ref - but this may cause the cleanup code to run. This has the unfortunate effect that the cleanup code may then be run in softirq context - which means that any spinlocks it might need to touch have to be guarded to disable softirqs (ie. they need a "_bh" suffix). Fix this by: (1) Don't give a ref to the timer. (2) Making the expiration function not do anything if the refcount is 0. Note that this is more of an optimisation. (3) Make sure that the cleanup routine waits for timer to complete. However, this has a consequence that timer cannot give a ref to the work item. Therefore the following fixes are also necessary: (4) Don't give a ref to the work item. (5) Make the work item return asap if it sees the ref count is 0. (6) Make sure that the cleanup routine waits for the work item to complete. Unfortunately, neither the timer nor the work item can simply get around the problem by just using refcount_inc_not_zero() as the waits would still have to be done, and there would still be the possibility of having to put the ref in the expiration function. Note the call work item is going to go away with the work being transferred to the I/O thread, so the wait in (6) will become obsolete. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-12-01rxrpc: trace: Don't use __builtin_return_address for sk_buff tracingGravatar David Howells 1-24/+33
In rxrpc tracing, use enums to generate lists of points of interest rather than __builtin_return_address() for the sk_buff tracepoint. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-12-01rxrpc: Trace rxrpc_bundle refcountGravatar David Howells 1-0/+34
Add a tracepoint for the rxrpc_bundle refcounting. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-12-01rxrpc: trace: Don't use __builtin_return_address for rxrpc_call tracingGravatar David Howells 1-34/+49
In rxrpc tracing, use enums to generate lists of points of interest rather than __builtin_return_address() for the rxrpc_call tracepoint Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-12-01rxrpc: trace: Don't use __builtin_return_address for rxrpc_conn tracingGravatar David Howells 1-21/+37
In rxrpc tracing, use enums to generate lists of points of interest rather than __builtin_return_address() for the rxrpc_conn tracepoint Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-12-01rxrpc: trace: Don't use __builtin_return_address for rxrpc_peer tracingGravatar David Howells 1-17/+26
In rxrpc tracing, use enums to generate lists of points of interest rather than __builtin_return_address() for the rxrpc_peer tracepoint Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-12-01rxrpc: trace: Don't use __builtin_return_address for rxrpc_local tracingGravatar David Howells 1-13/+36
In rxrpc tracing, use enums to generate lists of points of interest rather than __builtin_return_address() for the rxrpc_local tracepoint Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-12-01rxrpc: Remove the [k_]proto() debugging macrosGravatar David Howells 1-0/+60
Remove the kproto() and _proto() debugging macros in preference to using tracepoints for this. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-12-01rxrpc: Remove decl for rxrpc_kernel_call_is_complete()Gravatar David Howells 1-1/+0
rxrpc_kernel_call_is_complete() has been removed, so remove its declaration too. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-12-01rxrpc: Implement an in-kernel rxperf server for testing purposesGravatar David Howells 1-0/+1
Implement an in-kernel rxperf server to allow kernel-based rxrpc services to be tested directly, unlike with AFS where they're accessed by the fileserver when the latter decides it wants to. This is implemented as a module that, if loaded, opens UDP port 7009 (afs3-rmtsys) and listens on it for incoming calls. Calls can be generated using the rxperf command shipped with OpenAFS, for example. Changes ======= ver #2) - Use min_t() instead of min(). Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org cc: Jakub Kicinski <kuba@kernel.org>
2022-12-01wifi: cfg80211: Correct example of ieee80211_iface_limitGravatar Philipp Hortmann 1-1/+1
Correct wrong closing bracket. Signed-off-by: Philipp Hortmann <philipp.g.hortmann@gmail.com> Link: https://lore.kernel.org/r/20221114200135.GA100176@matrix-ESPRIMO-P710 Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-12-01wifi: ieee80211: Do not open-code qos address offsetsGravatar Kees Cook 1-6/+22
When building with -Wstringop-overflow, GCC's KASAN implementation does not correctly perform bounds checking within some complex structures when faced with literal offsets, and can get very confused. For example, this warning is seen due to literal offsets into sturct ieee80211_hdr that may or may not be large enough: drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c: In function 'iwl_mvm_rx_mpdu_mq': drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c:2022:29: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=] 2022 | *qc &= ~IEEE80211_QOS_CTL_A_MSDU_PRESENT; In file included from drivers/net/wireless/intel/iwlwifi/mvm/fw-api.h:32, from drivers/net/wireless/intel/iwlwifi/mvm/sta.h:15, from drivers/net/wireless/intel/iwlwifi/mvm/mvm.h:27, from drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c:10: drivers/net/wireless/intel/iwlwifi/mvm/../fw/api/rx.h:559:16: note: at offset [78, 166] into destination object 'mpdu_len' of size 2 559 | __le16 mpdu_len; | ^~~~~~~~ Refactor ieee80211_get_qos_ctl() to avoid using literal offsets, requiring the creation of the actual structure that is described in the comments. Explicitly choose the desired offset, making the code more human-readable too. This is one of the last remaining warning to fix before enabling -Wstringop-overflow globally. Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97490 Link: https://github.com/KSPP/linux/issues/181 Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Kalle Valo <kvalo@kernel.org> Cc: Gregory Greenman <gregory.greenman@intel.com> Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org> Cc: linux-wireless@vger.kernel.org Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20221130212641.never.627-kees@kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-11-30Merge tag 'mlx5-updates-2022-11-29' of ↵Gravatar Jakub Kicinski 2-6/+3
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2022-11-29 Misc update for mlx5 driver 1) Various trivial cleanups 2) Maor Dickman, Adds support for trap offload with additional actions 3) From Tariq, UMR (device memory registrations) cleanups, UMR WQE must be aligned to 64B per device spec, (not a bug fix). * tag 'mlx5-updates-2022-11-29' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5e: Support devlink reload of IPsec core net/mlx5e: TC, Add offload support for trap with additional actions net/mlx5e: Do early return when setup vports dests for slow path flow net/mlx5: Remove redundant check net/mlx5e: Delete always true DMA check net/mlx5e: Don't access directly DMA device pointer net/mlx5e: Don't use termination table when redundant net/mlx5: Fix orthography errors in documentation net/mlx5: Use generic definition for UMR KLM alignment net/mlx5: Generalize name of UMR alignment definition net/mlx5: Remove unused UMR MTT definitions net/mlx5e: Add padding when needed in UMR WQEs net/mlx5: Remove unused ctx variables net/mlx5e: Replace zero-length arrays with DECLARE_FLEX_ARRAY() helper net/mlx5e: Remove unneeded io-mapping.h #include ==================== Link: https://lore.kernel.org/r/20221130051152.479480-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-30net: phy: Add link between phy dev and mac devGravatar Xiaolei Wang 1-0/+4
If the external phy used by current mac interface is managed by another mac interface, it means that this network port cannot work independently, especially when the system suspends and resumes, the following trace may appear, so we should create a device link between phy dev and mac dev. WARNING: CPU: 0 PID: 24 at drivers/net/phy/phy.c:983 phy_error+0x20/0x68 Modules linked in: CPU: 0 PID: 24 Comm: kworker/0:2 Not tainted 6.1.0-rc3-00011-g5aaef24b5c6d-dirty #34 Hardware name: Freescale i.MX6 SoloX (Device Tree) Workqueue: events_power_efficient phy_state_machine unwind_backtrace from show_stack+0x10/0x14 show_stack from dump_stack_lvl+0x68/0x90 dump_stack_lvl from __warn+0xb4/0x24c __warn from warn_slowpath_fmt+0x5c/0xd8 warn_slowpath_fmt from phy_error+0x20/0x68 phy_error from phy_state_machine+0x22c/0x23c phy_state_machine from process_one_work+0x288/0x744 process_one_work from worker_thread+0x3c/0x500 worker_thread from kthread+0xf0/0x114 kthread from ret_from_fork+0x14/0x28 Exception stack(0xf0951fb0 to 0xf0951ff8) Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20221130021216.1052230-1-xiaolei.wang@windriver.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-30net: devlink: let the core report the driver name instead of the driversGravatar Vincent Mailhol 1-2/+0
The driver name is available in device_driver::name. Right now, drivers still have to report this piece of information themselves in their devlink_ops::info_get callback function. In order to factorize code, make devlink_nl_info_fill() add the driver name attribute. Now that the core sets the driver name attribute, drivers are not supposed to call devlink_info_driver_name_put() anymore. Remove devlink_info_driver_name_put() and clean-up all the drivers using this function in their callback. Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Tested-by: Ido Schimmel <idosch@nvidia.com> # mlxsw Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-30devlink: support directly reading from region memoryGravatar Jacob Keller 2-0/+18
To read from a region, user space must currently request a new snapshot of the region and then read from that snapshot. This can sometimes be overkill if user space only reads a tiny portion. They first create the snapshot, then request a read, then destroy the snapshot. For regions which have a single underlying "contents", it makes sense to allow supporting direct reading of the region data. Extend the DEVLINK_CMD_REGION_READ to allow direct reading from a region if requested via the new DEVLINK_ATTR_REGION_DIRECT. If this attribute is set, then perform a direct read instead of using a snapshot. Direct read is mutually exclusive with DEVLINK_ATTR_REGION_SNAPSHOT_ID, and care is taken to ensure that we reject commands which provide incorrect attributes. Regions must enable support for direct read by implementing the .read() callback function. If a region does not support such direct reads, a suitable extended error message is reported. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-29net/mlx5: Use generic definition for UMR KLM alignmentGravatar Tariq Toukan 1-1/+1
MLX5_UMR_KLM_ALIGNMENT is in units of number of entries, while MLX5_UMR_MTT_ALIGNMENT (generalized and renamed to MLX5_UMR_FLEX_ALIGNMENT) is in byte units. This is misleading and confusing. Replace this KLM definition with one based on the generic definition. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-29net/mlx5: Generalize name of UMR alignment definitionGravatar Tariq Toukan 1-2/+2
Per the device spec, MLX5_UMR_MTT_ALIGNMENT is good not only for UMR MTT entries, but for all other entries as well, like KLMs and KSMs. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-29net/mlx5: Remove unused UMR MTT definitionsGravatar Tariq Toukan 1-2/+0
Defines MLX5_UMR_MTT_MASK and MLX5_UMR_MTT_MIN_CHUNK_SIZE are not in use. Remove them. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-29net/mlx5e: Add padding when needed in UMR WQEsGravatar Tariq Toukan 1-0/+1
Per the device spec, MTTs/KLMs list in a UMR WQE must be aligned to 64B. Per our SW design, the MTT/KLMs list would need alignment only if it's too small, for example on PPC when PAGE_SIZE is 64KB, and only 4 pages are needed to cover a MPWQE of size 256KB. Padding, if needed, is taken into account when calculating the UMR WQE fields (ds_cnt and xlt_octowords), however no entries are provided, instead garbage is passed. No real harm though, as these parts act as gaps between the RX MPWQEs and not used by any of them. Hence, in practice, device does not try to write any incoming packet to them. Still, prefer providing clean padding marking the end of the list, and do not map garbage into the RQ memory region. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-29net/mlx5: Remove unused ctx variablesGravatar Petr Pavlu 1-2/+0
Remove mlx5_priv.ctx_list and ctx_lock which are no longer used after commit 601c10c89cbb ("net/mlx5: Delete custom device management logic"). Signed-off-by: Petr Pavlu <petr.pavlu@suse.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-29Merge branch 'master' of ↵Gravatar Jakub Kicinski 1-3/+5
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== ipsec-next 2022-11-26 1) Remove redundant variable in esp6. From Colin Ian King. 2) Update x->lastused for every packet. It was used only for outgoing mobile IPv6 packets, but showed to be usefull to check if the a SA is still in use in general. From Antony Antony. 3) Remove unused variable in xfrm_byidx_resize. From Leon Romanovsky. 4) Finalize extack support for xfrm. From Sabrina Dubroca. * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next: xfrm: add extack to xfrm_set_spdinfo xfrm: add extack to xfrm_alloc_userspi xfrm: add extack to xfrm_do_migrate xfrm: add extack to xfrm_new_ae and xfrm_replay_verify_len xfrm: add extack to xfrm_del_sa xfrm: add extack to xfrm_add_sa_expire xfrm: a few coding style clean ups xfrm: Remove not-used total variable xfrm: update x->lastused for every packet esp6: remove redundant variable err ==================== Link: https://lore.kernel.org/r/20221126110303.1859238-1-steffen.klassert@secunet.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netGravatar Jakub Kicinski 18-23/+51
tools/lib/bpf/ringbuf.c 927cbb478adf ("libbpf: Handle size overflow for ringbuf mmap") b486d19a0ab0 ("libbpf: checkpatch: Fixed code alignments in ringbuf.c") https://lore.kernel.org/all/20221121122707.44d1446a@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-29Merge tag 'net-6.1-rc8-2' of ↵Gravatar Linus Torvalds 2-1/+3
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bpf, can and wifi. Current release - new code bugs: - eth: mlx5e: - use kvfree() in mlx5e_accel_fs_tcp_create() - MACsec, fix RX data path 16 RX security channel limit - MACsec, fix memory leak when MACsec device is deleted - MACsec, fix update Rx secure channel active field - MACsec, fix add Rx security association (SA) rule memory leak Previous releases - regressions: - wifi: cfg80211: don't allow multi-BSSID in S1G - stmmac: set MAC's flow control register to reflect current settings - eth: mlx5: - E-switch, fix duplicate lag creation - fix use-after-free when reverting termination table Previous releases - always broken: - ipv4: fix route deletion when nexthop info is not specified - bpf: fix a local storage BPF map bug where the value's spin lock field can get initialized incorrectly - tipc: re-fetch skb cb after tipc_msg_validate - wifi: wilc1000: fix Information Element parsing - packet: do not set TP_STATUS_CSUM_VALID on CHECKSUM_COMPLETE - sctp: fix memory leak in sctp_stream_outq_migrate() - can: can327: fix potential skb leak when netdev is down - can: add number of missing netdev freeing on error paths - aquantia: do not purge addresses when setting the number of rings - wwan: iosm: - fix incorrect skb length leading to truncated packet - fix crash in peek throughput test due to skb UAF" * tag 'net-6.1-rc8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (79 commits) net: ethernet: renesas: ravb: Fix promiscuous mode after system resumed MAINTAINERS: Update maintainer list for chelsio drivers ionic: update MAINTAINERS entry sctp: fix memory leak in sctp_stream_outq_migrate() packet: do not set TP_STATUS_CSUM_VALID on CHECKSUM_COMPLETE net/mlx5: Lag, Fix for loop when checking lag Revert "net/mlx5e: MACsec, remove replay window size limitation in offload path" net: marvell: prestera: Fix a NULL vs IS_ERR() check in some functions net: tun: Fix use-after-free in tun_detach() net: mdiobus: fix unbalanced node reference count net: hsr: Fix potential use-after-free tipc: re-fetch skb cb after tipc_msg_validate mptcp: fix sleep in atomic at close time mptcp: don't orphan ssk in mptcp_close() dsa: lan9303: Correct stat name ipv4: Fix route deletion when nexthop info is not specified net: wwan: iosm: fix incorrect skb length net: wwan: iosm: fix crash in peek throughput test net: wwan: iosm: fix dma_alloc_coherent incompatible pointer type net: wwan: iosm: fix kernel test robot reported error ...
2022-11-29sctp: fix memory leak in sctp_stream_outq_migrate()Gravatar Zhengchao Shao 1-0/+2
When sctp_stream_outq_migrate() is called to release stream out resources, the memory pointed to by prio_head in stream out is not released. The memory leak information is as follows: unreferenced object 0xffff88801fe79f80 (size 64): comm "sctp_repo", pid 7957, jiffies 4294951704 (age 36.480s) hex dump (first 32 bytes): 80 9f e7 1f 80 88 ff ff 80 9f e7 1f 80 88 ff ff ................ 90 9f e7 1f 80 88 ff ff 90 9f e7 1f 80 88 ff ff ................ backtrace: [<ffffffff81b215c6>] kmalloc_trace+0x26/0x60 [<ffffffff88ae517c>] sctp_sched_prio_set+0x4cc/0x770 [<ffffffff88ad64f2>] sctp_stream_init_ext+0xd2/0x1b0 [<ffffffff88aa2604>] sctp_sendmsg_to_asoc+0x1614/0x1a30 [<ffffffff88ab7ff1>] sctp_sendmsg+0xda1/0x1ef0 [<ffffffff87f765ed>] inet_sendmsg+0x9d/0xe0 [<ffffffff8754b5b3>] sock_sendmsg+0xd3/0x120 [<ffffffff8755446a>] __sys_sendto+0x23a/0x340 [<ffffffff87554651>] __x64_sys_sendto+0xe1/0x1b0 [<ffffffff89978b49>] do_syscall_64+0x39/0xb0 [<ffffffff89a0008b>] entry_SYSCALL_64_after_hwframe+0x63/0xcd Link: https://syzkaller.appspot.com/bug?exrid=29c402e56c4760763cc0 Fixes: 637784ade221 ("sctp: introduce priority based stream scheduler") Reported-by: syzbot+29c402e56c4760763cc0@syzkaller.appspotmail.com Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Xin Long <lucien.xin@gmail.com> Link: https://lore.kernel.org/r/20221126031720.378562-1-shaozhengchao@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-29Revert "net/mlx5e: MACsec, remove replay window size limitation in offload path"Gravatar Saeed Mahameed 1-0/+7
This reverts commit c0071be0e16c461680d87b763ba1ee5e46548fde. The cited commit removed the validity checks which initialized the window_sz and never removed the use of the now uninitialized variable, so now we are left with wrong value in the window size and the following clang warning: [-Wuninitialized] drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec.c:232:45: warning: variable 'window_sz' is uninitialized when used here MLX5_SET(macsec_aso, aso_ctx, window_size, window_sz); Revet at this time to address the clang issue due to lack of time to test the proper solution. Fixes: c0071be0e16c ("net/mlx5e: MACsec, remove replay window size limitation in offload path") Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reported-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20221129093006.378840-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-29ieee802154: Advertize coordinators discoveryGravatar Miquel Raynal 2-0/+61
Let's introduce the basics for advertizing discovered PANs and coordinators, which is: - A new "scan" netlink message group. - A couple of netlink command/attribute. - The main netlink helper to send a netlink message with all the necessary information to forward the main information to the user. Two netlink attributes are proactively added to support future UWB complex channels, but are not actually used yet. Co-developed-by: David Girault <david.girault@qorvo.com> Signed-off-by: David Girault <david.girault@qorvo.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Acked-by: Alexander Aring <aahringo@redhat.com> Link: https://lore.kernel.org/r/20221129135535.532513-2-miquel.raynal@bootlin.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
2022-11-29net: ethernet: mtk_wed: add reset to tx_ring_setup callbackGravatar Lorenzo Bianconi 1-4/+4
Introduce reset parameter to mtk_wed_tx_ring_setup signature. This is a preliminary patch to add Wireless Ethernet Dispatcher reset support. Co-developed-by: Sujuan Chen <sujuan.chen@mediatek.com> Signed-off-by: Sujuan Chen <sujuan.chen@mediatek.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-11-29net: ethernet: mtk_wed: update mtk_wed_stopGravatar Lorenzo Bianconi 1-0/+4
Update mtk_wed_stop routine and rename old mtk_wed_stop() to mtk_wed_deinit(). This is a preliminary patch to add Wireless Ethernet Dispatcher reset support. Co-developed-by: Sujuan Chen <sujuan.chen@mediatek.com> Signed-off-by: Sujuan Chen <sujuan.chen@mediatek.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>