aboutsummaryrefslogtreecommitdiff
path: root/lib
AgeCommit message (Collapse)AuthorFilesLines
5 hoursMerge tag 'probes-v6.10' of ↵Gravatar Linus Torvalds 1-106/+6
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull probes updates from Masami Hiramatsu: - tracing/probes: Add new pseudo-types %pd and %pD support for dumping dentry name from 'struct dentry *' and file name from 'struct file *' - uprobes performance optimizations: - Speed up the BPF uprobe event by delaying the fetching of the uprobe event arguments that are not used in BPF - Avoid locking by speculatively checking whether uprobe event is valid - Reduce lock contention by using read/write_lock instead of spinlock for uprobe list operation. This improved BPF uprobe benchmark result 43% on average - rethook: Remove non-fatal warning messages when tracing stack from BPF and skip rcu_is_watching() validation in rethook if possible - objpool: Optimize objpool (which is used by kretprobes and fprobe as rethook backend storage) by inlining functions and avoid caching nr_cpu_ids because it is a const value - fprobe: Add entry/exit callbacks types (code cleanup) - kprobes: Check ftrace was killed in kprobes if it uses ftrace * tag 'probes-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: kprobe/ftrace: bail out if ftrace was killed selftests/ftrace: Fix required features for VFS type test case objpool: cache nr_possible_cpus() and avoid caching nr_cpu_ids objpool: enable inlining objpool_push() and objpool_pop() operations rethook: honor CONFIG_FTRACE_VALIDATE_RCU_IS_WATCHING in rethook_try_get() ftrace: make extra rcu_is_watching() validation check optional uprobes: reduce contention on uprobes_tree access rethook: Remove warning messages printed for finding return address of a frame. fprobe: Add entry/exit callbacks types selftests/ftrace: add fprobe test cases for VFS type "%pd" and "%pD" selftests/ftrace: add kprobe test cases for VFS type "%pd" and "%pD" Documentation: tracing: add new type '%pd' and '%pD' for kprobe tracing/probes: support '%pD' type for print struct file's name tracing/probes: support '%pd' type for print struct dentry's name uprobes: add speculative lockless system-wide uprobe filter check uprobes: prepare uprobe args buffer lazily uprobes: encapsulate preparation of uprobe args buffer
3 daysMerge tag 'net-next-6.10' of ↵Gravatar Linus Torvalds 7-28/+201
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core & protocols: - Complete rework of garbage collection of AF_UNIX sockets. AF_UNIX is prone to forming reference count cycles due to fd passing functionality. New method based on Tarjan's Strongly Connected Components algorithm should be both faster and remove a lot of workarounds we accumulated over the years. - Add TCP fraglist GRO support, allowing chaining multiple TCP packets and forwarding them together. Useful for small switches / routers which lack basic checksum offload in some scenarios (e.g. PPPoE). - Support using SMP threads for handling packet backlog i.e. packet processing from software interfaces and old drivers which don't use NAPI. This helps move the processing out of the softirq jumble. - Continue work of converting from rtnl lock to RCU protection. Don't require rtnl lock when reading: IPv6 routing FIB, IPv6 address labels, netdev threaded NAPI sysfs files, bonding driver's sysfs files, MPLS devconf, IPv4 FIB rules, netns IDs, tcp metrics, TC Qdiscs, neighbor entries, ARP entries via ioctl(SIOCGARP), a lot of the link information available via rtnetlink. - Small optimizations from Eric to UDP wake up handling, memory accounting, RPS/RFS implementation, TCP packet sizing etc. - Allow direct page recycling in the bulk API used by XDP, for +2% PPS. - Support peek with an offset on TCP sockets. - Add MPTCP APIs for querying last time packets were received/sent/acked and whether MPTCP "upgrade" succeeded on a TCP socket. - Add intra-node communication shortcut to improve SMC performance. - Add IPv6 (and IPv{4,6}-over-IPv{4,6}) support to the GTP protocol driver. - Add HSR-SAN (RedBOX) mode of operation to the HSR protocol driver. - Add reset reasons for tracing what caused a TCP reset to be sent. - Introduce direction attribute for xfrm (IPSec) states. State can be used either for input or output packet processing. Things we sprinkled into general kernel code: - Add bitmap_{read,write}(), bitmap_size(), expose BYTES_TO_BITS(). This required touch-ups and renaming of a few existing users. - Add Endian-dependent __counted_by_{le,be} annotations. - Make building selftests "quieter" by printing summaries like "CC object.o" rather than full commands with all the arguments. Netfilter: - Use GFP_KERNEL to clone elements, to deal better with OOM situations and avoid failures in the .commit step. BPF: - Add eBPF JIT for ARCv2 CPUs. - Support attaching kprobe BPF programs through kprobe_multi link in a session mode, meaning, a BPF program is attached to both function entry and return, the entry program can decide if the return program gets executed and the entry program can share u64 cookie value with return program. "Session mode" is a common use-case for tetragon and bpftrace. - Add the ability to specify and retrieve BPF cookie for raw tracepoint programs in order to ease migration from classic to raw tracepoints. - Add an internal-only BPF per-CPU instruction for resolving per-CPU memory addresses and implement support in x86, ARM64 and RISC-V JITs. This allows inlining functions which need to access per-CPU state. - Optimize x86 BPF JIT's emit_mov_imm64, and add support for various atomics in bpf_arena which can be JITed as a single x86 instruction. Support BPF arena on ARM64. - Add a new bpf_wq API for deferring events and refactor process-context bpf_timer code to keep common code where possible. - Harden the BPF verifier's and/or/xor value tracking. - Introduce crypto kfuncs to let BPF programs call kernel crypto APIs. - Support bpf_tail_call_static() helper for BPF programs with GCC 13. - Add bpf_preempt_{disable,enable}() kfuncs in order to allow a BPF program to have code sections where preemption is disabled. Driver API: - Skip software TC processing completely if all installed rules are marked as HW-only, instead of checking the HW-only flag rule by rule. - Add support for configuring PoE (Power over Ethernet), similar to the already existing support for PoDL (Power over Data Line) config. - Initial bits of a queue control API, for now allowing a single queue to be reset without disturbing packet flow to other queues. - Common (ethtool) statistics for hardware timestamping. Tests and tooling: - Remove the need to create a config file to run the net forwarding tests so that a naive "make run_tests" can exercise them. - Define a method of writing tests which require an external endpoint to communicate with (to send/receive data towards the test machine). Add a few such tests. - Create a shared code library for writing Python tests. Expose the YAML Netlink library from tools/ to the tests for easy Netlink access. - Move netfilter tests under net/, extend them, separate performance tests from correctness tests, and iron out issues found by running them "on every commit". - Refactor BPF selftests to use common network helpers. - Further work filling in YAML definitions of Netlink messages for: nftables, team driver, bonding interfaces, vlan interfaces, VF info, TC u32 mark, TC police action. - Teach Python YAML Netlink to decode attribute policies. - Extend the definition of the "indexed array" construct in the specs to cover arrays of scalars rather than just nests. - Add hyperlinks between definitions in generated Netlink docs. Drivers: - Make sure unsupported flower control flags are rejected by drivers, and make more drivers report errors directly to the application rather than dmesg (large number of driver changes from Asbjørn Sloth Tønnesen). - Ethernet high-speed NICs: - Broadcom (bnxt): - support multiple RSS contexts and steering traffic to them - support XDP metadata - make page pool allocations more NUMA aware - Intel (100G, ice, idpf): - extract datapath code common among Intel drivers into a library - use fewer resources in switchdev by sharing queues with the PF - add PFCP filter support - add Ethernet filter support - use a spinlock instead of HW lock in PTP clock ops - support 5 layer Tx scheduler topology - nVidia/Mellanox: - 800G link modes and 100G SerDes speeds - per-queue IRQ coalescing configuration - Marvell Octeon: - support offloading TC packet mark action - Ethernet NICs consumer, embedded and virtual: - stop lying about skb->truesize in USB Ethernet drivers, it messes up TCP memory calculations - Google cloud vNIC: - support changing ring size via ethtool - support ring reset using the queue control API - VirtIO net: - expose flow hash from RSS to XDP - per-queue statistics - add selftests - Synopsys (stmmac): - support controllers which require an RX clock signal from the MII bus to perform their hardware initialization - TI: - icssg_prueth: support ICSSG-based Ethernet on AM65x SR1.0 devices - icssg_prueth: add SW TX / RX Coalescing based on hrtimers - cpsw: minimal XDP support - Renesas (ravb): - support describing the MDIO bus - Realtek (r8169): - add support for RTL8168M - Microchip Sparx5: - matchall and flower actions mirred and redirect - Ethernet switches: - nVidia/Mellanox: - improve events processing performance - Marvell: - add support for MV88E6250 family internal PHYs - Microchip: - add DCB and DSCP mapping support for KSZ switches - vsc73xx: convert to PHYLINK - Realtek: - rtl8226b/rtl8221b: add C45 instances and SerDes switching - Many driver changes related to PHYLIB and PHYLINK deprecated API cleanup - Ethernet PHYs: - Add a new driver for Airoha EN8811H 2.5 Gigabit PHY. - micrel: lan8814: add support for PPS out and external timestamp trigger - WiFi: - Disable Wireless Extensions (WEXT) in all Wi-Fi 7 devices drivers. Modern devices can only be configured using nl80211. - mac80211/cfg80211 - handle color change per link for WiFi 7 Multi-Link Operation - Intel (iwlwifi): - don't support puncturing in 5 GHz - support monitor mode on passive channels - BZ-W device support - P2P with HE/EHT support - re-add support for firmware API 90 - provide channel survey information for Automatic Channel Selection - MediaTek (mt76): - mt7921 LED control - mt7925 EHT radiotap support - mt7920e PCI support - Qualcomm (ath11k): - P2P support for QCA6390, WCN6855 and QCA2066 - support hibernation - ieee80211-freq-limit Device Tree property support - Qualcomm (ath12k): - refactoring in preparation of multi-link support - suspend and hibernation support - ACPI support - debugfs support, including dfs_simulate_radar support - RealTek: - rtw88: RTL8723CS SDIO device support - rtw89: RTL8922AE Wi-Fi 7 PCI device support - rtw89: complete features of new WiFi 7 chip 8922AE including BT-coexistence and Wake-on-WLAN - rtw89: use BIOS ACPI settings to set TX power and channels - rtl8xxxu: enable Management Frame Protection (MFP) support - Bluetooth: - support for Intel BlazarI and Filmore Peak2 (BE201) - support for MediaTek MT7921S SDIO - initial support for Intel PCIe BT driver - remove HCI_AMP support" * tag 'net-next-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1827 commits) selftests: netfilter: fix packetdrill conntrack testcase net: gro: fix napi_gro_cb zeroed alignment Bluetooth: btintel_pcie: Refactor and code cleanup Bluetooth: btintel_pcie: Fix warning reported by sparse Bluetooth: hci_core: Fix not handling hdev->le_num_of_adv_sets=1 Bluetooth: btintel: Fix compiler warning for multi_v7_defconfig config Bluetooth: btintel_pcie: Fix compiler warnings Bluetooth: btintel_pcie: Add *setup* function to download firmware Bluetooth: btintel_pcie: Add support for PCIe transport Bluetooth: btintel: Export few static functions Bluetooth: HCI: Remove HCI_AMP support Bluetooth: L2CAP: Fix div-by-zero in l2cap_le_flowctl_init() Bluetooth: qca: Fix error code in qca_read_fw_build_info() Bluetooth: hci_conn: Use __counted_by() and avoid -Wfamnae warning Bluetooth: btintel: Add support for Filmore Peak2 (BE201) Bluetooth: btintel: Add support for BlazarI LE Create Connection command timeout increased to 20 secs dt-bindings: net: bluetooth: Add MediaTek MT7921S SDIO Bluetooth Bluetooth: compute LE flow credits based on recvbuf space Bluetooth: hci_sync: Use cmd->num_cis instead of magic number ...
4 daysMerge tag 'linux_kselftest-kunit-6.10-rc1' of ↵Gravatar Linus Torvalds 7-32/+99
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kunit updates from Shuah Khan: - fix race condition in try-catch completion - change __kunit_test_suites_init() to exit early if there is nothing to test - change string-stream-test to use KUNIT_DEFINE_ACTION_WRAPPER - move fault tests behind KUNIT_FAULT_TEST Kconfig option - kthread test fixes and improvements - iov_iter test fixes * tag 'linux_kselftest-kunit-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: kunit: bail out early in __kunit_test_suites_init() if there are no suites to test kunit: string-stream-test: use KUNIT_DEFINE_ACTION_WRAPPER kunit: test: Move fault tests behind KUNIT_FAULT_TEST Kconfig option kunit: unregister the device on error kunit: Fix race condition in try-catch completion kunit: Add tests for fault kunit: Print last test location on fault kunit: Fix KUNIT_SUCCESS() calls in iov_iter tests kunit: Handle test faults kunit: Fix timeout message kunit: Fix kthread reference kunit: Handle thread creation error
4 daysMerge tag 'irq-core-2024-05-12' of ↵Gravatar Linus Torvalds 2-0/+26
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull interrupt subsystem updates from Thomas Gleixner: "Core code: - Interrupt storm detection for the lockup watchdog: Lockups which are caused by interrupt storms are not easy to debug because there is no information about the events which make the lockup detector trigger. To make this more user friendly, provide an extenstion to interrupt statistics which allows to take snapshots and an interface to retrieve the delta to the snapshot. Use this new mechanism in the watchdog code to do a two stage lockup analysis by taking the snapshot and printing the deltas for the topmost active interrupts on the second trigger. Note: This contains both the interrupt and the watchdog changes as the latter depend on the former obviously. - Avoid summation loops in the /proc/interrupts output and use the global counter when possible - Skip suspended interrupts on CPU hotplug operations to ensure that they are not delivered before the system resumes the device drivers when coming out of suspend. - On CPU hot-unplug interrupts which are affine to the outgoing CPU are migrated to a different CPU in the affinity mask. This can fail when the CPUs have no vectors left. Instead of giving up try to migrate it to any online CPU and thereby breaking the affinity setting in order to prevent a stale device interrupt which targets an offline CPU - The usual small cleanups Driver code: - Support for the RISCV AIA MSI controller - Make the interrupt allocation for the Loongson PCH controller more flexible to prevent vector exhaustion - The usual set of cleanups and fixes all over the place" * tag 'irq-core-2024-05-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits) irqchip/gic-v3-its: Remove BUG_ON in its_vpe_irq_domain_alloc cpuidle: Avoid explicit cpumask allocation on stack irqchip/sifive-plic: Avoid explicit cpumask allocation on stack irqchip/riscv-aplic-direct: Avoid explicit cpumask allocation on stack irqchip/loongson-eiointc: Avoid explicit cpumask allocation on stack irqchip/gic-v3-its: Avoid explicit cpumask allocation on stack irqchip/irq-bcm6345-l1: Avoid explicit cpumask allocation on stack cpumask: Introduce cpumask_first_and_and() irqchip/irq-brcmstb-l2: Avoid saving mask on shutdown genirq: Reuse irq_is_nmi() genirq/cpuhotplug: Retry with cpu_online_mask when migration fails genirq/cpuhotplug: Skip suspended interrupts when restoring affinity arm64: dts: st: Add interrupt parent to pinctrl on stm32mp251 arm64: dts: st: Add exti1 and exti2 nodes on stm32mp251 ARM: dts: stm32: List exti parent interrupts on stm32mp131 ARM: dts: stm32: List exti parent interrupts on stm32mp151 arm64: Kconfig.platforms: Enable STM32_EXTI for ARCH_STM32 irqchip/stm32-exti: Mark events reserved with RIF configuration check irqchip/stm32-exti: Skip secure events irqchip/stm32-exti: Convert driver to standard PM ...
4 daysMerge tag 'timers-core-2024-05-12' of ↵Gravatar Linus Torvalds 2-19/+43
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timers and timekeeping updates from Thomas Gleixner: "Core code: - Make timekeeping and VDSO time readouts resilent against math overflow: In guest context the kernel is prone to math overflow when the host defers the timer interrupt due to overload, malfunction or malice. This can be mitigated by checking the clocksource delta for the maximum deferrement which is readily available. If that value is exceeded then the code uses a slowpath function which can handle the multiplication overflow. This functionality is enabled unconditionally in the kernel, but made conditional in the VDSO code. The latter is conditional because it allows architectures to optimize the check so it is not causing performance regressions. On X86 this is achieved by reworking the existing check for negative TSC deltas as a negative delta obviously exceeds the maximum deferrement when it is evaluated as an unsigned value. That avoids two conditionals in the hotpath and allows to hide both the negative delta and the large delta handling in the same slow path. - Add an initial minimal ktime_t abstraction for Rust - The usual boring cleanups and enhancements Drivers: - Boring updates to device trees and trivial enhancements in various drivers" * tag 'timers-core-2024-05-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits) clocksource/drivers/arm_arch_timer: Mark hisi_161010101_oem_info const clocksource/drivers/timer-ti-dm: Remove an unused field in struct dmtimer clocksource/drivers/renesas-ostm: Avoid reprobe after successful early probe clocksource/drivers/renesas-ostm: Allow OSTM driver to reprobe for RZ/V2H(P) SoC dt-bindings: timer: renesas: ostm: Document Renesas RZ/V2H(P) SoC rust: time: doc: Add missing C header links clocksource: Make the int help prompt unit readable in ncurses hrtimer: Rename __hrtimer_hres_active() to hrtimer_hres_active() timerqueue: Remove never used function timerqueue_node_expires() rust: time: Add Ktime vdso: Fix powerpc build U64_MAX undeclared error clockevents: Convert s[n]printf() to sysfs_emit() clocksource: Convert s[n]printf() to sysfs_emit() clocksource: Make watchdog and suspend-timing multiplication overflow safe timekeeping: Let timekeeping_cycles_to_ns() handle both under and overflow timekeeping: Make delta calculation overflow safe timekeeping: Prepare timekeeping_cycles_to_ns() for overflow safety timekeeping: Fold in timekeeping_delta_to_ns() timekeeping: Consolidate timekeeping helpers timekeeping: Refactor timekeeping helpers ...
4 daysMerge tag 'sched-core-2024-05-13' of ↵Gravatar Linus Torvalds 1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: - Add cpufreq pressure feedback for the scheduler - Rework misfit load-balancing wrt affinity restrictions - Clean up and simplify the code around ::overutilized and ::overload access. - Simplify sched_balance_newidle() - Bump SCHEDSTAT_VERSION to 16 due to a cleanup of CPU_MAX_IDLE_TYPES handling that changed the output. - Rework & clean up <asm/vtime.h> interactions wrt arch_vtime_task_switch() - Reorganize, clean up and unify most of the higher level scheduler balancing function names around the sched_balance_*() prefix - Simplify the balancing flag code (sched_balance_running) - Miscellaneous cleanups & fixes * tag 'sched-core-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits) sched/pelt: Remove shift of thermal clock sched/cpufreq: Rename arch_update_thermal_pressure() => arch_update_hw_pressure() thermal/cpufreq: Remove arch_update_thermal_pressure() sched/cpufreq: Take cpufreq feedback into account cpufreq: Add a cpufreq pressure feedback for the scheduler sched/fair: Fix update of rd->sg_overutilized sched/vtime: Do not include <asm/vtime.h> header s390/irq,nmi: Include <asm/vtime.h> header directly s390/vtime: Remove unused __ARCH_HAS_VTIME_TASK_SWITCH leftover sched/vtime: Get rid of generic vtime_task_switch() implementation sched/vtime: Remove confusing arch_vtime_task_switch() declaration sched/balancing: Simplify the sg_status bitmask and use separate ->overloaded and ->overutilized flags sched/fair: Rename set_rd_overutilized_status() to set_rd_overutilized() sched/fair: Rename SG_OVERLOAD to SG_OVERLOADED sched/fair: Rename {set|get}_rd_overload() to {set|get}_rd_overloaded() sched/fair: Rename root_domain::overload to ::overloaded sched/fair: Use helper functions to access root_domain::overload sched/fair: Check root_domain::overload value before update sched/fair: Combine EAS check with root_domain::overutilized access sched/fair: Simplify the continue_balancing logic in sched_balance_newidle() ...
4 daysMerge tag 'hardening-6.10-rc1' of ↵Gravatar Linus Torvalds 8-393/+644
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening updates from Kees Cook: "The bulk of the changes here are related to refactoring and expanding the KUnit tests for string helper and fortify behavior. Some trivial strncpy replacements in fs/ were carried in my tree. Also some fixes to SCSI string handling were carried in my tree since the helper for those was introduce here. Beyond that, just little fixes all around: objtool getting confused about LKDTM+KCFI, preparing for future refactors (constification of sysctl tables, additional __counted_by annotations), a Clang UBSAN+i386 crash fix, and adding more options in the hardening.config Kconfig fragment. Summary: - selftests: Add str*cmp tests (Ivan Orlov) - __counted_by: provide UAPI for _le/_be variants (Erick Archer) - Various strncpy deprecation refactors (Justin Stitt) - stackleak: Use a copy of soon-to-be-const sysctl table (Thomas Weißschuh) - UBSAN: Work around i386 -regparm=3 bug with Clang prior to version 19 - Provide helper to deal with non-NUL-terminated string copying - SCSI: Fix older string copying bugs (with new helper) - selftests: Consolidate string helper behavioral tests - selftests: add memcpy() fortify tests - string: Add additional __realloc_size() annotations for "dup" helpers - LKDTM: Fix KCFI+rodata+objtool confusion - hardening.config: Enable KCFI" * tag 'hardening-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (29 commits) uapi: stddef.h: Provide UAPI macros for __counted_by_{le, be} stackleak: Use a copy of the ctl_table argument string: Add additional __realloc_size() annotations for "dup" helpers kunit/fortify: Fix replaced failure path to unbreak __alloc_size hardening: Enable KCFI and some other options lkdtm: Disable CFI checking for perms functions kunit/fortify: Add memcpy() tests kunit/fortify: Do not spam logs with fortify WARNs kunit/fortify: Rename tests to use recommended conventions init: replace deprecated strncpy with strscpy_pad kunit/fortify: Fix mismatched kvalloc()/vfree() usage scsi: qla2xxx: Avoid possible run-time warning with long model_num scsi: mpi3mr: Avoid possible run-time warning with long manufacturer strings scsi: mptfusion: Avoid possible run-time warning with long manufacturer strings fs: ecryptfs: replace deprecated strncpy with strscpy hfsplus: refactor copy_name to not use strncpy reiserfs: replace deprecated strncpy with scnprintf virt: acrn: replace deprecated strncpy with strscpy ubsan: Avoid i386 UBSAN handler crashes with Clang ubsan: Remove 1-element array usage in debug reporting ...
4 daysMerge tag 'for-6.10/block-20240511' of git://git.kernel.dk/linuxGravatar Linus Torvalds 1-4/+4
Pull block updates from Jens Axboe: - Add a partscan attribute in sysfs, fixing an issue with systemd relying on an internal interface that went away. - Attempt #2 at making long running discards interruptible. The previous attempt went into 6.9, but we ended up mostly reverting it as it had issues. - Remove old ida_simple API in bcache - Support for zoned write plugging, greatly improving the performance on zoned devices. - Remove the old throttle low interface, which has been experimental since 2017 and never made it beyond that and isn't being used. - Remove page->index debugging checks in brd, as it hasn't caught anything and prepares us for removing in struct page. - MD pull request from Song - Don't schedule block workers on isolated CPUs * tag 'for-6.10/block-20240511' of git://git.kernel.dk/linux: (84 commits) blk-throttle: delay initialization until configuration blk-throttle: remove CONFIG_BLK_DEV_THROTTLING_LOW block: fix that util can be greater than 100% block: support to account io_ticks precisely block: add plug while submitting IO bcache: fix variable length array abuse in btree_iter bcache: Remove usage of the deprecated ida_simple_xx() API md: Revert "md: Fix overflow in is_mddev_idle" blk-lib: check for kill signal in ioctl BLKDISCARD block: add a bio_await_chain helper block: add a blk_alloc_discard_bio helper block: add a bio_chain_and_submit helper block: move discard checks into the ioctl handler block: remove the discard_granularity check in __blkdev_issue_discard block/ioctl: prefer different overflow check null_blk: Fix the WARNING: modpost: missing MODULE_DESCRIPTION() block: fix and simplify blkdevparts= cmdline parsing block: refine the EOF check in blkdev_iomap_begin block: add a partscan sysfs attribute for disks block: add a disk_has_partscan helper ...
5 daysMerge tag 'tpmdd-next-6.10-rc1' of ↵Gravatar Linus Torvalds 3-0/+265
git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd Pull TPM updates from Jarkko Sakkinen: "These are the changes for the TPM driver with a single major new feature: TPM bus encryption and integrity protection. The key pair on TPM side is generated from so called null random seed per power on of the machine [1]. This supports the TPM encryption of the hard drive by adding layer of protection against bus interposer attacks. Other than that, a few minor fixes and documentation for tpm_tis to clarify basics of TPM localities for future patch review discussions (will be extended and refined over times, just a seed)" Link: https://lore.kernel.org/linux-integrity/20240429202811.13643-1-James.Bottomley@HansenPartnership.com/ [1] * tag 'tpmdd-next-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: (28 commits) Documentation: tpm: Add TPM security docs toctree entry tpm: disable the TPM if NULL name changes Documentation: add tpm-security.rst tpm: add the null key name as a sysfs export KEYS: trusted: Add session encryption protection to the seal/unseal path tpm: add session encryption protection to tpm2_get_random() tpm: add hmac checks to tpm2_pcr_extend() tpm: Add the rest of the session HMAC API tpm: Add HMAC session name/handle append tpm: Add HMAC session start and end functions tpm: Add TCG mandated Key Derivation Functions (KDFs) tpm: Add NULL primary creation tpm: export the context save and load commands tpm: add buffer function to point to returned parameters crypto: lib - implement library version of AES in CFB mode KEYS: trusted: tpm2: Use struct tpm_buf for sized buffers tpm: Add tpm_buf_read_{u8,u16,u32} tpm: TPM2B formatted buffers tpm: Store the length of the tpm_buf data separately. tpm: Update struct tpm_buf documentation comments ...
5 daysMerge tag 'slab-for-6.10' of ↵Gravatar Linus Torvalds 1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab Pull slab updates from Vlastimil Babka: "This time it's mostly random cleanups and fixes, with two performance fixes that might have significant impact, but limited to systems experiencing particular bad corner case scenarios rather than general performance improvements. The memcg hook changes are going through the mm tree due to dependencies. - Prevent stalls when reading /proc/slabinfo (Jianfeng Wang) This fixes the long-standing problem that can happen with workloads that have alloc/free patterns resulting in many partially used slabs (in e.g. dentry cache). Reading /proc/slabinfo will traverse the long partial slab list under spinlock with disabled irqs and thus can stall other processes or even trigger the lockup detection. The traversal is only done to count free objects so that <active_objs> column can be reported along with <num_objs>. To avoid affecting fast paths with another shared counter (attempted in the past) or complex partial list traversal schemes that allow rescheduling, the chosen solution resorts to approximation - when the partial list is over 10000 slabs long, we will only traverse first 5000 slabs from head and tail each and use the average of those to estimate the whole list. Both head and tail are used as the slabs near head to tend to have more free objects than the slabs towards the tail. It is expected the approximation should not break existing /proc/slabinfo consumers. The <num_objs> field is still accurate and reflects the overall kmem_cache footprint. The <active_objs> was already imprecise due to cpu and percpu-partial slabs, so can't be relied upon to determine exact cache usage. The difference between <active_objs> and <num_objs> is mainly useful to determine the slab fragmentation, and that will be possible even with the approximation in place. - Prevent allocating many slabs when a NUMA node is full (Chen Jun) Currently, on NUMA systems with a node under significantly bigger pressure than other nodes, the fallback strategy may result in each kmalloc_node() that can't be safisfied from the preferred node, to allocate a new slab on a fallback node, and not reuse the slabs already on that node's partial list. This is now fixed and partial lists of fallback nodes are checked even for kmalloc_node() allocations. It's still preferred to allocate a new slab on the requested node before a fallback, but only with a GFP_NOWAIT attempt, which will fail quickly when the node is under a significant memory pressure. - More SLAB removal related cleanups (Xiu Jianfeng, Hyunmin Lee) - Fix slub_kunit self-test with hardened freelists (Guenter Roeck) - Mark racy accesses for KCSAN (linke li) - Misc cleanups (Xiongwei Song, Haifeng Xu, Sangyun Kim)" * tag 'slab-for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: mm/slub: remove the check for NULL kmalloc_caches mm/slub: create kmalloc 96 and 192 caches regardless cache size order mm/slub: mark racy access on slab->freelist slub: use count_partial_free_approx() in slab_out_of_memory() slub: introduce count_partial_free_approx() slub: Set __GFP_COMP in kmem_cache by default mm/slub: remove duplicate initialization for early_kmem_cache_node_alloc() mm/slub: correct comment in do_slab_free() mm/slub, kunit: Use inverted data to corrupt kmem cache mm/slub: simplify get_partial_node() mm/slub: add slub_get_cpu_partial() helper mm/slub: remove the check of !kmem_cache_has_cpu_partial() mm/slub: Reduce memory consumption in extreme scenarios mm/slub: mark racy accesses on slab->slabs mm/slub: remove dummy slabinfo functions
5 daysMerge tag 'cmpxchg.2024.05.11a' of ↵Gravatar Linus Torvalds 2-0/+46
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu Pull cmpxchg updates from Paul McKenney: "Provide one-byte and two-byte cmpxchg() support on sparc32, parisc, and csky This provides native one-byte and two-byte cmpxchg() support for sparc32 and parisc, courtesy of Al Viro. This support is provided by the same hashed-array-of-locks technique used for the other atomic operations provided for these two platforms. There is also emulated one-byte cmpxchg() support for csky using a new cmpxchg_emu_u8() function that uses a four-byte cmpxchg() to emulate the one-byte variant. Similar patches for emulation of one-byte cmpxchg() for arc, sh, and xtensa have not yet received maintainer acks, so they are slated for the v6.11 merge window" * tag 'cmpxchg.2024.05.11a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: csky: Emulate one-byte cmpxchg lib: Add one-byte emulation function parisc: add u16 support to cmpxchg() parisc: add missing export of __cmpxchg_u8() parisc: unify implementations of __cmpxchg_u{8,32,64} parisc: __cmpxchg_u32(): lift conversion into the callers sparc32: add __cmpxchg_u{8,16}() and teach __cmpxchg() to handle those sizes sparc32: unify __cmpxchg_u{32,64} sparc32: make the first argument of __cmpxchg_u64() volatile u64 * sparc32: make __cmpxchg_u32() return u32
7 daysMerge tag 'mm-hotfixes-stable-2024-05-10-13-14' of ↵Gravatar Linus Torvalds 3-17/+49
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM fixes from Andrew Morton: "18 hotfixes, 7 of which are cc:stable. More fixups for this cycle's page_owner updates. And a few userfaultfd fixes. Otherwise, random singletons - see the individual changelogs for details" * tag 'mm-hotfixes-stable-2024-05-10-13-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mailmap: add entry for Barry Song selftests/mm: fix powerpc ARCH check mailmap: add entry for John Garry XArray: set the marks correctly when splitting an entry selftests/vDSO: fix runtime errors on LoongArch selftests/vDSO: fix building errors on LoongArch mm,page_owner: don't remove __GFP_NOLOCKDEP in add_stack_record_to_list fs/proc/task_mmu: fix uffd-wp confusion in pagemap_scan_pmd_entry() fs/proc/task_mmu: fix loss of young/dirty bits during pagemap scan mm/vmalloc: fix return value of vb_alloc if size is 0 mm: use memalloc_nofs_save() in page_cache_ra_order() kmsan: compiler_types: declare __no_sanitize_or_inline lib/test_xarray.c: fix error assumptions on check_xa_multi_store_adv_add() tools: fix userspace compilation with new test_xarray changes MAINTAINERS: update URL's for KEYS/KEYRINGS_INTEGRITY and TPM DEVICE DRIVER mm: page_owner: fix wrong information in dump_page_owner maple_tree: fix mas_empty_area_rev() null pointer dereference mm/userfaultfd: reset ptes when close() for wr-protected ones
8 dayscrypto: lib - implement library version of AES in CFB modeGravatar Ard Biesheuvel 3-0/+265
Implement AES in CFB mode using the existing, mostly constant-time generic AES library implementation. This will be used by the TPM code to encrypt communications with TPM hardware, which is often a discrete component connected using sniffable wires or traces. While a CFB template does exist, using a skcipher is a major pain for non-performance critical synchronous crypto where the algorithm is known at compile time and the data is in contiguous buffers with valid kernel virtual addresses. Tested-by: James Bottomley <James.Bottomley@HansenPartnership.com> Reviewed-by: James Bottomley <James.Bottomley@HansenPartnership.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Link: https://lore.kernel.org/all/20230216201410.15010-1-James.Bottomley@HansenPartnership.com/ Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Tested-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
9 daysMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netGravatar Jakub Kicinski 1-1/+5
Cross-merge networking fixes after downstream PR. No conflicts. Adjacent changes: drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c 35d92abfbad8 ("net: hns3: fix kernel crash when devlink reload during initialization") 2a1a1a7b5fd7 ("net: hns3: add command queue trace for hns3") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 dayslib: Allow for the DIM library to be modularGravatar Florian Fainelli 3-3/+6
Allow the Dynamic Interrupt Moderation (DIM) library to be built as a module. This is particularly useful in an Android GKI (Google Kernel Image) configuration where everything is built as a module, including Ethernet controller drivers. Having to build DIMLIB into the kernel image with potentially no user is wasteful. Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://lore.kernel.org/r/20240506175040.410446-1-florian.fainelli@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 dayskunit: bail out early in __kunit_test_suites_init() if there are no suites ↵Gravatar Scott Mayhew 1-0/+3
to test Commit c72a870926c2 added a mutex to prevent kunit tests from running concurrently. Unfortunately that mutex gets locked during module load regardless of whether the module actually has any kunit tests. This causes a problem for kunit tests that might need to load other kernel modules (e.g. gss_krb5_test loading the camellia module). So check to see if there are actually any tests to run before locking the kunit_run_lock mutex. Fixes: c72a870926c2 ("kunit: add ability to run tests after boot using debugfs") Reported-by: Nico Pache <npache@redhat.com> Signed-off-by: Scott Mayhew <smayhew@redhat.com> Reviewed-by: Rae Moar <rmoar@google.com> Reviewed-by: David Gow <davidgow@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
11 dayskunit: string-stream-test: use KUNIT_DEFINE_ACTION_WRAPPERGravatar Ivan Orlov 1-10/+2
Use KUNIT_DEFINE_ACTION_WRAPPER macro to define the 'kfree' and 'string_stream_destroy' wrappers for kunit_add_action. Signed-off-by: Ivan Orlov <ivan.orlov0322@gmail.com> Reviewed-by: Rae Moar <rmoar@google.com> Acked-by: David Gow <davidgow@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
11 dayskunit: test: Move fault tests behind KUNIT_FAULT_TEST Kconfig optionGravatar David Gow 2-4/+15
The NULL dereference tests in kunit_fault deliberately trigger a kernel BUG(), and therefore print the associated stack trace, even when the test passes. This is both annoying (as it bloats the test output), and can confuse some test harnesses, which assume any BUG() is a failure. Allow these tests to be specifically disabled (without disabling all of KUnit's other tests), by placing them behind the CONFIG_KUNIT_FAULT_TEST Kconfig option. This is enabled by default, but can be set to 'n' to disable the test. An empty 'kunit_fault' suite is left behind, which will automatically be marked 'skipped'. As the fault tests already were disabled under UML (as they weren't compatible with its fault handling), we can simply adapt those conditions, and add a dependency on !UML for our new option. Suggested-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/all/928249cc-e027-4f7f-b43f-502f99a1ea63@roeck-us.net/ Fixes: 82b0beff3497 ("kunit: Add tests for fault") Signed-off-by: David Gow <davidgow@google.com> Reviewed-by: Mickaël Salaün <mic@digikod.net> Reviewed-by: Rae Moar <rmoar@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
11 dayskunit: unregister the device on errorGravatar Wander Lairson Costa 1-1/+1
kunit_init_device() should unregister the device on bus register error, but mistakenly it tries to unregister the bus. Unregister the device instead of the bus. Signed-off-by: Wander Lairson Costa <wander@redhat.com> Fixes: d03c720e03bd ("kunit: Add APIs for managing devices") Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
11 dayskunit: Fix race condition in try-catch completionGravatar David Gow 1-3/+7
KUnit's try-catch infrastructure now uses vfork_done, which is always set to a valid completion when a kthread is created, but which is set to NULL once the thread terminates. This creates a race condition, where the kthread exits before we can wait on it. Keep a copy of vfork_done, which is taken before we wake_up_process() and so valid, and wait on that instead. Fixes: 93533996100c ("kunit: Handle test faults") Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> Closes: https://lore.kernel.org/lkml/20240410102710.35911-1-naresh.kamboju@linaro.org/ Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Acked-by: Mickaël Salaün <mic@digikod.net> Signed-off-by: David Gow <davidgow@google.com> Reviewed-by: Rae Moar <rmoar@google.com> Tested-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
11 dayskunit: Add tests for faultGravatar Mickaël Salaün 1-1/+44
Add a test case to check NULL pointer dereference and make sure it would result as a failed test. The full kunit_fault test suite is marked as skipped when run on UML because it would result to a kernel panic. Tested with: ./tools/testing/kunit/kunit.py run --arch x86_64 kunit_fault ./tools/testing/kunit/kunit.py run --arch arm64 \ --cross_compile=aarch64-linux-gnu- kunit_fault Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Rae Moar <rmoar@google.com> Cc: Shuah Khan <skhan@linuxfoundation.org> Reviewed-by: David Gow <davidgow@google.com> Signed-off-by: Mickaël Salaün <mic@digikod.net> Link: https://lore.kernel.org/r/20240408074625.65017-8-mic@digikod.net Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
11 dayskunit: Print last test location on faultGravatar Mickaël Salaün 1-3/+7
This helps identify the location of test faults with opportunistic calls to _KUNIT_SAVE_LOC(). This can be useful while writing tests or debugging them. It is possible to call KUNIT_SUCCESS() to explicit save last location. Cc: Brendan Higgins <brendanhiggins@google.com> Cc: David Gow <davidgow@google.com> Cc: Rae Moar <rmoar@google.com> Cc: Shuah Khan <skhan@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Mickaël Salaün <mic@digikod.net> Link: https://lore.kernel.org/r/20240408074625.65017-7-mic@digikod.net Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
11 dayskunit: Fix KUNIT_SUCCESS() calls in iov_iter testsGravatar Mickaël Salaün 1-9/+9
Fix KUNIT_SUCCESS() calls to pass a test argument. This is a no-op for now because this macro does nothing, but it will be required for the next commit. Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Rae Moar <rmoar@google.com> Cc: Shuah Khan <skhan@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Gow <davidgow@google.com> Signed-off-by: Mickaël Salaün <mic@digikod.net> Link: https://lore.kernel.org/r/20240408074625.65017-6-mic@digikod.net Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
11 dayskunit: Handle test faultsGravatar Mickaël Salaün 1-7/+12
Previously, when a kernel test thread crashed (e.g. NULL pointer dereference, general protection fault), the KUnit test hanged for 30 seconds and exited with a timeout error. Fix this issue by waiting on task_struct->vfork_done instead of the custom kunit_try_catch.try_completion, and track the execution state by initially setting try_result with -EINTR and only setting it to 0 if the test passed. Fix kunit_generic_run_threadfn_adapter() signature by returning 0 instead of calling kthread_complete_and_exit(). Because thread's exit code is never checked, always set it to 0 to make it clear. To make this explicit, export kthread_exit() for KUnit tests built as module. Fix the -EINTR error message, which couldn't be reached until now. This is tested with a following patch. Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Shuah Khan <skhan@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Gow <davidgow@google.com> Tested-by: Rae Moar <rmoar@google.com> Signed-off-by: Mickaël Salaün <mic@digikod.net> Link: https://lore.kernel.org/r/20240408074625.65017-5-mic@digikod.net Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
11 dayskunit: Fix timeout messageGravatar Mickaël Salaün 1-1/+2
The exit code is always checked, so let's properly handle the -ETIMEDOUT error code. Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Shuah Khan <skhan@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Gow <davidgow@google.com> Reviewed-by: Rae Moar <rmoar@google.com> Signed-off-by: Mickaël Salaün <mic@digikod.net> Link: https://lore.kernel.org/r/20240408074625.65017-4-mic@digikod.net Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
11 dayskunit: Fix kthread referenceGravatar Mickaël Salaün 1-3/+6
There is a race condition when a kthread finishes after the deadline and before the call to kthread_stop(), which may lead to use after free. Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Shuah Khan <skhan@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Fixes: adf505457032 ("kunit: fix UAF when run kfence test case test_gfpzero") Reviewed-by: David Gow <davidgow@google.com> Reviewed-by: Rae Moar <rmoar@google.com> Signed-off-by: Mickaël Salaün <mic@digikod.net> Link: https://lore.kernel.org/r/20240408074625.65017-3-mic@digikod.net Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
11 dayskunit: Handle thread creation errorGravatar Mickaël Salaün 1-0/+1
Previously, if a thread creation failed (e.g. -ENOMEM), the function was called (kunit_catch_run_case or kunit_catch_run_case_cleanup) without marking the test as failed. Instead, fill try_result with the error code returned by kthread_run(), which will mark the test as failed and print "internal error occurred...". Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Shuah Khan <skhan@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Rae Moar <rmoar@google.com> Reviewed-by: David Gow <davidgow@google.com> Signed-off-by: Mickaël Salaün <mic@digikod.net> Link: https://lore.kernel.org/r/20240408074625.65017-2-mic@digikod.net Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
12 daysXArray: set the marks correctly when splitting an entryGravatar Matthew Wilcox (Oracle) 2-5/+32
If we created a new node to replace an entry which had search marks set, we were setting the search mark on every entry in that node. That works fine when we're splitting to order 0, but when splitting to a larger order, we must not set the search marks on the sibling entries. Link: https://lkml.kernel.org/r/20240501153120.4094530-1-willy@infradead.org Fixes: c010d47f107f ("mm: thp: split huge page to any lower order pages") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reported-by: Luis Chamberlain <mcgrof@kernel.org> Link: https://lore.kernel.org/r/ZjFGCOYk3FK_zVy3@bombadil.infradead.org Tested-by: Luis Chamberlain <mcgrof@kernel.org> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
12 dayslib/test_xarray.c: fix error assumptions on check_xa_multi_store_adv_add()Gravatar Luis Chamberlain 1-4/+9
While testing lib/test_xarray in userspace I've noticed we can fail with: make -C tools/testing/radix-tree ./tools/testing/radix-tree/xarray BUG at check_xa_multi_store_adv_add:749 xarray: 0x55905fb21a00x head 0x55905fa1d8e0x flags 0 marks 0 0 0 0: 0x55905fa1d8e0x xarray: ../../../lib/test_xarray.c:749: check_xa_multi_store_adv_add: Assertion `0' failed. Aborted We get a failure with a BUG_ON(), and that is because we actually can fail due to -ENOMEM, the check in xas_nomem() will fix this for us so it makes no sense to expect no failure inside the loop. So modify the check and since this is also useful for instructional purposes clarify the situation. The check for XA_BUG_ON(xa, xa_load(xa, index) != p) is already done at the end of the loop so just remove the bogus on inside the loop. With this we now pass the test in both kernel and userspace: In userspace: ./tools/testing/radix-tree/xarray XArray: 149092856 of 149092856 tests passed In kernel space: XArray: 148257077 of 148257077 tests passed Link: https://lkml.kernel.org/r/20240423192221.301095-3-mcgrof@kernel.org Fixes: a60cc288a1a2 ("test_xarray: add tests for advanced multi-index use") Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Cc: Daniel Gomez <da.gomez@samsung.com> Cc: Darrick J. Wong <djwong@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Pankaj Raghav <p.raghav@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
12 daysmaple_tree: fix mas_empty_area_rev() null pointer dereferenceGravatar Liam R. Howlett 1-8/+8
Currently the code calls mas_start() followed by mas_data_end() if the maple state is MA_START, but mas_start() may return with the maple state node == NULL. This will lead to a null pointer dereference when checking information in the NULL node, which is done in mas_data_end(). Avoid setting the offset if there is no node by waiting until after the maple state is checked for an empty or single entry state. A user could trigger the events to cause a kernel oops by unmapping all vmas to produce an empty maple tree, then mapping a vma that would cause the scenario described above. Link: https://lkml.kernel.org/r/20240422203349.2418465-1-Liam.Howlett@oracle.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by: Marius Fleischer <fleischermarius@gmail.com> Closes: https://lore.kernel.org/lkml/CAJg=8jyuSxDL6XvqEXY_66M20psRK2J53oBTP+fjV5xpW2-R6w@mail.gmail.com/ Link: https://lore.kernel.org/lkml/CAJg=8jyuSxDL6XvqEXY_66M20psRK2J53oBTP+fjV5xpW2-R6w@mail.gmail.com/ Tested-by: Marius Fleischer <fleischermarius@gmail.com> Tested-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
13 daysMerge tag 'char-misc-6.9-rc7' of ↵Gravatar Linus Torvalds 1-1/+5
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver fixes from Greg KH: "Here are some small char/misc/other driver fixes and new device ids for 6.9-rc7 that resolve some reported problems. Included in here are: - iio driver fixes - mei driver fix and new device ids - dyndbg bugfix - pvpanic-pci driver bugfix - slimbus driver bugfix - fpga new device id All have been in linux-next with no reported problems" * tag 'char-misc-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: slimbus: qcom-ngd-ctrl: Add timeout for wait operation dyndbg: fix old BUG_ON in >control parser misc/pvpanic-pci: register attributes via pci_driver fpga: dfl-pci: add PCI subdevice ID for Intel D5005 card mei: me: add lunar lake point M DID mei: pxp: match against PCI_CLASS_DISPLAY_OTHER iio:imu: adis16475: Fix sync mode setting iio: accel: mxc4005: Reset chip on probe() and resume() iio: accel: mxc4005: Interrupt handling fixes dt-bindings: iio: health: maxim,max30102: fix compatible check iio: pressure: Fixes SPI support for BMP3xx devices iio: pressure: Fixes BME280 SPI driver data
2024-05-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netGravatar Jakub Kicinski 3-5/+6
Cross-merge networking fixes after downstream PR. Conflicts: include/linux/filter.h kernel/bpf/core.c 66e13b615a0c ("bpf: verifier: prevent userspace memory access") d503a04f8bc0 ("bpf: Add support for certain atomics in bpf_arena to x86 JIT") https://lore.kernel.org/all/20240429114939.210328b0@canb.auug.org.au/ No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-02Merge tag 'net-6.9-rc7' of ↵Gravatar Linus Torvalds 2-3/+4
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bpf. Relatively calm week, likely due to public holiday in most places. No known outstanding regressions. Current release - regressions: - rxrpc: fix wrong alignmask in __page_frag_alloc_align() - eth: e1000e: change usleep_range to udelay in PHY mdic access Previous releases - regressions: - gro: fix udp bad offset in socket lookup - bpf: fix incorrect runtime stat for arm64 - tipc: fix UAF in error path - netfs: fix a potential infinite loop in extract_user_to_sg() - eth: ice: ensure the copied buf is NUL terminated - eth: qeth: fix kernel panic after setting hsuid Previous releases - always broken: - bpf: - verifier: prevent userspace memory access - xdp: use flags field to disambiguate broadcast redirect - bridge: fix multicast-to-unicast with fraglist GSO - mptcp: ensure snd_nxt is properly initialized on connect - nsh: fix outer header access in nsh_gso_segment(). - eth: bcmgenet: fix racing registers access - eth: vxlan: fix stats counters. Misc: - a bunch of MAINTAINERS file updates" * tag 'net-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (45 commits) MAINTAINERS: mark MYRICOM MYRI-10G as Orphan MAINTAINERS: remove Ariel Elior net: gro: add flush check in udp_gro_receive_segment net: gro: fix udp bad offset in socket lookup by adding {inner_}network_offset to napi_gro_cb ipv4: Fix uninit-value access in __ip_make_skb() s390/qeth: Fix kernel panic after setting hsuid vxlan: Pull inner IP header in vxlan_rcv(). tipc: fix a possible memleak in tipc_buf_append tipc: fix UAF in error path rxrpc: Clients must accept conn from any address net: core: reject skb_copy(_expand) for fraglist GSO skbs net: bridge: fix multicast-to-unicast with fraglist GSO mptcp: ensure snd_nxt is properly initialized on connect e1000e: change usleep_range to udelay in PHY mdic access net: dsa: mv88e6xxx: Fix number of databases for 88E6141 / 88E6341 cxgb4: Properly lock TX queue for the selftest. rxrpc: Fix using alignmask being zero for __page_frag_alloc_align() vxlan: Add missing VNI filter counter update in arp_reduce(). vxlan: Fix racy device stats updates. net: qede: use return from qede_parse_actions() ...
2024-05-02string: Add additional __realloc_size() annotations for "dup" helpersGravatar Kees Cook 1-0/+26
Several other "dup"-style interfaces could use the __realloc_size() attribute. (As a reminder to myself and others: "realloc" is used here instead of "alloc" because the "alloc_size" attribute implies that the memory contents are uninitialized. Since we're copying contents into the resulting allocation, it must use "realloc_size" to avoid confusing the compiler's optimization passes.) Add KUnit test coverage where possible. (KUnit still does not have the ability to manipulate userspace memory.) Reviewed-by: Andy Shevchenko <andy@kernel.org> Link: https://lore.kernel.org/r/20240502145218.it.729-kees@kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>
2024-05-01kunit/fortify: Fix replaced failure path to unbreak __alloc_sizeGravatar Kees Cook 1-3/+3
The __alloc_size annotation for kmemdup() was getting disabled under KUnit testing because the replaced fortify_panic macro implementation was using "return NULL" as a way to survive the sanity checking. But having the chance to return NULL invalidated __alloc_size, so kmemdup was not passing the __builtin_dynamic_object_size() tests any more: [23:26:18] [PASSED] fortify_test_alloc_size_kmalloc_const [23:26:19] # fortify_test_alloc_size_kmalloc_dynamic: EXPECTATION FAILED at lib/fortify_kunit.c:265 [23:26:19] Expected __builtin_dynamic_object_size(p, 1) == expected, but [23:26:19] __builtin_dynamic_object_size(p, 1) == -1 (0xffffffffffffffff) [23:26:19] expected == 11 (0xb) [23:26:19] __alloc_size() not working with __bdos on kmemdup("hello there", len, gfp) [23:26:19] [FAILED] fortify_test_alloc_size_kmalloc_dynamic Normal builds were not affected: __alloc_size continued to work there. Use a zero-sized allocation instead, which allows __alloc_size to behave. Fixes: 4ce615e798a7 ("fortify: Provide KUnit counters for failure testing") Fixes: fa4a3f86d498 ("fortify: Add KUnit tests for runtime overflows") Link: https://lore.kernel.org/r/20240501232937.work.532-kees@kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>
2024-05-01objpool: cache nr_possible_cpus() and avoid caching nr_cpu_idsGravatar Andrii Nakryiko 1-6/+6
Profiling shows that calling nr_possible_cpus() in objpool_pop() takes a noticeable amount of CPU (when profiled on 80-core machine), as we need to recalculate number of set bits in a CPU bit mask. This number can't change, so there is no point in paying the price for recalculating it. As such, cache this value in struct objpool_head and use it in objpool_pop(). On the other hand, cached pool->nr_cpus isn't necessary, as it's not used in hot path and is also a pretty trivial value to retrieve. So drop pool->nr_cpus in favor of using nr_cpu_ids everywhere. This way the size of struct objpool_head remains the same, which is a nice bonus. Same BPF selftests benchmarks were used to evaluate the effect. Using changes in previous patch (inlining of objpool_pop/objpool_push) as baseline, here are the differences: BASELINE ======== kretprobe : 9.937 ± 0.174M/s kretprobe-multi: 10.440 ± 0.108M/s AFTER ===== kretprobe : 10.106 ± 0.120M/s (+1.7%) kretprobe-multi: 10.515 ± 0.180M/s (+0.7%) Link: https://lore.kernel.org/all/20240424215214.3956041-3-andrii@kernel.org/ Cc: Matt (Qiang) Wu <wuqiang.matt@bytedance.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2024-05-01objpool: enable inlining objpool_push() and objpool_pop() operationsGravatar Andrii Nakryiko 1-100/+0
objpool_push() and objpool_pop() are very performance-critical functions and can be called very frequently in kretprobe triggering path. As such, it makes sense to allow compiler to inline them completely to eliminate function calls overhead. Luckily, their logic is quite well isolated and doesn't have any sprawling dependencies. This patch moves both objpool_push() and objpool_pop() into include/linux/objpool.h and marks them as static inline functions, enabling inlining. To avoid anyone using internal helpers (objpool_try_get_slot, objpool_try_add_slot), rename them to use leading underscores. We used kretprobe microbenchmark from BPF selftests (bench trig-kprobe and trig-kprobe-multi benchmarks) running no-op BPF kretprobe/kretprobe.multi programs in a tight loop to evaluate the effect. BPF own overhead in this case is minimal and it mostly stresses the rest of in-kernel kretprobe infrastructure overhead. Results are in millions of calls per second. This is not super scientific, but shows the trend nevertheless. BEFORE ====== kretprobe : 9.794 ± 0.086M/s kretprobe-multi: 10.219 ± 0.032M/s AFTER ===== kretprobe : 9.937 ± 0.174M/s (+1.5%) kretprobe-multi: 10.440 ± 0.108M/s (+2.2%) Link: https://lore.kernel.org/all/20240424215214.3956041-2-andrii@kernel.org/ Cc: Matt (Qiang) Wu <wuqiang.matt@bytedance.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2024-04-30kunit/fortify: Add memcpy() testsGravatar Kees Cook 1-3/+82
Add fortify tests for memcpy() and memmove(). This can use a similar method to the fortify_panic() replacement, only we can do it for what was the WARN_ONCE(), which can be redefined. Since this is primarily testing the fortify behaviors of the memcpy() and memmove() defenses, the tests for memcpy() and memmove() are identical. Link: https://lore.kernel.org/r/20240429194342.2421639-3-keescook@chromium.org Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-30kunit/fortify: Do not spam logs with fortify WARNsGravatar Kees Cook 1-1/+8
When running KUnit fortify tests, we're already doing precise tracking of which warnings are getting hit. Don't fill the logs with WARNs unless we've been explicitly built with DEBUG enabled. Link: https://lore.kernel.org/r/20240429194342.2421639-2-keescook@chromium.org Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-30kunit/fortify: Rename tests to use recommended conventionsGravatar Kees Cook 1-40/+40
The recommended conventions for KUnit tests is ${module}_test_${what}. Adjust the fortify tests to match. Link: https://lore.kernel.org/r/20240429194342.2421639-1-keescook@chromium.org Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-30dyndbg: fix old BUG_ON in >control parserGravatar Jim Cromie 1-1/+5
Fix a BUG_ON from 2009. Even if it looks "unreachable" (I didn't really look), lets make sure by removing it, doing pr_err and return -EINVAL instead. Cc: stable <stable@kernel.org> Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Link: https://lore.kernel.org/r/20240429193145.66543-2-jim.cromie@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-29Merge tag 'for-netdev' of ↵Gravatar Jakub Kicinski 1-1/+1
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2024-04-29 We've added 147 non-merge commits during the last 32 day(s) which contain a total of 158 files changed, 9400 insertions(+), 2213 deletions(-). The main changes are: 1) Add an internal-only BPF per-CPU instruction for resolving per-CPU memory addresses and implement support in x86 BPF JIT. This allows inlining per-CPU array and hashmap lookups and the bpf_get_smp_processor_id() helper, from Andrii Nakryiko. 2) Add BPF link support for sk_msg and sk_skb programs, from Yonghong Song. 3) Optimize x86 BPF JIT's emit_mov_imm64, and add support for various atomics in bpf_arena which can be JITed as a single x86 instruction, from Alexei Starovoitov. 4) Add support for passing mark with bpf_fib_lookup helper, from Anton Protopopov. 5) Add a new bpf_wq API for deferring events and refactor sleepable bpf_timer code to keep common code where possible, from Benjamin Tissoires. 6) Fix BPF_PROG_TEST_RUN infra with regards to bpf_dummy_struct_ops programs to check when NULL is passed for non-NULLable parameters, from Eduard Zingerman. 7) Harden the BPF verifier's and/or/xor value tracking, from Harishankar Vishwanathan. 8) Introduce crypto kfuncs to make BPF programs able to utilize the kernel crypto subsystem, from Vadim Fedorenko. 9) Various improvements to the BPF instruction set standardization doc, from Dave Thaler. 10) Extend libbpf APIs to partially consume items from the BPF ringbuffer, from Andrea Righi. 11) Bigger batch of BPF selftests refactoring to use common network helpers and to drop duplicate code, from Geliang Tang. 12) Support bpf_tail_call_static() helper for BPF programs with GCC 13, from Jose E. Marchesi. 13) Add bpf_preempt_{disable,enable}() kfuncs in order to allow a BPF program to have code sections where preemption is disabled, from Kumar Kartikeya Dwivedi. 14) Allow invoking BPF kfuncs from BPF_PROG_TYPE_SYSCALL programs, from David Vernet. 15) Extend the BPF verifier to allow different input maps for a given bpf_for_each_map_elem() helper call in a BPF program, from Philo Lu. 16) Add support for PROBE_MEM32 and bpf_addr_space_cast instructions for riscv64 and arm64 JITs to enable BPF Arena, from Puranjay Mohan. 17) Shut up a false-positive KMSAN splat in interpreter mode by unpoison the stack memory, from Martin KaFai Lau. 18) Improve xsk selftest coverage with new tests on maximum and minimum hardware ring size configurations, from Tushar Vyavahare. 19) Various ReST man pages fixes as well as documentation and bash completion improvements for bpftool, from Rameez Rehman & Quentin Monnet. 20) Fix libbpf with regards to dumping subsequent char arrays, from Quentin Deslandes. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (147 commits) bpf, docs: Clarify PC use in instruction-set.rst bpf_helpers.h: Define bpf_tail_call_static when building with GCC bpf, docs: Add introduction for use in the ISA Internet Draft selftests/bpf: extend BPF_SOCK_OPS_RTT_CB test for srtt and mrtt_us bpf: add mrtt and srtt as BPF_SOCK_OPS_RTT_CB args selftests/bpf: dummy_st_ops should reject 0 for non-nullable params bpf: check bpf_dummy_struct_ops program params for test runs selftests/bpf: do not pass NULL for non-nullable params in dummy_st_ops selftests/bpf: adjust dummy_st_ops_success to detect additional error bpf: mark bpf_dummy_struct_ops.test_1 parameter as nullable selftests/bpf: Add ring_buffer__consume_n test. bpf: Add bpf_guard_preempt() convenience macro selftests: bpf: crypto: add benchmark for crypto functions selftests: bpf: crypto skcipher algo selftests bpf: crypto: add skcipher to bpf crypto bpf: make common crypto API for TC/XDP programs bpf: update the comment for BTF_FIELDS_MAX selftests/bpf: Fix wq test. selftests/bpf: Use make_sockaddr in test_sock_addr selftests/bpf: Use connect_to_addr in test_sock_addr ... ==================== Link: https://lore.kernel.org/r/20240429131657.19423-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-26Merge tag 'for-netdev' of ↵Gravatar Jakub Kicinski 1-2/+3
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2024-04-26 We've added 12 non-merge commits during the last 22 day(s) which contain a total of 14 files changed, 168 insertions(+), 72 deletions(-). The main changes are: 1) Fix BPF_PROBE_MEM in verifier and JIT to skip loads from vsyscall page, from Puranjay Mohan. 2) Fix a crash in XDP with devmap broadcast redirect when the latter map is in process of being torn down, from Toke Høiland-Jørgensen. 3) Fix arm64 and riscv64 BPF JITs to properly clear start time for BPF program runtime stats, from Xu Kuohai. 4) Fix a sockmap KCSAN-reported data race in sk_psock_skb_ingress_enqueue, from Jason Xing. 5) Fix BPF verifier error message in resolve_pseudo_ldimm64, from Anton Protopopov. 6) Fix missing DEBUG_INFO_BTF_MODULES Kconfig menu item, from Andrii Nakryiko. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests/bpf: Test PROBE_MEM of VSYSCALL_ADDR on x86-64 bpf, x86: Fix PROBE_MEM runtime load check bpf: verifier: prevent userspace memory access xdp: use flags field to disambiguate broadcast redirect arm32, bpf: Reimplement sign-extension mov instruction riscv, bpf: Fix incorrect runtime stats bpf, arm64: Fix incorrect runtime stats bpf: Fix a verifier verbose message bpf, skmsg: Fix NULL pointer dereference in sk_psock_skb_ingress_enqueue MAINTAINERS: bpf: Add Lehui and Puranjay as riscv64 reviewers MAINTAINERS: Update email address for Puranjay Mohan bpf, kconfig: Fix DEBUG_INFO_BTF_MODULES Kconfig definition ==================== Link: https://lore.kernel.org/r/20240426224248.26197-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-26kunit/fortify: Fix mismatched kvalloc()/vfree() usageGravatar Kees Cook 1-8/+8
The kv*() family of tests were accidentally freeing with vfree() instead of kvfree(). Use kvfree() instead. Fixes: 9124a2640148 ("kunit/fortify: Validate __alloc_size attribute results") Link: https://lore.kernel.org/r/20240425230619.work.299-kees@kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-26Merge tag 'mm-hotfixes-stable-2024-04-26-13-30' of ↵Gravatar Linus Torvalds 1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "11 hotfixes. 8 are cc:stable and the remaining 3 (nice ratio!) address post-6.8 issues or aren't considered suitable for backporting. All except one of these are for MM. I see no particular theme - it's singletons all over" * tag 'mm-hotfixes-stable-2024-04-26-13-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm/hugetlb: fix DEBUG_LOCKS_WARN_ON(1) when dissolve_free_hugetlb_folio() selftests: mm: protection_keys: save/restore nr_hugepages value from launch script stackdepot: respect __GFP_NOLOCKDEP allocation flag hugetlb: check for anon_vma prior to folio allocation mm: zswap: fix shrinker NULL crash with cgroup_disable=memory mm: turn folio_test_hugetlb into a PageType mm: support page_mapcount() on page_has_type() pages mm: create FOLIO_FLAG_FALSE and FOLIO_TYPE_OPS macros mm/hugetlb: fix missing hugetlb_lock for resv uncharge selftests: mm: fix unused and uninitialized variable warning selftests/harness: remove use of LINE_MAX
2024-04-26Fix a potential infinite loop in extract_user_to_sg()Gravatar David Howells 1-1/+1
Fix extract_user_to_sg() so that it will break out of the loop if iov_iter_extract_pages() returns 0 rather than looping around forever. [Note that I've included two fixes lines as the function got moved to a different file and renamed] Fixes: 85dd2c8ff368 ("netfs: Add a function to extract a UBUF or IOVEC into a BVEC iterator") Fixes: f5f82cd18732 ("Move netfs_extract_iter_to_sg() to lib/scatterlist.c") Signed-off-by: David Howells <dhowells@redhat.com> cc: Jeff Layton <jlayton@kernel.org> cc: Steve French <sfrench@samba.org> cc: Herbert Xu <herbert@gondor.apana.org.au> cc: netfs@lists.linux.dev Link: https://lore.kernel.org/r/1967121.1714034372@warthog.procyon.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-26sbitmap: use READ_ONCE to access map->wordGravatar linke li 1-4/+4
In __sbitmap_queue_get_batch(), map->word is read several times, and update atomically using atomic_long_try_cmpxchg(). But the first two read of map->word is not protected. This patch moves the statement val = READ_ONCE(map->word) forward, eliminating unprotected accesses to map->word within the function. It is aimed at reducing the number of benign races reported by KCSAN in order to focus future debugging effort on harmful races. Signed-off-by: linke li <lilinke99@qq.com> Link: https://lore.kernel.org/r/tencent_0B517C25E519D3D002194E8445E86C04AD0A@qq.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-04-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netGravatar Jakub Kicinski 2-11/+29
Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/ethernet/ti/icssg/icssg_prueth.c net/mac80211/chan.c 89884459a0b9 ("wifi: mac80211: fix idle calculation with multi-link") 87f5500285fb ("wifi: mac80211: simplify ieee80211_assign_link_chanctx()") https://lore.kernel.org/all/20240422105623.7b1fbda2@canb.auug.org.au/ net/unix/garbage.c 1971d13ffa84 ("af_unix: Suppress false-positive lockdep splat for spin_lock() in __unix_gc().") 4090fa373f0e ("af_unix: Replace garbage collection algorithm.") drivers/net/ethernet/ti/icssg/icssg_prueth.c drivers/net/ethernet/ti/icssg/icssg_common.c 4dcd0e83ea1d ("net: ti: icssg-prueth: Fix signedness bug in prueth_init_rx_chns()") e2dc7bfd677f ("net: ti: icssg-prueth: Move common functions into a separate file") No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-24stackdepot: respect __GFP_NOLOCKDEP allocation flagGravatar Andrey Ryabinin 1-2/+2
If stack_depot_save_flags() allocates memory it always drops __GFP_NOLOCKDEP flag. So when KASAN tries to track __GFP_NOLOCKDEP allocation we may end up with lockdep splat like bellow: ====================================================== WARNING: possible circular locking dependency detected 6.9.0-rc3+ #49 Not tainted ------------------------------------------------------ kswapd0/149 is trying to acquire lock: ffff88811346a920 (&xfs_nondir_ilock_class){++++}-{4:4}, at: xfs_reclaim_inode+0x3ac/0x590 [xfs] but task is already holding lock: ffffffff8bb33100 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x5d9/0xad0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (fs_reclaim){+.+.}-{0:0}: __lock_acquire+0x7da/0x1030 lock_acquire+0x15d/0x400 fs_reclaim_acquire+0xb5/0x100 prepare_alloc_pages.constprop.0+0xc5/0x230 __alloc_pages+0x12a/0x3f0 alloc_pages_mpol+0x175/0x340 stack_depot_save_flags+0x4c5/0x510 kasan_save_stack+0x30/0x40 kasan_save_track+0x10/0x30 __kasan_slab_alloc+0x83/0x90 kmem_cache_alloc+0x15e/0x4a0 __alloc_object+0x35/0x370 __create_object+0x22/0x90 __kmalloc_node_track_caller+0x477/0x5b0 krealloc+0x5f/0x110 xfs_iext_insert_raw+0x4b2/0x6e0 [xfs] xfs_iext_insert+0x2e/0x130 [xfs] xfs_iread_bmbt_block+0x1a9/0x4d0 [xfs] xfs_btree_visit_block+0xfb/0x290 [xfs] xfs_btree_visit_blocks+0x215/0x2c0 [xfs] xfs_iread_extents+0x1a2/0x2e0 [xfs] xfs_buffered_write_iomap_begin+0x376/0x10a0 [xfs] iomap_iter+0x1d1/0x2d0 iomap_file_buffered_write+0x120/0x1a0 xfs_file_buffered_write+0x128/0x4b0 [xfs] vfs_write+0x675/0x890 ksys_write+0xc3/0x160 do_syscall_64+0x94/0x170 entry_SYSCALL_64_after_hwframe+0x71/0x79 Always preserve __GFP_NOLOCKDEP to fix this. Link: https://lkml.kernel.org/r/20240418141133.22950-1-ryabinin.a.a@gmail.com Fixes: cd11016e5f52 ("mm, kasan: stackdepot implementation. Enable stackdepot for SLAB") Signed-off-by: Andrey Ryabinin <ryabinin.a.a@gmail.com> Reported-by: Xiubo Li <xiubli@redhat.com> Closes: https://lore.kernel.org/all/a0caa289-ca02-48eb-9bf2-d86fd47b71f4@redhat.com/ Reported-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Closes: https://lore.kernel.org/all/f9ff999a-e170-b66b-7caf-293f2b147ac2@opensource.wdc.com/ Suggested-by: Dave Chinner <david@fromorbit.com> Tested-by: Xiubo Li <xiubli@redhat.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Alexander Potapenko <glider@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-24ubsan: Avoid i386 UBSAN handler crashes with ClangGravatar Kees Cook 1-14/+27
When generating Runtime Calls, Clang doesn't respect the -mregparm=3 option used on i386. Hopefully this will be fixed correctly in Clang 19: https://github.com/llvm/llvm-project/pull/89707 but we need to fix this for earlier Clang versions today. Force the calling convention to use non-register arguments. Reported-by: Erhard Furtner <erhard_f@mailbox.org> Closes: https://github.com/KSPP/linux/issues/350 Link: https://lore.kernel.org/r/20240424224026.it.216-kees@kernel.org Acked-by: Nathan Chancellor <nathan@kernel.org> Acked-by: Justin Stitt <justinstitt@google.com> Signed-off-by: Kees Cook <keescook@chromium.org>