aboutsummaryrefslogtreecommitdiff
path: root/fs/bcachefs
AgeCommit message (Collapse)AuthorFilesLines
6 daysbcachefs: Fix rcu_read_lock() leak in drop_extra_replicasGravatar Kent Overstreet 1-2/+1
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 daysbcachefs: Add missing bch_inode_info.ei_flags initGravatar Kent Overstreet 1-0/+2
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 daysbcachefs: Add missing synchronize_srcu_expedited() call when shutting downGravatar Kent Overstreet 1-1/+3
We use the polling interface to srcu for tracking pending frees; when shutting down we don't need to wait for an srcu barrier to free them, but SRCU still gets confused if we shutdown with an outstanding grace period. Reported-by: syzbot+6a038377f0a594d7d44e@syzkaller.appspotmail.com Reported-by: syzbot+0ece6edfd05ed20e32d9@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 daysbcachefs: Check for invalid bucket from bucket_gen(), gc_bucket()Gravatar Kent Overstreet 8-47/+135
Turn more asserts into proper recoverable error paths. Reported-by: syzbot+246b47da27f8e7e7d6fb@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 daysbcachefs: Replace bucket_valid() asserts in bucket lookup with proper checksGravatar Kent Overstreet 4-2/+10
The bucket_gens array and gc_buckets array known their own size; we should be using those members, and returning an error. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 daysbcachefs: Fix snapshot_create_lock lock orderingGravatar Kent Overstreet 1-12/+5
====================================================== WARNING: possible circular locking dependency detected 6.10.0-rc2-ktest-00018-gebd1d148b278 #144 Not tainted ------------------------------------------------------ fio/1345 is trying to acquire lock: ffff88813e200ab8 (&c->snapshot_create_lock){++++}-{3:3}, at: bch2_truncate+0x76/0xf0 but task is already holding lock: ffff888105a1fa38 (&sb->s_type->i_mutex_key#13){+.+.}-{3:3}, at: do_truncate+0x7b/0xc0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&sb->s_type->i_mutex_key#13){+.+.}-{3:3}: down_write+0x3d/0xd0 bch2_write_iter+0x1c0/0x10f0 vfs_write+0x24a/0x560 __x64_sys_pwrite64+0x77/0xb0 x64_sys_call+0x17e5/0x1ab0 do_syscall_64+0x68/0x130 entry_SYSCALL_64_after_hwframe+0x4b/0x53 -> #1 (sb_writers#10){.+.+}-{0:0}: mnt_want_write+0x4a/0x1d0 filename_create+0x69/0x1a0 user_path_create+0x38/0x50 bch2_fs_file_ioctl+0x315/0xbf0 __x64_sys_ioctl+0x297/0xaf0 x64_sys_call+0x10cb/0x1ab0 do_syscall_64+0x68/0x130 entry_SYSCALL_64_after_hwframe+0x4b/0x53 -> #0 (&c->snapshot_create_lock){++++}-{3:3}: __lock_acquire+0x1445/0x25b0 lock_acquire+0xbd/0x2b0 down_read+0x40/0x180 bch2_truncate+0x76/0xf0 bchfs_truncate+0x240/0x3f0 bch2_setattr+0x7b/0xb0 notify_change+0x322/0x4b0 do_truncate+0x8b/0xc0 do_ftruncate+0x110/0x270 __x64_sys_ftruncate+0x43/0x80 x64_sys_call+0x1373/0x1ab0 do_syscall_64+0x68/0x130 entry_SYSCALL_64_after_hwframe+0x4b/0x53 other info that might help us debug this: Chain exists of: &c->snapshot_create_lock --> sb_writers#10 --> &sb->s_type->i_mutex_key#13 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&sb->s_type->i_mutex_key#13); lock(sb_writers#10); lock(&sb->s_type->i_mutex_key#13); rlock(&c->snapshot_create_lock); *** DEADLOCK *** Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 daysbcachefs: Fix refcount leak in check_fix_ptrs()Gravatar Kent Overstreet 1-116/+133
fsck_err() does a goto fsck_err on error; factor out check_fix_ptr() so that our error label can drop our device ref. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 daysbcachefs: Leave a buffer in the btree key cache to avoid lock thrashingGravatar Kent Overstreet 1-0/+8
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 daysbcachefs: Fix reporting of freed objects from key cache shrinkerGravatar Kent Overstreet 1-8/+5
We count objects as freed when we move them to the srcu-pending lists because we're doing the equivalent of a kfree_srcu(); the only difference is managing the pending list ourself means we can allocate from the pending list. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 daysbcachefs: set sb->s_shrinker->seeks = 0Gravatar Kent Overstreet 1-0/+1
inodes and dentries are still present in the btree node cache, in much more compact form Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 daysbcachefs: increase key cache shrinker batch sizeGravatar Kent Overstreet 1-1/+2
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 daysbcachefs: Enable automatic shrinking for rhashtablesGravatar Kent Overstreet 4-14/+18
Since the key cache shrinker walks the rhashtable, a mostly empty rhashtable leads to really nasty reclaim performance issues. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 daysbcachefs: fix the display format for show-superGravatar Hongbo Li 1-3/+3
There are three keys displayed in non-uniform format. Let's fix them. [Before] ``` Label: testbcachefs Version: 1.9: (unknown version) Version upgrade complete: 0.0: (unknown version) ``` [After] ``` Label: testbcachefs Version: 1.9: (unknown version) Version upgrade complete: 0.0: (unknown version) ``` Fixes: 7423330e30ab ("bcachefs: prt_printf() now respects \r\n\t") Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 daysbcachefs: fix stack frame size in fsck.cGravatar Kent Overstreet 1-0/+3
fsck.c always runs top of the stack so we're not too concerned here; noinline_for_stack is sufficient Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 daysbcachefs: Delete incorrect BTREE_ID_NR assertionGravatar Kent Overstreet 1-6/+1
for forwards compat we now explicitly allow mounting and using filesystems with unknown btrees, and we have to walk them for fsck. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 daysbcachefs: Fix incorrect error handling found_btree_node_is_readable()Gravatar Kent Overstreet 1-4/+5
error handling here is slightly odd, which is why we were accidently calling evict() on an error pointer Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 daysbcachefs: Split out btree_write_submit_wqGravatar Kent Overstreet 3-8/+13
Split the workqueues for btree read completions and btree write submissions; we don't want concurrency control on btree read completions, but we do want concurrency control on write submissions, else blocking in submit_bio() will cause a ton of kworkers to be allocated. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
12 daysbcachefs: Fix trans->locked assertGravatar Kent Overstreet 1-0/+1
in bch2_move_data_btree, we might start with the trans unlocked from a previous loop iteration - we need a trans_begin() before iter_init(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
12 daysbcachefs: Rereplicate now moves data off of durability=0 devicesGravatar Kent Overstreet 1-1/+14
This fixes an issue where setting a device to durability=0 after it's been used makes it impossible to remove. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
12 daysbcachefs: Fix GFP_KERNEL allocation in break_cycle()Gravatar Kent Overstreet 1-0/+1
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-31Merge tag 'bcachefs-2024-05-30' of https://evilpiepirate.org/git/bcachefsGravatar Linus Torvalds 27-686/+706
Pull bcachefs fixes from Kent Overstreet: "Assorted odds and ends... - two downgrade fixes - a couple snapshot deletion and repair fixes, thanks to noradtux for finding these and providing the image to debug them - a couple assert fixes - convert to folio helper, from Matthew - some improved error messages - bit of code reorganization (just moving things around); doing this while things are quiet so I'm not rebasing fixes past reorgs - don't return -EROFS on inconsistency error in recovery, this confuses util-linux and has it retry the mount - fix failure to return error on misaligned dio write; reported as an issue with coreutils shred" * tag 'bcachefs-2024-05-30' of https://evilpiepirate.org/git/bcachefs: (21 commits) bcachefs: Fix failure to return error on misaligned dio write bcachefs: Don't return -EROFS from mount on inconsistency error bcachefs: Fix uninitialized var warning bcachefs: Split out sb-errors_format.h bcachefs: Split out journal_seq_blacklist_format.h bcachefs: Split out replicas_format.h bcachefs: Split out disk_groups_format.h bcachefs: split out sb-downgrade_format.h bcachefs: split out sb-members_format.h bcachefs: Better fsck error message for key version bcachefs: btree_gc can now handle unknown btrees bcachefs: add missing MODULE_DESCRIPTION() bcachefs: Fix setting of downgrade recovery passes/errors bcachefs: Run check_key_has_snapshot in snapshot_delete_keys() bcachefs: Refactor delete_dead_snapshots() bcachefs: Fix locking assert bcachefs: Fix lookup_first_inode() when inode_generations are present bcachefs: Plumb bkey into __btree_err() bcachefs: Use copy_folio_from_iter_atomic() bcachefs: Fix sb-downgrade validation ...
2024-05-29bcachefs: Fix failure to return error on misaligned dio writeGravatar Kent Overstreet 1-1/+3
This was reported as an error when running coreutils shred. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-28bcachefs: Don't return -EROFS from mount on inconsistency errorGravatar Kent Overstreet 1-2/+10
We were accidentally returning -EROFS during recovery on filesystem inconsistency - since this is what the journal returns on emergency shutdown. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-28bcachefs: Fix uninitialized var warningGravatar Kent Overstreet 2-2/+2
Can't actually be used uninitialized, but gcc was being silly. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-28bcachefs: Split out sb-errors_format.hGravatar Kent Overstreet 3-292/+297
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-28bcachefs: Split out journal_seq_blacklist_format.hGravatar Kent Overstreet 2-10/+16
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-28bcachefs: Split out replicas_format.hGravatar Kent Overstreet 2-32/+36
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-28bcachefs: Split out disk_groups_format.hGravatar Kent Overstreet 2-18/+22
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-28bcachefs: split out sb-downgrade_format.hGravatar Kent Overstreet 2-13/+18
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-28bcachefs: split out sb-members_format.hGravatar Kent Overstreet 2-101/+111
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-28bcachefs: Better fsck error message for key versionGravatar Kent Overstreet 1-4/+5
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-28bcachefs: btree_gc can now handle unknown btreesGravatar Kent Overstreet 5-73/+55
Compatibility fix - we no longer have a separate table for which order gc walks btrees in, and special case the stripes btree directly. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-28bcachefs: add missing MODULE_DESCRIPTION()Gravatar Jeff Johnson 1-0/+1
Fix the 'make W=1' warning: WARNING: modpost: missing MODULE_DESCRIPTION() in fs/bcachefs/mean_and_variance_test.o Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-28bcachefs: Fix setting of downgrade recovery passes/errorsGravatar Kent Overstreet 1-9/+3
bch2_check_version_downgrade() was setting c->sb.version, which bch2_sb_set_downgrade() expects to be at the previous version; and it shouldn't even have been set directly because c->sb.version is updated by write_super(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-28bcachefs: Run check_key_has_snapshot in snapshot_delete_keys()Gravatar Kent Overstreet 3-24/+29
delete_dead_snapshots now runs before the main fsck.c passes which check for keys for invalid snapshots; thus, it needs those checks as well. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-28bcachefs: Refactor delete_dead_snapshots()Gravatar Kent Overstreet 1-40/+25
Consolidate per-key work into delete_dead_snapshots_process_key(), so we now walk all keys once, not twice. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-28bcachefs: Fix locking assertGravatar Kent Overstreet 1-5/+5
We now track whether a transaction is locked, and verify that we don't have nodes locked when the transaction isn't locked; reorder relocks to not pop the new assert. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-28bcachefs: Fix lookup_first_inode() when inode_generations are presentGravatar Kent Overstreet 1-14/+10
This function is used for finding the hash seed (which is the same in all versions of an inode in different snapshots): ff an inode has been deleted in a child snapshot we need to iterate until we find a live version. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-28bcachefs: Plumb bkey into __btree_err()Gravatar Kent Overstreet 1-40/+45
It can be useful to know the exact byte offset within a btree node where an error occured. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-26bcachefs: Use copy_folio_from_iter_atomic()Gravatar Matthew Wilcox (Oracle) 1-3/+3
copy_page_from_iter_atomic() will be removed at some point. Also fixup a comment for folios. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-26bcachefs: Fix sb-downgrade validationGravatar Kent Overstreet 1-3/+10
Superblock downgrade entries are only two byte aligned, but section sizes are 8 byte aligned, which means we have to be careful about overrun checks; an entry that crosses the end of the section is allowed (and ignored) as long as it has zero errors. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-26bcachefs: Fix debug assertGravatar Kent Overstreet 1-1/+1
Reported-by: syzbot+a8074a75b8d73328751e@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-24Merge tag 'bcachefs-2024-05-24' of https://evilpiepirate.org/git/bcachefsGravatar Linus Torvalds 15-57/+96
Pull bcachefs fixes from Kent Overstreet: "Nothing exciting, just syzbot fixes (except for the one FMODE_CAN_ODIRECT patch). Looks like syzbot reports have slowed down; this is all catch up from two weeks of conferences. Next hardening project is using Thomas's error injection tooling to torture test repair" * tag 'bcachefs-2024-05-24' of https://evilpiepirate.org/git/bcachefs: bcachefs: Fix race path in bch2_inode_insert() bcachefs: Ensure we're RW before journalling bcachefs: Fix shutdown ordering bcachefs: Fix unsafety in bch2_dirent_name_bytes() bcachefs: Fix stack oob in __bch2_encrypt_bio() bcachefs: Fix btree_trans leak in bch2_readahead() bcachefs: Fix bogus verify_replicas_entry() assert bcachefs: Check for subvolues with bogus snapshot/inode fields bcachefs: bch2_checksum() returns 0 for unknown checksum type bcachefs: Fix bch2_alloc_ciphers() bcachefs: Add missing guard in bch2_snapshot_has_children() bcachefs: Fix missing parens in drop_locks_do() bcachefs: Improve bch2_assert_pos_locked() bcachefs: Fix shift overflows in replicas.c bcachefs: Fix shift overflow in btree_lost_data() bcachefs: Fix ref in trans_mark_dev_sbs() error path bcachefs: set FMODE_CAN_ODIRECT instead of a dummy direct_IO method bcachefs: Fix rcu splat in check_fix_ptrs()
2024-05-22bcachefs: Fix race path in bch2_inode_insert()Gravatar Kent Overstreet 1-2/+1
__destroy_new_inode() is appropriate when we have _just_allocated the inode, but not when it's been fully initialized and on i_sb_list. Reported-by: syzbot+a0ddc9873c280a4cb18f@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-22bcachefs: Ensure we're RW before journallingGravatar Kent Overstreet 1-1/+3
Reported-by: syzbot+c60cd352aedb109528bf@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-22tracing/treewide: Remove second parameter of __assign_str()Gravatar Steven Rostedt (Google) 1-3/+3
With the rework of how the __string() handles dynamic strings where it saves off the source string in field in the helper structure[1], the assignment of that value to the trace event field is stored in the helper value and does not need to be passed in again. This means that with: __string(field, mystring) Which use to be assigned with __assign_str(field, mystring), no longer needs the second parameter and it is unused. With this, __assign_str() will now only get a single parameter. There's over 700 users of __assign_str() and because coccinelle does not handle the TRACE_EVENT() macro I ended up using the following sed script: git grep -l __assign_str | while read a ; do sed -e 's/\(__assign_str([^,]*[^ ,]\) *,[^;]*/\1)/' $a > /tmp/test-file; mv /tmp/test-file $a; done I then searched for __assign_str() that did not end with ';' as those were multi line assignments that the sed script above would fail to catch. Note, the same updates will need to be done for: __assign_str_len() __assign_rel_str() __assign_rel_str_len() I tested this with both an allmodconfig and an allyesconfig (build only for both). [1] https://lore.kernel.org/linux-trace-kernel/20240222211442.634192653@goodmis.org/ Link: https://lore.kernel.org/linux-trace-kernel/20240516133454.681ba6a0@rorschach.local.home Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Acked-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Christian König <christian.koenig@amd.com> for the amdgpu parts. Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> #for Acked-by: Rafael J. Wysocki <rafael@kernel.org> # for thermal Acked-by: Takashi Iwai <tiwai@suse.de> Acked-by: Darrick J. Wong <djwong@kernel.org> # xfs Tested-by: Guenter Roeck <linux@roeck-us.net>
2024-05-22bcachefs: Fix shutdown orderingGravatar Kent Overstreet 2-1/+8
the btree key cache uses the srcu struct created/destroyed by btree_iter.c; btree_iter needs to be exited last. Reported-by: syzbot+3af9daea347788b15213@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-22bcachefs: Fix unsafety in bch2_dirent_name_bytes()Gravatar Kent Overstreet 1-0/+3
Reported-by: syzbot+84fa6fb8c7f98b93cdea@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-22bcachefs: Fix stack oob in __bch2_encrypt_bio()Gravatar Kent Overstreet 1-2/+6
Reported-by: syzbot+fff6b0fb00259873576a@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-22bcachefs: Fix btree_trans leak in bch2_readahead()Gravatar Kent Overstreet 1-2/+2
Reported-by: syzbot+d797fe78808e968d6c84@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>