aboutsummaryrefslogtreecommitdiff
path: root/fs/bcachefs/super-io.c
AgeCommit message (Collapse)AuthorFilesLines
2023-10-22bcachefs: Printbuf reworkGravatar Kent Overstreet 1-222/+217
This converts bcachefs to the modern printbuf interface/implementation, synced with the version to be submitted upstream. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Rename group to label for remaining strings.Gravatar Daniel Hill 1-1/+1
Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Add persistent countersGravatar Daniel Hill 1-0/+3
This adds a new superblock field for persisting counters and adds a sysfs interface in counters/ exposing these counters. The superblock field is ignored by older versions letting us avoid an on disk version bump. Each sysfs file outputs a counter that tracks since filesystem creation and a counter for the current mount session. Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fold bucket_state in to BCH_DATA_TYPES()Gravatar Kent Overstreet 1-1/+0
Previously, we were missing accounting for buckets in need_gc_gens and need_discard states. This matters because buckets in those states need other btree operations done before they can be used, so they can't be conuted when checking current number of free buckets against the allocation watermark. Also, we weren't directly counting free buckets at all. Now, data type 0 == BCH_DATA_free, and free buckets are counted; this means we can get rid of the separate (poorly defined) count of unavailable buckets. This is a new on disk format version, with upgrade and fsck required for the accounting changes. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Add a tracepoint for superblock writesGravatar Kent Overstreet 1-0/+3
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Don't write partially-initialized superblocksGravatar Kent Overstreet 1-0/+7
This neatly avoids bugs where we fail partway through initializing a new filesystem, if we just don't write out partly-initialized state. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Freespace, need_discard btreesGravatar Kent Overstreet 1-0/+5
This adds two new btrees for the upcoming allocator rewrite: an extents btree of free buckets, and a btree for buckets awaiting discards. We also add a new trigger for alloc keys to keep the new btrees up to date, and a compatibility path to initialize them on existing filesystems. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: bch_sb_field_journal_v2Gravatar Kent Overstreet 1-80/+2
Add a new superblock field which represents journal buckets as ranges: also move code for the superblock journal fields to journal_sb.c. This also reworks the code for resizing the journal to write the new superblock before using the new journal buckets, and thus be a bit safer. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Add printf format attribute to bch2_pr_buf()Gravatar Kent Overstreet 1-1/+1
This tells the compiler to check printf format strings, and catches a few bugs. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Reset journal flush delay to default value if zeroedGravatar Kent Overstreet 1-3/+16
We've been seeing a very strange bug where journal flush & reclaim delay end up getting inexplicably zeroed, in the superblock. We're now validating all the options in bch2_validate_super(), and 0 is no longer a valid value for those options, but we need to be careful not to prevent people's filesystems from mounting because of the new validation. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Better superblock opt validationGravatar Kent Overstreet 1-0/+16
This moves validation of superblock options to bch2_sb_validate(), so they'll be checked in the write path as well. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: x-macro metadata version enumGravatar Kent Overstreet 1-4/+4
Now we've got strings for metadata versions - this changes bch2_sb_to_text() and our mount log message to use it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Convert bch2_sb_to_text to master option listGravatar Kent Overstreet 1-101/+90
Options no longer have to be manually added to bch2_sb_to_text() - it now uses the master list of options in opts.h. Also, improve some of the formatting by converting it to tabstops. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Journal seq now incremented at entry open, not closeGravatar Kent Overstreet 1-1/+1
This patch changes journal_entry_open() to initialize the new journal entry, not __journal_entry_close(). This also means that journal_cur_seq() refers to the sequence number of the last journal entry when we don't have an open journal entry, not the next one. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix a memory leakGravatar Kent Overstreet 1-8/+9
This fixes a regression from "bcachefs: Heap allocate printbufs" - bch2_sb_field_validate() was leaking an error string. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Heap allocate printbufsGravatar Kent Overstreet 1-22/+11
This patch changes printbufs dynamically allocate and reallocate a buffer as needed. Stack usage has become a bit of a problem, and a major cause of that has been static size string buffers on the stack. The most involved part of this refactoring is that printbufs must now be exited with printbuf_exit(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix 32 bit buildGravatar Kent Overstreet 1-5/+5
vstruct_bytes() was returning a u64 - it should be a size_t, the corect type for the size of anything that fits in memory. Also replace a 64 bit divide with div_u64(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Add tabstops to printbufsGravatar Kent Overstreet 1-6/+6
Now, when outputting to printbufs, we can set tabstops and left or right justify text to them - this is to be used by the userspace 'bcachefs fs usage' command. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Add .to_text() methods for all superblock sectionsGravatar Kent Overstreet 1-12/+352
This patch improves the superblock .to_text() methods and adds methods for all types that were missing them. It also improves printbufs by allowing them to specfiy what units we want to be printing in, and adds new wrapper methods for unifying our kernel and userspace environments. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Delete some flag bits that are no longer usedGravatar Kent Overstreet 1-3/+0
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Improved superblock-related error messagesGravatar Kent Overstreet 1-176/+290
This patch converts bch2_sb_validate() and the .validate methods for the various superblock sections to take printbuf, to which they can print detailed error messages, including printing the entire section that was invalid. This is a great improvement over the previous situation, where we could only return static strings that didn't have precise information about what was wrong. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Improve error messages in superblock write pathGravatar Kent Overstreet 1-4/+17
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: bch2_journal_entry_to_text()Gravatar Kent Overstreet 1-3/+3
This adds a _to_text() pretty printer for journal entries - including every subtype - which will shortly be used by the 'bcachefs list_journal' subcommand. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Turn encoded_extent_max into a regular optionGravatar Kent Overstreet 1-1/+0
It'll now be handled at format time and in sysfs like other options - it still can only be set at format time, though. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Option improvementsGravatar Kent Overstreet 1-8/+9
This adds flags for options that must be a power of two (block size and btree node size), and options that are stored in the superblock as a power of two (encoded extent max). Also: options are now stored in memory in the same units they're displayed in (bytes): we now convert when getting and setting from the superblock. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix BCH_FS_ERROR flag handlingGravatar Kent Overstreet 1-10/+0
We were setting BCH_FS_ERROR on startup if the superblock was marked as containing errors, which is not what we wanted - BCH_FS_ERROR indicates whether errors have been found, so that after a successful fsck we're able to clear the error bit in the superblock. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Disk space accounting fix on brand-new fsGravatar Kent Overstreet 1-0/+8
The filesystem initialization path first marks superblock and journal buckets non transactionally, since the btree isn't functional yet. That path was updating the per-journal-buf percpu counters via bch2_dev_usage_update(), and updating the wrong set of counters so those updates didn't get written out until journal entry 4. The relevant code is going to get significantly rewritten in the future as we transition away from the in memory bucket array, so this just hacks around it for now. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Improve error message in bch2_write_super()Gravatar Kent Overstreet 1-1/+2
It's helpful to know what the superblock write is for. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Mask out unknown compat features when going read-writeGravatar Kent Overstreet 1-0/+1
Compat features should be cleared if the filesystem was touched by a version that doesn't support them. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: fix a possible bcachefs checksum mapping error opt-checksum enum ↵Gravatar Janpieter Sollie 1-1/+1
to type-checksum enum This fixes some rare cases where the metadata checksum option specified may map to the wrong actual checksum type. Signed-off-by: Janpieter Sollie <janpieter.sollie@edpnet.be> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Assorted endianness fixesGravatar Kent Overstreet 1-7/+7
Found by sparse Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix time handlingGravatar Kent Overstreet 1-2/+8
There were some overflows in the time conversion functions - fix this by converting tv_sec and tv_nsec separately. Also, set sb->time_min and sb->time_max. Fixes xfstest generic/258. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: New and improved topology repair codeGravatar Kent Overstreet 1-0/+7
This splits out btree topology repair into a separate pass, and makes some improvements: - When we have to pick which of two overlapping nodes to drop keys from, we use the btree node header sequence number to preserve the newer node - the gc code has been changed so that it doesn't bail out if we're continuing/ignoring on fsck error - this way the dump tool can skip running the repair pass but still walk all reachable metadata - add a new superblock flag indicating when a filesystem is known to have btree topology issues, and the topology repair pass should be run - changing the start/end of a node might mean keys in that node have to be deleted: this patch handles that better by splitting it out into a separate function and running it explicitly in the topology repair code, previously those keys were only being dropped when the btree node was read in. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Eliminate more PAGE_SIZE usesGravatar Kent Overstreet 1-16/+15
In userspace, we don't really have a well defined PAGE_SIZE and shouln't be relying on it. This is some more incremental work to remove references to it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Validate bset version field against sb version fieldsGravatar Kent Overstreet 1-0/+1
The superblock version fields need to be accurate to know whether a filesystem is supported, thus we should be verifying them. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix compat code for superblockGravatar Kent Overstreet 1-6/+25
The bkey compat code wasn't being run for btree roots in the superblock clean section - this patch fixes it to use the journal entry validate code. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Rename BTREE_ID enums for consistency with other enumsGravatar Kent Overstreet 1-1/+1
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix bch2_write_super to obey very_degraded optionGravatar Kent Overstreet 1-2/+6
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Use x-macros for compat feature bitsGravatar Kent Overstreet 1-3/+2
This is to generate strings for them, so that we can print them out. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Extents may now cross btree node boundariesGravatar Kent Overstreet 1-3/+1
When snapshots arrive, we won't necessarily be able to arbitrarily split existis - when we need to split an existing extent, we'll have to check if the extent was overwritten in child snapshots and if so emit a whiteout for the split in the child snapshot. Because extents couldn't span btree nodes previously, journal replay would sometimes have to split existing extents. That's no good anymore, but fortunately since extent handling has already been lifted above most of the btree code there's no real need for that rule anymore. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Redo checks for sufficient devicesGravatar Kent Overstreet 1-4/+3
When the replicas mechanism was added, for tracking data by which drives it's replicated on, the check for whether we have sufficient devices was never updated to make use of it. This patch finally does that. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Journal updates to dev usageGravatar Kent Overstreet 1-1/+21
This eliminates the need to scan every bucket to regenerate dev_usage at mount time. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Persist 64 bit io clocksGravatar Kent Overstreet 1-33/+27
Originally, bcachefs - going back to bcache - stored, for each bucket, a 16 bit counter corresponding to how long it had been since the bucket was read from. But, this required periodically rescaling counters on every bucket to avoid wraparound. That wasn't an issue in bcache, where we'd perodically rewrite the per bucket metadata all at once, but in bcachefs we're trying to avoid having to walk every single bucket. This patch switches to persisting 64 bit io clocks, corresponding to the 64 bit bucket timestaps introduced in the previous patch with KEY_TYPE_alloc_v2. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix BCH_REPLICAS_MAX checkGravatar Kent Overstreet 1-4/+4
Ideally, this limit will be going away in the future. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Improve some IO error messagesGravatar Kent Overstreet 1-1/+1
it's useful to know whether an error was for a read or a write - this also standardizes error messages a bit more. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Refactor filesystem usage accountingGravatar Kent Overstreet 1-1/+1
Various filesystem usage counters are kept in percpu counters, with one set per in flight journal buffer. Right now all the code that deals with it assumes that there's only two buffers/sets of counters, but the number of journal bufs is getting increased to 4 in the next patch - so refactor that code to not assume a constant. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Add bch2_blk_status_to_str()Gravatar Kent Overstreet 1-1/+1
We define our own BLK_STS_REMOVED, so we need our own to_str helper too. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Use x-macros for data typesGravatar Kent Overstreet 1-2/+2
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Use blk_status_to_str()Gravatar Kent Overstreet 1-1/+2
Improved error messages are always a good thing Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Interior btree updates are now fully transactionalGravatar Kent Overstreet 1-20/+2
We now update the alloc info (bucket sector counts) atomically with journalling the update to the interior btree nodes, and we also set new btree roots atomically with the journalled part of the btree update. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>