aboutsummaryrefslogtreecommitdiff
path: root/fs/xfs/libxfs/xfs_attr_leaf.c
AgeCommit message (Collapse)AuthorFilesLines
2019-11-22xfs: remove the mappedbno argument to xfs_da_get_bufGravatar Christoph Hellwig 1-2/+2
Use the xfs_da_get_buf_daddr function directly for the two callers that pass a mapped disk address, and then remove the mappedbno argument. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-22xfs: remove the mappedbno argument to xfs_da_read_bufGravatar Christoph Hellwig 1-1/+1
Move the code for reading an already mapped block into xfs_da3_node_read_mapped, which is the only caller ever passing a block number in the mappedbno argument and replace the mappedbno argument with the simple xfs_dabuf_get flags. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-22xfs: remove the mappedbno argument to xfs_attr3_leaf_readGravatar Christoph Hellwig 1-9/+8
This argument is always hard coded to -1, so remove it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-15xfs: fix attr leaf header freemap.size underflowGravatar Brian Foster 1-1/+3
The leaf format xattr addition helper xfs_attr3_leaf_add_work() adjusts the block freemap in a couple places. The first update drops the size of the freemap that the caller had already selected to place the xattr name/value data. Before the function returns, it also checks whether the entries array has encroached on a freemap range by virtue of the new entry addition. This is necessary because the entries array grows from the start of the block (but end of the block header) towards the end of the block while the name/value data grows from the end of the block in the opposite direction. If the associated freemap is already empty, however, size is zero and the subtraction underflows the field and causes corruption. This is reproduced rarely by generic/070. The observed behavior is that a smaller sized freemap is aligned to the end of the entries list, several subsequent xattr additions land in larger freemaps and the entries list expands into the smaller freemap until it is fully consumed and then underflows. Note that it is not otherwise a corruption for the entries array to consume an empty freemap because the nameval list (i.e. the firstused pointer in the xattr header) starts beyond the end of the corrupted freemap. Update the freemap size modification to account for the fact that the freemap entry can be empty and thus stale. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-10xfs: add a btree entries pointer to struct xfs_da3_icnode_hdrGravatar Christoph Hellwig 1-4/+2
All but two callers of the ->node_tree_p dir operation already have a xfs_da3_icnode_hdr from a previous call to xfs_da3_node_hdr_from_disk at hand. Add a pointer to the btree entries to struct xfs_da3_icnode_hdr to clean up this pattern. The two remaining callers now expand the whole header as well, but that isn't very expensive and not in a super hot path anyway. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-10xfs: devirtualize ->node_hdr_to_diskGravatar Christoph Hellwig 1-1/+1
Replace the ->node_hdr_to_disk dir ops method with a directly called xfs_da_node_hdr_to_disk helper that takes care of the v4 vs v5 difference. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-10xfs: devirtualize ->node_hdr_from_diskGravatar Christoph Hellwig 1-1/+1
Replace the ->node_hdr_from_disk dir ops method with a directly called xfs_da_node_hdr_from_disk helper that takes care of the v4 vs v5 difference. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-10xfs: Correct comment tyops -> typosGravatar Joe Perches 1-1/+1
Just fix the typos checkpatch notices... Signed-off-by: Joe Perches <joe@perches.com> Reviewed-by: Bill O'Donnell <billodo@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-04xfs: always log corruption errorsGravatar Darrick J. Wong 1-3/+9
Make sure we log something to dmesg whenever we return -EFSCORRUPTED up the call stack. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-10-29xfs: check attribute leaf block structureGravatar Darrick J. Wong 1-2/+65
Add missing structure checks in the attribute leaf verifier. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-10-21xfs: fix inode fork extent count overflowGravatar Dave Chinner 1-8/+10
[commit message is verbose for discussion purposes - will trim it down later. Some questions about implementation details at the end.] Zorro Lang recently ran a new test to stress single inode extent counts now that they are no longer limited by memory allocation. The test was simply: # xfs_io -f -c "falloc 0 40t" /mnt/scratch/big-file # ~/src/xfstests-dev/punch-alternating /mnt/scratch/big-file This test uncovered a problem where the hole punching operation appeared to finish with no error, but apparently only created 268M extents instead of the 10 billion it was supposed to. Further, trying to punch out extents that should have been present resulted in success, but no change in the extent count. It looked like a silent failure. While running the test and observing the behaviour in real time, I observed the extent coutn growing at ~2M extents/minute, and saw this after about an hour: # xfs_io -f -c "stat" /mnt/scratch/big-file |grep next ; \ > sleep 60 ; \ > xfs_io -f -c "stat" /mnt/scratch/big-file |grep next fsxattr.nextents = 127657993 fsxattr.nextents = 129683339 # And a few minutes later this: # xfs_io -f -c "stat" /mnt/scratch/big-file |grep next fsxattr.nextents = 4177861124 # Ah, what? Where did that 4 billion extra extents suddenly come from? Stop the workload, unmount, mount: # xfs_io -f -c "stat" /mnt/scratch/big-file |grep next fsxattr.nextents = 166044375 # And it's back at the expected number. i.e. the extent count is correct on disk, but it's screwed up in memory. I loaded up the extent list, and immediately: # xfs_io -f -c "stat" /mnt/scratch/big-file |grep next fsxattr.nextents = 4192576215 # It's bad again. So, where does that number come from? xfs_fill_fsxattr(): if (ip->i_df.if_flags & XFS_IFEXTENTS) fa->fsx_nextents = xfs_iext_count(&ip->i_df); else fa->fsx_nextents = ip->i_d.di_nextents; And that's the behaviour I just saw in a nutshell. The on disk count is correct, but once the tree is loaded into memory, it goes whacky. Clearly there's something wrong with xfs_iext_count(): inline xfs_extnum_t xfs_iext_count(struct xfs_ifork *ifp) { return ifp->if_bytes / sizeof(struct xfs_iext_rec); } Simple enough, but 134M extents is 2**27, and that's right about where things went wrong. A struct xfs_iext_rec is 16 bytes in size, which means 2**27 * 2**4 = 2**31 and we're right on target for an integer overflow. And, sure enough: struct xfs_ifork { int if_bytes; /* bytes in if_u1 */ .... Once we get 2**27 extents in a file, we overflow if_bytes and the in-core extent count goes wrong. And when we reach 2**28 extents, if_bytes wraps back to zero and things really start to go wrong there. This is where the silent failure comes from - only the first 2**28 extents can be looked up directly due to the overflow, all the extents above this index wrap back to somewhere in the first 2**28 extents. Hence with a regular pattern, trying to punch a hole in the range that didn't have holes mapped to a hole in the first 2**28 extents and so "succeeded" without changing anything. Hence "silent failure"... Fix this by converting if_bytes to a int64_t and converting all the index variables and size calculations to use int64_t types to avoid overflows in future. Signed integers are still used to enable easy detection of extent count underflows. This enables scalability of extent counts to the limits of the on-disk format - MAXEXTNUM (2**31) extents. Current testing is at over 500M extents and still going: fsxattr.nextents = 517310478 Reported-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-09xfs: move local to extent inode logging into bmap helperGravatar Brian Foster 1-2/+1
The callers of xfs_bmap_local_to_extents_empty() log the inode external to the function, yet this function is where the on-disk format value is updated. Push the inode logging down into the function itself to help prevent future mistakes. Note that internal bmap callers track the inode logging flags independently and thus may log the inode core twice due to this change. This is harmless, so leave this code around for consistency with the other attr fork conversion functions. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-09xfs: remove broken error handling on failed attr sf to leaf changeGravatar Brian Foster 1-17/+2
xfs_attr_shortform_to_leaf() attempts to put the shortform fork back together after a failed attempt to convert from shortform to leaf format. While this code reallocates and copies back the shortform attr fork data, it never resets the inode format field back to local format. Further, now that the inode is properly logged after the initial switch from local format, any error that triggers the recovery code will eventually abort the transaction and shutdown the fs. Therefore, remove the broken and unnecessary error handling code. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-09xfs: log the inode on directory sf to block format changeGravatar Brian Foster 1-0/+1
When a directory changes from shortform (sf) to block format, the sf format is copied to a temporary buffer, the inode format is modified and the updated format filled with the dentries from the temporary buffer. If the inode format is modified and attempt to grow the inode fails (due to I/O error, for example), it is possible to return an error while leaving the directory in an inconsistent state and with an otherwise clean transaction. This results in corruption of the associated directory and leads to xfs_dabuf_map() errors as subsequent lookups cannot accurately determine the format of the directory. This problem is reproduced occasionally by generic/475. The fundamental problem is that xfs_dir2_sf_to_block() changes the on-disk inode format without logging the inode. The inode is eventually logged by the bmapi layer in the common case, but error checking introduces the possibility of failing the high level request before this happens. Update both of the dir2 and attr callers of xfs_bmap_local_to_extents_empty() to log the inode core as consistent with the bmap local to extent format change codepath. This ensures that any subsequent errors after the format has changed cause the transaction to abort. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-30xfs: allocate xattr buffer on demandGravatar Dave Chinner 1-0/+6
When doing file lookups and checking for permissions, we end up in xfs_get_acl() to see if there are any ACLs on the inode. This requires and xattr lookup, and to do that we have to supply a buffer large enough to hold an maximum sized xattr. On workloads were we are accessing a wide range of cache cold files under memory pressure (e.g. NFS fileservers) we end up spending a lot of time allocating the buffer. The buffer is 64k in length, so is a contiguous multi-page allocation, and if that then fails we fall back to vmalloc(). Hence the allocation here is /expensive/ when we are looking up hundreds of thousands of files a second. Initial numbers from a bpf trace show average time in xfs_get_acl() is ~32us, with ~19us of that in the memory allocation. Note these are average times, so there are going to be affected by the worst case allocations more than the common fast case... To avoid this, we could just do a "null" lookup to see if the ACL xattr exists and then only do the allocation if it exists. This, however, optimises the path for the "no ACL present" case at the expense of the "acl present" case. i.e. we can halve the time in xfs_get_acl() for the no acl case (i.e down to ~10-15us), but that then increases the ACL case by 30% (i.e. up to 40-45us). To solve this and speed up both cases, drive the xattr buffer allocation into the attribute code once we know what the actual xattr length is. For the no-xattr case, we avoid the allocation completely, speeding up that case. For the common ACL case, we'll end up with a fast heap allocation (because it'll be smaller than a page), and only for the rarer "we have a remote xattr" will we have a multi-page allocation occur. Hence the common ACL case will be much faster, too. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-30xfs: consolidate attribute value copyingGravatar Dave Chinner 1-39/+49
The same code is used to copy do the attribute copying in three different places. Consolidate them into a single function in preparation from on-demand buffer allocation. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-30xfs: move remote attr retrieval into xfs_attr3_leaf_getvalueGravatar Dave Chinner 1-1/+1
Because we repeat exactly the same code to get the remote attribute value after both calls to xfs_attr3_leaf_getvalue() if it's a remote attr. Just do it in xfs_attr3_leaf_getvalue() so the callers don't have to care about it. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-30xfs: remove unnecessary indenting from xfs_attr3_leaf_getvalueGravatar Dave Chinner 1-16/+17
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-30xfs: make attr lookup returns consistentGravatar Dave Chinner 1-4/+11
Shortform, leaf and remote value attr value retrieval return different values for success. This makes it more complex to handle actual errors xfs_attr_get() as some errors mean success and some mean failure. Make the return values consistent for success and failure consistent for all attribute formats. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-26fs: xfs: Remove KM_NOSLEEP and KM_SLEEP.Gravatar Tetsuo Handa 1-4/+4
Since no caller is using KM_NOSLEEP and no callee branches on KM_SLEEP, we can remove KM_NOSLEEP and replace KM_SLEEP with 0. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28xfs: remove unused header filesGravatar Eric Sandeen 1-3/+0
There are many, many xfs header files which are included but unneeded (or included twice) in the xfs code, so remove them. nb: xfs_linux.h includes about 9 headers for everyone, so those explicit includes get removed by this. I'm not sure what the preference is, but if we wanted explicit includes everywhere, a followup patch could remove those xfs_*.h includes from xfs_linux.h and move them into the files that need them. Or it could be left as-is. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28xfs: add struct xfs_mount pointer to struct xfs_bufGravatar Christoph Hellwig 1-6/+6
We need to derive the mount pointer from a buffer in a lot of place. Add a direct pointer to short cut the pointer chasing. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-02-18xfs: fix xfs_buf magic number endian checksGravatar Darrick J. Wong 1-2/+2
Create a separate magic16 check function so that we don't run afoul of static checkers. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-02-11xfs: factor xfs_da3_blkinfo verification into common helperGravatar Brian Foster 1-12/+4
With the verifier magic value helper in place, we've left a bit more duplicate code across the verifiers that involve struct xfs_da3_blkinfo. This includes the da node, xattr leaf and dir leaf verifiers, all of which perform similar checks for v4 and v5 filesystems. Create a common helper to verify an xfs_da3_blkinfo structure, taking care to only access v5 fields where appropriate, and refactor the aforementioned verifiers to use the helper. No functional changes. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-02-11xfs: miscellaneous verifier magic value fixupsGravatar Brian Foster 1-6/+5
Most buffer verifiers have hardcoded magic value checks conditionalized on the version of the filesystem. The magic value field of the verifier structure facilitates abstraction of some of this code. Populate the ->magic field of various verifiers to take advantage of this abstraction. No functional changes. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-02-11xfs: always check magic values in on-disk byte orderGravatar Brian Foster 1-2/+2
Most verifiers that check on-disk magic values convert the CPU endian magic value constant to disk endian to facilitate compile time optimization of the byte swap and reduce the need for runtime byte swaps in buffer verifiers. Several buffer verifiers do not follow this pattern. Update those verifiers for consistency. Also fix up a random typo in the inode readahead verifier name. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-11-06xfs: fix overflow in xfs_attr3_leaf_verifyGravatar Dave Chinner 1-2/+9
generic/070 on 64k block size filesystems is failing with a verifier corruption on writeback or an attribute leaf block: [ 94.973083] XFS (pmem0): Metadata corruption detected at xfs_attr3_leaf_verify+0x246/0x260, xfs_attr3_leaf block 0x811480 [ 94.975623] XFS (pmem0): Unmount and run xfs_repair [ 94.976720] XFS (pmem0): First 128 bytes of corrupted metadata buffer: [ 94.978270] 000000004b2e7b45: 00 00 00 00 00 00 00 00 3b ee 00 00 00 00 00 00 ........;....... [ 94.980268] 000000006b1db90b: 00 00 00 00 00 81 14 80 00 00 00 00 00 00 00 00 ................ [ 94.982251] 00000000433f2407: 22 7b 5c 82 2d 5c 47 4c bb 31 1c 37 fa a9 ce d6 "{\.-\GL.1.7.... [ 94.984157] 0000000010dc7dfb: 00 00 00 00 00 81 04 8a 00 0a 18 e8 dd 94 01 00 ................ [ 94.986215] 00000000d5a19229: 00 a0 dc f4 fe 98 01 68 f0 d8 07 e0 00 00 00 00 .......h........ [ 94.988171] 00000000521df36c: 0c 2d 32 e2 fe 20 01 00 0c 2d 58 65 fe 0c 01 00 .-2.. ...-Xe.... [ 94.990162] 000000008477ae06: 0c 2d 5b 66 fe 8c 01 00 0c 2d 71 35 fe 7c 01 00 .-[f.....-q5.|.. [ 94.992139] 00000000a4a6bca6: 0c 2d 72 37 fc d4 01 00 0c 2d d8 b8 f0 90 01 00 .-r7.....-...... [ 94.994789] XFS (pmem0): xfs_do_force_shutdown(0x8) called from line 1453 of file fs/xfs/xfs_buf.c. Return address = ffffffff815365f3 This is failing this check: end = ichdr.freemap[i].base + ichdr.freemap[i].size; if (end < ichdr.freemap[i].base) >>>>> return __this_address; if (end > mp->m_attr_geo->blksize) return __this_address; And from the buffer output above, the freemap array is: freemap[0].base = 0x00a0 freemap[0].size = 0xdcf4 end = 0xdd94 freemap[1].base = 0xfe98 freemap[1].size = 0x0168 end = 0x10000 freemap[2].base = 0xf0d8 freemap[2].size = 0x07e0 end = 0xf8b8 These all look valid - the block size is 0x10000 and so from the last check in the above verifier fragment we know that the end of freemap[1] is valid. The problem is that end is declared as: uint16_t end; And (uint16_t)0x10000 = 0. So we have a verifier bug here, not a corruption. Fix the verifier to use uint32_t types for the check and hence avoid the overflow. Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=201577 Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-08-01xfs: refactor log recovery checkGravatar Darrick J. Wong 1-2/+1
Add a predicate to decide if the log is actively in recovery and use that instead of open-coding a pagf_init check in the attr leaf verifier. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
2018-07-30xfs: remove the xfs_ifork_t typedefGravatar Christoph Hellwig 1-3/+3
We only have a few more callers left, so seize the opportunity and kill it off. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-07-23xfs: check leaf attribute block freemap in verifierGravatar Darrick J. Wong 1-0/+22
Check the leaf attribute freemap when we're verifying the block. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2018-07-17xfs_attr_leaf: use swap macro in xfs_attr3_leaf_rebalanceGravatar Gustavo A. R. Silva 1-10/+3
Make use of the swap macro and remove some unnecessary variables. This makes the code easier to read and maintain. Also, reduces the stack usage. This code was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-07-11xfs: use ->t_firstblock in xattr opsGravatar Brian Foster 1-2/+0
Similar to the dirops code, the xattr code uses an on-stack firstblock variable for the various operations. This code rolls the underlying transaction in various places, however, which means we cannot simply replace the local firstblock vars with ->t_firstblock. Doing so (without further changes) would invalidate the memory pointed to by xfs_da_args.firstblock as soon as the first transaction rolls. To avoid this problem, remove xfs_da_args.firstblock and replace all such accesses with ->t_firstblock at the same time. This ensures that accesses to the current firstblock always occur through the current transaction rather than a potentially invalid xfs_da_args pointer. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-07-11xfs: replace xfs_da_args->dfops accesses with ->t_dfops and removeGravatar Brian Foster 1-13/+11
Now that xfs_da_args->dfops is always assigned from a ->t_dfops pointer (or one that is immediately attached), replace all downstream accesses of the former with the latter and remove the field from struct xfs_da_args. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-06-08xfs: don't call xfs_da_shrink_inode with NULL bpGravatar Eric Sandeen 1-3/+2
xfs_attr3_leaf_create may have errored out before instantiating a buffer, for example if the blkno is out of range. In that case there is no work to do to remove it, and in fact xfs_da_shrink_inode will lead to an oops if we try. This also seems to fix a flaw where the original error from xfs_attr3_leaf_create gets overwritten in the cleanup case, and it removes a pointless assignment to bp which isn't used after this. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199969 Reported-by: Xu, Wen <wen.xu@gatech.edu> Tested-by: Xu, Wen <wen.xu@gatech.edu> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-06-08xfs: clean up MIN/MAXGravatar Dave Chinner 1-1/+1
Get rid of the MIN/MAX macros and just use the native min/max macros directly in the XFS code. Signed-Off-By: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-06-06xfs: convert to SPDX license tagsGravatar Dave Chinner 1-13/+1
Remove the verbose license text from XFS files and replace them with SPDX tags. This does not change the license of any of the code, merely refers to the common, up-to-date license files in LICENSES/ This change was mostly scripted. fs/xfs/Makefile and fs/xfs/libxfs/xfs_fs.h were modified by hand, the rest were detected and modified by the following command: for f in `git grep -l "GNU General" fs/xfs/` ; do echo $f cat $f | awk -f hdr.awk > $f.new mv -f $f.new $f done And the hdr.awk script that did the modification (including detecting the difference between GPL-2.0 and GPL-2.0+ licenses) is as follows: $ cat hdr.awk BEGIN { hdr = 1.0 tag = "GPL-2.0" str = "" } /^ \* This program is free software/ { hdr = 2.0; next } /any later version./ { tag = "GPL-2.0+" next } /^ \*\// { if (hdr > 0.0) { print "// SPDX-License-Identifier: " tag print str print $0 str="" hdr = 0.0 next } print $0 next } /^ \* / { if (hdr > 1.0) next if (hdr > 0.0) { if (str != "") str = str "\n" str = str $0 next } print $0 next } /^ \*/ { if (hdr > 0.0) next print $0 next } // { if (hdr > 0.0) { if (str != "") str = str "\n" str = str $0 next } print $0 } END { } $ Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-01-29Split buffer's b_fspriv fieldGravatar Carlos Maiolino 1-1/+1
By splitting the b_fspriv field into two different fields (b_log_item and b_li_list). It's possible to get rid of an old ABI workaround, by using the new b_log_item field to store xfs_buf_log_item separated from the log items attached to the buffer, which will be linked in the new b_li_list field. This way, there is no more need to reorder the log items list to place the buf_log_item at the beginning of the list, simplifying a bit the logic to handle buffer IO. This also opens the possibility to change buffer's log items list into a proper list_head. b_log_item field is still defined as a void *, because it is still used by the log buffers to store xlog_in_core structures, and there is no need to add an extra field on xfs_buf just for xlog_in_core. Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Bill O'Donnell <billodo@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> [darrick: minor style changes] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-01-17xfs: attr leaf verifier needs to check for obviously bad countGravatar Darrick J. Wong 1-5/+21
In the attribute leaf verifier, we can check for obviously bad values of firstused and count so that later attempts at lasthash don't run off the end of the memory buffer. Found by ones fuzzing hdr.count in xfs/400 with KASAN. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-01-08xfs: create a new buf_ops pointer to verify structure metadataGravatar Darrick J. Wong 1-0/+1
Expose all metadata structure buffer verifier functions via buf_ops. These will be used by the online scrub mechanism to look for problems with buffers that are already sitting around in memory. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-08xfs: fail out of xfs_attr3_leaf_lookup_int if it looks corruptGravatar Darrick J. Wong 1-3/+6
If the xattr leaf block looks corrupt, return -EFSCORRUPTED to userspace instead of ASSERTing on debug kernels or running off the end of the buffer on regular kernels. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-08xfs: create structure verifier function for shortform xattrsGravatar Darrick J. Wong 1-0/+74
Create a function to perform structure verification for short form extended attributes. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-08xfs: refactor verifier callers to print address of failing checkGravatar Darrick J. Wong 1-5/+11
Refactor the callers of verifiers to print the instruction address of a failing check. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-08xfs: have buffer verifier functions report failing addressGravatar Darrick J. Wong 1-10/+10
Modify each function that checks the contents of a metadata buffer to return the instruction address of the failing test so that we can report more precise failure errors to the log. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-08xfs: refactor xfs_verifier_error and xfs_buf_ioerrorGravatar Darrick J. Wong 1-7/+3
Since all verification errors also mark the buffer as having an error, we can combine these two calls. Later we'll add a xfs_failaddr_t parameter to promote the idea of reporting corruption errors and the address of the failing check to enable better debugging reports. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2017-12-14xfs: hold xfs_buf locked between shortform->leaf conversion and the addition ↵Gravatar Darrick J. Wong 1-3/+6
of an attribute The new attribute leaf buffer is not held locked across the transaction roll between the shortform->leaf modification and the addition of the new entry. As a result, the attribute buffer modification being made is not atomic from an operational perspective. Hence the AIL push can grab it in the transient state of "just created" after the initial transaction is rolled, because the buffer has been released. This leads to xfs_attr3_leaf_verify() asserting that hdr.count is zero, treating this as in-memory corruption, and shutting down the filesystem. Darrick ported the original patch to 4.15 and reworked it use the xfs_defer_bjoin helper and hold/join the buffer correctly across the second transaction roll. Signed-off-by: Alex Lyakas <alex@zadarastorage.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2017-10-26xfs: remove the never fully implemented UUID fork formatGravatar Christoph Hellwig 1-5/+1
Remove the dead code dealing with the UUID fork format that was never implemented in Linux (and neither in IRIX as far as I know). Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-09-01xfs: refactor xfs_trans_rollGravatar Christoph Hellwig 1-3/+3
Split xfs_trans_roll into a low-level helper that just rolls the actual transaction and a new higher level xfs_trans_roll_inode that takes care of logging and rejoining the inode. This gets rid of the NULL inode case, and allows to simplify the special cases in the deferred operation code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-07-07xfs: don't crash on unexpected holes in dir/attr btreesGravatar Darrick J. Wong 1-1/+1
In quite a few places we call xfs_da_read_buf with a mappedbno that we don't control, then assume that the function passes back either an error code or a buffer pointer. Unfortunately, if mappedbno == -2 and bno maps to a hole, we get a return code of zero and a NULL buffer, which means that we crash if we actually try to use that buffer pointer. This happens immediately when we set the buffer type for transaction context. Therefore, check that we have no error code and a non-NULL bp before trying to use bp. This patch is a follow-up to an incomplete fix in 96a3aefb8ffde231 ("xfs: don't crash if reading a directory results in an unexpected hole"). Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2016-12-09xfs: ignore leaf attr ichdr.count in verifier during log replayGravatar Eric Sandeen 1-1/+7
When we create a new attribute, we first create a shortform attribute, and try to fit the new attribute into it. If that fails, we copy the (empty) attribute into a leaf attribute, and do the copy again. Thus there can be a transient state where we have an empty leaf attribute. If we encounter this during log replay, the verifier will fail. So add a test to ignore this part of the leaf attr verification during log replay. Thanks as usual to dchinner for spotting the problem. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-08-03xfs: rename flist/free_list to dfopsGravatar Darrick J. Wong 1-2/+2
Mechanical change of flist/free_list to dfops, since they're now deferred ops, not just a freeing list. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>