aboutsummaryrefslogtreecommitdiff
path: root/fs/gfs2
AgeCommit message (Collapse)AuthorFilesLines
2009-09-14Merge branch 'for-2.6.32' of git://git.kernel.dk/linux-2.6-blockGravatar Linus Torvalds 1-2/+4
* 'for-2.6.32' of git://git.kernel.dk/linux-2.6-block: (29 commits) block: use blkdev_issue_discard in blk_ioctl_discard Make DISCARD_BARRIER and DISCARD_NOBARRIER writes instead of reads block: don't assume device has a request list backing in nr_requests store block: Optimal I/O limit wrapper cfq: choose a new next_req when a request is dispatched Seperate read and write statistics of in_flight requests aoe: end barrier bios with EOPNOTSUPP block: trace bio queueing trial only when it occurs block: enable rq CPU completion affinity by default cfq: fix the log message after dispatched a request block: use printk_once cciss: memory leak in cciss_init_one() splice: update mtime and atime on files block: make blk_iopoll_prep_sched() follow normal 0/1 return convention cfq-iosched: get rid of must_alloc flag block: use interrupts disabled version of raise_softirq_irqoff() block: fix comment in blk-iopoll.c block: adjust default budget for blk-iopoll block: fix long lines in block/blk-iopoll.c block: add blk-iopoll, a NAPI like approach for block devices ...
2009-09-14GFS2: Whitespace fixesGravatar Steven Whitehouse 3-4/+4
Reported-by: Daniel Walker <dwalker@fifo99.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-09-14block: use blkdev_issue_discard in blk_ioctl_discardGravatar Christoph Hellwig 1-2/+4
blk_ioctl_discard duplicates large amounts of code from blkdev_issue_discard, the only difference between the two is that blkdev_issue_discard needs to send a barrier discard request and blk_ioctl_discard a non-barrier one, and blk_ioctl_discard needs to wait on the request. To facilitates this add a flags argument to blkdev_issue_discard to control both aspects of the behaviour. This will be very useful later on for using the waiting funcitonality for other callers. Based on an earlier patch from Matthew Wilcox <matthew@wil.cx>. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-09GFS2: Remove unused sysfs fileGravatar Steven Whitehouse 3-14/+1
The /sys/fs/gfs2/<fsname>/lock_module/id file has been unused for some time now, so we can remove it. We still accept the mount option though, as userspace still sends that. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-09-08GFS2: Be extra careful about deallocating inodesGravatar Steven Whitehouse 4-35/+51
There is a potential race in the inode deallocation code if two nodes try to deallocate the same inode at the same time. Most of the issue is solved by the iopen locking. There is still a small window which is not covered by the iopen lock. This patches fixes that and also makes the deallocation code more robust in the face of any errors in the rgrp bitmaps, or erroneous iopen callbacks from other nodes. This does introduce one extra disk read, but that is generally not an issue since its the same block that must be written to later in the deallocation process. The total disk accesses therefore stay the same, Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-08-27GFS2: Remove no_formal_ino generating codeGravatar Steven Whitehouse 5-190/+10
The inum structure used throughout GFS2 has two fields. One no_addr is the disk block number of the inode in question and is used everywhere as the inode number. The other, no_formal_ino, is used only as the generation number for NFS. Historically the no_formal_ino field was set using a complicated system of one global and one per-node file containing inode numbers in order to ensure that each no_formal_ino was unique. Also this code made no provision for what would happen when eventually the (64 bit) numbers ran out. Now I know that is pretty unlikely to happen given the large space of numbers, but it is possible nevertheless. The only guarantee required for no_formal_ino is that, for any single inode, the same number doesn't get reused too quickly. We already have a generation number which is kept in the inode and initialised from a counter in the resource group (almost no overhead, since we have to touch the resource group anyway in order to allocate an inode in the first place). Aside from ensuring that we never use the value 0 in the no_formal_ino field, we can use that counter directly. As a result of that change, we lose about 200 lines of code and also gain about 10 creates/sec on the postmark benchmark (on my test machine). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-08-26GFS2: Rename eattr.[ch] as xattr.[ch]Gravatar Steven Whitehouse 7-6/+6
Use the more conventional name for the extended attribute support code. Update all the places which care. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-08-26GFS2: Clean up of extended attribute supportGravatar Steven Whitehouse 11-526/+333
This has been on my list for some time. We need to change the way in which we handle extended attributes to allow faster file creation times (by reducing the number of transactions required) and the extended attribute code is the main obstacle to this. In addition to that, the VFS provides a way to demultiplex the xattr calls which we ought to be using, rather than rolling our own. This patch changes the GFS2 code to use that VFS feature and as a result the code shrinks by a couple of hundred lines or so, and becomes easier to read. I'm planning on doing further clean up work in this area, but this patch is a good start. The cleaned up code also uses the more usual "xattr" shorthand, I plan to eliminate the use of "eattr" eventually and in the mean time it serves as a flag as to which bits of the code have been updated. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-08-24GFS2: Add "-o errors=panic|withdraw" mount optionsGravatar Bob Peterson 4-14/+71
This patch adds "-o errors=panic" and "-o errors=withdraw" to the gfs2 mount options. The "errors=withdraw" option is today's current behaviour, meaning to withdraw from the file system if a non-serious gfs2 error occurs. The new "errors=panic" option tells gfs2 to force a kernel panic if a non-serious gfs2 file system error occurs. This may be useful, for example, where fabric-level fencing is used that has no way to reboot (such as fence_scsi). Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-08-24GFS2: jumping to wrong label?Gravatar Roel Kluin 1-1/+1
Also a gfs2_glock_dq() is required here. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-08-18GFS2: free disk inode which is deleted by remote node -V2Gravatar Wengang Wang 1-0/+18
this patch is for the same problem that Benjamin Marzinski fixes at commit b94a170e96dc416828af9d350ae2e34b70ae7347 quotation of the original problem: ---cut here--- When a file is deleted from a gfs2 filesystem on one node, a dcache entry for it may still exist on other nodes in the cluster. If this happens, gfs2 will be unable to free this file on disk. Because of this, it's possible to have a gfs2 filesystem with no files on it and no free space. With this patch, when a node receives a callback notifying it that the file is being deleted on another node, it schedules a new workqueue thread to remove the file's dcache entry. ---end cut--- after applying Benjamin's patch, I think there is still a case in which the disk inode remains even when "no space" is hit. the case is that when running d_prune_aliases() against the inode, there are one or more dentries(aliases) which have reference count number > 0. in this case the dentries won't be pruned. and even later, the reference count becomes to 0, the dentries can still be cached in memory. unfortunately, no callback come again, things come back to the state before the callback runs. thus the on disk inode remains there until in memoryinode is removed for some other reason(shrinking inode cache or unmount the volume..). this patch is to remove those dentries when their reference count becomes to 0 and the inode is deleted by remote node. for implementation, gfs2_dentry_delete() is added as dentry_operations.d_delete. the function returns true when the inode is deleted by remote node. in dput(), gfs2_dentry_delete() is called and since it returns true, the dentry is unhashed from dcache and then removed. when all dentries are removed, the in memory inode get removed so that the on disk inode is freed. Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-08-17GFS2: Add sysfs link to deviceGravatar Steven Whitehouse 1-0/+10
This adds a link from the per-gfs2 sb sysfs directory to the block device upon which the filesystem is mounted. The link is called "device", strangely enough :-) Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-08-17GFS2: Replace assertion with proper error handlingGravatar Steven Whitehouse 1-1/+3
One fewer assert, one more place we can recover gracefully if there is an error. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-08-17GFS2: Improve error handling in inode allocationGravatar Steven Whitehouse 3-13/+27
A little while back, block allocation was given some improved error handling which meant that -EIO was returned in the case of there being a problem in the resource group data. In addition a message is printed explaning what went wrong and how to fix it. This extends that error handling so that it also covers inode allocation too. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-08-17GFS2: Add some more info to ueventsGravatar Steven Whitehouse 1-3/+10
With each uevent, we now always include the journal ID. We can't call it JID since that is already in use by some of the individual events relating to recovery, so we use JOURNALID instead. We don't send the JOURNALID for spectator mounts, since there isn't one. Also the ADD event now has both RDONLY and SPECTATOR information to match that of the ONLINE event. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-08-17GFS2: Add online uevent to GFS2Gravatar Steven Whitehouse 3-3/+15
We already have an offline uevent (used when a withdraw occurs) but no online uevent. This adds an online uevent so that userspace will be able to detect a successful mount by means other than not receiving a remove event after the add & recovery (change) uevents. It has also been added to the remount path as well - we can't use a change uevent there as older GFS2 userspace acts on change uevents according to the state that it thinks the fs is in, so we can't easily add any new ones. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-08-14GFS2: Fix permissions on "recover" fileGravatar Steven Whitehouse 1-10/+10
Although this file is only ever written and not read by userspace, it seems that the utils are opening this file O_RDWR, so we need to allow that. Also fixes the whitespace which seemed to be broken. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: David Teigland <teigland@redhat.com>
2009-07-30GFS2: remove dcache entries for remote deleted inodesGravatar Benjamin Marzinski 5-5/+65
When a file is deleted from a gfs2 filesystem on one node, a dcache entry for it may still exist on other nodes in the cluster. If this happens, gfs2 will be unable to free this file on disk. Because of this, it's possible to have a gfs2 filesystem with no files on it and no free space. With this patch, when a node receives a callback notifying it that the file is being deleted on another node, it schedules a new workqueue thread to remove the file's dcache entry. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-07-30GFS2: Fix incorrent statfs consistency checkGravatar Benjamin Marzinski 1-11/+3
Since both linked and unlinked inodes are counted by rgd->rd_dinodes, It makes no sense to count them with the used data blocks (first check that I changed), it makes sense to count them with the linked inodes (second check), and it makes no sense to care if there are more unlinked inodes than linked ones. This fixes these errors. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-07-30GFS2: Don't put unlikely reclaim candidates on the reclaim list.Gravatar Benjamin Marzinski 1-26/+46
GFS2 was placing far too many glocks on the reclaim list that were not good candidates for freeing up from cache. These locks would sit there and repeatedly get scanned to see if they could be reclaimed, wasting a lot of time when there was memory pressure. This fix does more checks on the locks to see if they are actually likely to be removable from cache. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-07-30GFS2: Don't try and dealloc own inodeGravatar Steven Whitehouse 1-3/+6
When searching for unlinked, but still allocated inodes during block allocation, avoid the block relating to the inode that is doing the allocation. This fixes a hang caused when an unlinked, but still open, inode tries to allocate some more blocks and lands up finding itself during the search for deallocatable inodes. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-07-30GFS2: Fix panic in glock memory shrinkerGravatar Benjamin Marzinski 1-0/+4
It is possible for gfs2_shrink_glock_memory() to check a glock for demotion that's in the process of being freed by gfs2_glock_put(). In this case, gfs2_shrink_glock_memory() will acquire a new reference to this glock, and then try to free the glock itself when it drops the refernce. To solve this, gfs2_shrink_glock_memory() just needs to check if the glock is in the process of being freed, and if so skip it without ever unlocking the lru_lock. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Acked-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-07-30GFS2: keep statfs info in sync on growsGravatar Benjamin Marzinski 3-14/+68
GFS2 wasn't syncing its statfs info on grows. This causes a problem when you grow the filesystem on multiple nodes. GFS2 would calculate the new space based on the resource groups (which are always current), and then assume that the filesystem had grown the from the existing statfs size. If you grew the filesystem on two different nodes in a short time, the second node wouldn't see the statfs size change from the first node, and would assume that it was grown by a larger amount than it was. When all these changes were synced out, the total fileystem size would be incorrect (the first grow would be counted twice). This patch syncs makes GFS2 read in the statfs changes from disk before a grow, and write them out after the grow, while the master statfs inode is locked. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-07-30GFS2: Shrink the shrinkerGravatar Steven Whitehouse 1-18/+5
This patch removes some of the special cases that the shrinker was trying to deal with. As a result we leave fewer items on the list and none at all which cannot be demoted. This makes the list scanning more efficient and solves some issues seen with large numbers of inodes. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-07-13tracing/events: Move TRACE_SYSTEM outside of include guardGravatar Li Zefan 1-4/+4
If TRACE_INCLDUE_FILE is defined, <trace/events/TRACE_INCLUDE_FILE.h> will be included and compiled, otherwise it will be <trace/events/TRACE_SYSTEM.h> So TRACE_SYSTEM should be defined outside of #if proctection, just like TRACE_INCLUDE_FILE. Imaging this scenario: #include <trace/events/foo.h> -> TRACE_SYSTEM == foo ... #include <trace/events/bar.h> -> TRACE_SYSTEM == bar ... #define CREATE_TRACE_POINTS #include <trace/events/foo.h> -> TRACE_SYSTEM == bar !!! and then bar.h will be included and compiled. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <4A5A9CF1.2010007@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-19block: rename CONFIG_LBD to CONFIG_LBDAFGravatar Bartlomiej Zolnierkiewicz 1-1/+1
Follow-up to "block: enable by default support for large devices and files on 32-bit archs". Rename CONFIG_LBD to CONFIG_LBDAF to: - allow update of existing [def]configs for "default y" change - reflect that it is used also for large files support nowadays Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-06-12GFS2: Remove lock_kernel from gfs2_put_super()Gravatar Steven Whitehouse 1-4/+0
It is not required here. Signed-off-by: Steven Whitehouse <swhiteho@redhat,com> Cc: Christoph Hellwig <hch@infradead.org>
2009-06-12GFS2: Add tracepointsGravatar Steven Whitehouse 8-6/+442
This patch adds the ability to trace various aspects of the GFS2 filesystem. The trace points are divided into three groups, glocks, logging and bmap. These points have been chosen because they allow inspection of the major internal functions of GFS2 and they are also generic enough that they are unlikely to need any major changes as the filesystem evolves. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-06-11push BKL down into ->put_superGravatar Christoph Hellwig 1-0/+4
Move BKL into ->put_super from the only caller. A couple of filesystems had trivial enough ->put_super (only kfree and NULLing of s_fs_info + stuff in there) to not get any locking: coda, cramfs, efs, hugetlbfs, omfs, qnx4, shmem, all others got the full treatment. Most of them probably don't need it, but I'd rather sort that out individually. Preferably after all the other BKL pushdowns in that area. [AV: original used to move lock_super() down as well; these changes are removed since we don't do lock_super() at all in generic_shutdown_super() now] [AV: fuse, btrfs and xfs are known to need no damn BKL, exempt] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-06-11gfs2: remove ->write_super and stop maintaining ->s_dirtGravatar Christoph Hellwig 2-15/+0
Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-06-11Merge branch 'for-2.6.31' of git://git.kernel.dk/linux-2.6-blockGravatar Linus Torvalds 2-3/+3
* 'for-2.6.31' of git://git.kernel.dk/linux-2.6-block: (153 commits) block: add request clone interface (v2) floppy: fix hibernation ramdisk: remove long-deprecated "ramdisk=" boot-time parameter fs/bio.c: add missing __user annotation block: prevent possible io_context->refcount overflow Add serial number support for virtio_blk, V4a block: Add missing bounce_pfn stacking and fix comments Revert "block: Fix bounce limit setting in DM" cciss: decode unit attention in SCSI error handling code cciss: Remove no longer needed sendcmd reject processing code cciss: change SCSI error handling routines to work with interrupts enabled. cciss: separate error processing and command retrying code in sendcmd_withirq_core() cciss: factor out fix target status processing code from sendcmd functions cciss: simplify interface of sendcmd() and sendcmd_withirq() cciss: factor out core of sendcmd_withirq() for use by SCSI error handling code cciss: Use schedule_timeout_uninterruptible in SCSI error handling code block: needs to set the residual length of a bidi request Revert "block: implement blkdev_readpages" block: Fix bounce limit setting in DM Removed reference to non-existing file Documentation/PCI/PCI-DMA-mapping.txt ... Manually fix conflicts with tracing updates in: block/blk-sysfs.c drivers/ide/ide-atapi.c drivers/ide/ide-cd.c drivers/ide/ide-floppy.c drivers/ide/ide-tape.c include/trace/events/block.h kernel/trace/blktrace.c
2009-06-10GFS2: Merge gfs2_get_sb into gfs2_get_sb_metaGravatar Steven Whitehouse 1-12/+4
These don't need to be separate functions. Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-06-10GFS2: Fix cache coherency between truncate and O_DIRECT readGravatar Steven Whitehouse 1-1/+1
If a page was partially zeroed as the result of a truncate, then it was not being correctly marked dirty. This resulted in the deleted data reappearing if the file was read back via direct I/O. Reported-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-06-05GFS2: Fix locking issue mounting gfs2meta fsGravatar Steven Whitehouse 1-14/+22
This patch uses sget() to get a reference to the existing gfs2 sb when mouting the gfs2meta filesystem (in fact thats just another mount of the gfs2 filesystem with a different root and this interface is for backward compatibility). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Reported-by: Benjamin Marzinski <bmarzins@redhat.com> Tested-by: Benjamin Marzinski <bmarzins@redhat.com> Cc: Christoph Hellwig <hch@infradead.org>
2009-06-03GFS2: Remove unused variableGravatar Steven Whitehouse 1-2/+0
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-06-02GFS2: smbd proccess hangs with flock() call.Gravatar Abhijith Das 1-2/+2
GFS2 currently does not support mandatory flocks. An flock() call with LOCK_MAND triggers unexpected behavior because gfs2 is not checking for this lock type. This patch corrects that. Signed-off-by: Abhi Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-05-26GFS2: Remove args subdir from gfs2 sysfs filesGravatar Steven Whitehouse 1-51/+1
Since we can cat /proc/mounts there is no need to have this subdirectory in the gfs2 sysfs files. In fact this does not reflect the full range of possible mount argumenmts, where as /proc/mounts does. There was only one userland user of this set of sysfs files and it will function perfectly well without these files being present (in fact that subcommand of gfs2_tool is obsolete anyway). The tune/* subdirectory is also considered mostly obsolete, but there are a few uses of this until mount arguments can be added for the last few functions for which there are no equivalents currently. However the tune/* directory is still in my sights and new code should avoid using it. Only the gfs2_quota and gfs2_tool programs are know to use tune/* at the moment. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-05-26GFS2: Remove lockstruct subdir from gfs2 sysfs filesGravatar Steven Whitehouse 1-33/+8
The lockstruct sub directory contained two entries, both of which are duplicated elsewhere in the gfs2 sysfs files as well as being available via /proc/mounts. There is no userland program using either of them, so this patch removes them. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-05-22block: Do away with the notion of hardsect_sizeGravatar Martin K. Petersen 2-3/+3
Until now we have had a 1:1 mapping between storage device physical block size and the logical block sized used when addressing the device. With SATA 4KB drives coming out that will no longer be the case. The sector size will be 4KB but the logical block size will remain 512-bytes. Hence we need to distinguish between the physical block size and the logical ditto. This patch renames hardsect_size to logical_block_size. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-05-22GFS2: Move gfs2_unlink_ok into ops_inode.cGravatar Steven Whitehouse 3-41/+38
Another function which is only called from one ops_inode.c so we move it and make it static. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-05-22GFS2: Move gfs2_readlinki into ops_inode.cGravatar Steven Whitehouse 3-58/+56
Move gfs2_readlinki into ops_inode.c and make it static Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-05-22GFS2: Move gfs2_rmdiri into ops_inode.cGravatar Steven Whitehouse 3-54/+53
Move gfs2_rmdiri() into ops_inode.c and make it static. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-05-22GFS2: Merge mount.c and ops_super.c into super.cGravatar Steven Whitehouse 4-956/+903
mount.c only contained a single function, so is not really worth retaining on its own. All of the super related code is now either in super.c or ops_fstype.c Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-05-22GFS2: Clean up some file namesGravatar Steven Whitehouse 12-55/+34
This patch renames the ops_*.c files which have no counterpart without the ops_ prefix in order to shorten the name and make it more readable. In addition, ops_address.h (which was very small) is moved into inode.h and inode.h is cleaned up by adding extern where required. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-05-21GFS2: Be more aggressive in reclaiming unlinked inodesGravatar Steven Whitehouse 2-3/+4
This patch increases the frequency with which gfs2 looks for unlinked, but still allocated inodes. Its the equivalent operation to ext3's orphan list, but done with bitmaps in the resource groups. This also fixes a bug where a field in the rgrp was too small. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-05-21GFS2: Add a rgrp bitmap full flagGravatar Steven Whitehouse 2-30/+50
During block allocation, it is useful to know if sections of disk are full on a finer grained basis than a single resource group. This can make a performance difference when resource groups have larger numbers of bitmap blocks, since we no longer have to search them all block by block in each individual bitmap. The full flag is set on a per-bitmap basis when it has been searched and found to have no free space. It is then skipped in subsequent searches until the flag is reset. The resetting occurs if we have to drop the glock on the resource group for any reason, or if we deallocate some blocks within that resource group and thus free up some space. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-05-20GFS2: Improve resource group error handlingGravatar Steven Whitehouse 7-67/+99
This patch improves the error handling in the case where we discover that the summary information in the resource group doesn't match the bitmap information while in the process of allocating blocks. Originally this resulted in a kernel bug, but this patch changes that so that we return -EIO and print some messages explaining what went wrong, and how to fix it. We also remember locally not to try and allocate from the same rgrp again, so that a subsequent allocation in a different rgrp should succeed. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-05-19GFS2: Don't warn when delete inode fails on ro filesystemGravatar Steven Whitehouse 1-1/+1
If the filesystem is read-only, then we expect that delete inode will fail, so there is no need to warn about it. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-05-19GFS2: Umount recovery race fixGravatar Steven Whitehouse 9-124/+122
This patch fixes a race condition where we can receive recovery requests part way through processing a umount. This was causing problems since the recovery thread had already gone away. Looking in more detail at the recovery code, it was really trying to implement a slight variation on a work queue, and that happens to align nicely with the recently introduced slow-work subsystem. As a result I've updated the code to use slow-work, rather than its own home grown variety of work queue. When using the wait_on_bit() function, I noticed that the wait function that was supplied as an argument was appearing in the WCHAN field, so I've updated the function names in order to produce more meaningful output. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-05-13GFS2: Remove a couple of unused sysfs entriesGravatar Steven Whitehouse 1-13/+0
These two tunables are pointless and would never need to be changed anyway. There is also a race between them and umount as the deamons which they refer to might have gone away. The easiest way to fix the race is to remove the interface. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>