aboutsummaryrefslogtreecommitdiff
path: root/fs/lockd/svc.c
AgeCommit message (Collapse)AuthorFilesLines
2023-04-13lockd: simplify two-level sysctl registration for nlm_sysctlsGravatar Luis Chamberlain 1-19/+1
There is no need to declare two tables to just create directories, this can be easily be done with a prefix path with register_sysctl(). Simplify this registration. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-02-23Merge tag 'sysctl-6.3-rc1' of ↵Gravatar Linus Torvalds 1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux Pull sysctl update from Luis Chamberlain: "Just one fix which just came in. Sadly the eager beavers willing to help with the sysctl moves have slowed" * tag 'sysctl-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: sysctl: fix proc_dobool() usability
2023-02-21sysctl: fix proc_dobool() usabilityGravatar Ondrej Mosnacek 1-1/+1
Currently proc_dobool expects a (bool *) in table->data, but sizeof(int) in table->maxsize, because it uses do_proc_dointvec() directly. This is unsafe for at least two reasons: 1. A sysctl table definition may use { .data = &variable, .maxsize = sizeof(variable) }, not realizing that this makes the sysctl unusable (see the Fixes: tag) and that they need to use the completely counterintuitive sizeof(int) instead. 2. proc_dobool() will currently try to parse an array of values if given .maxsize >= 2*sizeof(int), but will try to write values of type bool by offsets of sizeof(int), so it will not work correctly with neither an (int *) nor a (bool *). There is no .maxsize validation to prevent this. Fix this by: 1. Constraining proc_dobool() to allow only one value and .maxsize == sizeof(bool). 2. Wrapping the original struct ctl_table in a temporary one with .data pointing to a local int variable and .maxsize set to sizeof(int) and passing this one to proc_dointvec(), converting the value to/from bool as needed (using proc_dou8vec_minmax() as an example). 3. Extending sysctl_check_table() to enforce proc_dobool() expectations. 4. Fixing the proc_dobool() docstring (it was just copy-pasted from proc_douintvec, apparently...). 5. Converting all existing proc_dobool() users to set .maxsize to sizeof(bool) instead of sizeof(int). Fixes: 83efeeeb3d04 ("tty: Allow TIOCSTI to be disabled") Fixes: a2071573d634 ("sysctl: introduce new proc handler proc_dobool") Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-02-20SUNRPC: Use per-CPU counters to tally server RPC countsGravatar Chuck Lever 1-5/+10
- Improves counting accuracy - Reduces cross-CPU memory traffic Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Refactor RPC server dispatch methodGravatar Chuck Lever 1-2/+2
Currently, svcauth_gss_accept() pre-reserves response buffer space for the RPC payload length and GSS sequence number before returning to the dispatcher, which then adds the header's accept_stat field. The problem is the accept_stat field is supposed to go before the length and seq_num fields. So svcauth_gss_release() has to relocate the accept_stat value (see svcauth_gss_prepare_to_wrap()). To enable these fields to be added to the response buffer in the correct (final) order, the pointer to the accept_stat has to be made available to svcauth_gss_accept() so that it can set it before reserving space for the length and seq_num fields. As a first step, move the pointer to the location of the accept_stat field into struct svc_rqst. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Push svcxdr_init_encode() into svc_process_common()Gravatar Chuck Lever 1-1/+0
Now that all vs_dispatch functions invoke svcxdr_init_encode(), it is common code and can be pushed down into the generic RPC server. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Push svcxdr_init_decode() into svc_process_common()Gravatar Chuck Lever 1-1/+0
Now that all vs_dispatch functions invoke svcxdr_init_decode(), it is common code and can be pushed down into the generic RPC server. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-02-28NFSD: Move svc_serv_ops::svo_function into struct svc_servGravatar Chuck Lever 1-5/+1
Hoist svo_function back into svc_serv and remove struct svc_serv_ops, since the struct is now devoid of fields. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-02-28NFSD: Remove svc_serv_ops::svo_moduleGravatar Chuck Lever 1-3/+1
struct svc_serv_ops is about to be removed. Neil Brown says: > I suspect svo_module can go as well - I don't think the thread is > ever the thing that primarily keeps a module active. A random sample of kthread_create() callers shows sunrpc is the only one that manages module reference count in this way. Suggested-by: Neil Brown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-02-28SUNRPC: Remove svc_shutdown_net()Gravatar Chuck Lever 1-2/+2
Clean up: svc_shutdown_net() now does nothing but call svc_close_net(). Replace all external call sites. svc_close_net() is renamed to be the inverse of svc_xprt_create(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-02-28SUNRPC: Rename svc_create_xprt()Gravatar Chuck Lever 1-2/+2
Clean up: Use the "svc_xprt_<task>" function naming convention as is used for other external APIs. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-02-28SUNRPC: Remove svo_shutdown methodGravatar Chuck Lever 1-3/+2
Clean up. Neil observed that "any code that calls svc_shutdown_net() knows what the shutdown function should be, and so can call it directly." Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: NeilBrown <neilb@suse.de>
2022-02-28SUNRPC: Remove the .svo_enqueue_xprt methodGravatar Chuck Lever 1-1/+0
We have never been able to track down and address the underlying cause of the performance issues with workqueue-based service support. svo_enqueue_xprt is called multiple times per RPC, so it adds instruction path length, but always ends up at the same function: svc_xprt_do_enqueue(). We do not anticipate needing this flexibility for dynamic nfsd thread management support. As a micro-optimization, remove .svo_enqueue_xprt because Spectre/Meltdown makes virtual function calls more costly. This change essentially reverts commit b9e13cdfac70 ("nfsd/sunrpc: turn enqueueing a svc_xprt into a svc_serv operation"). Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-01-17Merge branch 'signal-for-v5.17' of ↵Gravatar Linus Torvalds 1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull signal/exit/ptrace updates from Eric Biederman: "This set of changes deletes some dead code, makes a lot of cleanups which hopefully make the code easier to follow, and fixes bugs found along the way. The end-game which I have not yet reached yet is for fatal signals that generate coredumps to be short-circuit deliverable from complete_signal, for force_siginfo_to_task not to require changing userspace configured signal delivery state, and for the ptrace stops to always happen in locations where we can guarantee on all architectures that the all of the registers are saved and available on the stack. Removal of profile_task_ext, profile_munmap, and profile_handoff_task are the big successes for dead code removal this round. A bunch of small bug fixes are included, as most of the issues reported were small enough that they would not affect bisection so I simply added the fixes and did not fold the fixes into the changes they were fixing. There was a bug that broke coredumps piped to systemd-coredump. I dropped the change that caused that bug and replaced it entirely with something much more restrained. Unfortunately that required some rebasing. Some successes after this set of changes: There are few enough calls to do_exit to audit in a reasonable amount of time. The lifetime of struct kthread now matches the lifetime of struct task, and the pointer to struct kthread is no longer stored in set_child_tid. The flag SIGNAL_GROUP_COREDUMP is removed. The field group_exit_task is removed. Issues where task->exit_code was examined with signal->group_exit_code should been examined were fixed. There are several loosely related changes included because I am cleaning up and if I don't include them they will probably get lost. The original postings of these changes can be found at: https://lkml.kernel.org/r/87a6ha4zsd.fsf@email.froward.int.ebiederm.org https://lkml.kernel.org/r/87bl1kunjj.fsf@email.froward.int.ebiederm.org https://lkml.kernel.org/r/87r19opkx1.fsf_-_@email.froward.int.ebiederm.org I trimmed back the last set of changes to only the obviously correct once. Simply because there was less time for review than I had hoped" * 'signal-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (44 commits) ptrace/m68k: Stop open coding ptrace_report_syscall ptrace: Remove unused regs argument from ptrace_report_syscall ptrace: Remove second setting of PT_SEIZED in ptrace_attach taskstats: Cleanup the use of task->exit_code exit: Use the correct exit_code in /proc/<pid>/stat exit: Fix the exit_code for wait_task_zombie exit: Coredumps reach do_group_exit exit: Remove profile_handoff_task exit: Remove profile_task_exit & profile_munmap signal: clean up kernel-doc comments signal: Remove the helper signal_group_exit signal: Rename group_exit_task group_exec_task coredump: Stop setting signal->group_exit_task signal: Remove SIGNAL_GROUP_COREDUMP signal: During coredumps set SIGNAL_GROUP_EXIT in zap_process signal: Make coredump handling explicit in complete_signal signal: Have prepare_signal detect coredumps using signal->core_state signal: Have the oom killer detect coredumps using signal->core_state exit: Move force_uaccess back into do_exit exit: Guarantee make_task_dead leaks the tsk when calling do_task_exit ...
2021-12-13lockd: use svc_set_num_threads() for thread start and stopGravatar NeilBrown 1-50/+8
svc_set_num_threads() does everything that lockd_start_svc() does, except set sv_maxconn. It also (when passed 0) finds the threads and stops them with kthread_stop(). So move the setting for sv_maxconn, and use svc_set_num_thread() We now don't need nlmsvc_task. Now that we use svc_set_num_threads() it makes sense to set svo_module. This request that the thread exists with module_put_and_exit(). Also fix the documentation for svo_module to make this explicit. svc_prepare_thread is now only used where it is defined, so it can be made static. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-12-13lockd: rename lockd_create_svc() to lockd_get()Gravatar NeilBrown 1-6/+4
lockd_create_svc() already does an svc_get() if the service already exists, so it is more like a "get" than a "create". So: - Move the increment of nlmsvc_users into the function as well - rename to lockd_get(). It is now the inverse of lockd_put(). Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-12-13lockd: introduce lockd_put()Gravatar NeilBrown 1-37/+27
There is some cleanup that is duplicated in lockd_down() and the failure path of lockd_up(). Factor these out into a new lockd_put() and call it from both places. lockd_put() does *not* take the mutex - that must be held by the caller. It decrements nlmsvc_users and if that reaches zero, it cleans up. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-12-13lockd: move svc_exit_thread() into the threadGravatar NeilBrown 1-11/+12
The normal place to call svc_exit_thread() is from the thread itself just before it exists. Do this for lockd. This means that nlmsvc_rqst is not used out side of lockd_start_svc(), so it can be made local to that function, and renamed to 'rqst'. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-12-13lockd: move lockd_start_svc() call into lockd_create_svc()Gravatar NeilBrown 1-12/+10
lockd_start_svc() only needs to be called once, just after the svc is created. If the start fails, the svc is discarded too. It thus makes sense to call lockd_start_svc() from lockd_create_svc(). This allows us to remove the test against nlmsvc_rqst at the start of lockd_start_svc() - it must always be NULL. lockd_up() only held an extra reference on the svc until a thread was created - then it dropped it. The thread - and thus the extra reference - will remain until kthread_stop() is called. Now that the thread is created in lockd_create_svc(), the extra reference can be dropped there. So the 'serv' variable is no longer needed in lockd_up(). Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-12-13lockd: simplify management of network status notifiersGravatar NeilBrown 1-26/+9
Now that the network status notifiers use nlmsvc_serv rather then nlmsvc_rqst the management can be simplified. Notifier unregistration synchronises with any pending notifications so providing we unregister before nlm_serv is freed no further interlock is required. So we move the unregister call to just before the thread is killed (which destroys the service) and just before the service is destroyed in the failure-path of lockd_up(). Then nlm_ntf_refcnt and nlm_ntf_wq can be removed. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-12-13lockd: introduce nlmsvc_servGravatar NeilBrown 1-16/+20
lockd has two globals - nlmsvc_task and nlmsvc_rqst - but mostly it wants the 'struct svc_serv', and when it doesn't want it exactly it can get to what it wants from the serv. This patch is a first step to removing nlmsvc_task and nlmsvc_rqst. It introduces nlmsvc_serv to store the 'struct svc_serv*'. This is set as soon as the serv is created, and cleared only when it is destroyed. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-12-13SUNRPC: stop using ->sv_nrthreads as a refcountGravatar NeilBrown 1-4/+0
The use of sv_nrthreads as a general refcount results in clumsy code, as is seen by various comments needed to explain the situation. This patch introduces a 'struct kref' and uses that for reference counting, leaving sv_nrthreads to be a pure count of threads. The kref is managed particularly in svc_get() and svc_put(), and also nfsd_put(); svc_destroy() now takes a pointer to the embedded kref, rather than to the serv. nfsd allows the svc_serv to exist with ->sv_nrhtreads being zero. This happens when a transport is created before the first thread is started. To support this, a 'keep_active' flag is introduced which holds a ref on the svc_serv. This is set when any listening socket is successfully added (unless there are running threads), and cleared when the number of threads is set. So when the last thread exits, the nfs_serv will be destroyed. The use of 'keep_active' replaces previous code which checked if there were any permanent sockets. We no longer clear ->rq_server when nfsd() exits. This was done to prevent svc_exit_thread() from calling svc_destroy(). Instead we take an extra reference to the svc_serv to prevent svc_destroy() from being called. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-12-13SUNRPC/NFSD: clean up get/put functions.Gravatar NeilBrown 1-5/+1
svc_destroy() is poorly named - it doesn't necessarily destroy the svc, it might just reduce the ref count. nfsd_destroy() is poorly named for the same reason. This patch: - removes the refcount functionality from svc_destroy(), moving it to a new svc_put(). Almost all previous callers of svc_destroy() now call svc_put(). - renames nfsd_destroy() to nfsd_put() and improves the code, using the new svc_destroy() rather than svc_put() - removes a few comments that explain the important for balanced get/put calls. This should be obvious. The only non-trivial part of this is that svc_destroy() would call svc_sock_update() on a non-final decrement. It can no longer do that, and svc_put() isn't really a good place of it. This call is now made from svc_exit_thread() which seems like a good place. This makes the call *before* sv_nrthreads is decremented rather than after. This is not particularly important as the call just sets a flag which causes sv_nrthreads set be checked later. A subsequent patch will improve the ordering. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-12-13SUNRPC: change svc_get() to return the svc.Gravatar NeilBrown 1-4/+2
It is common for 'get' functions to return the object that was 'got', and there are a couple of places where users of svc_get() would be a little simpler if svc_get() did that. Make it so. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-10-13SUNRPC: Replace the "__be32 *p" parameter to .pc_encodeGravatar Chuck Lever 1-2/+1
The passed-in value of the "__be32 *p" parameter is now unused in every server-side XDR encoder, and can be removed. Note also that there is a line in each encoder that sets up a local pointer to a struct xdr_stream. Passing that pointer from the dispatcher instead saves one line per encoder function. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2021-10-13SUNRPC: Replace the "__be32 *p" parameter to .pc_decodeGravatar Chuck Lever 1-2/+1
The passed-in value of the "__be32 *p" parameter is now unused in every server-side XDR decoder, and can be removed. Note also that there is a line in each decoder that sets up a local pointer to a struct xdr_stream. Passing that pointer from the dispatcher instead saves one line per decoder function. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2021-09-04Merge tag 'nfs-for-5.15-1' of git://git.linux-nfs.org/projects/anna/linux-nfsGravatar Linus Torvalds 1-0/+2
Pull NFS client updates from Anna Schumaker: "New Features: - Better client responsiveness when server isn't replying - Use refcount_t in sunrpc rpc_client refcount tracking - Add srcaddr and dst_port to the sunrpc sysfs info files - Add basic support for connection sharing between servers with multiple NICs` Bugfixes and Cleanups: - Sunrpc tracepoint cleanups - Disconnect after ib_post_send() errors to avoid deadlocks - Fix for tearing down rpcrdma_reps - Fix a potential pNFS layoutget livelock loop - pNFS layout barrier fixes - Fix a potential memory corruption in rpc_wake_up_queued_task_set_status() - Fix reconnection locking - Fix return value of get_srcport() - Remove rpcrdma_post_sends() - Remove pNFS dead code - Remove copy size restriction for inter-server copies - Overhaul the NFS callback service - Clean up sunrpc TCP socket shutdowns - Always provide aligned buffers to RPC read layers" * tag 'nfs-for-5.15-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (39 commits) NFS: Always provide aligned buffers to the RPC read layers NFSv4.1 add network transport when session trunking is detected SUNRPC enforce creation of no more than max_connect xprts NFSv4 introduce max_connect mount options SUNRPC add xps_nunique_destaddr_xprts to xprt_switch_info in sysfs SUNRPC keep track of number of transports to unique addresses NFSv3: Delete duplicate judgement in nfs3_async_handle_jukebox SUNRPC: Tweak TCP socket shutdown in the RPC client SUNRPC: Simplify socket shutdown when not reusing TCP ports NFSv4.2: remove restriction of copy size for inter-server copy. NFS: Clean up the synopsis of callback process_op() NFS: Extract the xdr_init_encode/decode() calls from decode_compound NFS: Remove unused callback void decoder NFS: Add a private local dispatcher for NFSv4 callback operations SUNRPC: Eliminate the RQ_AUTHERR flag SUNRPC: Set rq_auth_stat in the pg_authenticate() callout SUNRPC: Add svc_rqst::rq_auth_stat SUNRPC: Add dst_port to the sysfs xprt info file SUNRPC: Add srcaddr as a file in sysfs sunrpc: Fix return value of get_srcport() ...
2021-08-17lockd: change the proc_handler for nsm_use_hostnamesGravatar Jia He 1-1/+1
nsm_use_hostnames is a module parameter and it will be exported to sysctl procfs. This is to let user sometimes change it from userspace. But the minimal unit for sysctl procfs read/write it sizeof(int). In big endian system, the converting from/to bool to/from int will cause error for proc items. This patch use a new proc_handler proc_dobool to fix it. Signed-off-by: Jia He <hejianet@gmail.com> Reviewed-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com> [thuth: Fix typo in commit message] Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-10SUNRPC: Set rq_auth_stat in the pg_authenticate() calloutGravatar Chuck Lever 1-0/+2
In a few moments, rq_auth_stat will need to be explicitly set to rpc_auth_ok before execution gets to the dispatcher. svc_authenticate() already sets it, but it often gets reset to rpc_autherr_badcred right after that call, even when authentication is successful. Let's ensure that the pg_authenticate callout and svc_set_client() set it properly in every case. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2021-07-06lockd: Create a simplified .vs_dispatch method for NLM requestsGravatar Chuck Lever 1-0/+43
To enable xdr_stream-based encoding and decoding, create a bespoke RPC dispatch function for the lockd service. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-05-21treewide: Add SPDX license identifier for more missed filesGravatar Thomas Gleixner 1-0/+1
Add SPDX license identifiers to all files which: - Have no license information of any form - Have MODULE_LICENCE("GPL*") inside which was used in the initial scan/conversion to ignore the file These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-24lockd: Pass the user cred from knfsd when starting the lockd serverGravatar Trond Myklebust 1-12/+16
When starting up a new knfsd server, pass the user cred to the supporting lockd server. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-04-24SUNRPC: Cache the process user cred in the RPC server listenerGravatar Trond Myklebust 1-1/+2
In order to be able to interpret uids and gids correctly in knfsd, we should cache the user namespace of the process that created the RPC server's listener. To do so, we refcount the credential of that process. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-04-24SUNRPC: Allow further customisation of RPC program registrationGravatar Trond Myklebust 1-0/+1
Add a callback to allow customisation of the rpcbind registration. When clients have the ability to turn on and off version support, we want to allow them to also prevent registration of those versions with the rpc portmapper. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-04-24SUNRPC: Add a callback to initialise server requestsGravatar Trond Myklebust 1-1/+2
Add a callback to help initialise server requests before they are processed. This will allow us to clean up the NFS server version support, and to make it container safe. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-03-19lockd: make nlm_ntf_refcnt and nlm_ntf_wq staticGravatar Colin Ian King 1-2/+2
The variables nlm_ntf_refcnt and nlm_ntf_wq are local to the source and do not need to be in global scope, so make them static. Cleans up sparse warnings: fs/lockd/svc.c:60:10: warning: symbol 'nlm_ntf_refcnt' was not declared. Should it be static? fs/lockd/svc.c:61:1: warning: symbol 'nlm_ntf_wq' was not declared. Should it be static? Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-11-27race of lockd inetaddr notifiers vs nlmsvc_rqst changeGravatar Vasily Averin 1-2/+14
lockd_inet[6]addr_event use nlmsvc_rqst without taken nlmsvc_mutex, nlmsvc_rqst can be changed during execution of notifiers and crash the host. Patch enables access to nlmsvc_rqst only when it was correctly initialized and delays its cleanup until notifiers are no longer in use. Note that nlmsvc_rqst can be temporally set to ERR_PTR, so the "if (nlmsvc_rqst)" check in notifiers is insufficient on its own. Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Tested-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-11-27lockd: lost rollback of set_grace_period() in lockd_down_net()Gravatar Vasily Averin 1-0/+2
Commit efda760fe95ea ("lockd: fix lockd shutdown race") is incorrect, it removes lockd_manager and disarm grace_period_end for init_net only. If nfsd was started from another net namespace lockd_up_net() calls set_grace_period() that adds lockd_manager into per-netns list and queues grace_period_end delayed work. These action should be reverted in lockd_down_net(). Otherwise it can lead to double list_add on after restart nfsd in netns, and to use-after-free if non-disarmed delayed work will be executed after netns destroy. Fixes: efda760fe95e ("lockd: fix lockd shutdown race") Cc: stable@vger.kernel.org Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-11-27lockd: added cleanup checks in exit_net hookGravatar Vasily Averin 1-0/+11
Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-11-27lockd: remove net pointer from messagesGravatar Vasily Averin 1-4/+5
Publishing of net pointer is not safe, use net->ns.inum as net ID in debug messages [ 171.757678] lockd_up_net: per-net data created; net=f00001e7 [ 171.767188] NFSD: starting 90-second grace period (net f00001e7) [ 300.653313] lockd: nuking all hosts in net f00001e7... [ 300.653641] lockd: host garbage collection for net f00001e7 [ 300.653968] lockd: nlmsvc_mark_resources for net f00001e7 [ 300.711483] lockd_down_net: per-net data destroyed; net=f00001e7 [ 300.711847] lockd: nuking all hosts in net 0... [ 300.711847] lockd: host garbage collection for net 0 [ 300.711848] lockd: nlmsvc_mark_resources for net 0 Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-11-18Merge tag 'nfsd-4.15' of git://linux-nfs.org/~bfields/linuxGravatar Linus Torvalds 1-11/+9
Pull nfsd updates from Bruce Fields: "Lots of good bugfixes, including: - fix a number of races in the NFSv4+ state code - fix some shutdown crashes in multiple-network-namespace cases - relax our 4.1 session limits; if you've an artificially low limit to the number of 4.1 clients that can mount simultaneously, try upgrading" * tag 'nfsd-4.15' of git://linux-nfs.org/~bfields/linux: (22 commits) SUNRPC: Improve ordering of transport processing nfsd: deal with revoked delegations appropriately svcrdma: Enqueue after setting XPT_CLOSE in completion handlers nfsd: use nfs->ns.inum as net ID rpc: remove some BUG()s svcrdma: Preserve CB send buffer across retransmits nfds: avoid gettimeofday for nfssvc_boot time fs, nfsd: convert nfs4_file.fi_ref from atomic_t to refcount_t fs, nfsd: convert nfs4_cntl_odstate.co_odcount from atomic_t to refcount_t fs, nfsd: convert nfs4_stid.sc_count from atomic_t to refcount_t lockd: double unregister of inetaddr notifiers nfsd4: catch some false session retries nfsd4: fix cached replies to solo SEQUENCE compounds sunrcp: make function _svc_create_xprt static SUNRPC: Fix tracepoint storage issues with svc_recv and svc_rqst_status nfsd: use ARRAY_SIZE nfsd: give out fewer session slots as limit approaches nfsd: increase DRC cache limit nfsd: remove unnecessary nofilehandle checks nfs_common: convert int to bool ...
2017-11-07lockd: double unregister of inetaddr notifiersGravatar Vasily Averin 1-11/+9
lockd_up() can call lockd_unregister_notifiers twice: inside lockd_start_svc() when it calls lockd_svc_exit_thread() and then in error path of lockd_up() Patch forces lockd_start_svc() to unregister notifiers in all error cases and removes extra unregister in error path of lockd_up(). Fixes: cb7d224f82e4 "lockd: unregister notifier blocks if the service ..." Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Cc: stable@vger.kernel.org Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-10-31treewide: Fix function prototypes for module_param_call()Gravatar Kees Cook 1-1/+1
Several function prototypes for the set/get functions defined by module_param_call() have a slightly wrong argument types. This fixes those in an effort to clean up the calls when running under type-enforced compiler instrumentation for CFI. This is the result of running the following semantic patch: @match_module_param_call_function@ declarer name module_param_call; identifier _name, _set_func, _get_func; expression _arg, _mode; @@ module_param_call(_name, _set_func, _get_func, _arg, _mode); @fix_set_prototype depends on match_module_param_call_function@ identifier match_module_param_call_function._set_func; identifier _val, _param; type _val_type, _param_type; @@ int _set_func( -_val_type _val +const char * _val , -_param_type _param +const struct kernel_param * _param ) { ... } @fix_get_prototype depends on match_module_param_call_function@ identifier match_module_param_call_function._get_func; identifier _val, _param; type _val_type, _param_type; @@ int _get_func( -_val_type _val +char * _val , -_param_type _param +const struct kernel_param * _param ) { ... } Two additional by-hand changes are included for places where the above Coccinelle script didn't notice them: drivers/platform/x86/thinkpad_acpi.c fs/lockd/svc.c Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jessica Yu <jeyu@kernel.org>
2017-08-24sunrpc: Const-ify struct sv_serv_opsGravatar Chuck Lever 1-1/+1
Close an attack vector by moving the arrays of per-server methods to read-only memory. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-05-15sunrpc: mark all struct svc_version instances as constGravatar Christoph Hellwig 1-19/+19
Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-05-15sunrpc: move pc_count out of struct svc_procinfoGravatar Christoph Hellwig 1-0/+6
pc_count is the only writeable memeber of struct svc_procinfo, which is a good candidate to be const-ified as it contains function pointers. This patch moves it into out out struct svc_procinfo, and into a separate writable array that is pointed to by struct svc_version. Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-05-08lockd: fix lockd shutdown raceGravatar J. Bruce Fields 1-2/+4
As reported by David Jeffery: "a signal was sent to lockd while lockd was shutting down from a request to stop nfs. The signal causes lockd to call restart_grace() which puts the lockd_net structure on the grace list. If this signal is received at the wrong time, it will occur after lockd_down_net() has called locks_end_grace() but before lockd_down_net() stops the lockd thread. This leads to lockd putting the lockd_net structure back on the grace list, then exiting without anything removing it from the list." So, perform the final locks_end_grace() from the the lockd thread; this ensures it's serialized with respect to restart_grace(). Reported-by: David Jeffery <djeffery@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-03-02sched/headers: Prepare for new header dependencies before moving code to ↵Gravatar Ingo Molnar 1-1/+1
<linux/sched/signal.h> We are going to split <linux/sched/signal.h> out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files. Create a trivial placeholder <linux/sched/signal.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable. Include the new header in the files that are going to need it. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-31lockd: initialize sin6_scope_id in lockd_inet6addr_event()Gravatar Scott Mayhew 1-0/+2
I noticed this was missing when I was testing with link local addresses. Signed-off-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-11-18netns: make struct pernet_operations::id unsigned intGravatar Alexey Dobriyan 1-1/+1
Make struct pernet_operations::id unsigned. There are 2 reasons to do so: 1) This field is really an index into an zero based array and thus is unsigned entity. Using negative value is out-of-bound access by definition. 2) On x86_64 unsigned 32-bit data which are mixed with pointers via array indexing or offsets added or subtracted to pointers are preffered to signed 32-bit data. "int" being used as an array index needs to be sign-extended to 64-bit before being used. void f(long *p, int i) { g(p[i]); } roughly translates to movsx rsi, esi mov rdi, [rsi+...] call g MOVSX is 3 byte instruction which isn't necessary if the variable is unsigned because x86_64 is zero extending by default. Now, there is net_generic() function which, you guessed it right, uses "int" as an array index: static inline void *net_generic(const struct net *net, int id) { ... ptr = ng->ptr[id - 1]; ... } And this function is used a lot, so those sign extensions add up. Patch snipes ~1730 bytes on allyesconfig kernel (without all junk messing with code generation): add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730) Unfortunately some functions actually grow bigger. This is a semmingly random artefact of code generation with register allocator being used differently. gcc decides that some variable needs to live in new r8+ registers and every access now requires REX prefix. Or it is shifted into r12, so [r12+0] addressing mode has to be used which is longer than [r8] However, overall balance is in negative direction: add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730) function old new delta nfsd4_lock 3886 3959 +73 tipc_link_build_proto_msg 1096 1140 +44 mac80211_hwsim_new_radio 2776 2808 +32 tipc_mon_rcv 1032 1058 +26 svcauth_gss_legacy_init 1413 1429 +16 tipc_bcbase_select_primary 379 392 +13 nfsd4_exchange_id 1247 1260 +13 nfsd4_setclientid_confirm 782 793 +11 ... put_client_renew_locked 494 480 -14 ip_set_sockfn_get 730 716 -14 geneve_sock_add 829 813 -16 nfsd4_sequence_done 721 703 -18 nlmclnt_lookup_host 708 686 -22 nfsd4_lockt 1085 1063 -22 nfs_get_client 1077 1050 -27 tcf_bpf_init 1106 1076 -30 nfsd4_encode_fattr 5997 5930 -67 Total: Before=154856051, After=154854321, chg -0.00% Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>