aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2017-03-02statx: Add a system call to make enhanced file info availableGravatar David Howells 6-15/+185
Add a system call to make extended file information available, including file creation and some attribute flags where available through the underlying filesystem. The getattr inode operation is altered to take two additional arguments: a u32 request_mask and an unsigned int flags that indicate the synchronisation mode. This change is propagated to the vfs_getattr*() function. Functions like vfs_stat() are now inline wrappers around new functions vfs_statx() and vfs_statx_fd() to reduce stack usage. ======== OVERVIEW ======== The idea was initially proposed as a set of xattrs that could be retrieved with getxattr(), but the general preference proved to be for a new syscall with an extended stat structure. A number of requests were gathered for features to be included. The following have been included: (1) Make the fields a consistent size on all arches and make them large. (2) Spare space, request flags and information flags are provided for future expansion. (3) Better support for the y2038 problem [Arnd Bergmann] (tv_sec is an __s64). (4) Creation time: The SMB protocol carries the creation time, which could be exported by Samba, which will in turn help CIFS make use of FS-Cache as that can be used for coherency data (stx_btime). This is also specified in NFSv4 as a recommended attribute and could be exported by NFSD [Steve French]. (5) Lightweight stat: Ask for just those details of interest, and allow a netfs (such as NFS) to approximate anything not of interest, possibly without going to the server [Trond Myklebust, Ulrich Drepper, Andreas Dilger] (AT_STATX_DONT_SYNC). (6) Heavyweight stat: Force a netfs to go to the server, even if it thinks its cached attributes are up to date [Trond Myklebust] (AT_STATX_FORCE_SYNC). And the following have been left out for future extension: (7) Data version number: Could be used by userspace NFS servers [Aneesh Kumar]. Can also be used to modify fill_post_wcc() in NFSD which retrieves i_version directly, but has just called vfs_getattr(). It could get it from the kstat struct if it used vfs_xgetattr() instead. (There's disagreement on the exact semantics of a single field, since not all filesystems do this the same way). (8) BSD stat compatibility: Including more fields from the BSD stat such as creation time (st_btime) and inode generation number (st_gen) [Jeremy Allison, Bernd Schubert]. (9) Inode generation number: Useful for FUSE and userspace NFS servers [Bernd Schubert]. (This was asked for but later deemed unnecessary with the open-by-handle capability available and caused disagreement as to whether it's a security hole or not). (10) Extra coherency data may be useful in making backups [Andreas Dilger]. (No particular data were offered, but things like last backup timestamp, the data version number and the DOS archive bit would come into this category). (11) Allow the filesystem to indicate what it can/cannot provide: A filesystem can now say it doesn't support a standard stat feature if that isn't available, so if, for instance, inode numbers or UIDs don't exist or are fabricated locally... (This requires a separate system call - I have an fsinfo() call idea for this). (12) Store a 16-byte volume ID in the superblock that can be returned in struct xstat [Steve French]. (Deferred to fsinfo). (13) Include granularity fields in the time data to indicate the granularity of each of the times (NFSv4 time_delta) [Steve French]. (Deferred to fsinfo). (14) FS_IOC_GETFLAGS value. These could be translated to BSD's st_flags. Note that the Linux IOC flags are a mess and filesystems such as Ext4 define flags that aren't in linux/fs.h, so translation in the kernel may be a necessity (or, possibly, we provide the filesystem type too). (Some attributes are made available in stx_attributes, but the general feeling was that the IOC flags were to ext[234]-specific and shouldn't be exposed through statx this way). (15) Mask of features available on file (eg: ACLs, seclabel) [Brad Boyer, Michael Kerrisk]. (Deferred, probably to fsinfo. Finding out if there's an ACL or seclabal might require extra filesystem operations). (16) Femtosecond-resolution timestamps [Dave Chinner]. (A __reserved field has been left in the statx_timestamp struct for this - if there proves to be a need). (17) A set multiple attributes syscall to go with this. =============== NEW SYSTEM CALL =============== The new system call is: int ret = statx(int dfd, const char *filename, unsigned int flags, unsigned int mask, struct statx *buffer); The dfd, filename and flags parameters indicate the file to query, in a similar way to fstatat(). There is no equivalent of lstat() as that can be emulated with statx() by passing AT_SYMLINK_NOFOLLOW in flags. There is also no equivalent of fstat() as that can be emulated by passing a NULL filename to statx() with the fd of interest in dfd. Whether or not statx() synchronises the attributes with the backing store can be controlled by OR'ing a value into the flags argument (this typically only affects network filesystems): (1) AT_STATX_SYNC_AS_STAT tells statx() to behave as stat() does in this respect. (2) AT_STATX_FORCE_SYNC will require a network filesystem to synchronise its attributes with the server - which might require data writeback to occur to get the timestamps correct. (3) AT_STATX_DONT_SYNC will suppress synchronisation with the server in a network filesystem. The resulting values should be considered approximate. mask is a bitmask indicating the fields in struct statx that are of interest to the caller. The user should set this to STATX_BASIC_STATS to get the basic set returned by stat(). It should be noted that asking for more information may entail extra I/O operations. buffer points to the destination for the data. This must be 256 bytes in size. ====================== MAIN ATTRIBUTES RECORD ====================== The following structures are defined in which to return the main attribute set: struct statx_timestamp { __s64 tv_sec; __s32 tv_nsec; __s32 __reserved; }; struct statx { __u32 stx_mask; __u32 stx_blksize; __u64 stx_attributes; __u32 stx_nlink; __u32 stx_uid; __u32 stx_gid; __u16 stx_mode; __u16 __spare0[1]; __u64 stx_ino; __u64 stx_size; __u64 stx_blocks; __u64 __spare1[1]; struct statx_timestamp stx_atime; struct statx_timestamp stx_btime; struct statx_timestamp stx_ctime; struct statx_timestamp stx_mtime; __u32 stx_rdev_major; __u32 stx_rdev_minor; __u32 stx_dev_major; __u32 stx_dev_minor; __u64 __spare2[14]; }; The defined bits in request_mask and stx_mask are: STATX_TYPE Want/got stx_mode & S_IFMT STATX_MODE Want/got stx_mode & ~S_IFMT STATX_NLINK Want/got stx_nlink STATX_UID Want/got stx_uid STATX_GID Want/got stx_gid STATX_ATIME Want/got stx_atime{,_ns} STATX_MTIME Want/got stx_mtime{,_ns} STATX_CTIME Want/got stx_ctime{,_ns} STATX_INO Want/got stx_ino STATX_SIZE Want/got stx_size STATX_BLOCKS Want/got stx_blocks STATX_BASIC_STATS [The stuff in the normal stat struct] STATX_BTIME Want/got stx_btime{,_ns} STATX_ALL [All currently available stuff] stx_btime is the file creation time, stx_mask is a bitmask indicating the data provided and __spares*[] are where as-yet undefined fields can be placed. Time fields are structures with separate seconds and nanoseconds fields plus a reserved field in case we want to add even finer resolution. Note that times will be negative if before 1970; in such a case, the nanosecond fields will also be negative if not zero. The bits defined in the stx_attributes field convey information about a file, how it is accessed, where it is and what it does. The following attributes map to FS_*_FL flags and are the same numerical value: STATX_ATTR_COMPRESSED File is compressed by the fs STATX_ATTR_IMMUTABLE File is marked immutable STATX_ATTR_APPEND File is append-only STATX_ATTR_NODUMP File is not to be dumped STATX_ATTR_ENCRYPTED File requires key to decrypt in fs Within the kernel, the supported flags are listed by: KSTAT_ATTR_FS_IOC_FLAGS [Are any other IOC flags of sufficient general interest to be exposed through this interface?] New flags include: STATX_ATTR_AUTOMOUNT Object is an automount trigger These are for the use of GUI tools that might want to mark files specially, depending on what they are. Fields in struct statx come in a number of classes: (0) stx_dev_*, stx_blksize. These are local system information and are always available. (1) stx_mode, stx_nlinks, stx_uid, stx_gid, stx_[amc]time, stx_ino, stx_size, stx_blocks. These will be returned whether the caller asks for them or not. The corresponding bits in stx_mask will be set to indicate whether they actually have valid values. If the caller didn't ask for them, then they may be approximated. For example, NFS won't waste any time updating them from the server, unless as a byproduct of updating something requested. If the values don't actually exist for the underlying object (such as UID or GID on a DOS file), then the bit won't be set in the stx_mask, even if the caller asked for the value. In such a case, the returned value will be a fabrication. Note that there are instances where the type might not be valid, for instance Windows reparse points. (2) stx_rdev_*. This will be set only if stx_mode indicates we're looking at a blockdev or a chardev, otherwise will be 0. (3) stx_btime. Similar to (1), except this will be set to 0 if it doesn't exist. ======= TESTING ======= The following test program can be used to test the statx system call: samples/statx/test-statx.c Just compile and run, passing it paths to the files you want to examine. The file is built automatically if CONFIG_SAMPLES is enabled. Here's some example output. Firstly, an NFS directory that crosses to another FSID. Note that the AUTOMOUNT attribute is set because transiting this directory will cause d_automount to be invoked by the VFS. [root@andromeda ~]# /tmp/test-statx -A /warthog/data statx(/warthog/data) = 0 results=7ff Size: 4096 Blocks: 8 IO Block: 1048576 directory Device: 00:26 Inode: 1703937 Links: 125 Access: (3777/drwxrwxrwx) Uid: 0 Gid: 4041 Access: 2016-11-24 09:02:12.219699527+0000 Modify: 2016-11-17 10:44:36.225653653+0000 Change: 2016-11-17 10:44:36.225653653+0000 Attributes: 0000000000001000 (-------- -------- -------- -------- -------- -------- ---m---- --------) Secondly, the result of automounting on that directory. [root@andromeda ~]# /tmp/test-statx /warthog/data statx(/warthog/data) = 0 results=7ff Size: 4096 Blocks: 8 IO Block: 1048576 directory Device: 00:27 Inode: 2 Links: 125 Access: (3777/drwxrwxrwx) Uid: 0 Gid: 4041 Access: 2016-11-24 09:02:12.219699527+0000 Modify: 2016-11-17 10:44:36.225653653+0000 Change: 2016-11-17 10:44:36.225653653+0000 Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-03-02Merge branch 'for-linus' of ↵Gravatar Linus Torvalds 1-13/+39
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs pile two from Al Viro: - orangefs fix - series of fs/namei.c cleanups from me - VFS stuff coming from overlayfs tree * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: orangefs: Use RCU for destroy_inode vfs: use helper for calling f_op->fsync() mm: use helper for calling f_op->mmap() vfs: use helpers for calling f_op->{read,write}_iter() vfs: pass type instead of fn to do_{loop,iter}_readv_writev() vfs: extract common parts of {compat_,}do_readv_writev() vfs: wrap write f_ops with file_{start,end}_write() vfs: deny copy_file_range() for non regular files vfs: deny fallocate() on directory vfs: create vfs helper vfs_tmpfile() namei.c: split unlazy_walk() namei.c: fold the check for DCACHE_OP_REVALIDATE into d_revalidate() lookup_fast(): clean up the logics around the fallback to non-rcu mode namei: fold unlazy_link() into its sole caller
2017-03-02Merge branch 'for-next' of ↵Gravatar Linus Torvalds 4-17/+25
git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull SCSI target updates from Nicholas Bellinger: "The highlights this round include: - enable dual mode (initiator + target) qla2xxx operation. (Quinn + Himanshu) - add a framework for qla2xxx async fabric discovery. (Quinn + Himanshu) - enable iscsi PDU DDP completion offload in cxgbit/T6 NICs. (Varun) - fix target-core handling of aborted failed commands. (Bart) - fix a long standing target-core issue NULL pointer dereference with active I/O LUN shutdown. (Rob Millner + Bryant + nab)" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (44 commits) target: Add counters for ABORT_TASK success + failure iscsi-target: Fix early login failure statistics misses target: Fix NULL dereference during LUN lookup + active I/O shutdown target: Delete tmr from list before processing target: Fix handling of aborted failed commands uapi: fix linux/target_core_user.h userspace compilation errors target: export protocol identifier qla2xxx: Fix a warning reported by the "smatch" static checker target/iscsi: Fix unsolicited data seq_end_offset calculation target/cxgbit: add T6 iSCSI DDP completion feature target/cxgbit: Enable DDP for T6 only if data sequence and pdu are in order target/cxgbit: Use T6 specific macros to get ETH/IP hdr len target/cxgbit: use cxgb4_tp_smt_idx() to get smt idx target/iscsi: split iscsit_check_dataout_hdr() target: Remove command flag CMD_T_DEV_ACTIVE target: Remove command flag CMD_T_BUSY target: Move session check from target_put_sess_cmd() into target_release_cmd_kref() target: Inline transport_cmd_check_stop() target: Remove an overly chatty debug message target: Stop execution if CMD_T_STOP has been set ...
2017-03-02Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostGravatar Linus Torvalds 6-6/+20
Pull vhost updates from Michael Tsirkin: "virtio, vhost: optimizations, fixes Looks like a quiet cycle for vhost/virtio, just a couple of minor tweaks. Most notable is automatic interrupt affinity for blk and scsi. Hopefully other devices are not far behind" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: virtio-console: avoid DMA from stack vhost: introduce O(1) vq metadata cache virtio_scsi: use virtio IRQ affinity virtio_blk: use virtio IRQ affinity blk-mq: provide a default queue mapping for virtio device virtio: provide a method to get the IRQ affinity mask for a virtqueue virtio: allow drivers to request IRQ affinity when creating VQs virtio_pci: simplify MSI-X setup virtio_pci: don't duplicate the msix_enable flag in struct pci_dev virtio_pci: use shared interrupts for virtqueues virtio_pci: remove struct virtio_pci_vq_info vhost: try avoiding avail index access when getting descriptor virtio_mmio: expose header to userspace
2017-03-02Merge branch 'for-linus' of ↵Gravatar Linus Torvalds 2-3/+11
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem fixes from James Morris: "Two fixes for the security subsystem: - keys: split both rcu_dereference_key() and user_key_payload() into versions which can be called with or without holding the key semaphore. - SELinux: fix Android init(8) breakage due to new cgroup security labeling support when using older policy" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: selinux: wrap cgroup seclabel support with its own policy capability KEYS: Differentiate uses of rcu_dereference_key() and user_key_payload()
2017-03-02give up on gcc ilog2() constant optimizationsGravatar Linus Torvalds 1-11/+2
gcc-7 has an "optimization" pass that completely screws up, and generates the code expansion for the (impossible) case of calling ilog2() with a zero constant, even when the code gcc compiles does not actually have a zero constant. And we try to generate a compile-time error for anybody doing ilog2() on a constant where that doesn't make sense (be it zero or negative). So now gcc7 will fail the build due to our sanity checking, because it created that constant-zero case that didn't actually exist in the source code. There's a whole long discussion on the kernel mailing about how to work around this gcc bug. The gcc people themselevs have discussed their "feature" in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72785 but it's all water under the bridge, because while it looked at one point like it would be solved by the time gcc7 was released, that was not to be. So now we have to deal with this compiler braindamage. And the only simple approach seems to be to just delete the code that tries to warn about bad uses of ilog2(). So now "ilog2()" will just return 0 not just for the value 1, but for any non-positive value too. It's not like I can recall anybody having ever actually tried to use this function on any invalid value, but maybe the sanity check just meant that such code never made it out in public. Reported-by: Laura Abbott <labbott@redhat.com> Cc: John Stultz <john.stultz@linaro.org>, Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-03-02Merge remote-tracking branch 'ovl/for-viro' into for-linusGravatar Al Viro 1-13/+39
Overlayfs-related series from Miklos and Amir
2017-03-01Merge branch 'core-urgent-for-linus' of ↵Gravatar Linus Torvalds 2-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool relocation fixes from Ingo Molnar: "Two fixes related to the module loading regression introduced by the recent objtool changes" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool, modules: Discard objtool annotation sections for modules objtool, compiler.h: Fix __unreachable section relocation size
2017-03-01Merge tag 'nfs-for-4.11-1' of git://git.linux-nfs.org/projects/anna/linux-nfsGravatar Linus Torvalds 8-39/+235
Pull NFS client updates from Anna Schumaker: "Highlights include: Stable bugfixes: - NFSv4: Fix memory and state leak in _nfs4_open_and_get_state - xprtrdma: Fix Read chunk padding - xprtrdma: Per-connection pad optimization - xprtrdma: Disable pad optimization by default - xprtrdma: Reduce required number of send SGEs - nlm: Ensure callback code also checks that the files match - pNFS/flexfiles: If the layout is invalid, it must be updated before retrying - NFSv4: Fix reboot recovery in copy offload - Revert "NFSv4.1: Handle NFS4ERR_BADSESSION/NFS4ERR_DEADSESSION replies to OP_SEQUENCE" - NFSv4: fix getacl head length estimation - NFSv4: fix getacl ERANGE for sum ACL buffer sizes Features: - Add and use dprintk_cont macros - Various cleanups to NFS v4.x to reduce code duplication and complexity - Remove unused cr_magic related code - Improvements to sunrpc "read from buffer" code - Clean up sunrpc timeout code and allow changing TCP timeout parameters - Remove duplicate mw_list management code in xprtrdma - Add generic functions for encoding and decoding xdr streams Bugfixes: - Clean up nfs_show_mountd_netid - Make layoutreturn_ops static and use NULL instead of 0 to fix sparse warnings - Properly handle -ERESTARTSYS in nfs_rename() - Check if register_shrinker() failed during rpcauth_init() - Properly clean up procfs/pipefs entries - Various NFS over RDMA related fixes - Silence unititialized variable warning in sunrpc" * tag 'nfs-for-4.11-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (64 commits) NFSv4: fix getacl ERANGE for some ACL buffer sizes NFSv4: fix getacl head length estimation Revert "NFSv4.1: Handle NFS4ERR_BADSESSION/NFS4ERR_DEADSESSION replies to OP_SEQUENCE" NFSv4: Fix reboot recovery in copy offload pNFS/flexfiles: If the layout is invalid, it must be updated before retrying NFSv4: Clean up owner/group attribute decode SUNRPC: Add a helper function xdr_stream_decode_string_dup() NFSv4: Remove bogus "struct nfs_client" argument from decode_ace() NFSv4: Fix the underestimation of delegation XDR space reservation NFSv4: Replace callback string decode function with a generic NFSv4: Replace the open coded decode_opaque_inline() with the new generic NFSv4: Replace ad-hoc xdr encode/decode helpers with xdr_stream_* generics SUNRPC: Add generic helpers for xdr_stream encode/decode sunrpc: silence uninitialized variable warning nlm: Ensure callback code also checks that the files match sunrpc: Allow xprt->ops->timer method to sleep xprtrdma: Refactor management of mw_list field xprtrdma: Handle stale connection rejection xprtrdma: Properly recover FRWRs with in-flight FASTREG WRs xprtrdma: Shrink send SGEs array ...
2017-03-01Merge tag 'for-f2fs-4.11' of ↵Gravatar Linus Torvalds 2-56/+103
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "This round introduces several interesting features such as on-disk NAT bitmaps, IO alignment, and a discard thread. And it includes a couple of major bug fixes as below. Enhancements: - introduce on-disk bitmaps to avoid scanning NAT blocks when getting free nids - support IO alignment to prepare open-channel SSD integration in future - introduce a discard thread to avoid long latency during checkpoint and fstrim - use SSR for warm node and enable inline_xattr by default - introduce in-memory bitmaps to check FS consistency for debugging - improve write_begin by avoiding needless read IO Bug fixes: - fix broken zone_reset behavior for SMR drive - fix wrong victim selection policy during GC - fix missing behavior when preparing discard commands - fix bugs in atomic write support and fiemap - workaround to handle multiple f2fs_add_link calls having same name ... and it includes a bunch of clean-up patches as well" * tag 'for-f2fs-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (97 commits) f2fs: avoid to flush nat journal entries f2fs: avoid to issue redundant discard commands f2fs: fix a plint compile warning f2fs: add f2fs_drop_inode tracepoint f2fs: Fix zoned block device support f2fs: remove redundant set_page_dirty() f2fs: fix to enlarge size of write_io_dummy mempool f2fs: fix memory leak of write_io_dummy mempool during umount f2fs: fix to update F2FS_{CP_}WB_DATA count correctly f2fs: use MAX_FREE_NIDS for the free nids target f2fs: introduce free nid bitmap f2fs: new helper cur_cp_crc() getting crc in f2fs_checkpoint f2fs: update the comment of default nr_pages to skipping f2fs: drop the duplicate pval in f2fs_getxattr f2fs: Don't update the xattr data that same as the exist f2fs: kill __is_extent_same f2fs: avoid bggc->fggc when enough free segments are avaliable after cp f2fs: select target segment with closer temperature in SSR mode f2fs: show simple call stack in fault injection message f2fs: no need lock_op in f2fs_write_inline_data ...
2017-03-02KEYS: Differentiate uses of rcu_dereference_key() and user_key_payload()Gravatar David Howells 2-3/+11
rcu_dereference_key() and user_key_payload() are currently being used in two different, incompatible ways: (1) As a wrapper to rcu_dereference() - when only the RCU read lock used to protect the key. (2) As a wrapper to rcu_dereference_protected() - when the key semaphor is used to protect the key and the may be being modified. Fix this by splitting both of the key wrappers to produce: (1) RCU accessors for keys when caller has the key semaphore locked: dereference_key_locked() user_key_payload_locked() (2) RCU accessors for keys when caller holds the RCU read lock: dereference_key_rcu() user_key_payload_rcu() This should fix following warning in the NFS idmapper =============================== [ INFO: suspicious RCU usage. ] 4.10.0 #1 Tainted: G W ------------------------------- ./include/keys/user-type.h:53 suspicious rcu_dereference_protected() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 0 1 lock held by mount.nfs/5987: #0: (rcu_read_lock){......}, at: [<d000000002527abc>] nfs_idmap_get_key+0x15c/0x420 [nfsv4] stack backtrace: CPU: 1 PID: 5987 Comm: mount.nfs Tainted: G W 4.10.0 #1 Call Trace: dump_stack+0xe8/0x154 (unreliable) lockdep_rcu_suspicious+0x140/0x190 nfs_idmap_get_key+0x380/0x420 [nfsv4] nfs_map_name_to_uid+0x2a0/0x3b0 [nfsv4] decode_getfattr_attrs+0xfac/0x16b0 [nfsv4] decode_getfattr_generic.constprop.106+0xbc/0x150 [nfsv4] nfs4_xdr_dec_lookup_root+0xac/0xb0 [nfsv4] rpcauth_unwrap_resp+0xe8/0x140 [sunrpc] call_decode+0x29c/0x910 [sunrpc] __rpc_execute+0x140/0x8f0 [sunrpc] rpc_run_task+0x170/0x200 [sunrpc] nfs4_call_sync_sequence+0x68/0xa0 [nfsv4] _nfs4_lookup_root.isra.44+0xd0/0xf0 [nfsv4] nfs4_lookup_root+0xe0/0x350 [nfsv4] nfs4_lookup_root_sec+0x70/0xa0 [nfsv4] nfs4_find_root_sec+0xc4/0x100 [nfsv4] nfs4_proc_get_rootfh+0x5c/0xf0 [nfsv4] nfs4_get_rootfh+0x6c/0x190 [nfsv4] nfs4_server_common_setup+0xc4/0x260 [nfsv4] nfs4_create_server+0x278/0x3c0 [nfsv4] nfs4_remote_mount+0x50/0xb0 [nfsv4] mount_fs+0x74/0x210 vfs_kern_mount+0x78/0x220 nfs_do_root_mount+0xb0/0x140 [nfsv4] nfs4_try_mount+0x60/0x100 [nfsv4] nfs_fs_mount+0x5ec/0xda0 [nfs] mount_fs+0x74/0x210 vfs_kern_mount+0x78/0x220 do_mount+0x254/0xf70 SyS_mount+0x94/0x100 system_call+0x38/0xe0 Reported-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-03-01objtool, modules: Discard objtool annotation sections for modulesGravatar Josh Poimboeuf 2-2/+2
The '__unreachable' and '__func_stack_frame_non_standard' sections are only used at compile time. They're discarded for vmlinux but they should also be discarded for modules. Since this is a recurring pattern, prefix the section names with ".discard.". It's a nice convention and vmlinux.lds.h already discards such sections. Also remove the 'a' (allocatable) flag from the __unreachable section since it doesn't make sense for a discarded section. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jessica Yu <jeyu@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: d1091c7fa3d5 ("objtool: Improve detection of BUG() and other dead ends") Link: http://lkml.kernel.org/r/20170301180444.lhd53c5tibc4ns77@treble Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-01Merge branch 'next' of ↵Gravatar Linus Torvalds 1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux Pull thermal management updates from Zhang Rui: - add thermal driver for R-Car Gen3 thermal sensors. - add thermal driver for ZTE' zx2967 family thermal sensors. - convert thermal ID allocation from IDR to IDA. - fix a possible NULL dereference in imx thermal driver. - fix a ti-soc-thermal driver dependency issue so that critical thermal control is still available when CPU_THERMAL is not defined. - update binding information for QorIQ thermal driver. - a couple of cleanups in thermal core, intel_powerclamp, exynos, dra752-thermal, mtk-thermal driver. * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: powerpc/mpc85xx: Update TMU device tree node for T1023/T1024 powerpc/mpc85xx: Update TMU device tree node for T1040/T1042 dt-bindings: Update QorIQ TMU thermal bindings thermal: mtk_thermal: Staticise a number of data variables thermal: arm: dra752: Remove all TSHUT related definitions thermal: arm: dra752: Remove TSHUT configuration thermal: ti-soc-thermal: Remove CPU_THERMAL Dependency from TI_THERMAL thermal: imx: Fix possible NULL dereference. thermal: exynos: Remove parsing unused samsung,tmu_cal_mode property thermal: zx2967: add thermal driver for ZTE's zx2967 family thermal: use cpumask_var_t for on-stack cpu masks dt: bindings: add documentation for zx2967 family thermal sensor thermal/intel_powerclamp: Remove set-but-not-used variables thermal: rcar_gen3_thermal: Add R-Car Gen3 thermal driver thermal: rcar_gen3_thermal: Document the R-Car Gen3 thermal: convert devfreq_cooling to use an IDA thermal: convert cpu_cooling to use an IDA thermal: convert clock cooling to use an IDA thermal core: convert ID allocation to IDA
2017-03-01Merge tag 'pwm/for-4.11-rc1' of ↵Gravatar Linus Torvalds 1-18/+15
git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm Pull pwm updates from Thierry Reding: "This set contains mostly fixes to existing drivers as well as cleanup of code that's not been in active use for a while" * tag 'pwm/for-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm: (27 commits) acpi: lpss: call pwm_add_table() for BSW PWM device pwm: Try to load modules during pwm_get() pwm: Don't hold pwm_lookup_lock longer than necessary pwm: Make the PWM_POLARITY flag in DTB optional pwm: Print error messages with pr_err() instead of pr_debug() pwm: imx: Add polarity inversion support to i.MX's PWMv2 pwm: imx: doc: Update imx-pwm.txt documentation entry pwm: imx: Remove redundant i.MX PWMv2 code pwm: imx: Provide atomic PWM support for i.MX PWMv2 pwm: imx: Move PWMv2 wait for fifo slot code to a separate function pwm: imx: Move PWMv2 software reset code to a separate function pwm: imx: Rewrite v1 code to facilitate switch to atomic PWM pwm: imx: Add separate set of PWM ops for v1 and v2 pwm: imx: Remove ipg clock and enable per clock when required pwm: lpss: Add Intel Gemini Lake PCI ID pwm: lpss: Do not export board infos for different PWM types pwm: lpss: Avoid reconfiguring while UPDATE bit is still enabled pwm: lpss: Switch to new atomic API pwm: lpss: Allow duty cycle to be 0 pwm: lpss: Avoid potential overflow of base_unit ...
2017-03-01objtool, compiler.h: Fix __unreachable section relocation sizeGravatar Josh Poimboeuf 1-1/+1
Linus reported the following commit broke module loading on his laptop: d1091c7fa3d5 ("objtool: Improve detection of BUG() and other dead ends") It showed errors like the following: module: overflow in relocation type 10 val ffffffffc02afc81 module: 'nvme' likely not compiled with -mcmodel=kernel The problem is that the __unreachable section addresses are stored using the '.long' asm directive, which isn't big enough for .text section kernel addresses. Use relative addresses instead: ".long %c0b - .\t\n" Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: d1091c7fa3d5 ("objtool: Improve detection of BUG() and other dead ends") Link: http://lkml.kernel.org/r/20170301060504.oltm3iws6fmubnom@treble Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-28Merge tag 'for-linus-4.11' of git://git.code.sf.net/p/openipmi/linux-ipmiGravatar Linus Torvalds 1-1/+1
Pull IPMI updates from Corey Minyard: "This is a few small fixes to the main IPMI driver, make some things const, fix typos, etc. The last patch came in about a week ago, but IMHO it's best to go in now. It is not for the main driver, it's for the bt-bmc driver, which runs on the managment controller side, not on the host side, so the scope is limited and the change is necessary" * tag 'for-linus-4.11' of git://git.code.sf.net/p/openipmi/linux-ipmi: ipmi: bt-bmc: Use a regmap for register access char: ipmi: constify ipmi_smi_handlers structures acpi:ipmi: Make IPMI user handler const ipmi: make ipmi_usr_hndl const Documentation: Fix a typo in IPMI.txt.
2017-02-28Merge branch 'idr-4.11' of git://git.infradead.org/users/willy/linux-daxGravatar Linus Torvalds 2-148/+179
Pull IDR rewrite from Matthew Wilcox: "The most significant part of the following is the patch to rewrite the IDR & IDA to be clients of the radix tree. But there's much more, including an enhancement of the IDA to be significantly more space efficient, an IDR & IDA test suite, some improvements to the IDR API (and driver changes to take advantage of those improvements), several improvements to the radix tree test suite and RCU annotations. The IDR & IDA rewrite had a good spin in linux-next and Andrew's tree for most of the last cycle. Coupled with the IDR test suite, I feel pretty confident that any remaining bugs are quite hard to hit. 0-day did a great job of watching my git tree and pointing out problems; as it hit them, I added new test-cases to be sure not to be caught the same way twice" Willy goes on to expand a bit on the IDR rewrite rationale: "The radix tree and the IDR use very similar data structures. Merging the two codebases lets us share the memory allocation pools, and results in a net deletion of 500 lines of code. It also opens up the possibility of exposing more of the features of the radix tree to users of the IDR (and I have some interesting patches along those lines waiting for 4.12) It also shrinks the size of the 'struct idr' from 40 bytes to 24 which will shrink a fair few data structures that embed an IDR" * 'idr-4.11' of git://git.infradead.org/users/willy/linux-dax: (32 commits) radix tree test suite: Add config option for map shift idr: Add missing __rcu annotations radix-tree: Fix __rcu annotations radix-tree: Add rcu_dereference and rcu_assign_pointer calls radix tree test suite: Run iteration tests for longer radix tree test suite: Fix split/join memory leaks radix tree test suite: Fix leaks in regression2.c radix tree test suite: Fix leaky tests radix tree test suite: Enable address sanitizer radix_tree_iter_resume: Fix out of bounds error radix-tree: Store a pointer to the root in each node radix-tree: Chain preallocated nodes through ->parent radix tree test suite: Dial down verbosity with -v radix tree test suite: Introduce kmalloc_verbose idr: Return the deleted entry from idr_remove radix tree test suite: Build separate binaries for some tests ida: Use exceptional entries for small IDAs ida: Move ida_bitmap to a percpu variable Reimplement IDR and IDA using the radix tree radix-tree: Add radix_tree_iter_delete ...
2017-02-28Merge tag 'nfsd-4.11' of git://linux-nfs.org/~bfields/linuxGravatar Linus Torvalds 6-16/+30
Pull nfsd updates from Bruce Fields: "The nfsd update this round is mainly a lot of miscellaneous cleanups and bugfixes. A couple changes could theoretically break working setups on upgrade. I don't expect complaints in practice, but they seem worth calling out just in case: - NFS security labels are now off by default; a new security_label export flag reenables it per export. But, having them on by default is a disaster, as it generally only makes sense if all your clients and servers have similar enough selinux policies. Thanks to Jason Tibbitts for pointing this out. - NFSv4/UDP support is off. It was never really supported, and the spec explicitly forbids it. We only ever left it on out of laziness; thanks to Jeff Layton for finally fixing that" * tag 'nfsd-4.11' of git://linux-nfs.org/~bfields/linux: (34 commits) nfsd: Fix display of the version string nfsd: fix configuration of supported minor versions sunrpc: don't register UDP port with rpcbind when version needs congestion control nfs/nfsd/sunrpc: enforce transport requirements for NFSv4 sunrpc: flag transports as having congestion control sunrpc: turn bitfield flags in svc_version into bools nfsd: remove superfluous KERN_INFO nfsd: special case truncates some more nfsd: minor nfsd_setattr cleanup NFSD: Reserve adequate space for LOCKT operation NFSD: Get response size before operation for all RPCs nfsd/callback: Drop a useless data copy when comparing sessionid nfsd/callback: skip the callback tag nfsd/callback: Cleanup callback cred on shutdown nfsd/idmap: return nfserr_inval for 0-length names SUNRPC/Cache: Always treat the invalid cache as unexpired SUNRPC: Drop all entries from cache_detail when cache_purge() svcrdma: Poll CQs in "workqueue" mode svcrdma: Combine list fields in struct svc_rdma_op_ctxt svcrdma: Remove unused sc_dto_q field ...
2017-02-28Merge tag 'ceph-for-4.11-rc1' of git://github.com/ceph/ceph-clientGravatar Linus Torvalds 5-24/+54
Pull ceph updates from Ilya Dryomov: "This time around we have: - support for rbd data-pool feature, which enables rbd images on erasure-coded pools (myself). CEPH_PG_MAX_SIZE has been bumped to allow erasure-coded profiles with k+m up to 32. - a patch for ceph_d_revalidate() performance regression introduced in 4.9, along with some cleanups in the area (Jeff Layton) - a set of fixes for unsafe ->d_parent accesses in CephFS (Jeff Layton) - buffered reads are now processed in rsize windows instead of rasize windows (Andreas Gerstmayr). The new default for rsize mount option is 64M. - ack vs commit distinction is gone, greatly simplifying ->fsync() and MOSDOpReply handling code (myself) ... also a few filesystem bug fixes from Zheng, a CRUSH sync up (CRUSH computations are still serialized though) and several minor fixes and cleanups all over" * tag 'ceph-for-4.11-rc1' of git://github.com/ceph/ceph-client: (52 commits) libceph, rbd, ceph: WRITE | ONDISK -> WRITE libceph: get rid of ack vs commit ceph: remove special ack vs commit behavior ceph: tidy some white space in get_nonsnap_parent() crush: fix dprintk compilation crush: do is_out test only if we do not collide ceph: remove req from unsafe list when unregistering it rbd: constify device_type structure rbd: kill obj_request->object_name and rbd_segment_name_cache rbd: store and use obj_request->object_no rbd: RBD_V{1,2}_DATA_FORMAT macros rbd: factor out __rbd_osd_req_create() rbd: set offset and length outside of rbd_obj_request_create() rbd: support for data-pool feature rbd: introduce rbd_init_layout() rbd: use rbd_obj_bytes() more rbd: remove now unused rbd_obj_request_wait() and helpers rbd: switch rbd_obj_method_sync() to ceph_osdc_call() libceph: pass reply buffer length through ceph_osdc_call() rbd: do away with obj_request in rbd_obj_read_sync() ...
2017-02-28Merge branch 'locking-urgent-for-linus' of ↵Gravatar Linus Torvalds 1-265/+13
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Ingo Molnar: "The main change is the uninlining of large refcount_t APIs, plus a header dependency fix. Note that the uninlining allowed us to enable the underflow/overflow warnings unconditionally and remove the debug Kconfig switch: this might trigger new warnings in buggy code and turn crashes/use-after-free bugs into less harmful memory leaks" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/refcounts: Add missing kernel.h header to have UINT_MAX defined locking/refcounts: Out-of-line everything
2017-02-28Merge branch 'core-urgent-for-linus' of ↵Gravatar Linus Torvalds 1-1/+13
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool fixes from Ingo Molnar: "A handful of objtool fixes related to unreachable code, plus a build fix for out of tree modules" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool: Enclose contents of unreachable() macro in a block objtool: Prevent GCC from merging annotate_unreachable() objtool: Improve detection of BUG() and other dead ends objtool: Fix CONFIG_STACK_VALIDATION=y warning for out-of-tree modules
2017-02-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netGravatar Linus Torvalds 2-1/+1
Pull networking fixes from David Miller: 1) Don't save TIPC header values before the header has been validated, from Jon Paul Maloy. 2) Fix memory leak in RDS, from Zhu Yanjun. 3) We miss to initialize the UID in the flow key in some paths, from Julian Anastasov. 4) Fix latent TOS masking bug in the routing cache removal from years ago, also from Julian. 5) We forget to set the sockaddr port in sctp_copy_local_addr_list(), fix from Xin Long. 6) Missing module ref count drop in packet scheduler actions, from Roman Mashak. 7) Fix RCU annotations in rht_bucket_nested, from Herbert Xu. 8) Fix use after free which happens because L2TP's ipv4 support returns non-zero values from it's backlog_rcv function which ipv4 interprets as protocol values. Fix from Paul Hüber. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (35 commits) qed: Don't use attention PTT for configuring BW qed: Fix race with multiple VFs l2tp: avoid use-after-free caused by l2tp_ip_backlog_recv xfrm: provide correct dst in xfrm_neigh_lookup rhashtable: Fix RCU dereference annotation in rht_bucket_nested rhashtable: Fix use before NULL check in bucket_table_free net sched actions: do not overwrite status of action creation. rxrpc: Kernel calls get stuck in recvmsg net sched actions: decrement module reference count after table flush. lib: Allow compile-testing of parman ipv6: check sk sk_type and protocol early in ip_mroute_set/getsockopt sctp: set sin_port for addr param when checking duplicate address net/mlx4_en: fix overflow in mlx4_en_init_timestamp() netfilter: nft_set_bitmap: incorrect bitmap size net: s2io: fix typo argumnet argument net: vxge: fix typo argumnet argument netfilter: nf_ct_expect: Change __nf_ct_expect_check() return value. ipv4: mask tos for input route ipv4: add missing initialization for flowi4_uid lib: fix spelling mistake: "actualy" -> "actually" ...
2017-02-27Merge branch 'akpm' (patches from Andrew)Gravatar Linus Torvalds 21-60/+167
Merge yet more updates from Andrew Morton: - a few MM remainders - misc things - autofs updates - signals - affs updates - ipc - nilfs2 - spelling.txt updates * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (78 commits) mm, x86: fix HIGHMEM64 && PARAVIRT build config for native_pud_clear() mm: add arch-independent testcases for RODATA hfs: atomically read inode size mm: clarify mm_struct.mm_{users,count} documentation mm: use mmget_not_zero() helper mm: add new mmget() helper mm: add new mmgrab() helper checkpatch: warn when formats use %Z and suggest %z lib/vsprintf.c: remove %Z support scripts/spelling.txt: add some typo-words scripts/spelling.txt: add "followings" pattern and fix typo instances scripts/spelling.txt: add "therfore" pattern and fix typo instances scripts/spelling.txt: add "overwriten" pattern and fix typo instances scripts/spelling.txt: add "overwritting" pattern and fix typo instances scripts/spelling.txt: add "deintialize(d)" pattern and fix typo instances scripts/spelling.txt: add "disassocation" pattern and fix typo instances scripts/spelling.txt: add "omited" pattern and fix typo instances scripts/spelling.txt: add "explictely" pattern and fix typo instances scripts/spelling.txt: add "applys" pattern and fix typo instances scripts/spelling.txt: add "configuartion" pattern and fix typo instances ...
2017-02-28objtool: Enclose contents of unreachable() macro in a blockGravatar Josh Poimboeuf 1-1/+2
Guenter Roeck reported a boot failure in mips64. It was bisected to the following commit: d1091c7fa3d5 ("objtool: Improve detection of BUG() and other dead ends") The unreachable() macro was formerly only composed of a single statement. The above commit added a second statement, but neglected to enclose the statements in a block. Suggested-by: Guenter Roeck <linux@roeck-us.net> Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: d1091c7fa3d5 ("objtool: Improve detection of BUG() and other dead ends") Link: http://lkml.kernel.org/r/20170228042116.glmwmwiohcix7o4a@treble Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-27Merge branch 'for-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqGravatar Linus Torvalds 1-2/+2
Pull workqueue update from Tejun Heo: "Just one patch to silence clang's warnings" * 'for-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: avoid clang warning
2017-02-27Merge branch 'for-4.11' of ↵Gravatar Linus Torvalds 6-29/+113
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: "Several noteworthy changes. - Parav's rdma controller is finally merged. It is very straight forward and can limit the abosolute numbers of common rdma constructs used by different cgroups. - kernel/cgroup.c got too chubby and disorganized. Created kernel/cgroup/ subdirectory and moved all cgroup related files under kernel/ there and reorganized the core code. This hurts for backporting patches but was long overdue. - cgroup v2 process listing reimplemented so that it no longer depends on allocating a buffer large enough to cache the entire result to sort and uniq the output. v2 has always mangled the sort order to ensure that users don't depend on the sorted output, so this shouldn't surprise anybody. This makes the pid listing functions use the same iterators that are used internally, which have to have the same iterating capabilities anyway. - perf cgroup filtering now works automatically on cgroup v2. This patch was posted a long time ago but somehow fell through the cracks. - misc fixes asnd documentation updates" * 'for-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (27 commits) kernfs: fix locking around kernfs_ops->release() callback cgroup: drop the matching uid requirement on migration for cgroup v2 cgroup, perf_event: make perf_event controller work on cgroup2 hierarchy cgroup: misc cleanups cgroup: call subsys->*attach() only for subsystems which are actually affected by migration cgroup: track migration context in cgroup_mgctx cgroup: cosmetic update to cgroup_taskset_add() rdmacg: Fixed uninitialized current resource usage cgroup: Add missing cgroup-v2 PID controller documentation. rdmacg: Added documentation for rdmacg IB/core: added support to use rdma cgroup controller rdmacg: Added rdma cgroup controller cgroup: fix a comment typo cgroup: fix RCU related sparse warnings cgroup: move namespace code to kernel/cgroup/namespace.c cgroup: rename functions for consistency cgroup: move v1 mount functions to kernel/cgroup/cgroup-v1.c cgroup: separate out cgroup1_kf_syscall_ops cgroup: refactor mount path and clearly distinguish v1 and v2 paths cgroup: move cgroup v1 specific code to kernel/cgroup/cgroup-v1.c ...
2017-02-27Merge tag 'rtc-4.11' of ↵Gravatar Linus Torvalds 2-16/+1
git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux Pull RTC updates from Alexandre Belloni: "Subsystem: - constify rtc_class_ops structures New driver: - STM32 Drivers: - armada38x: fix errata, Armada 7K/8K support - ds3232: fix wakeup support - gemini: DT support - m48t86: huge cleanup and platform_data removal - mcp795: alarm support - sun6i: proper oscillator handling - tegra: proper clock handling - tps65910: calibration support" * tag 'rtc-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (44 commits) rtc: ds3232: Call device_init_wakeup before device_register rtc: pcf2127: bulk read only date and time registers. rtc: armada38x: Add support for Armada 7K/8K rtc: armada38x: Prepare driver to manage different versions rtc: ds3232: Add regmap max_register definition. rtc: ds3232: Cleanup whitespace around register and bit definitions. rtc: m48t86: remove unused platform_data ARM: Orion5x: ts78xx: allow rtc-m48t86 to manage it's own resources ARM: Orion5x: ts78xx: remove RTC detection ARM: ep93xx: ts72xx: allow rtc-m48t86 to manage its own resources rtc: m48t86: verify that the RTC is actually present rtc: m48t86: add NVRAM support rtc: m48t86: allow driver to manage its resources rtc: m48t86: shorten register name defines bindings: rtc: correct wrong reference in required properties rtc: sun6i: Fix return value check in sun6i_rtc_clk_init() rtc: sun6i: extend test coverage rtc: sun6i: Fix compatibility with old DT binding rtc: snvs: add a missing write sync rtc: bq32000: add support to enable disable the trickle charge FET bypass ...
2017-02-27mm: add arch-independent testcases for RODATAGravatar Jinbum Park 1-0/+23
This patch makes arch-independent testcases for RODATA. Both x86 and x86_64 already have testcases for RODATA, But they are arch-specific because using inline assembly directly. And cacheflush.h is not a suitable location for rodata-test related things. Since they were in cacheflush.h, If someone change the state of CONFIG_DEBUG_RODATA_TEST, It cause overhead of kernel build. To solve the above issues, write arch-independent testcases and move it to shared location. [jinb.park7@gmail.com: fix config dependency] Link: http://lkml.kernel.org/r/20170209131625.GA16954@pjb1027-Latitude-E5410 Link: http://lkml.kernel.org/r/20170129105436.GA9303@pjb1027-Latitude-E5410 Signed-off-by: Jinbum Park <jinb.park7@gmail.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Laura Abbott <labbott@redhat.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Valentin Rothberg <valentinrothberg@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27mm: clarify mm_struct.mm_{users,count} documentationGravatar Vegard Nossum 1-2/+21
Clarify documentation relating to mm_users and mm_count, and switch to kernel-doc syntax. Link: http://lkml.kernel.org/r/20161218123229.22952-4-vegard.nossum@oracle.com Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27mm: add new mmget() helperGravatar Vegard Nossum 1-0/+21
Apart from adding the helper function itself, the rest of the kernel is converted mechanically using: git grep -l 'atomic_inc.*mm_users' | xargs sed -i 's/atomic_inc(&\(.*\)->mm_users);/mmget\(\1\);/' git grep -l 'atomic_inc.*mm_users' | xargs sed -i 's/atomic_inc(&\(.*\)\.mm_users);/mmget\(\&\1\);/' This is needed for a later patch that hooks into the helper, but might be a worthwhile cleanup on its own. (Michal Hocko provided most of the kerneldoc comment.) Link: http://lkml.kernel.org/r/20161218123229.22952-2-vegard.nossum@oracle.com Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27mm: add new mmgrab() helperGravatar Vegard Nossum 1-0/+22
Apart from adding the helper function itself, the rest of the kernel is converted mechanically using: git grep -l 'atomic_inc.*mm_count' | xargs sed -i 's/atomic_inc(&\(.*\)->mm_count);/mmgrab\(\1\);/' git grep -l 'atomic_inc.*mm_count' | xargs sed -i 's/atomic_inc(&\(.*\)\.mm_count);/mmgrab\(\&\1\);/' This is needed for a later patch that hooks into the helper, but might be a worthwhile cleanup on its own. (Michal Hocko provided most of the kerneldoc comment.) Link: http://lkml.kernel.org/r/20161218123229.22952-1-vegard.nossum@oracle.com Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27scripts/spelling.txt: add "followings" pattern and fix typo instancesGravatar Masahiro Yamada 1-1/+1
Fix typos and add the following to the scripts/spelling.txt: followings||following While we are here, add a missing colon in the boilerplate in DT binding documents. The "you SoC" in allwinner,sunxi-pinctrl.txt was fixed as well. I reworded "as the followings:" to "as follows:" for drivers/usb/gadget/udc/renesas_usb3.c. Link: http://lkml.kernel.org/r/1481573103-11329-32-git-send-email-yamada.masahiro@socionext.com Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27scripts/spelling.txt: add "disassocation" pattern and fix typo instancesGravatar Masahiro Yamada 1-1/+1
Fix typos and add the following to the scripts/spelling.txt: disassocation||disassociation Link: http://lkml.kernel.org/r/1481573103-11329-27-git-send-email-yamada.masahiro@socionext.com Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27scripts/spelling.txt: add "partiton" pattern and fix typo instancesGravatar Masahiro Yamada 2-2/+2
Fix typos and add the following to the scripts/spelling.txt: partiton||partition Link: http://lkml.kernel.org/r/1481573103-11329-7-git-send-email-yamada.masahiro@socionext.com Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27scripts/spelling.txt: add "an union" pattern and fix typo instancesGravatar Masahiro Yamada 3-5/+5
Fix typos and add the following to the scripts/spelling.txt: an union||a union Link: http://lkml.kernel.org/r/1481573103-11329-5-git-send-email-yamada.masahiro@socionext.com Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27scripts/spelling.txt: add "an user" pattern and fix typo instancesGravatar Masahiro Yamada 1-1/+1
Fix typos and add the following to the scripts/spelling.txt: an user||a user an userspace||a userspace I also added "userspace" to the list since it is a common word in Linux. I found some instances for "an userfaultfd", but I did not add it to the list. I felt it is endless to find words that start with "user" such as "userland" etc., so must draw a line somewhere. Link: http://lkml.kernel.org/r/1481573103-11329-4-git-send-email-yamada.masahiro@socionext.com Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27fs: add i_blocksize()Gravatar Fabian Frederick 1-0/+5
Replace all 1 << inode->i_blkbits and (1 << inode->i_blkbits) in fs branch. This patch also fixes multiple checkpatch warnings: WARNING: Prefer 'unsigned int' to bare use of 'unsigned' Thanks to Andrew Morton for suggesting more appropriate function instead of macro. [geliangtang@gmail.com: truncate: use i_blocksize()] Link: http://lkml.kernel.org/r/9c8b2cd83c8f5653805d43debde9fa8817e02fc4.1484895804.git.geliangtang@gmail.com Link: http://lkml.kernel.org/r/1481319905-10126-1-git-send-email-fabf@skynet.be Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27ipc/sem: add hysteresisGravatar Manfred Spraul 1-1/+1
sysv sem has two lock modes: One with per-semaphore locks, one lock mode with a single global lock for the whole array. When switching from the per-semaphore locks to the global lock, all per-semaphore locks must be scanned for ongoing operations. The patch adds a hysteresis for switching from the global lock to the per semaphore locks. This reduces how often the per-semaphore locks must be scanned. Compared to the initial patch, this is a simplified solution: Setting USE_GLOBAL_LOCK_HYSTERESIS to 1 restores the current behavior. In theory, a workload with exactly 10 simple sops and then one complex op now scales a bit worse, but this is pure theory: If there is concurrency, the it won't be exactly 10:1:10:1:10:1:... If there is no concurrency, then there is no need for scalability. Link: http://lkml.kernel.org/r/1476851896-3590-3-git-send-email-manfred@colorfullife.com Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: H. Peter Anvin <hpa@zytor.com> Cc: <1vier1@web.de> Cc: kernel test robot <xiaolong.ye@intel.com> Cc: <felixh@informatik.uni-bremen.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27include/linux/pid.h: use for_each_thread() in do_each_pid_thread()Gravatar Tetsuo Handa 1-2/+2
while_each_pid_thread() is using while_each_thread(), which is unsafe under RCU lock according to commit 0c740d0afc3b ("introduce for_each_thread() to replace the buggy while_each_thread()"). Use for_each_thread() in do_each_pid_thread() which is safe under RCU lock. Link: http://lkml.kernel.org/r/201702011947.DBD56740.OMVHOLOtSJFFFQ@I-love.SAKURA.ne.jp Link: http://lkml.kernel.org/r/1486041779-4401-2-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27sigaltstack: support SS_AUTODISARM for CONFIG_COMPATGravatar Stas Sergeev 1-1/+3
Currently SS_AUTODISARM is not supported in compatibility mode, but does not return -EINVAL either. This makes dosemu built with -m32 on x86_64 to crash. Also the kernel's sigaltstack selftest fails if compiled with -m32. This patch adds the needed support. Link: http://lkml.kernel.org/r/20170205101213.8163-2-stsp@list.ru Signed-off-by: Stas Sergeev <stsp@users.sourceforge.net> Cc: Milosz Tanski <milosz@adfin.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Nicolas Pitre <nicolas.pitre@linaro.org> Cc: Waiman Long <Waiman.Long@hpe.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dmitry Safonov <dsafonov@virtuozzo.com> Cc: Wang Xiaoqiang <wangxq10@lzu.edu.cn> Cc: Oleg Nesterov <oleg@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27autofs: remove duplicated AUTOFS_DEV_IOCTL_SIZE definitionGravatar Tomohiro Kusumi 1-2/+2
This macro is already defined in uapi header. Also use this macro where possible. Link: http://lkml.kernel.org/r/148577166656.9801.10322423666945951186.stgit@pluto.themaw.net Signed-off-by: Tomohiro Kusumi <tkusumi@tuxera.com> Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27autofs: add command enum/macros for root-dir ioctlsGravatar Tomohiro Kusumi 3-14/+29
Sync root-dir ioctl with misc-char-dev ioctl's enum/macro format since these two types of ioctls aren't completely independent of each other in terms of command nr. No functional changes. Link: http://lkml.kernel.org/r/148577166143.9801.15511796506678428145.stgit@pluto.themaw.net Signed-off-by: Tomohiro Kusumi <tkusumi@tuxera.com> Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27autofs: remove wrong commentGravatar Tomohiro Kusumi 1-4/+0
This format seems to have been taken from device mapper header, but autofs has no such file:function in both kernel and userspace. Link: http://lkml.kernel.org/r/148577164094.9801.4775075118014742496.stgit@pluto.themaw.net Signed-off-by: Tomohiro Kusumi <tkusumi@tuxera.com> Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27kprobes: move kprobe declarations to asm-generic/kprobes.hGravatar Luis R. Rodriguez 3-24/+28
Often all is needed is these small helpers, instead of compiler.h or a full kprobes.h. This is important for asm helpers, in fact even some asm/kprobes.h make use of these helpers... instead just keep a generic asm file with helpers useful for asm code with the least amount of clutter as possible. Likewise we need now to also address what to do about this file for both when architectures have CONFIG_HAVE_KPROBES, and when they do not. Then for when architectures have CONFIG_HAVE_KPROBES but have disabled CONFIG_KPROBES. Right now most asm/kprobes.h do not have guards against CONFIG_KPROBES, this means most architecture code cannot include asm/kprobes.h safely. Correct this and add guards for architectures missing them. Additionally provide architectures that not have kprobes support with the default asm-generic solution. This lets us force asm/kprobes.h on the header include/linux/kprobes.h always, but most importantly we can now safely include just asm/kprobes.h on architecture code without bringing the full kitchen sink of header files. Two architectures already provided a guard against CONFIG_KPROBES on its kprobes.h: sh, arch. The rest of the architectures needed gaurds added. We avoid including any not-needed headers on asm/kprobes.h unless kprobes have been enabled. In a subsequent atomic change we can try now to remove compiler.h from include/linux/kprobes.h. During this sweep I've also identified a few architectures defining a common macro needed for both kprobes and ftrace, that of the definition of the breakput instruction up. Some refer to this as BREAKPOINT_INSTRUCTION. This must be kept outside of the #ifdef CONFIG_KPROBES guard. [mcgrof@kernel.org: fix arm64 build] Link: http://lkml.kernel.org/r/CAB=NE6X1WMByuARS4mZ1g9+W=LuVBnMDnh_5zyN0CLADaVh=Jw@mail.gmail.com [sfr@canb.auug.org.au: fixup for kprobes declarations moving] Link: http://lkml.kernel.org/r/20170214165933.13ebd4f4@canb.auug.org.au Link: http://lkml.kernel.org/r/20170203233139.32682-1-mcgrof@kernel.org Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27Merge tag 'trace-v4.11' of ↵Gravatar Linus Torvalds 4-20/+46
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "This release has no new tracing features, just clean ups, minor fixes and small optimizations" * tag 'trace-v4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (25 commits) tracing: Remove outdated ring buffer comment tracing/probes: Fix a warning message to show correct maximum length tracing: Fix return value check in trace_benchmark_reg() tracing: Use modern function declaration jump_label: Reduce the size of struct static_key tracing/probe: Show subsystem name in messages tracing/hwlat: Update old comment about migration timers: Make flags output in the timer_start tracepoint useful tracing: Have traceprobe_probes_write() not access userspace unnecessarily tracing: Have COMM event filter key be treated as a string ftrace: Have set_graph_function handle multiple functions in one write ftrace: Do not hold references of ftrace_graph_{notrace_}hash out of graph_lock tracing: Reset parser->buffer to allow multiple "puts" ftrace: Have set_graph_functions handle write with RDWR ftrace: Reset fgd->hash in ftrace_graph_write() ftrace: Replace (void *)1 with a meaningful macro name FTRACE_GRAPH_EMPTY ftrace: Create a slight optimization on searching the ftrace_hash tracing: Add ftrace_hash_key() helper function ftrace: Convert graph filter to use hash tables ftrace: Expose ftrace_hash_empty and ftrace_lookup_ip ...
2017-02-27virtio_scsi: use virtio IRQ affinityGravatar Christoph Hellwig 1-1/+0
Use automatic IRQ affinity assignment in the virtio layer if available, and build the blk-mq queues based on it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-02-27blk-mq: provide a default queue mapping for virtio deviceGravatar Christoph Hellwig 1-0/+10
Similar to the PCI version, just calling into virtio instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-02-27virtio: provide a method to get the IRQ affinity mask for a virtqueueGravatar Christoph Hellwig 1-0/+3
This basically passed up the pci_irq_get_affinity information through virtio through an optional get_vq_affinity method. It is only implemented by the PCI backend for now, and only when we use per-virtqueue IRQs. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-02-27virtio: allow drivers to request IRQ affinity when creating VQsGravatar Christoph Hellwig 1-4/+5
Add a struct irq_affinity pointer to the find_vqs methods, which if set is used to tell the PCI layer to create the MSI-X vectors for our I/O virtqueues with the proper affinity from the start. Compared to after the fact affinity hints this gives us an instantly working setup and allows to allocate the irq descritors node-local and avoid interconnect traffic. Last but not least this will allow blk-mq queues are created based on the interrupt affinity for storage drivers. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-02-27virtio_pci: don't duplicate the msix_enable flag in struct pci_devGravatar Christoph Hellwig 1-1/+1
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>