aboutsummaryrefslogtreecommitdiff
path: root/Documentation/filesystems
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/filesystems')
-rw-r--r--Documentation/filesystems/debugfs.rst8
-rw-r--r--Documentation/filesystems/erofs.rst38
-rw-r--r--Documentation/filesystems/fscrypt.rst7
-rw-r--r--Documentation/filesystems/locking.rst10
-rw-r--r--Documentation/filesystems/mount_api.rst12
-rw-r--r--Documentation/filesystems/porting.rst4
-rw-r--r--Documentation/filesystems/proc.rst20
-rw-r--r--Documentation/filesystems/sysfs.rst41
-rw-r--r--Documentation/filesystems/vfs.rst5
9 files changed, 91 insertions, 54 deletions
diff --git a/Documentation/filesystems/debugfs.rst b/Documentation/filesystems/debugfs.rst
index 71b1fee56d2a..dc35da8b8792 100644
--- a/Documentation/filesystems/debugfs.rst
+++ b/Documentation/filesystems/debugfs.rst
@@ -155,8 +155,8 @@ any code which does so in the mainline. Note that all files created with
debugfs_create_blob() are read-only.
If you want to dump a block of registers (something that happens quite
-often during development, even if little such code reaches mainline.
-Debugfs offers two functions: one to make a registers-only file, and
+often during development, even if little such code reaches mainline),
+debugfs offers two functions: one to make a registers-only file, and
another to insert a register block in the middle of another sequential
file::
@@ -183,7 +183,7 @@ The "base" argument may be 0, but you may want to build the reg32 array
using __stringify, and a number of register names (macros) are actually
byte offsets over a base for the register block.
-If you want to dump an u32 array in debugfs, you can create file with::
+If you want to dump a u32 array in debugfs, you can create a file with::
struct debugfs_u32_array {
u32 *array;
@@ -197,7 +197,7 @@ If you want to dump an u32 array in debugfs, you can create file with::
The "array" argument wraps a pointer to the array's data and the number
of its elements. Note: Once array is created its size can not be changed.
-There is a helper function to create device related seq_file::
+There is a helper function to create a device-related seq_file::
void debugfs_create_devm_seqfile(struct device *dev,
const char *name,
diff --git a/Documentation/filesystems/erofs.rst b/Documentation/filesystems/erofs.rst
index 05e03d54af1a..067fd1670b1f 100644
--- a/Documentation/filesystems/erofs.rst
+++ b/Documentation/filesystems/erofs.rst
@@ -30,12 +30,18 @@ It is implemented to be a better choice for the following scenarios:
especially for those embedded devices with limited memory and high-density
hosts with numerous containers.
-Here is the main features of EROFS:
+Here are the main features of EROFS:
- Little endian on-disk design;
- - 4KiB block size and 32-bit block addresses, therefore 16TiB address space
- at most for now;
+ - Block-based distribution and file-based distribution over fscache are
+ supported;
+
+ - Support multiple devices to refer to external blobs, which can be used
+ for container images;
+
+ - 4KiB block size and 32-bit block addresses for each device, therefore
+ 16TiB address space at most for now;
- Two inode layouts for different requirements:
@@ -50,28 +56,31 @@ Here is the main features of EROFS:
Metadata reserved 8 bytes 18 bytes
===================== ============ ======================================
- - Metadata and data could be mixed as an option;
-
- - Support extended attributes (xattrs) as an option;
+ - Support extended attributes as an option;
- - Support tailpacking data and xattr inline compared to byte-addressed
- unaligned metadata or smaller block size alternatives;
-
- - Support POSIX.1e ACLs by using xattrs;
+ - Support POSIX.1e ACLs by using extended attributes;
- Support transparent data compression as an option:
LZ4 and MicroLZMA algorithms can be used on a per-file basis; In addition,
inplace decompression is also supported to avoid bounce compressed buffers
and page cache thrashing.
+ - Support chunk-based data deduplication and rolling-hash compressed data
+ deduplication;
+
+ - Support tailpacking inline compared to byte-addressed unaligned metadata
+ or smaller block size alternatives;
+
+ - Support merging tail-end data into a special inode as fragments.
+
+ - Support large folios for uncompressed files.
+
- Support direct I/O on uncompressed files to avoid double caching for loop
devices;
- Support FSDAX on uncompressed images for secure containers and ramdisks in
order to get rid of unnecessary page cache.
- - Support multiple devices for multi blob container images;
-
- Support file-based on-demand loading with the Fscache infrastructure.
The following git tree provides the file system user-space tools under
@@ -259,7 +268,7 @@ By the way, chunk-based files are all uncompressed for now.
Data compression
----------------
-EROFS implements LZ4 fixed-sized output compression which generates fixed-sized
+EROFS implements fixed-sized output compression which generates fixed-sized
compressed data blocks from variable-sized input in contrast to other existing
fixed-sized input solutions. Relatively higher compression ratios can be gotten
by using fixed-sized output compression since nowadays popular data compression
@@ -314,3 +323,6 @@ to understand its delta0 is constantly 1, as illustrated below::
If another HEAD follows a HEAD lcluster, there is no room to record CBLKCNT,
but it's easy to know the size of such pcluster is 1 lcluster as well.
+
+Since Linux v6.1, each pcluster can be used for multiple variable-sized extents,
+therefore it can be used for compressed data deduplication.
diff --git a/Documentation/filesystems/fscrypt.rst b/Documentation/filesystems/fscrypt.rst
index 5ba5817c17c2..ef183387da20 100644
--- a/Documentation/filesystems/fscrypt.rst
+++ b/Documentation/filesystems/fscrypt.rst
@@ -338,6 +338,7 @@ Currently, the following pairs of encryption modes are supported:
- AES-128-CBC for contents and AES-128-CTS-CBC for filenames
- Adiantum for both contents and filenames
- AES-256-XTS for contents and AES-256-HCTR2 for filenames (v2 policies only)
+- SM4-XTS for contents and SM4-CTS-CBC for filenames (v2 policies only)
If unsure, you should use the (AES-256-XTS, AES-256-CTS-CBC) pair.
@@ -369,6 +370,12 @@ CONFIG_CRYPTO_HCTR2 must be enabled. Also, fast implementations of XCTR and
POLYVAL should be enabled, e.g. CRYPTO_POLYVAL_ARM64_CE and
CRYPTO_AES_ARM64_CE_BLK for ARM64.
+SM4 is a Chinese block cipher that is an alternative to AES. It has
+not seen as much security review as AES, and it only has a 128-bit key
+size. It may be useful in cases where its use is mandated.
+Otherwise, it should not be used. For SM4 support to be available, it
+also needs to be enabled in the kernel crypto API.
+
New encryption modes can be added relatively easily, without changes
to individual filesystems. However, authenticated encryption (AE)
modes are not currently supported because of the difficulty of dealing
diff --git a/Documentation/filesystems/locking.rst b/Documentation/filesystems/locking.rst
index 8f737e76935c..36fa2a83d714 100644
--- a/Documentation/filesystems/locking.rst
+++ b/Documentation/filesystems/locking.rst
@@ -70,7 +70,7 @@ prototypes::
const char *(*get_link) (struct dentry *, struct inode *, struct delayed_call *);
void (*truncate) (struct inode *);
int (*permission) (struct inode *, int, unsigned int);
- struct posix_acl * (*get_acl)(struct inode *, int, bool);
+ struct posix_acl * (*get_inode_acl)(struct inode *, int, bool);
int (*setattr) (struct dentry *, struct iattr *);
int (*getattr) (const struct path *, struct kstat *, u32, unsigned int);
ssize_t (*listxattr) (struct dentry *, char *, size_t);
@@ -84,13 +84,14 @@ prototypes::
int (*fileattr_set)(struct user_namespace *mnt_userns,
struct dentry *dentry, struct fileattr *fa);
int (*fileattr_get)(struct dentry *dentry, struct fileattr *fa);
+ struct posix_acl * (*get_acl)(struct user_namespace *, struct dentry *, int);
locking rules:
all may block
-============= =============================================
+============== =============================================
ops i_rwsem(inode)
-============= =============================================
+============== =============================================
lookup: shared
create: exclusive
link: exclusive (both)
@@ -104,6 +105,7 @@ readlink: no
get_link: no
setattr: exclusive
permission: no (may not block if called in rcu-walk mode)
+get_inode_acl: no
get_acl: no
getattr: no
listxattr: no
@@ -113,7 +115,7 @@ atomic_open: shared (exclusive if O_CREAT is set in open flags)
tmpfile: no
fileattr_get: no or exclusive
fileattr_set: exclusive
-============= =============================================
+============== =============================================
Additionally, ->rmdir(), ->unlink() and ->rename() have ->i_rwsem
diff --git a/Documentation/filesystems/mount_api.rst b/Documentation/filesystems/mount_api.rst
index eb358a00be27..63204d2094fd 100644
--- a/Documentation/filesystems/mount_api.rst
+++ b/Documentation/filesystems/mount_api.rst
@@ -562,17 +562,6 @@ or looking up of superblocks.
The following helpers all wrap sget_fc():
- * ::
-
- int vfs_get_super(struct fs_context *fc,
- enum vfs_get_super_keying keying,
- int (*fill_super)(struct super_block *sb,
- struct fs_context *fc))
-
- This creates/looks up a deviceless superblock. The keying indicates how
- many superblocks of this type may exist and in what manner they may be
- shared:
-
(1) vfs_get_single_super
Only one such superblock may exist in the system. Any further
@@ -814,6 +803,7 @@ process the parameters it is given.
int fs_lookup_param(struct fs_context *fc,
struct fs_parameter *value,
bool want_bdev,
+ unsigned int flags,
struct path *_path);
This takes a parameter that carries a string or filename type and attempts
diff --git a/Documentation/filesystems/porting.rst b/Documentation/filesystems/porting.rst
index df0dc37e6f58..d2d684ae7798 100644
--- a/Documentation/filesystems/porting.rst
+++ b/Documentation/filesystems/porting.rst
@@ -462,8 +462,8 @@ ERR_PTR(...).
argument; instead of passing IPERM_FLAG_RCU we add MAY_NOT_BLOCK into mask.
generic_permission() has also lost the check_acl argument; ACL checking
-has been taken to VFS and filesystems need to provide a non-NULL ->i_op->get_acl
-to read an ACL from disk.
+has been taken to VFS and filesystems need to provide a non-NULL
+->i_op->get_inode_acl to read an ACL from disk.
---
diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
index 898c99eae8e4..f4ee84d7b351 100644
--- a/Documentation/filesystems/proc.rst
+++ b/Documentation/filesystems/proc.rst
@@ -47,6 +47,7 @@ fixes/update part 1.1 Stefani Seibold <stefani@seibold.net> June 9 2009
3.10 /proc/<pid>/timerslack_ns - Task timerslack value
3.11 /proc/<pid>/patch_state - Livepatch patch operation state
3.12 /proc/<pid>/arch_status - Task architecture specific information
+ 3.13 /proc/<pid>/fd - List of symlinks to open files
4 Configuring procfs
4.1 Mount options
@@ -245,7 +246,8 @@ It's slow but very precise.
Ngid NUMA group ID (0 if none)
Pid process id
PPid process id of the parent process
- TracerPid PID of process tracing this process (0 if not)
+ TracerPid PID of process tracing this process (0 if not, or
+ the tracer is outside of the current pid namespace)
Uid Real, effective, saved set, and file system UIDs
Gid Real, effective, saved set, and file system GIDs
FDSize number of file descriptor slots currently allocated
@@ -2149,6 +2151,22 @@ AVX512_elapsed_ms
the task is unlikely an AVX512 user, but depends on the workload and the
scheduling scenario, it also could be a false negative mentioned above.
+3.13 /proc/<pid>/fd - List of symlinks to open files
+-------------------------------------------------------
+This directory contains symbolic links which represent open files
+the process is maintaining. Example output::
+
+ lr-x------ 1 root root 64 Sep 20 17:53 0 -> /dev/null
+ l-wx------ 1 root root 64 Sep 20 17:53 1 -> /dev/null
+ lrwx------ 1 root root 64 Sep 20 17:53 10 -> 'socket:[12539]'
+ lrwx------ 1 root root 64 Sep 20 17:53 11 -> 'socket:[12540]'
+ lrwx------ 1 root root 64 Sep 20 17:53 12 -> 'socket:[12542]'
+
+The number of open files for the process is stored in 'size' member
+of stat() output for /proc/<pid>/fd for fast access.
+-------------------------------------------------------
+
+
Chapter 4: Configuring procfs
=============================
diff --git a/Documentation/filesystems/sysfs.rst b/Documentation/filesystems/sysfs.rst
index 8bba676b1365..f8187d466b97 100644
--- a/Documentation/filesystems/sysfs.rst
+++ b/Documentation/filesystems/sysfs.rst
@@ -12,10 +12,10 @@ Mike Murphy <mamurph@cs.clemson.edu>
:Original: 10 January 2003
-What it is:
-~~~~~~~~~~~
+What it is
+~~~~~~~~~~
-sysfs is a ram-based filesystem initially based on ramfs. It provides
+sysfs is a RAM-based filesystem initially based on ramfs. It provides
a means to export kernel data structures, their attributes, and the
linkages between them to userspace.
@@ -43,7 +43,7 @@ userspace. Top-level directories in sysfs represent the common
ancestors of object hierarchies; i.e. the subsystems the objects
belong to.
-Sysfs internally stores a pointer to the kobject that implements a
+sysfs internally stores a pointer to the kobject that implements a
directory in the kernfs_node object associated with the directory. In
the past this kobject pointer has been used by sysfs to do reference
counting directly on the kobject whenever the file is opened or closed.
@@ -55,7 +55,7 @@ Attributes
~~~~~~~~~~
Attributes can be exported for kobjects in the form of regular files in
-the filesystem. Sysfs forwards file I/O operations to methods defined
+the filesystem. sysfs forwards file I/O operations to methods defined
for the attributes, providing a means to read and write kernel
attributes.
@@ -72,8 +72,8 @@ you publicly humiliated and your code rewritten without notice.
An attribute definition is simply::
struct attribute {
- char * name;
- struct module *owner;
+ char *name;
+ struct module *owner;
umode_t mode;
};
@@ -138,7 +138,7 @@ __ATTR_WO(name):
assumes a name_store only and is restricted to mode
0200 that is root write access only.
__ATTR_RO_MODE(name, mode):
- fore more restrictive RO access currently
+ for more restrictive RO access; currently
only use case is the EFI System Resource Table
(see drivers/firmware/efi/esrt.c)
__ATTR_RW(name):
@@ -207,7 +207,7 @@ IOW, they should take only an object, an attribute, and a buffer as parameters.
sysfs allocates a buffer of size (PAGE_SIZE) and passes it to the
-method. Sysfs will call the method exactly once for each read or
+method. sysfs will call the method exactly once for each read or
write. This forces the following behavior on the method
implementations:
@@ -221,7 +221,7 @@ implementations:
be called again, rearmed, to fill the buffer.
- On write(2), sysfs expects the entire buffer to be passed during the
- first write. Sysfs then passes the entire buffer to the store() method.
+ first write. sysfs then passes the entire buffer to the store() method.
A terminating null is added after the data on stores. This makes
functions like sysfs_streq() safe to use.
@@ -237,7 +237,7 @@ Other notes:
- Writing causes the show() method to be rearmed regardless of current
file position.
-- The buffer will always be PAGE_SIZE bytes in length. On i386, this
+- The buffer will always be PAGE_SIZE bytes in length. On x86, this
is 4096.
- show() methods should return the number of bytes printed into the
@@ -253,7 +253,7 @@ Other notes:
through, be sure to return an error.
- The object passed to the methods will be pinned in memory via sysfs
- referencing counting its embedded object. However, the physical
+ reference counting its embedded object. However, the physical
entity (e.g. device) the object represents may not be present. Be
sure to have a way to check this, if necessary.
@@ -295,8 +295,12 @@ The top level sysfs directory looks like::
dev/
devices/
firmware/
- net/
fs/
+ hypervisor/
+ kernel/
+ module/
+ net/
+ power/
devices/ contains a filesystem representation of the device tree. It maps
directly to the internal kernel device tree, which is a hierarchy of
@@ -317,15 +321,18 @@ span multiple bus types).
fs/ contains a directory for some filesystems. Currently each
filesystem wanting to export attributes must create its own hierarchy
-below fs/ (see ./fuse.txt for an example).
+below fs/ (see ./fuse.rst for an example).
+
+module/ contains parameter values and state information for all
+loaded system modules, for both builtin and loadable modules.
-dev/ contains two directories char/ and block/. Inside these two
+dev/ contains two directories: char/ and block/. Inside these two
directories there are symlinks named <major>:<minor>. These symlinks
point to the sysfs directory for the given device. /sys/dev provides a
quick way to lookup the sysfs interface for a device from the result of
a stat(2) operation.
-More information can driver-model specific features can be found in
+More information on driver-model specific features can be found in
Documentation/driver-api/driver-model/.
@@ -335,7 +342,7 @@ TODO: Finish this section.
Current Interfaces
~~~~~~~~~~~~~~~~~~
-The following interface layers currently exist in sysfs:
+The following interface layers currently exist in sysfs.
devices (include/linux/device.h)
diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst
index 2b55f71e2ae1..2c15e7053113 100644
--- a/Documentation/filesystems/vfs.rst
+++ b/Documentation/filesystems/vfs.rst
@@ -435,7 +435,7 @@ As of kernel 2.6.22, the following members are defined:
const char *(*get_link) (struct dentry *, struct inode *,
struct delayed_call *);
int (*permission) (struct user_namespace *, struct inode *, int);
- struct posix_acl * (*get_acl)(struct inode *, int, bool);
+ struct posix_acl * (*get_inode_acl)(struct inode *, int, bool);
int (*setattr) (struct user_namespace *, struct dentry *, struct iattr *);
int (*getattr) (struct user_namespace *, const struct path *, struct kstat *, u32, unsigned int);
ssize_t (*listxattr) (struct dentry *, char *, size_t);
@@ -443,7 +443,8 @@ As of kernel 2.6.22, the following members are defined:
int (*atomic_open)(struct inode *, struct dentry *, struct file *,
unsigned open_flag, umode_t create_mode);
int (*tmpfile) (struct user_namespace *, struct inode *, struct file *, umode_t);
- int (*set_acl)(struct user_namespace *, struct inode *, struct posix_acl *, int);
+ struct posix_acl * (*get_acl)(struct user_namespace *, struct dentry *, int);
+ int (*set_acl)(struct user_namespace *, struct dentry *, struct posix_acl *, int);
int (*fileattr_set)(struct user_namespace *mnt_userns,
struct dentry *dentry, struct fileattr *fa);
int (*fileattr_get)(struct dentry *dentry, struct fileattr *fa);