msm-4.14

mirror of https://github.com/rd-stuffs/msm-4.14.git synced 2025-02-20 11:45:48 +08:00

Author	SHA1	Message	Date
Eric Biggers	8842133ff3	fscrypt: don't check for ENOKEY from fscrypt_get_encryption_info() fscrypt_get_encryption_info() returns 0 if the encryption key is unavailable; it never returns ENOKEY. So remove checks for ENOKEY. Link: https://lore.kernel.org/r/20191209212348.243331-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>	2020-02-13 08:24:10 -08:00
Jaegeuk Kim	66ff25e225	f2fs: fix build error on PAGE_KERNEL_RO This fixes build error reported by kbuild test robot. tree: https://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs-stable.git linux-4.14.y head: 2945d197414d9732c680ea0b709735d3b0d8ea57 commit: f6574fbf6578e47cfa3cace486ca852979a1e433 [868/885] f2fs: support data compression config: mips-allyesconfig (attached as .config) compiler: mips-linux-gcc (GCC) 7.5.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross git checkout f6574fbf6578e47cfa3cace486ca852979a1e433 # save the attached .config to linux build tree GCC_VERSION=7.5.0 make.cross ARCH=mips If you fix the issue, kindly add following tag Reported-by: kbuild test robot <lkp@intel.com> All errors (new ones prefixed by >>): fs/f2fs/compress.c: In function 'f2fs_compress_pages': >> fs/f2fs/compress.c:359:56: error: 'PAGE_KERNEL_RO' undeclared (first use in this function); did you mean +'PAGE_KERNEL_NC'? cc->rbuf = vmap(cc->rpages, cc->cluster_size, VM_MAP, PAGE_KERNEL_RO); ^~~~~~~~~~~~~~ PAGE_KERNEL_NC fs/f2fs/compress.c:359:56: note: each undeclared identifier is reported only once for each function it appears +in fs/f2fs/compress.c: In function 'f2fs_decompress_pages': fs/f2fs/compress.c:456:56: error: 'PAGE_KERNEL_RO' undeclared (first use in this function); did you mean +'PAGE_KERNEL_NC'? dic->cbuf = vmap(dic->cpages, dic->nr_cpages, VM_MAP, PAGE_KERNEL_RO); ^~~~~~~~~~~~~~ PAGE_KERNEL_NC vim +359 fs/f2fs/compress.c Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>	2020-02-07 13:47:07 -08:00
Eric Biggers	2945d19741	f2fs: fix race conditions in ->d_compare() and ->d_hash() Since ->d_compare() and ->d_hash() can be called in RCU-walk mode, ->d_parent and ->d_inode can be concurrently modified, and in particular, ->d_inode may be changed to NULL. For f2fs_d_hash() this resulted in a reproducible NULL dereference if a lookup is done in a directory being deleted, e.g. with: int main() { if (fork()) { for (;;) { mkdir("subdir", 0700); rmdir("subdir"); } } else { for (;;) access("subdir/file", 0); } } ... or by running the 't_encrypted_d_revalidate' program from xfstests. Both repros work in any directory on a filesystem with the encoding feature, even if the directory doesn't actually have the casefold flag. I couldn't reproduce a crash in f2fs_d_compare(), but it appears that a similar crash is possible there. Fix these bugs by reading ->d_parent and ->d_inode using READ_ONCE() and falling back to the case sensitive behavior if the inode is NULL. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Fixes: 2c2eb7a300cd ("f2fs: Support case-insensitive file name lookups") Cc: <stable@vger.kernel.org> # v5.4+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2020-01-28 10:37:35 -08:00
Eric Biggers	86c6acb233	f2fs: fix dcache lookup of !casefolded directories Do the name comparison for non-casefolded directories correctly. This is analogous to ext4's commit 66883da1eee8 ("ext4: fix dcache lookup of !casefolded directories"). Fixes: 2c2eb7a300cd ("f2fs: Support case-insensitive file name lookups") Cc: <stable@vger.kernel.org> # v5.4+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2020-01-28 10:37:34 -08:00
Hridya Valsaraju	89df19dc0d	f2fs: Add f2fs stats to sysfs Currently f2fs stats are only available from /d/f2fs/status. This patch adds some of the f2fs stats to sysfs so that they are accessible even when debugfs is not mounted. The following sysfs nodes are added: -/sys/fs/f2fs/<disk>/free_segments -/sys/fs/f2fs/<disk>/cp_foreground_calls -/sys/fs/f2fs/<disk>/cp_background_calls -/sys/fs/f2fs/<disk>/gc_foreground_calls -/sys/fs/f2fs/<disk>/gc_background_calls -/sys/fs/f2fs/<disk>/moved_blocks_foreground -/sys/fs/f2fs/<disk>/moved_blocks_background -/sys/fs/f2fs/<disk>/avg_vblocks Signed-off-by: Hridya Valsaraju <hridya@google.com> [Jaegeuk Kim: allow STAT_FS without DEBUG_FS] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2020-01-28 10:37:34 -08:00
Chao Yu	a0574e03c8	f2fs: change to use rwsem for gc_mutex Mutex lock won't serialize callers, in order to avoid starving of unlucky caller, let's use rwsem lock instead. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2020-01-28 10:37:34 -08:00
Jaegeuk Kim	52e959c6db	f2fs: update f2fs document regarding to fsync_mode This patch adds missing fsync_mode entry in f2fs document. Fixes: 04485987f053 ("f2fs: introduce async IPU policy") Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2020-01-28 10:37:34 -08:00
Jaegeuk Kim	75d1a8c497	f2fs: add a way to turn off ipu bio cache Setting 0x40 in /sys/fs/f2fs/dev/ipu_policy gives a way to turn off bio cache, which is useufl to check whether block layer using hardware encryption engine merges IOs correctly. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2020-01-28 10:37:34 -08:00
Chengguang Xu	124544e89c	f2fs: code cleanup for f2fs_statfs_project() Calling min_not_zero() to simplify complicated prjquota limit comparison in f2fs_statfs_project(). Signed-off-by: Chengguang Xu <cgxu519@mykernel.net> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2020-01-28 10:37:34 -08:00
Chengguang Xu	de02073302	f2fs: fix miscounted block limit in f2fs_statfs_project() statfs calculates Total/Used/Avail disk space in block unit, so we should translate soft/hard prjquota limit to block unit as well. Below testing result shows the block/inode numbers of Total/Used/Avail from df command are all correct afer applying this patch. [root@localhost quota-tools]\# ./repquota -P /dev/sdb1 *** Report for project quotas on device /dev/sdb1 Block grace time: 7days; Inode grace time: 7days Block limits File limits Project used soft hard grace used soft hard grace ----------------------------------------------------------- \#0 -- 4 0 0 1 0 0 \#101 -- 0 0 0 2 0 0 \#102 -- 0 10240 0 2 10 0 \#103 -- 0 0 20480 2 0 20 \#104 -- 0 10240 20480 2 10 20 \#105 -- 0 20480 10240 2 20 10 [root@localhost sdb1]\# lsattr -p t{1,2,3,4,5} 101 ----------------N-- t1/a1 102 ----------------N-- t2/a2 103 ----------------N-- t3/a3 104 ----------------N-- t4/a4 105 ----------------N-- t5/a5 [root@localhost sdb1]\# df -hi t{1,2,3,4,5} Filesystem Inodes IUsed IFree IUse% Mounted on /dev/sdb1 2.4M 21 2.4M 1% /mnt/sdb1 /dev/sdb1 10 2 8 20% /mnt/sdb1 /dev/sdb1 20 2 18 10% /mnt/sdb1 /dev/sdb1 10 2 8 20% /mnt/sdb1 /dev/sdb1 10 2 8 20% /mnt/sdb1 [root@localhost sdb1]\# df -h t{1,2,3,4,5} Filesystem Size Used Avail Use% Mounted on /dev/sdb1 10G 489M 9.6G 5% /mnt/sdb1 /dev/sdb1 10M 0 10M 0% /mnt/sdb1 /dev/sdb1 20M 0 20M 0% /mnt/sdb1 /dev/sdb1 10M 0 10M 0% /mnt/sdb1 /dev/sdb1 10M 0 10M 0% /mnt/sdb1 Fixes: 909110c060f2 ("f2fs: choose hardlimit when softlimit is larger than hardlimit in f2fs_statfs_project()") Signed-off-by: Chengguang Xu <cgxu519@mykernel.net> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2020-01-28 10:37:33 -08:00
Eric Biggers	79493bf183	f2fs: fix deadlock allocating bio_post_read_ctx from mempool Without any form of coordination, any case where multiple allocations from the same mempool are needed at a time to make forward progress can deadlock under memory pressure. This is the case for struct bio_post_read_ctx, as one can be allocated to decrypt a Merkle tree page during fsverity_verify_bio(), which itself is running from a post-read callback for a data bio which has its own struct bio_post_read_ctx. Fix this by freeing first bio_post_read_ctx before calling fsverity_verify_bio(). This works because verity (if enabled) is always the last post-read step. This deadlock can be reproduced by trying to read from an encrypted verity file after reducing NUM_PREALLOC_POST_READ_CTXS to 1 and patching mempool_alloc() to pretend that pool->alloc() always fails. Note that since NUM_PREALLOC_POST_READ_CTXS is actually 128, to actually hit this bug in practice would require reading from lots of encrypted verity files at the same time. But it's theoretically possible, as N available objects doesn't guarantee forward progress when > N/2 threads each need 2 objects at a time. Fixes: 95ae251fe828 ("f2fs: add fs-verity support") Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2020-01-28 10:37:33 -08:00
Eric Biggers	9daea8e96e	f2fs: remove unneeded check for error allocating bio_post_read_ctx Since allocating an object from a mempool never fails when __GFP_DIRECT_RECLAIM (which is included in GFP_NOFS) is set, the check for failure to allocate a bio_post_read_ctx is unnecessary. Remove it. Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2020-01-28 10:37:33 -08:00
Jaegeuk Kim	a44ac57a0b	f2fs: convert inline_dir early before starting rename If we hit an error during rename, we'll get two dentries in different directories. Chao adds to check the room in inline_dir which can avoid needless inversion. This should be done by inode_lock(&old_dir). Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2020-01-28 10:37:33 -08:00
Chao Yu	30f5d05f3c	f2fs: fix memleak of kobject If kobject_init_and_add() failed, caller needs to invoke kobject_put() to release kobject explicitly. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2020-01-28 10:37:33 -08:00
Chao Yu	bcfd305ad4	f2fs: fix to add swap extent correctly As Youling reported in mailing list: https://www.linuxquestions.org/questions/linux-newbie-8/the-file-system-f2fs-is-broken-4175666043/ https://www.linux.org/threads/the-file-system-f2fs-is-broken.26490/ There is a test case can corrupt f2fs image: - dd if=/dev/zero of=/swapfile bs=1M count=4096 - chmod 600 /swapfile - mkswap /swapfile - swapon --discard /swapfile The root cause is f2fs_swap_activate() intends to return zero value to setup_swap_extents() to enable SWP_FS mode (swap file goes through fs), in this flow, setup_swap_extents() setups swap extent with wrong block address range, result in discard_swap() erasing incorrect address. Because f2fs_swap_activate() has pinned swapfile, its data block address will not change, it's safe to let swap to handle IO through raw device, so we can get rid of SWAP_FS mode and initial swap extents inside f2fs_swap_activate(), by this way, later discard_swap() can trim in right address range. Fixes: 4969c06a0d83 ("f2fs: support swap file w/ DIO") Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2020-01-28 10:37:33 -08:00
Jaegeuk Kim	f61ddc1e8e	f2fs: run fsck when getting bad inode during GC This is to avoid inifinite GC when trying to disable checkpoint. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2020-01-28 10:37:32 -08:00
Chao Yu	f6574fbf65	f2fs: support data compression This patch tries to support compression in f2fs. - New term named cluster is defined as basic unit of compression, file can be divided into multiple clusters logically. One cluster includes 4 << n (n >= 0) logical pages, compression size is also cluster size, each of cluster can be compressed or not. - In cluster metadata layout, one special flag is used to indicate cluster is compressed one or normal one, for compressed cluster, following metadata maps cluster to [1, 4 << n - 1] physical blocks, in where f2fs stores data including compress header and compressed data. - In order to eliminate write amplification during overwrite, F2FS only support compression on write-once file, data can be compressed only when all logical blocks in file are valid and cluster compress ratio is lower than specified threshold. - To enable compression on regular inode, there are three ways: * chattr +c file * chattr +c dir; touch dir/file * mount w/ -o compress_extension=ext; touch file.ext Compress metadata layout: [Dnode Structure] +-----------------------------------------------+ \| cluster 1 \| cluster 2 \| ......... \| cluster N \| +-----------------------------------------------+ . . . . . . . . . Compressed Cluster . . Normal Cluster . +----------+---------+---------+---------+ +---------+---------+---------+---------+ \|compr flag\| block 1 \| block 2 \| block 3 \| \| block 1 \| block 2 \| block 3 \| block 4 \| +----------+---------+---------+---------+ +---------+---------+---------+---------+ . . . . . . +-------------+-------------+----------+----------------------------+ \| data length \| data chksum \| reserved \| compressed data \| +-------------+-------------+----------+----------------------------+ Changelog: 20190326: - fix error handling of read_end_io(). - remove unneeded comments in f2fs_encrypt_one_page(). 20190327: - fix wrong use of f2fs_cluster_is_full() in f2fs_mpage_readpages(). - don't jump into loop directly to avoid uninitialized variables. - add TODO tag in error path of f2fs_write_cache_pages(). 20190328: - fix wrong merge condition in f2fs_read_multi_pages(). - check compressed file in f2fs_post_read_required(). 20190401 - allow overwrite on non-compressed cluster. - check cluster meta before writing compressed data. 20190402 - don't preallocate blocks for compressed file. - add lz4 compress algorithm - process multiple post read works in one workqueue Now f2fs supports processing post read work in multiple workqueue, it shows low performance due to schedule overhead of multiple workqueue executing orderly. 20190921 - compress: support buffered overwrite C: compress cluster flag V: valid block address N: NEW_ADDR One cluster contain 4 blocks before overwrite after overwrite - VVVV -> CVNN - CVNN -> VVVV - CVNN -> CVNN - CVNN -> CVVV - CVVV -> CVNN - CVVV -> CVVV 20191029 - add kconfig F2FS_FS_COMPRESSION to isolate compression related codes, add kconfig F2FS_FS_{LZO,LZ4} to cover backend algorithm. note that: will remove lzo backend if Jaegeuk agreed that too. - update codes according to Eric's comments. 20191101 - apply fixes from Jaegeuk 20191113 - apply fixes from Jaegeuk - split workqueue for fsverity 20191216 - apply fixes from Jaegeuk [Jaegeuk Kim] - add tracepoint for f2fs_{,de}compress_pages() - fix many bugs and add some compression stats - fix overwrite/mmap bugs - address 32bit build error, reported by Geert. - bug fixes when handling errors and i_compressed_blocks Reported-by: <noreply@ellerman.id.au> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2020-01-28 10:37:32 -08:00
Jaegeuk Kim	749e5b5b32	f2fs: free sysfs kobject Detected kmemleak. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2020-01-28 10:37:32 -08:00
Jaegeuk Kim	6f86584828	f2fs: declare nested quota_sem and remove unnecessary sems 1. f2fs_quota_sync -> down_read(&sbi->quota_sem) -> dquot_writeback_dquots -> f2fs_dquot_commit -> down_read(&sbi->quota_sem) 2. f2fs_quota_sync -> down_read(&sbi->quota_sem) -> f2fs_write_data_pages -> f2fs_write_single_data_page -> down_write(&F2FS_I(inode)->i_sem) f2fs_mkdir -> f2fs_do_add_link -> down_write(&F2FS_I(inode)->i_sem) -> f2fs_init_inode_metadata -> f2fs_new_node_page -> dquot_alloc_inode -> f2fs_dquot_mark_dquot_dirty -> down_read(&sbi->quota_sem) Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2020-01-28 10:37:32 -08:00
Jaegeuk Kim	a068719285	f2fs: don't put new_page twice in f2fs_rename In f2fs_rename(), new_page is gone after f2fs_set_link(), but it tries to put again when whiteout is failed and jumped to put_out_dir. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2020-01-28 10:37:32 -08:00
Jaegeuk Kim	c643e2b9c9	f2fs: set I_LINKABLE early to avoid wrong access by vfs This patch moves setting I_LINKABLE early in rename2(whiteout) to avoid the below warning. [ 3189.163385] WARNING: CPU: 3 PID: 59523 at fs/inode.c:358 inc_nlink+0x32/0x40 [ 3189.246979] Call Trace: [ 3189.248707] f2fs_init_inode_metadata+0x2d6/0x440 [f2fs] [ 3189.251399] f2fs_add_inline_entry+0x162/0x8c0 [f2fs] [ 3189.254010] f2fs_add_dentry+0x69/0xe0 [f2fs] [ 3189.256353] f2fs_do_add_link+0xc5/0x100 [f2fs] [ 3189.258774] f2fs_rename2+0xabf/0x1010 [f2fs] [ 3189.261079] vfs_rename+0x3f8/0xaa0 [ 3189.263056] ? tomoyo_path_rename+0x44/0x60 [ 3189.265283] ? do_renameat2+0x49b/0x550 [ 3189.267324] do_renameat2+0x49b/0x550 [ 3189.269316] __x64_sys_renameat2+0x20/0x30 [ 3189.271441] do_syscall_64+0x5a/0x230 [ 3189.273410] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 3189.275848] RIP: 0033:0x7f270b4d9a49 Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2020-01-28 10:37:32 -08:00
Eric Biggers	ff03252043	f2fs: don't keep META_MAPPING pages used for moving verity file blocks META_MAPPING is used to move blocks for both encrypted and verity files. So the META_MAPPING invalidation condition in do_checkpoint() should consider verity too, not just encrypt. Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2020-01-28 10:37:31 -08:00
Chao Yu	22bdd23cd6	f2fs: introduce private bioset In low memory scenario, we can allocate multiple bios without submitting any of them. - f2fs_write_checkpoint() - block_operations() - f2fs_sync_node_pages() step 1) flush cold nodes, allocate new bio from mempool - bio_alloc() - mempool_alloc() step 2) flush hot nodes, allocate a bio from mempool - bio_alloc() - mempool_alloc() step 3) flush warm nodes, be stuck in below call path - bio_alloc() - mempool_alloc() - loop to wait mempool element release, as we only reserved memory for two bio allocation, however above allocated two bios may never be submitted. So we need avoid using default bioset, in this patch we introduce a private bioset, in where we enlarg mempool element count to total number of log header, so that we can make sure we have enough backuped memory pool in scenario of allocating/holding multiple bios. Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2020-01-28 10:37:31 -08:00
Sahitya Tummala	8eb795d484	f2fs: cleanup duplicate stats for atomic files Remove duplicate sbi->aw_cnt stats counter that tracks the number of atomic files currently opened (it also shows incorrect value sometimes). Use more relit lable sbi->atomic_files to show in the stats. Signed-off-by: Sahitya Tummala <stummala@codeaurora.org> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2020-01-28 10:37:31 -08:00
Jaegeuk Kim	d5d3da042b	f2fs: set GFP_NOFS when moving inline dentries Otherwise, it can cause circular locking dependency reported by mm. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2020-01-28 10:37:31 -08:00
Jaegeuk Kim	396a9ec3a7	f2fs: should avoid recursive filesystem ops We need to use GFP_NOFS, since we did f2fs_lock_op(). Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2020-01-28 10:37:31 -08:00
Jaegeuk Kim	2e2eb2311d	f2fs: keep quota data on write_begin failure This patch avoids some unnecessary locks for quota files when write_begin fails. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2020-01-28 10:37:31 -08:00
Jaegeuk Kim	6ec5762bf1	f2fs: call f2fs_balance_fs outside of locked page Otherwise, we can hit deadlock by waiting for the locked page in move_data_block in GC. Thread A Thread B - do_page_mkwrite - f2fs_vm_page_mkwrite - lock_page - f2fs_balance_fs - mutex_lock(gc_mutex) - f2fs_gc - do_garbage_collect - ra_data_block - grab_cache_page - f2fs_balance_fs - mutex_lock(gc_mutex) Fixes: 39a8695824510 ("f2fs: refactor ->page_mkwrite() flow") Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2020-01-28 10:37:31 -08:00
Jaegeuk Kim	9abd1aede4	f2fs: preallocate DIO blocks when forcing buffered_io The previous preallocation and DIO decision like below. allow_outplace_dio !allow_outplace_dio f2fs_force_buffered_io () No_Prealloc / Buffered_IO Prealloc / Buffered_IO !f2fs_force_buffered_io No_Prealloc / DIO Prealloc / DIO But, Javier reported Case () where zoned device bypassed preallocation but fell back to buffered writes in f2fs_direct_IO(), resulting in stale data being read. In order to fix the issue, actually we need to preallocate blocks whenever we fall back to buffered IO like this. No change is made in the other cases. allow_outplace_dio !allow_outplace_dio f2fs_force_buffered_io (*) Prealloc / Buffered_IO Prealloc / Buffered_IO !f2fs_force_buffered_io No_Prealloc / DIO Prealloc / DIO Reported-and-tested-by: Javier Gonzalez <javier@javigon.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Reviewed-by: Javier González <javier@javigon.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2020-01-28 10:37:31 -08:00
Eric Biggers	c4948febfd	f2fs: support STATX_ATTR_VERITY Set the STATX_ATTR_VERITY bit when the statx() system call is used on a verity file on f2fs. Signed-off-by: Eric Biggers <ebiggers@google.com>	2020-01-09 14:58:36 -08:00
Eric Biggers	a6a7ff5b18	f2fs: add support for IV_INO_LBLK_64 encryption policies f2fs inode numbers are stable across filesystem resizing, and f2fs inode and file logical block numbers are always 32-bit. So f2fs can always support IV_INO_LBLK_64 encryption policies. Wire up the needed fscrypt_operations to declare support. Acked-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Eric Biggers <ebiggers@google.com>	2020-01-09 14:58:34 -08:00
Jaegeuk Kim	2fe7325a0e	f2fs: stop GC when the victim becomes fully valid We must stop GC, once the segment becomes fully valid. Otherwise, it can produce another dirty segments by moving valid blocks in the segment partially. Ramon hit no free segment panic sometimes and saw this case happens when validating reliable file pinning feature. Signed-off-by: Ramon Pantin <pantin@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-12-02 16:17:11 -08:00
Jaegeuk Kim	9b45d1eeae	f2fs: expose main_blkaddr in sysfs Expose in /sys/fs/f2fs/<blockdev>/main_blkaddr the block address where the main area starts. This allows user mode programs to determine: - That pinned files that are made exclusively of fully allocated 2MB segments will never be unpinned by the file system. - Where the main area starts. This is required by programs that want to verify if a file is made exclusively of 2MB f2fs segments, the alignment boundary for segments starts at this address. Testing for 2MB alignment relative to the start of the device is incorrect, because for some filesystems main_blkaddr is not at a 2MB boundary relative to the start of the device. The entry will be used when validating reliable pinning file feature proposed by "f2fs: support aligned pinned file". Signed-off-by: Ramon Pantin <pantin@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-12-02 16:17:10 -08:00
Chengguang Xu	c70bcddf94	f2fs: choose hardlimit when softlimit is larger than hardlimit in f2fs_statfs_project() Setting softlimit larger than hardlimit seems meaningless for disk quota but currently it is allowed. In this case, there may be a bit of comfusion for users when they run df comamnd to directory which has project quota. For example, we set 20M softlimit and 10M hardlimit of block usage limit for project quota of test_dir(project id 123). [root@hades f2fs]# repquota -P -a *** Report for project quotas on device /dev/nvme0n1p8 Block grace time: 7days; Inode grace time: 7days Block limits File limits Project used soft hard grace used soft hard grace ---------------------------------------------------------------------- 0 -- 4 0 0 1 0 0 123 +- 10248 20480 10240 2 0 0 The result of df command as below: [root@hades f2fs]# df -h /mnt/f2fs/test Filesystem Size Used Avail Use% Mounted on /dev/nvme0n1p8 20M 11M 10M 51% /mnt/f2fs Even though it looks like there is another 10M free space to use, if we write new data to diretory test(inherit project id), the write will fail with errno(-EDQUOT). After this patch, the df result looks like below. [root@hades f2fs]# df -h /mnt/f2fs/test Filesystem Size Used Avail Use% Mounted on /dev/nvme0n1p8 10M 10M 0 100% /mnt/f2fs Signed-off-by: Chengguang Xu <cgxu519@mykernel.net> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-12-02 16:17:09 -08:00
Sahitya Tummala	bcee82097f	f2fs: Fix deadlock in f2fs_gc() context during atomic files handling The FS got stuck in the below stack when the storage is almost full/dirty condition (when FG_GC is being done). schedule_timeout io_schedule_timeout congestion_wait f2fs_drop_inmem_pages_all f2fs_gc f2fs_balance_fs __write_node_page f2fs_fsync_node_pages f2fs_do_sync_file f2fs_ioctl The root cause for this issue is there is a potential infinite loop in f2fs_drop_inmem_pages_all() for the case where gc_failure is true and when there an inode whose i_gc_failures[GC_FAILURE_ATOMIC] is not set. Fix this by keeping track of the total atomic files currently opened and using that to exit from this condition. Fix-suggested-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Sahitya Tummala <stummala@codeaurora.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-12-02 16:17:08 -08:00
Chao Yu	f8db0be120	f2fs: show f2fs instance in printk_ratelimited As Eric mentioned, bare printk{,_ratelimited} won't show which filesystem instance these message is coming from, this patch tries to show fs instance with sb->s_id field in all places we missed before. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-12-02 16:17:04 -08:00
Chao Yu	7c4b00dcb8	f2fs: fix potential overflow We expect 64-bit calculation result from below statement, however in 32-bit machine, looped left shift operation on pgoff_t type variable may cause overflow issue, fix it by forcing type cast. page->index << PAGE_SHIFT; Fixes: 26de9b117130 ("f2fs: avoid unnecessary updating inode during fsync") Fixes: 0a2aa8fbb969 ("f2fs: refactor __exchange_data_block for speed up") Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-12-02 15:56:58 -08:00
Chao Yu	f96f967879	f2fs: fix to update dir's i_pino during cross_rename As Eric reported: RENAME_EXCHANGE support was just added to fsstress in xfstests: commit 65dfd40a97b6bbbd2a22538977bab355c5bc0f06 Author: kaixuxia <xiakaixu1987@gmail.com> Date: Thu Oct 31 14:41:48 2019 +0800 fsstress: add EXCHANGE renameat2 support This is causing xfstest generic/579 to fail due to fsck.f2fs reporting errors. I'm not sure what the problem is, but it still happens even with all the fs-verity stuff in the test commented out, so that the test just runs fsstress. generic/579 23s ... [10:02:25] [ 7.745370] run fstests generic/579 at 2019-11-04 10:02:25 _check_generic_filesystem: filesystem on /dev/vdc is inconsistent (see /results/f2fs/results-default/generic/579.full for details) [10:02:47] Ran: generic/579 Failures: generic/579 Failed 1 of 1 tests Xunit report: /results/f2fs/results-default/result.xml Here's the contents of 579.full: _check_generic_filesystem: filesystem on /dev/vdc is inconsistent * fsck.f2fs output * [ASSERT] (__chk_dots_dentries:1378) --> Bad inode number[0x24] for '..', parent parent ino is [0xd10] The root cause is that we forgot to update directory's i_pino during cross_rename, fix it. Fixes: 32f9bc25cbda0 ("f2fs: support ->rename2()") Signed-off-by: Chao Yu <yuchao0@huawei.com> Tested-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-12-02 15:56:57 -08:00
Jaegeuk Kim	ce44f2ad72	f2fs: support aligned pinned file This patch supports 2MB-aligned pinned file, which can guarantee no GC at all by allocating fully valid 2MB segment. Check free segments by has_not_enough_free_secs() with large budget. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-12-02 15:56:56 -08:00
Jaegeuk Kim	52963b54ef	f2fs: avoid kernel panic on corruption test xfstests/generic/475 complains kernel warn/panic while testing corrupted disk. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-12-02 15:56:55 -08:00
Chao Yu	47c687236f	f2fs: cache global IPU bio In commit 8648de2c581e ("f2fs: add bio cache for IPU"), we added f2fs_submit_ipu_bio() in __write_data_page() as below: __write_data_page() if (!S_ISDIR(inode->i_mode) && !IS_NOQUOTA(inode)) { f2fs_submit_ipu_bio(sbi, bio, page); .... } in order to avoid below deadlock: Thread A Thread B - __write_data_page (inode x, page y) - f2fs_do_write_data_page - set_page_writeback ---- set writeback flag in page y - f2fs_inplace_write_data - f2fs_balance_fs - lock gc_mutex - lock gc_mutex - f2fs_gc - do_garbage_collect - gc_data_segment - move_data_page - f2fs_wait_on_page_writeback - wait_on_page_writeback --- wait writeback of page y However, the bio submission breaks the merge of IPU IOs. So in this patch let's add a global bio cache for merged IPU pages, then f2fs_wait_on_page_writeback() is able to submit bio if a writebacked page is cached in global bio cache. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-12-02 15:56:52 -08:00
Randall Huang	faa35391f5	f2fs: fix to avoid memory leakage in f2fs_listxattr In f2fs_listxattr, there is no boundary check before memcpy e_name to buffer. If the e_name_len is corrupted, unexpected memory contents may be returned to the buffer. Signed-off-by: Randall Huang <huangrandall@google.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-12-02 15:56:51 -08:00
Qiuyang Sun	c5f24de38f	f2fs: check total_segments from devices in raw_super For multi-device F2FS, we should check if the sum of total_segments from all devices matches segment_count. Signed-off-by: Qiuyang Sun <sunqiuyang@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-12-02 15:56:50 -08:00
Qiuyang Sun	7706be6fcb	f2fs: update multi-dev metadata in resize_fs Multi-device metadata should be updated in resize_fs as well. Also, we check that the new FS size still reaches the last device. Signed-off-by: Qiuyang Sun <sunqiuyang@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-12-02 15:56:49 -08:00
Chengguang Xu via Linux-f2fs-devel	71ca56604c	f2fs: mark recovery flag correctly in read_raw_super_block() On the combination of first fail and second success, we will miss to mark recovery flag because currently we reuse err variable in the loop. Signed-off-by: Chengguang Xu <cgxu519@zoho.com.cn> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-12-02 15:56:48 -08:00
Chao Yu	4bcefaebc5	f2fs: fix to update time in lazytime mode generic/018 reports an inconsistent status of atime, the testcase is as below: - open file with O_SYNC - write file to construct fraged space - calc md5 of file - record {a,c,m}time - defrag file --- do nothing - umount & mount - check {a,c,m}time The root cause is, as f2fs enables lazytime by default, atime update will dirty vfs inode, rather than dirtying f2fs inode (by set with FI_DIRTY_INODE), so later f2fs_write_inode() called from VFS will fail to update inode page due to our skip: f2fs_write_inode() if (is_inode_flag_set(inode, FI_DIRTY_INODE)) return 0; So eventually, after evict(), we lose last atime for ever. To fix this issue, we need to check whether {a,c,m,cr}time is consistent in between inode cache and inode page, and only skip f2fs_update_inode() if f2fs inode is not dirty and time is consistent as well. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-12-02 15:56:44 -08:00
Sahitya Tummala	d9b27dd376	f2fs: add a condition to detect overflow in f2fs_ioc_gc_range() end = range.start + range.len; If the range.start/range.len is a very large value, then end can overflow in this operation. It results into a crash in get_valid_blocks() when accessing the invalid range.start segno. This issue is reported in ioctl fuzz testing. Signed-off-by: Sahitya Tummala <stummala@codeaurora.org> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-09-24 09:11:26 -07:00
Chao Yu	dc27e30bae	f2fs: fix to add missing F2FS_IO_ALIGNED() condition In f2fs_allocate_data_block(), we will reset fio.retry for IO alignment feature instead of IO serialization feature. In addition, spread F2FS_IO_ALIGNED() to check IO alignment feature status explicitly. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-09-24 09:11:26 -07:00
Chao Yu	6e31285dcd	f2fs: fix to fallback to buffered IO in IO aligned mode In LFS mode, we allow OPU for direct IO, however, we didn't consider IO alignment feature, so direct IO can trigger unaligned IO, let's just fallback to buffered IO to keep correct IO alignment semantics in all places. Fixes: f847c699cff3 ("f2fs: allow out-place-update for direct IO in LFS mode") Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-09-24 09:11:26 -07:00
Chao Yu	1199c3ac7e	f2fs: fix to handle error path correctly in f2fs_map_blocks In f2fs_map_blocks(), we should bail out once __allocate_data_block() failed. Fixes: f847c699cff3 ("f2fs: allow out-place-update for direct IO in LFS mode") Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-09-24 09:11:25 -07:00

1 2 3 4 5 ...

2727 Commits