2982 Commits

Author SHA1 Message Date
Eric Biggers
46f875355d ANDROID: f2fs: add back compress inode check
f2fs_force_buffered_io() originally had two checks of
f2fs_compressed_file(), when only one was needed.  The one via
f2fs_post_read_required() got removed by ANDROID commit 4f27c8b90bd2
("ANDROID: ext4, f2fs: enable direct I/O with inline encryption").  Then
more recently, the second was removed by upstream commit 73bfe5dfd09d
("f2fs: remove redundant compress inode check"), but this wasn't fixed
up during the merge resolution.

This incorrectly left no checks remaining, so add one back.

Reported at
https://lkml.kernel.org/r/560266ca-0164-c02e-18ea-55564683d13e@huawei.com

Fixes: db50a8301ece ("Merge remote-tracking branch 'aosp/upstream-f2fs-stable-linux-4.14.y' into android-4.14-stable")
Change-Id: I3f6afdd221b6e44f1caa5a92d3c751a573e284df
Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-07-10 14:08:39 -07:00
Sahitya Tummala
25007e4708 FROMLIST: f2fs: fix use-after-free when accessing bio->bi_crypt_context
There could be a potential race between these two paths below,
leading to use-after-free when accessing bio->bi_crypt_context.

f2fs_write_cache_pages
->f2fs_do_write_data_page on page#1
  ->f2fs_inplace_write_data
    ->f2fs_merge_page_bio
      ->add_bio_entry
->f2fs_do_write_data_page on page#2
  ->f2fs_inplace_write_data
    ->f2fs_merge_page_bio
      ->f2fs_crypt_mergeable_bio
        ->fscrypt_mergeable_bio
                                       f2fs_write_begin on page#1
                                       ->f2fs_wait_on_page_writeback
                                         ->f2fs_submit_merged_ipu_write
                                           ->__submit_bio
                                        The bio gets completed, calling
                                        bio_endio
                                        ->bio_uninit
                                          ->bio_crypt_free_ctx
          ->use-after-free issue

Fix this by moving f2fs_crypt_mergeable_bio() check within
add_ipu_page() so that it's done under bio_list_lock to prevent
the above race.

Bug: 137270441
Link: https://lore.kernel.org/linux-f2fs-devel/1592193588-21701-1-git-send-email-stummala@codeaurora.org/
Fixes: fb710731b64b ("f2fs: add inline encryption support")
Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
Signed-off-by: Satya Tangirala <satyat@google.com>
Change-Id: I1bd2cfa430423ba2a8d7c1da505322ded097cd9e
2020-06-22 19:32:27 +00:00
Eric Biggers
ddc9c4db87 Merge remote-tracking branch 'aosp/upstream-f2fs-stable-linux-4.14.y' into android-4.14-stable
* aosp/upstream-f2fs-stable-linux-4.14.y:
  fscrypt: remove stale definition
  fs-verity: remove unnecessary extern keywords
  fs-verity: fix all kerneldoc warnings
  fscrypt: add support for IV_INO_LBLK_32 policies
  fscrypt: make test_dummy_encryption use v2 by default
  fscrypt: support test_dummy_encryption=v2
  fscrypt: add fscrypt_add_test_dummy_key()
  linux/parser.h: add include guards
  fscrypt: remove unnecessary extern keywords
  fscrypt: name all function parameters
  fscrypt: fix all kerneldoc warnings

Conflicts:
	fs/crypto/fscrypt_private.h
	fs/crypto/keyring.c
	fs/crypto/keysetup.c
	fs/ext4/ext4.h
	fs/ext4/super.c
	fs/f2fs/f2fs.h
	fs/f2fs/super.c
	include/linux/fscrypt.h

Resolved the conflicts as per the corresponding android-mainline change,
I7198edbca759839aceeec2598e7a81305756c4d7.

Bug: 154167995
Test: kvm-xfstests -c ext4,f2fs,ext4/encrypt,f2fs/encrypt \
        -g encrypt -g verity -g casefold
      kvm-xfstests -c ext4,f2fs,ext4/encrypt,f2fs/encrypt \
        -g encrypt -g verity -g casefold -m inlinecrypt
Change-Id: Id12839f7948374575f9d15eee6a9c6a9382eacf3
Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-06-22 09:43:05 -07:00
Eric Biggers
11807f3279 fscrypt: support test_dummy_encryption=v2
v1 encryption policies are deprecated in favor of v2, and some new
features (e.g. encryption+casefolding) are only being added for v2.

Therefore, the "test_dummy_encryption" mount option (which is used for
encryption I/O testing with xfstests) needs to support v2 policies.

To do this, extend its syntax to be "test_dummy_encryption=v1" or
"test_dummy_encryption=v2".  The existing "test_dummy_encryption" (no
argument) also continues to be accepted, to specify the default setting
-- currently v1, but the next patch changes it to v2.

To cleanly support both v1 and v2 while also making it easy to support
specifying other encryption settings in the future (say, accepting
"$contents_mode:$filenames_mode:v2"), make ext4 and f2fs maintain a
pointer to the dummy fscrypt_context rather than using mount flags.

To avoid concurrency issues, don't allow test_dummy_encryption to be set
or changed during a remount.  (The former restriction is new, but
xfstests doesn't run into it, so no one should notice.)

Tested with 'gce-xfstests -c {ext4,f2fs}/encrypt -g auto'.  On ext4,
there are two regressions, both of which are test bugs: ext4/023 and
ext4/028 fail because they set an xattr and expect it to be stored
inline, but the increase in size of the fscrypt_context from
24 to 40 bytes causes this xattr to be spilled into an external block.

Link: https://lore.kernel.org/r/20200512233251.118314-4-ebiggers@kernel.org
Acked-by: Jaegeuk Kim <jaegeuk@kernel.org>
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-06-16 14:08:13 -07:00
Jaegeuk Kim
db50a8301e Merge remote-tracking branch 'aosp/upstream-f2fs-stable-linux-4.14.y' into android-4.14-stable
* aosp/upstream-f2fs-stable-linux-4.14.y:
  f2fs: attach IO flags to the missing cases
  f2fs: add node_io_flag for bio flags likewise data_io_flag
  f2fs: remove unused parameter of f2fs_put_rpages_mapping()
  f2fs: handle readonly filesystem in f2fs_ioc_shutdown()
  f2fs: avoid utf8_strncasecmp() with unstable name
  f2fs: don't return vmalloc() memory from f2fs_kmalloc()
  f2fs: fix retry logic in f2fs_write_cache_pages()
  f2fs: fix wrong discard space
  f2fs: compress: don't compress any datas after cp stop
  f2fs: remove unneeded return value of __insert_discard_tree()
  f2fs: fix wrong value of tracepoint parameter
  f2fs: protect new segment allocation in expand_inode_data
  f2fs: code cleanup by removing ifdef macro surrounding
  writeback: Avoid skipping inode writeback
  f2fs: avoid inifinite loop to wait for flushing node pages at cp_error
  f2fs: compress: fix zstd data corruption
  f2fs: add compressed/gc data read IO stat
  f2fs: fix potential use-after-free issue
  f2fs: compress: don't handle non-compressed data in workqueue
  f2fs: remove redundant assignment to variable err
  f2fs: refactor resize_fs to avoid meta updates in progress
  f2fs: use round_up to enhance calculation
  f2fs: introduce F2FS_IOC_RESERVE_COMPRESS_BLOCKS
  f2fs: Avoid double lock for cp_rwsem during checkpoint
  f2fs: report delalloc reserve as non-free in statfs for project quota
  f2fs: Fix wrong stub helper update_sit_info
  f2fs: compress: let lz4 compressor handle output buffer budget properly
  f2fs: remove blk_plugging in block_operations
  f2fs: introduce F2FS_IOC_RELEASE_COMPRESS_BLOCKS
  f2fs: shrink spinlock coverage
  f2fs: correctly fix the parent inode number during fsync()
  f2fs: introduce mempool for {,de}compress intermediate page allocation
  f2fs: introduce f2fs_bmap_compress()
  f2fs: support fiemap on compressed inode
  f2fs: support partial truncation on compressed inode
  f2fs: remove redundant compress inode check
  f2fs: use strcmp() in parse_options()
  f2fs: Use the correct style for SPDX License Identifier

 Conflicts:
	fs/f2fs/data.c
	fs/f2fs/dir.c

Bug: 154167995
Change-Id: I04ec97a9cafef2d7b8736f36a2a8d244965cae9a
Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
2020-06-15 13:25:41 -07:00
Jaegeuk Kim
a3a7060dbf f2fs: attach IO flags to the missing cases
This adds more IOs to attach flags.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-06-08 20:41:32 -07:00
Jaegeuk Kim
d9261e4293 f2fs: add node_io_flag for bio flags likewise data_io_flag
This patch adds another way to attach bio flags to node writes.

Description:   Give a way to attach REQ_META|FUA to node writes
               given temperature-based bits. Now the bits indicate:
               *      REQ_META     |      REQ_FUA      |
               *    5 |    4 |   3 |    2 |    1 |   0 |
               * Cold | Warm | Hot | Cold | Warm | Hot |

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-06-08 20:41:31 -07:00
Chao Yu
6e6ccfd4b7 f2fs: remove unused parameter of f2fs_put_rpages_mapping()
Just cleanup, no logic change.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-06-08 20:41:30 -07:00
Chao Yu
aa74c1acdc f2fs: handle readonly filesystem in f2fs_ioc_shutdown()
If mountpoint is readonly, we should allow shutdowning filesystem
successfully, this fixes issue found by generic/599 testcase of
xfstest.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-06-08 20:41:29 -07:00
Eric Biggers
49d7e80073 f2fs: avoid utf8_strncasecmp() with unstable name
If the dentry name passed to ->d_compare() fits in dentry::d_iname, then
it may be concurrently modified by a rename.  This can cause undefined
behavior (possibly out-of-bounds memory accesses or crashes) in
utf8_strncasecmp(), since fs/unicode/ isn't written to handle strings
that may be concurrently modified.

Fix this by first copying the filename to a stack buffer if needed.
This way we get a stable snapshot of the filename.

Fixes: 2c2eb7a300cd ("f2fs: Support case-insensitive file name lookups")
Cc: <stable@vger.kernel.org> # v5.4+
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Daniel Rosenberg <drosen@google.com>
Cc: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-06-08 20:41:28 -07:00
Eric Biggers
5d2f481734 f2fs: don't return vmalloc() memory from f2fs_kmalloc()
kmalloc() returns kmalloc'ed memory, and kvmalloc() returns either
kmalloc'ed or vmalloc'ed memory.  But the f2fs wrappers, f2fs_kmalloc()
and f2fs_kvmalloc(), both return both kinds of memory.

It's redundant to have two functions that do the same thing, and also
breaking the standard naming convention is causing bugs since people
assume it's safe to kfree() memory allocated by f2fs_kmalloc().  See
e.g. the various allocations in fs/f2fs/compress.c.

Fix this by making f2fs_kmalloc() just use kmalloc().  And to avoid
re-introducing the allocation failures that the vmalloc fallback was
intended to fix, convert the largest allocations to use f2fs_kvmalloc().

Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-06-08 20:41:27 -07:00
Sahitya Tummala
9c3820ab37 f2fs: fix retry logic in f2fs_write_cache_pages()
In case a compressed file is getting overwritten, the current retry
logic doesn't include the current page to be retried now as it sets
the new start index as 0 and new end index as writeback_index - 1.
This causes the corresponding cluster to be uncompressed and written
as normal pages without compression. Fix this by allowing writeback to
be retried for the current page as well (in case of compressed page
getting retried due to index mismatch with cluster index). So that
this cluster can be written compressed in case of overwrite.

Also, align f2fs_write_cache_pages() according to the change -
<64081362e8ff>("mm/page-writeback.c: fix range_cyclic writeback vs
writepages deadlock").

Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-06-04 11:54:32 -07:00
Chao Yu
537c6dad5b f2fs: fix wrong discard space
Under heavy fsstress, we may triggle panic while issuing discard,
because __check_sit_bitmap() detects that discard command may earse
valid data blocks, the root cause is as below race stack described,
since we removed lock when flushing quota data, quota data writeback
may race with write_checkpoint(), so that it causes inconsistency in
between cached discard entry and segment bitmap.

- f2fs_write_checkpoint
 - block_operations
  - set_sbi_flag(sbi, SBI_QUOTA_SKIP_FLUSH)
 - f2fs_flush_sit_entries
  - add_discard_addrs
   - __set_bit_le(i, (void *)de->discard_map);
						- f2fs_write_data_pages
						 - f2fs_write_single_data_page
						   : inode is quota one, cp_rwsem won't be locked
						  - f2fs_do_write_data_page
						   - f2fs_allocate_data_block
						    - f2fs_wait_discard_bio
						      : discard entry has not been added yet.
						    - update_sit_entry
 - f2fs_clear_prefree_segments
  - f2fs_issue_discard
  : add discard entry

In order to fix this, this patch uses node_write to serialize
f2fs_allocate_data_block and checkpoint.

Fixes: 435cbab95e39 ("f2fs: fix quota_sync failure due to f2fs_lock_op")
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-30 08:49:08 -07:00
Chao Yu
9787988476 f2fs: compress: don't compress any datas after cp stop
While compressed data writeback, we need to drop dirty pages like we did
for non-compressed pages if cp stops, however it's not needed to compress
any data in such case, so let's detect cp stop condition in
cluster_may_compress() to avoid redundant compressing and let following
f2fs_write_raw_pages() drops dirty pages correctly.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-30 08:49:07 -07:00
Chao Yu
b6be68eec8 f2fs: remove unneeded return value of __insert_discard_tree()
We never use return value of __insert_discard_tree(), so remove it.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-30 08:49:06 -07:00
Chao Yu
82f2f742dd f2fs: fix wrong value of tracepoint parameter
In f2fs_lookup(), we should set @err correctly before printing it
in tracepoint.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-30 08:49:05 -07:00
Daeho Jeong
674c17d879 f2fs: protect new segment allocation in expand_inode_data
Found a new segemnt allocation without f2fs_lock_op() in
expand_inode_data(). So, when we do fallocate() for a pinned file
and trigger checkpoint very frequently and simultaneously. F2FS gets
stuck in the below code of do_checkpoint() forever.

  f2fs_sync_meta_pages(sbi, META, LONG_MAX, FS_CP_META_IO);
  /* Wait for all dirty meta pages to be submitted for IO */
                                                <= if fallocate() here,
  f2fs_wait_on_all_pages(sbi, F2FS_DIRTY_META); <= it'll wait forever.

Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-30 08:49:04 -07:00
Chengguang Xu
6123fcf9d9 f2fs: code cleanup by removing ifdef macro surrounding
Define f2fs_listxattr and to NULL when CONFIG_F2FS_FS_XATTR is not
enabled, then we can remove many ugly ifdef macros in the code.

Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-30 08:49:03 -07:00
Jaegeuk Kim
ca504a6c00 Merge remote-tracking branch 'aosp/upstream-f2fs-stable-linux-4.14.y' into android-4.14-stable
This series addressed merge conflicts based on pa/c/1664425/15, mainly
integrated with a patch "f2fs: Handle casefolding with Encryption" for
casefolding support in ACK only.

* aosp/upstream-f2fs-stable-linux-4.14.y:
  f2fs: flush dirty meta pages when flushing them
  f2fs: fix checkpoint=disable:%u%%
  f2fs: rework filename handling
  f2fs: split f2fs_d_compare() from f2fs_match_name()
  f2fs: don't leak filename in f2fs_try_convert_inline_dir()
  Revert "f2fs: refactor resize_fs to avoid meta updates in progress"
  f2fs: fix missing check for f2fs_unlock_op
  f2fs: refactor resize_fs to avoid meta updates in progress

Conflicts:
	fs/f2fs/dir.c
	fs/f2fs/f2fs.h
	fs/f2fs/hash.c
	fs/f2fs/inline.c
	fs/f2fs/namei.c

Change-Id: Ib5ceb0f2f076d6c215d4c0c6262f3c1d41cde7c8
Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
2020-05-27 13:43:59 -07:00
Eric Biggers
09075917fb ANDROID: fscrypt: handle direct I/O with IV_INO_LBLK_32
With the existing fscrypt IV generation methods, each file's data blocks
have contiguous DUNs.  Therefore the direct I/O code "just worked"
because it only submits logically contiguous bios.  But with
IV_INO_LBLK_32, the direct I/O code breaks because the DUN can wrap from
0xffffffff to 0.  We can't submit bios across such boundaries.

This is especially difficult to handle when block_size != PAGE_SIZE,
since in that case the DUN can wrap in the middle of a page.  Punt on
this case for now and just handle block_size == PAGE_SIZE.

Add and use a new function fscrypt_dio_supported() to check whether a
direct I/O request is unsupported due to encryption constraints.

Then, update fs/direct-io.c (used by f2fs, and by ext4 in kernel v5.4
and earlier) and fs/iomap/direct-io.c (used by ext4 in kernel v5.5 and
later) to avoid submitting I/O across a DUN discontinuity.

(This is needed in ACK now because ACK already supports direct I/O with
inline crypto.  I'll be sending this upstream along with the encrypted
direct I/O support itself once its prerequisites are closer to landing.)

(cherry picked from android-mainline commit
 8d6c90c9d68b985fa809626d12f8c9aff3c9dcb1)

Conflicts:
	fs/ext4/file.c
	fs/iomap/direct-io.c

(Dropped the iomap changes because in kernel v5.4 and earlier,
 ext4 doesn't use iomap for direct I/O)

Test: For now, just manually tested direct I/O on ext4 and f2fs in the
      DUN discontinuity case.
Bug: 144046242
Change-Id: I0c0b0b20a73ade35c3660cc6f9c09d49d3853ba5
Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-05-26 17:54:48 +00:00
Jaegeuk Kim
0c436a78eb f2fs: avoid inifinite loop to wait for flushing node pages at cp_error
Shutdown test is somtimes hung, since it keeps trying to flush dirty node pages
in an inifinite loop. Let's drop dirty pages at umount in that case.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-24 21:01:06 -07:00
Chao Yu
e00cdb19a8 f2fs: compress: fix zstd data corruption
During zstd compression, ZSTD_endStream() may return non-zero value
because distination buffer is full, but there is still compressed data
remained in intermediate buffer, it means that zstd algorithm can not
save at last one block space, let's just writeback raw data instead of
compressed one, this can fix data corruption when decompressing
incomplete stored compression data.

Fixes: 50cfa66f0de0 ("f2fs: compress: support zstd compress algorithm")
Signed-off-by: Daeho Jeong <daehojeong@google.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-18 23:43:59 -07:00
Chao Yu
bdd18b0594 f2fs: add compressed/gc data read IO stat
in order to account data read IOs more accurately.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-18 23:43:58 -07:00
Chao Yu
96d6e3f043 f2fs: fix potential use-after-free issue
In error path of f2fs_read_multi_pages(), it should let last referrer
release decompress io context memory, otherwise, other referrer will
cause use-after-free issue.

Fixes: 4c8ff7095bef ("f2fs: support data compression")
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-18 23:43:58 -07:00
Chao Yu
8d8b720ea9 f2fs: compress: don't handle non-compressed data in workqueue
If bio has no compressed data, we don't need to handle end_io work in
workqueue, instead, it should just let interrupter handle it directly
to speed up IO response.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-18 23:43:58 -07:00
Colin Ian King
a884feeebd f2fs: remove redundant assignment to variable err
The variable err is being assigned with a value that is never read
and it is being updated later with a new value. The initialization is
redundant and can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-18 23:43:58 -07:00
Jaegeuk Kim
963d404e88 f2fs: refactor resize_fs to avoid meta updates in progress
Sahitya raised an issue:
- prevent meta updates while checkpoint is in progress

allocate_segment_for_resize() can cause metapage updates if
it requires to change the current node/data segments for resizing.
Stop these meta updates when there is a checkpoint already
in progress to prevent inconsistent CP data.

Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-18 23:43:57 -07:00
Chao Yu
eb7dd3a62b f2fs: use round_up to enhance calculation
.i_cluster_size should be power of 2, so we can use round_up() instead
of roundup() to enhance the calculation.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-18 23:43:57 -07:00
Chao Yu
2cf3a3e7bf f2fs: introduce F2FS_IOC_RESERVE_COMPRESS_BLOCKS
This patch introduces a new ioctl to rollback all compress inode
status:
- add reserved blocks in dnode blocks
- increase i_compr_blocks, i_blocks, total_valid_block_count
- remove immutable flag

Then compress inode can be restored to support overwrite
functionality again.

Signee-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-18 23:43:57 -07:00
Sayali Lokhande
82cc5e607b f2fs: Avoid double lock for cp_rwsem during checkpoint
There could be a scenario where f2fs_sync_node_pages gets
called during checkpoint, which in turn tries to flush
inline data and calls iput(). This results in deadlock as
iput() tries to hold cp_rwsem, which is already held at the
beginning by checkpoint->block_operations().

Call stack :

Thread A		Thread B
f2fs_write_checkpoint()
- block_operations(sbi)
 - f2fs_lock_all(sbi);
  - down_write(&sbi->cp_rwsem);

                        - open()
                         - igrab()
                        - write() write inline data
                        - unlink()
- f2fs_sync_node_pages()
 - if (is_inline_node(page))
  - flush_inline_data()
   - ilookup()
     page = f2fs_pagecache_get_page()
     if (!page)
      goto iput_out;
     iput_out:
			-close()
			-iput()
       iput(inode);
       - f2fs_evict_inode()
        - f2fs_truncate_blocks()
         - f2fs_lock_op()
           - down_read(&sbi->cp_rwsem);

Fixes: 2049d4fcb057 ("f2fs: avoid multiple node page writes due to inline_data")
Signed-off-by: Sayali Lokhande <sayalil@codeaurora.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-18 23:43:56 -07:00
Konstantin Khlebnikov
cd74e150ae f2fs: report delalloc reserve as non-free in statfs for project quota
This reserved space isn't committed yet but cannot be used for
allocations. For userspace it has no difference from used space.

See the same fix in ext4 commit f06925c73942 ("ext4: report delalloc
reserve as non-free in statfs for project quota").

Fixes: ddc34e328d06 ("f2fs: introduce f2fs_statfs_project")
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-18 23:43:56 -07:00
YueHaibing
9f25700da3 f2fs: Fix wrong stub helper update_sit_info
update_sit_info should be f2fs_update_sit_info,
otherwise build fails while no CONFIG_F2FS_STAT_FS.

Fixes: fc7100ea2a52 ("f2fs: Add f2fs stats to sysfs")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-18 23:43:56 -07:00
Chao Yu
61163cba00 f2fs: compress: let lz4 compressor handle output buffer budget properly
Commonly, in order to handle lz4 worst compress case, caller should
allocate buffer with size of LZ4_compressBound(inputsize) for target
compressed data storing, however in this case, if caller didn't
allocate enough space, lz4 compressor still can handle output buffer
budget properly, and end up compressing when left space in output
buffer is not enough.

So we don't have to allocate buffer with size for worst case, then
we can avoid 2 * 4KB size intermediate buffer allocation when
log_cluster_size is 2, and avoid unnecessary compressing work of
compressor if we can not save at least 4KB space.

Suggested-by: Daeho Jeong <daehojeong@google.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-18 23:43:55 -07:00
Jaegeuk Kim
30796f18dc f2fs: remove blk_plugging in block_operations
blk_plugging doesn't seem to give any benefit.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-18 23:43:55 -07:00
Chao Yu
25544ed1f9 f2fs: introduce F2FS_IOC_RELEASE_COMPRESS_BLOCKS
There are still reserved blocks on compressed inode, this patch
introduce a new ioctl to help release reserved blocks back to
filesystem, so that userspace can reuse those freed space.

----
Daeho fixed a bug like below.

Now, if writing pages and releasing compress blocks occur
simultaneously, and releasing cblocks is executed more than one time
to a file, then total block count of filesystem and block count of the
file could be incorrect and damaged.

We have to execute releasing compress blocks only one time for a file
without being interfered by writepages path.
---

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Daeho Jeong <daehojeong@google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-18 23:43:55 -07:00
Chao Yu
c768e67501 f2fs: shrink spinlock coverage
In f2fs_try_to_free_nids(), .nid_list_lock spinlock critical region will
increase as expected shrink number increase, to avoid spining other CPUs
for long time, we change to release nid caches with small batch each time
under .nid_list_lock coverage.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-18 23:43:53 -07:00
Eric Biggers
3f1146caa3 f2fs: correctly fix the parent inode number during fsync()
fsync() may be called on a deleted file that's still open.  So when
fsync() tries to set the parent inode number when the inode has
LOST_PINO and i_nlink == 1 (to avoid later checkpoints), it needs to
make sure to get the parent directory via a non-deleted alias.

Also remove the unnecessary igrab() and iput(), as the caller already
holds a reference to the inode.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-18 23:43:53 -07:00
Chao Yu
32328eaa3f f2fs: introduce mempool for {,de}compress intermediate page allocation
If compression feature is on, in scenario of no enough free memory,
page refault ratio is higher than before, the root cause is:
- {,de}compression flow needs to allocate intermediate pages to store
compressed data in cluster, so during their allocation, vm may reclaim
mmaped pages.
- if above reclaimed pages belong to compressed cluster, during its
refault, it may cause more intermediate pages allocation, result in
reclaiming more mmaped pages.

So this patch introduces a mempool for intermediate page allocation,
in order to avoid high refault ratio, by default, number of
preallocated page in pool is 512, user can change the number by
assigning 'num_compress_pages' parameter during module initialization.

Ma Feng found warnings in the original patch and fixed like below.

Fix the following sparse warning:
fs/f2fs/compress.c:501:5: warning: symbol 'num_compress_pages' was not declared.
 Should it be static?
fs/f2fs/compress.c:530:6: warning: symbol 'f2fs_compress_free_page' was not
declared. Should it be static?

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Ma Feng <mafeng.ma@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-18 13:23:33 -07:00
Chao Yu
1856ffe96f f2fs: introduce f2fs_bmap_compress()
to support bmap() on compressed inode: if queried block locates in
non-compressed cluster, return its physical block address.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-18 13:23:33 -07:00
Chao Yu
066c14e566 f2fs: support fiemap on compressed inode
Map normal/compressed cluster of compressed inode correctly, and give
the right fiemap flag FIEMAP_EXTENT_ENCODED on mapped compressed extent.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-18 13:23:33 -07:00
Chao Yu
b390210e01 f2fs: support partial truncation on compressed inode
Supports to truncate compressed/normal cluster partially on compressed
inode.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-18 13:23:33 -07:00
Chao Yu
73bfe5dfd0 f2fs: remove redundant compress inode check
due to f2fs_post_read_required() has did that.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-18 13:23:33 -07:00
Jaegeuk Kim
2116cb4204 f2fs: flush dirty meta pages when flushing them
Let's guarantee flusing dirty meta pages to avoid infinite loop.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-18 13:23:32 -07:00
Eric Biggers
a73baecd77 f2fs: use strcmp() in parse_options()
Remove the pointless string length checks.  Just use strcmp().

Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-18 13:23:32 -07:00
Jaegeuk Kim
f8a8368ae7 f2fs: fix checkpoint=disable:%u%%
When parsing the mount option, we don't have sbi->user_block_count.
Should do it after getting it.

Cc: <stable@vger.kernel.org>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-18 13:23:32 -07:00
Nishad Kamdar
3289192145 f2fs: Use the correct style for SPDX License Identifier
This patch corrects the SPDX License Identifier style in
header files related to F2FS File System support.
For C header files Documentation/process/license-rules.rst
mandates C-like comments (opposed to C source files where
C++ style should be used).

Changes made by using a script provided by Joe Perches here:
https://lkml.org/lkml/2019/2/7/46.

Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Nishad Kamdar <nishadkamdar@gmail.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-18 13:23:32 -07:00
Eric Biggers
77b2eab354 f2fs: rework filename handling
Rework f2fs's handling of filenames to use a new 'struct f2fs_filename'.
Similar to 'struct ext4_filename', this stores the usr_fname, disk_name,
dirhash, crypto_buf, and casefolded name.  Some of these names can be
NULL in some cases.  'struct f2fs_filename' differs from
'struct fscrypt_name' mainly in that the casefolded name is included.

For user-initiated directory operations like lookup() and create(),
initialize the f2fs_filename by translating the corresponding
fscrypt_name, then computing the dirhash and casefolded name if needed.

This makes the dirhash and casefolded name be cached for each syscall,
so we don't have to recompute them repeatedly.  (Previously, f2fs
computed the dirhash once per directory level, and the casefolded name
once per directory block.)  This improves performance.

This rework also makes it much easier to correctly handle all
combinations of normal, encrypted, casefolded, and encrypted+casefolded
directories.  (The fourth isn't supported yet but is being worked on.)

The only other cases where an f2fs_filename gets initialized are for two
filesystem-internal operations: (1) when converting an inline directory
to a regular one, we grab the needed disk_name and hash from an existing
f2fs_dir_entry; and (2) when roll-forward recovering a new dentry, we
grab the needed disk_name from f2fs_inode::i_name and compute the hash.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-18 13:23:32 -07:00
Eric Biggers
89b2283759 f2fs: split f2fs_d_compare() from f2fs_match_name()
Sharing f2fs_ci_compare() between comparing cached dentries
(f2fs_d_compare()) and comparing on-disk dentries (f2fs_match_name())
doesn't work as well as intended, as these actions fundamentally differ
in several ways (e.g. whether the task may sleep, whether the directory
is stable, whether the casefolded name was precomputed, whether the
dentry will need to be decrypted once we allow casefold+encrypt, etc.)

Just make f2fs_d_compare() implement what it needs directly, and rework
f2fs_ci_compare() to be specialized for f2fs_match_name().

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-18 13:23:32 -07:00
Eric Biggers
9abaaca262 f2fs: don't leak filename in f2fs_try_convert_inline_dir()
We need to call fscrypt_free_filename() to free the memory allocated by
fscrypt_setup_filename().

Fixes: b06af2aff28b ("f2fs: convert inline_dir early before starting rename")
Cc: <stable@vger.kernel.org> # v5.6+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-18 13:23:31 -07:00
Jaegeuk Kim
9bbb918999 Revert "f2fs: refactor resize_fs to avoid meta updates in progress"
This reverts commit 7529068a775e71f7c6f70149f70c6031a79daeb3.
2020-05-15 08:04:29 -07:00