56313 Commits

Author SHA1 Message Date
Josef Bacik
b6a6c52de6 btrfs: clean up our handling of refs == 0 in snapshot delete
[ Upstream commit b8ccef048354074a548f108e51d0557d6adfd3a3 ]

In reada we BUG_ON(refs == 0), which could be unkind since we aren't
holding a lock on the extent leaf and thus could get a transient
incorrect answer.  In walk_down_proc we also BUG_ON(refs == 0), which
could happen if we have extent tree corruption.  Change that to return
-EUCLEAN.  In do_walk_down() we catch this case and handle it correctly,
however we return -EIO, which -EUCLEAN is a more appropriate error code.
Finally in walk_up_proc we have the same BUG_ON(refs == 0), so convert
that to proper error handling.  Also adjust the error message so we can
actually do something with the information.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit c847b28a799733b04574060ab9d00f215970627d)
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-30 14:08:24 +00:00
Josef Bacik
8f9488fe41 btrfs: replace BUG_ON with ASSERT in walk_down_proc()
[ Upstream commit 1f9d44c0a12730a24f8bb75c5e1102207413cc9b ]

We have a couple of areas where we check to make sure the tree block is
locked before looking up or messing with references.  This is old code
so it has this as BUG_ON().  Convert this to ASSERT() for developers.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 2df0e48615f438cdf93f1469ed9f289f71440d2d)
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-30 14:08:24 +00:00
Ryusuke Konishi
faf0987376 nilfs2: fix state management in error path of log writing function
commit 6576dd6695f2afca3f4954029ac4a64f82ba60ab upstream.

After commit a694291a6211 ("nilfs2: separate wait function from
nilfs_segctor_write") was applied, the log writing function
nilfs_segctor_do_construct() was able to issue I/O requests continuously
even if user data blocks were split into multiple logs across segments,
but two potential flaws were introduced in its error handling.

First, if nilfs_segctor_begin_construction() fails while creating the
second or subsequent logs, the log writing function returns without
calling nilfs_segctor_abort_construction(), so the writeback flag set on
pages/folios will remain uncleared.  This causes page cache operations to
hang waiting for the writeback flag.  For example,
truncate_inode_pages_final(), which is called via nilfs_evict_inode() when
an inode is evicted from memory, will hang.

Second, the NILFS_I_COLLECTED flag set on normal inodes remain uncleared.
As a result, if the next log write involves checkpoint creation, that's
fine, but if a partial log write is performed that does not, inodes with
NILFS_I_COLLECTED set are erroneously removed from the "sc_dirty_files"
list, and their data and b-tree blocks may not be written to the device,
corrupting the block mapping.

Fix these issues by uniformly calling nilfs_segctor_abort_construction()
on failure of each step in the loop in nilfs_segctor_do_construct(),
having it clean up logs and segment usages according to progress, and
correcting the conditions for calling nilfs_redirty_inodes() to ensure
that the NILFS_I_COLLECTED flag is cleared.

Link: https://lkml.kernel.org/r/20240814101119.4070-1-konishi.ryusuke@gmail.com
Fixes: a694291a6211 ("nilfs2: separate wait function from nilfs_segctor_write")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 40a2757de2c376ef8a08d9ee9c81e77f3c750adf)
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-30 14:08:22 +00:00
Ryusuke Konishi
812eaaab95 nilfs2: fix missing cleanup on rollforward recovery error
commit 5787fcaab9eb5930f5378d6a1dd03d916d146622 upstream.

In an error injection test of a routine for mount-time recovery, KASAN
found a use-after-free bug.

It turned out that if data recovery was performed using partial logs
created by dsync writes, but an error occurred before starting the log
writer to create a recovered checkpoint, the inodes whose data had been
recovered were left in the ns_dirty_files list of the nilfs object and
were not freed.

Fix this issue by cleaning up inodes that have read the recovery data if
the recovery routine fails midway before the log writer starts.

Link: https://lkml.kernel.org/r/20240810065242.3701-1-konishi.ryusuke@gmail.com
Fixes: 0f3e1c7f23f8 ("nilfs2: recovery functions")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 35a9a7a7d94662146396199b0cfd95f9517cdd14)
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-30 14:08:22 +00:00
Jann Horn
a5a11287ea fuse: use unsigned type for getxattr/listxattr size truncation
commit b18915248a15eae7d901262f108d6ff0ffb4ffc1 upstream.

The existing code uses min_t(ssize_t, outarg.size, XATTR_LIST_MAX) when
parsing the FUSE daemon's response to a zero-length getxattr/listxattr
request.
On 32-bit kernels, where ssize_t and outarg.size are the same size, this is
wrong: The min_t() will pass through any size values that are negative when
interpreted as signed.
fuse_listxattr() will then return this userspace-supplied negative value,
which callers will treat as an error value.

This kind of bug pattern can lead to fairly bad security bugs because of
how error codes are used in the Linux kernel. If a caller were to convert
the numeric error into an error pointer, like so:

    struct foo *func(...) {
      int len = fuse_getxattr(..., NULL, 0);
      if (len < 0)
        return ERR_PTR(len);
      ...
    }

then it would end up returning this userspace-supplied negative value cast
to a pointer - but the caller of this function wouldn't recognize it as an
error pointer (IS_ERR_VALUE() only detects values in the narrow range in
which legitimate errno values are), and so it would just be treated as a
kernel pointer.

I think there is at least one theoretical codepath where this could happen,
but that path would involve virtio-fs with submounts plus some weird
SELinux configuration, so I think it's probably not a concern in practice.

Cc: stable@vger.kernel.org # v4.9
Fixes: 63401ccdb2ca ("fuse: limit xattr returned size")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 13d787bb4f21b6dbc8d8291bf179d36568893c25)
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-30 14:08:21 +00:00
Jan Kara
a47099495b udf: Limit file size to 4TB
commit c2efd13a2ed4f29bf9ef14ac2fbb7474084655f8 upstream.

UDF disk format supports in principle file sizes up to 1<<64-1. However
the file space (including holes) is described by a linked list of
extents, each of which can have at most 1GB. Thus the creation and
handling of extents gets unusably slow beyond certain point. Limit the
file size to 4TB to avoid locking up the kernel too easily.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit a6211d4d3df3a5f90d8bcd11acd91baf7a3c2b5d)
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-30 14:08:21 +00:00
Richard Raya
2cd059fb56 This is the 4.14.354 OpenELA-Extended LTS stable release
-----BEGIN PGP SIGNATURE-----
 
 iQJNBAABCAA3FiEERFwmR4yFob14UDOYC8702P6YulgFAmcgko0ZHHZlZ2FyZC5u
 b3NzdW1Ab3JhY2xlLmNvbQAKCRALzvTY/pi6WL/GD/0em+uP/O8QiPYqeGrEECpW
 bgRsBiN3XnyEsghAjplWX12G/zjxA0PY0u2zh9K9sdPw60n8nVZ1OxvPHINwuSC9
 kE9N60SCpJ88ju9OtU+4xz/nxtEmlel8fWy5elagB5wqbWbvsjT52ceZXqSxqhy7
 pQdIDHSiUUwx9JL6vDuJSL+Z/Y216qvBETZLnDSo90raFp/MDa5JmQsh81lLeUt8
 wGKwC/Olnbd21QTStNK34aQGyX5b+3YeACFVPud66Zs9airz9EE6Yq78gwL29L2k
 4jxzihXxSkkfa66eR63ap53+/mEqOZX72m2qEMVOvAcAwU0XsNDTdkXN7z8YQ5T3
 E1rJwr4Ox0hmM+hHBA20w9xRDXZoZmdrcjsU1aNKuK2zTJ0h9DBIvMM2XY5n5sWK
 I4F8E15KyKmu4nXBETreXZixqVLZMgjNFncRLf8XBIL1kxXm65LYCHypp3AgdVgo
 Ccdq5PbC6LAyNPrIOaftIaS9VlU15cqcalu7A+gSoWq55LGWAa3G9vX0ZtYQB9QX
 0R18fbzyjqG6Wa5J5KRDJ+HyS4IvdnEWS8hMR3jfosjMNgJhfDlDeev8NARBiDpX
 d26xogNA7xOOvtdpuwEbnxD5kR0zUdnC73pC4wxdMptYSK6ULKNPmTkA0dKE9qvl
 TDgw4DML8vXQqJ4P+w3Njw==
 =gX2R
 -----END PGP SIGNATURE-----

Merge tag 'v4.14.354-openela' of https://github.com/openela/kernel-lts

This is the 4.14.354 OpenELA-Extended LTS stable release

* tag 'v4.14.354-openela' of https://github.com/openela/kernel-lts: (90 commits)
  LTS: Update to 4.14.354
  drm/fb-helper: set x/yres_virtual in drm_fb_helper_check_var
  ipc: remove memcg accounting for sops objects in do_semtimedop()
  scsi: aacraid: Fix double-free on probe failure
  usb: core: sysfs: Unmerge @usb3_hardware_lpm_attr_group in remove_power_attributes()
  usb: dwc3: st: fix probed platform device ref count on probe error path
  usb: dwc3: core: Prevent USB core invalid event buffer address access
  usb: dwc3: omap: add missing depopulate in probe error path
  USB: serial: option: add MeiG Smart SRM825L
  cdc-acm: Add DISABLE_ECHO quirk for GE HealthCare UI Controller
  net: busy-poll: use ktime_get_ns() instead of local_clock()
  gtp: fix a potential NULL pointer dereference
  net: prevent mss overflow in skb_segment()
  ida: Fix crash in ida_free when the bitmap is empty
  net:rds: Fix possible deadlock in rds_message_put
  fbmem: Check virtual screen sizes in fb_set_var()
  fbcon: Prevent that screen size is smaller than font size
  printk: Export is_console_locked
  memcg: enable accounting of ipc resources
  cgroup/cpuset: Prevent UAF in proc_cpuset_show()
  ...

Change-Id: I7da4d8d188dec9d2833216e5d6580dbd72b99240
Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>
2024-10-29 20:17:04 -03:00
Long Li
3ae8e60551 filelock: Correct the filelock owner in fcntl_setlk/fcntl_setlk64
The locks_remove_posix() function in fcntl_setlk/fcntl_setlk64 is designed
to reliably remove locks when an fcntl/close race is detected. However, it
was passing in the wrong filelock owner, it looks like a mistake and
resulting in a failure to remove locks. More critically, if the lock
removal fails, it could lead to a uaf issue while traversing the locks.

This problem occurs only in the 4.19/5.4 stable version.

Fixes: a561145f3ae9 ("filelock: Fix fcntl/close race recovery compat path")
Fixes: d30ff3304083 ("filelock: Remove locks reliably when fcntl/close race is detected")
Cc: stable@vger.kernel.org
Signed-off-by: Long Li <leo.lilong@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit a1177ea83da87a87cc352aa41f24d61c08c80b5d)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-24 10:07:40 +00:00
Baokun Li
90ef045711 ext4: set the type of max_zeroout to unsigned int to avoid overflow
[ Upstream commit 261341a932d9244cbcd372a3659428c8723e5a49 ]

The max_zeroout is of type int and the s_extent_max_zeroout_kb is of
type uint, and the s_extent_max_zeroout_kb can be freely modified via
the sysfs interface. When the block size is 1024, max_zeroout may
overflow, so declare it as unsigned int to avoid overflow.

Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20240319113325.3110393-9-libaokun1@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 2f64ae32831e5a2bfd0e404c6e63b399eb180a0a)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-24 10:07:39 +00:00
NeilBrown
cfe972295c NFS: avoid infinite loop in pnfs_update_layout.
[ Upstream commit 2fdbc20036acda9e5694db74a032d3c605323005 ]

If pnfsd_update_layout() is called on a file for which recovery has
failed it will enter a tight infinite loop.

NFS_LAYOUT_INVALID_STID will be set, nfs4_select_rw_stateid() will
return -EIO, and nfs4_schedule_stateid_recovery() will do nothing, so
nfs4_client_recover_expired_lease() will not wait.  So the code will
loop indefinitely.

Break the loop by testing the validity of the open stateid at the top of
the loop.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 4980d45cca2b1135a1ab3dea101425cf44da72cd)
[Vegard: fix conflict in context due to missing commit
 d03360aaf5ccac49581960bd736258c62972b88b ("pNFS: Ensure we return the
 error if someone kills a waiting layoutget").]
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-24 10:07:38 +00:00
Zhiguo Niu
0070565082 f2fs: fix to do sanity check in update_sit_entry
[ Upstream commit 36959d18c3cf09b3c12157c6950e18652067de77 ]

If GET_SEGNO return NULL_SEGNO for some unecpected case,
update_sit_entry will access invalid memory address,
cause system crash. It is better to do sanity check about
GET_SEGNO just like update_segment_mtime & locate_dirty_segment.

Also remove some redundant judgment code.

Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 3c2c864f19490da6e892290441ba7dcc7bae2576)
[Vegard: drop hunk in {f2fs_,}allocate_data_block due to missing commit
 65f1b80b33378501ea552ef085e9c31739af356c ('Revert "f2fs: handle dirty
 segments inside refresh_sit_entry"') -- the important part of the patch
 is the addition of the segno check.]
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-24 10:07:38 +00:00
David Sterba
0a7ed5945f btrfs: delete pointless BUG_ON check on quota root in btrfs_qgroup_account_extent()
[ Upstream commit f40a3ea94881f668084f68f6b9931486b1606db0 ]

The BUG_ON is deep in the qgroup code where we can expect that it
exists. A NULL pointer would cause a crash.

It was added long ago in 550d7a2ed5db35 ("btrfs: qgroup: Add new qgroup
calculation function btrfs_qgroup_account_extents()."). It maybe made
sense back then as the quota enable/disable state machine was not that
robust as it is nowadays, so we can just delete it.

Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 5ae1493c5eac1a7a7ced34970a24cb3a5680a63b)
[Vegard: fix conflict in context due to missing commit
 c9f6f3cd1c6fc4df959ce2bce15e5e6ce660bfd4 ("btrfs: qgroup: Allow
 trace_btrfs_qgroup_account_extent() to record its transid").]
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-24 10:07:38 +00:00
David Sterba
b141038981 btrfs: send: handle unexpected data in header buffer in begin_cmd()
[ Upstream commit e80e3f732cf53c64b0d811e1581470d67f6c3228 ]

Change BUG_ON to a proper error handling in the unlikely case of seeing
data when the command is started. This is supposed to be reset when the
command is finished (send_cmd, send_encoded_extent).

Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit f0b54836bf2ff59b866a6db481f9ad46fa30b642)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-24 10:07:38 +00:00
David Sterba
4ae1ece59e btrfs: handle invalid root reference found in may_destroy_subvol()
[ Upstream commit 6fbc6f4ac1f4907da4fc674251527e7dc79ffbf6 ]

The may_destroy_subvol() looks up a root by a key, allowing to do an
inexact search when key->offset is -1.  It's never expected to find such
item, as it would break the allowed range of a root id.

Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit ebce7d482d1a08392362ddf936ffdd9244fb1ece)
[Vegard: move changes to ioctl.c due to missing commit
 ec42f167348a1949ac309532aa34760cfc96c92f ("btrfs: Move
 may_destroy_subvol() from ioctl.c to inode.c").]
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-24 10:07:38 +00:00
David Sterba
b52a80ebbf btrfs: change BUG_ON to assertion when checking for delayed_node root
[ Upstream commit be73f4448b607e6b7ce41cd8ef2214fdf6e7986f ]

The pointer to root is initialized in btrfs_init_delayed_node(), no need
to check for it again. Change the BUG_ON to assertion.

Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit be9ce497c7cb293f93cf98ef563b6456bac75686)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-24 10:07:38 +00:00
Max Filippov
eba1af4c43 fs: binfmt_elf_efpic: don't use missing interpreter's properties
[ Upstream commit 15fd1dc3dadb4268207fa6797e753541aca09a2a ]

Static FDPIC executable may get an executable stack even when it has
non-executable GNU_STACK segment. This happens when STACK segment has rw
permissions, but does not specify stack size. In that case FDPIC loader
uses permissions of the interpreter's stack, and for static executables
with no interpreter it results in choosing the arch-default permissions
for the stack.

Fix that by using the interpreter's properties only when the interpreter
is actually used.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Link: https://lore.kernel.org/r/20240118150637.660461-1-jcmvbkbc@gmail.com
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 8ca5b21fa9b2c13aad93a97992b92f9360988fe9)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-24 10:07:37 +00:00
Jan Kara
df321d5b02 quota: Remove BUG_ON from dqget()
[ Upstream commit 249f374eb9b6b969c64212dd860cc1439674c4a8 ]

dqget() checks whether dquot->dq_sb is set when returning it using
BUG_ON. Firstly this doesn't work as an invalidation check for quite
some time (we release dquot with dq_sb set these days), secondly using
BUG_ON is quite harsh. Use WARN_ON_ONCE and check whether dquot is still
hashed instead.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit c08d02053b9e98dffea9b9b378dc90547e4621e8)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-24 10:07:37 +00:00
Baokun Li
0c2fd9f775 ext4: do not trim the group with corrupted block bitmap
[ Upstream commit 172202152a125955367393956acf5f4ffd092e0d ]

Otherwise operating on an incorrupted block bitmap can lead to all sorts
of unknown problems.

Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20240104142040.2835097-3-libaokun1@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit cac7c9fcd15e92184c8e621b1f33d97d99505366)
[Vegard: fix conflict in context due to missing commit
 cac7c9fcd15e92184c8e621b1f33d97d99505366 ("ext4: do not trim the group
 with corrupted block bitmap").]
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-24 10:07:37 +00:00
Andreas Gruenbacher
431cbbd124 gfs2: setattr_chown: Add missing initialization
[ Upstream commit 2d8d7990619878a848b1d916c2f936d3012ee17d ]

Add a missing initialization of variable ap in setattr_chown().
Without, chown() may be able to bypass quotas.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 686ef69ca191dcba8d325334c65a04a2589383e6)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-24 10:07:37 +00:00
Christian Brauner
0f44374924 binfmt_misc: cleanup on filesystem umount
[ Upstream commit 1c5976ef0f7ad76319df748ccb99a4c7ba2ba464 ]

Currently, registering a new binary type pins the binfmt_misc
filesystem. Specifically, this means that as long as there is at least
one binary type registered the binfmt_misc filesystem survives all
umounts, i.e. the superblock is not destroyed. Meaning that a umount
followed by another mount will end up with the same superblock and the
same binary type handlers. This is a behavior we tend to discourage for
any new filesystems (apart from a few special filesystems such as e.g.
configfs or debugfs). A umount operation without the filesystem being
pinned - by e.g. someone holding a file descriptor to an open file -
should usually result in the destruction of the superblock and all
associated resources. This makes introspection easier and leads to
clearly defined, simple and clean semantics. An administrator can rely
on the fact that a umount will guarantee a clean slate making it
possible to reinitialize a filesystem. Right now all binary types would
need to be explicitly deleted before that can happen.

This allows us to remove the heavy-handed calls to simple_pin_fs() and
simple_release_fs() when creating and deleting binary types. This in
turn allows us to replace the current brittle pinning mechanism abusing
dget() which has caused a range of bugs judging from prior fixes in [2]
and [3]. The additional dget() in load_misc_binary() pins the dentry but
only does so for the sake to prevent ->evict_inode() from freeing the
node when a user removes the binary type and kill_node() is run. Which
would mean ->interpreter and ->interp_file would be freed causing a UAF.

This isn't really nicely documented nor is it very clean because it
relies on simple_pin_fs() pinning the filesystem as long as at least one
binary type exists. Otherwise it would cause load_misc_binary() to hold
on to a dentry belonging to a superblock that has been shutdown.
Replace that implicit pinning with a clean and simple per-node refcount
and get rid of the ugly dget() pinning. A similar mechanism exists for
e.g. binderfs (cf. [4]). All the cleanup work can now be done in
->evict_inode().

In a follow-up patch we will make it possible to use binfmt_misc in
sandboxes. We will use the cleaner semantics where a umount for the
filesystem will cause the superblock and all resources to be
deallocated. In preparation for this apply the same semantics to the
initial binfmt_misc mount. Note, that this is a user-visible change and
as such a uapi change but one that we can reasonably risk. We've
discussed this in earlier versions of this patchset (cf. [1]).

The main user and provider of binfmt_misc is systemd. Systemd provides
binfmt_misc via autofs since it is configurable as a kernel module and
is used by a few exotic packages and users. As such a binfmt_misc mount
is triggered when /proc/sys/fs/binfmt_misc is accessed and is only
provided on demand. Other autofs on demand filesystems include EFI ESP
which systemd umounts if the mountpoint stays idle for a certain amount
of time. This doesn't apply to the binfmt_misc autofs mount which isn't
touched once it is mounted meaning this change can't accidently wipe
binary type handlers without someone having explicitly unmounted
binfmt_misc. After speaking to systemd folks they don't expect this
change to affect them.

In line with our general policy, if we see a regression for systemd or
other users with this change we will switch back to the old behavior for
the initial binfmt_misc mount and have binary types pin the filesystem
again. But while we touch this code let's take the chance and let's
improve on the status quo.

[1]: https://lore.kernel.org/r/20191216091220.465626-2-laurent@vivier.eu
[2]: commit 43a4f2619038 ("exec: binfmt_misc: fix race between load_misc_binary() and kill_node()"
[3]: commit 83f918274e4b ("exec: binfmt_misc: shift filp_close(interp_file) from kill_node() to bm_evict_inode()")
[4]: commit f0fe2c0f050d ("binder: prevent UAF for binderfs devices II")

Link: https://lore.kernel.org/r/20211028103114.2849140-1-brauner@kernel.org (v1)
Cc: Sargun Dhillon <sargun@sargun.me>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Jann Horn <jannh@google.com>
Cc: Henning Schild <henning.schild@siemens.com>
Cc: Andrei Vagin <avagin@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Laurent Vivier <laurent@vivier.eu>
Cc: linux-fsdevel@vger.kernel.org
Acked-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 263bcebf5c2ab1fe949517225157f34015124620)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-24 10:07:37 +00:00
Alexander Lobakin
5e78e68997 btrfs: rename bitmap_set_bits() -> btrfs_bitmap_set_bits()
commit 4ca532d64648d4776d15512caed3efea05ca7195 upstream.

bitmap_set_bits() does not start with the FS' prefix and may collide
with a new generic helper one day. It operates with the FS-specific
types, so there's no change those two could do the same thing.
Just add the prefix to exclude such possible conflict.

Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Acked-by: David Sterba <dsterba@suse.com>
Reviewed-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit eeca0881c04b07e053cd24b455012b6abd164328)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-24 10:07:36 +00:00
Jann Horn
fcb5aec66c fuse: Initialize beyond-EOF page contents before setting uptodate
commit 3c0da3d163eb32f1f91891efaade027fa9b245b9 upstream.

fuse_notify_store(), unlike fuse_do_readpage(), does not enable page
zeroing (because it can be used to change partial page contents).

So fuse_notify_store() must be more careful to fully initialize page
contents (including parts of the page that are beyond end-of-file)
before marking the page uptodate.

The current code can leave beyond-EOF page contents uninitialized, which
makes these uninitialized page contents visible to userspace via mmap().

This is an information leak, but only affects systems which do not
enable init-on-alloc (via CONFIG_INIT_ON_ALLOC_DEFAULT_ON=y or the
corresponding kernel command line parameter).

Link: https://bugs.chromium.org/p/project-zero/issues/detail?id=2574
Cc: stable@kernel.org
Fixes: a1d75f258230 ("fuse: add store request")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 49934861514d36d0995be8e81bb3312a499d8d9a)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-24 10:07:35 +00:00
Richard Raya
1b9f971175 This is the 4.14.353 OpenELA-Extended LTS stable release
-----BEGIN PGP SIGNATURE-----
 
 iQJNBAABCAA3FiEERFwmR4yFob14UDOYC8702P6YulgFAmcHrG8ZHHZlZ2FyZC5u
 b3NzdW1Ab3JhY2xlLmNvbQAKCRALzvTY/pi6WIQuEACCYf9xCGBALlKFb0pXX3eF
 oiRkceNyy5NWSndD7t9p/3d2g4YrVptGxtTZN12IltfG4wfCQ+qC/0g2Mu4ho0Yp
 2ExKVaIli1t2csIjXCUUyjh3jU0JOkDwJap9n5QemACsX8zrDfKVwdlj9hw+e7vi
 fBWwdfl1duK5cfVbbyvL74It4WeMnjuAYrBnMTxhYBTq56xFLrbBILl8BLxAV5NN
 5wGoNCeUtj8LxUrL2qs5QoT3Bf7uoDlLnu1Ly7jDMMX34/oNh5huOjZdDFbQYxS3
 DsEe6ljOYOyB/awdUhScERfxVPimumN3nHWnRJbsQhX36uXT6U7HNJah4zauchRk
 UlKUSfG3YyOqKIwFH+8oGmkuCm6wZbVjVsNNkYhT804BCCHrasJ1SHXsSB9R0MpU
 x3IQOoiuc33bUYrSqWAO7utvt+PwG++3GHz0XQwPfZn4DHY18/e+VNsGtQTPqzRG
 tsywZVTN0DC0nO7L772nkQDb7z2mhmJGgN8q3FPbMTfp/I1phIh9C17pckfpHKAl
 ippTmTMaIYDU3Rlc1g/cu363GOaXWRN4t03VSEu/BLV0IElRktUnmuBU3B/rMb+F
 ItaBmhnZGXHUrulMTxDtzItrYMwx00USw6IrG3iYjob0MhhxhLVxEh0vKc7Te2w5
 2FZEjj2BxinK66mJgAolZw==
 =BQd/
 -----END PGP SIGNATURE-----

Merge tag 'v4.14.353-openela' of https://github.com/openela/kernel-lts

This is the 4.14.353 OpenELA-Extended LTS stable release

* tag 'v4.14.353-openela' of https://github.com/openela/kernel-lts: (173 commits)
  LTS: Update to 4.14.353
  net: fix __dst_negative_advice() race
  selftests: make order checking verbose in msg_zerocopy selftest
  selftests: fix OOM in msg_zerocopy selftest
  Revert "selftests/net: reap zerocopy completions passed up as ancillary data."
  Revert "selftests: fix OOM in msg_zerocopy selftest"
  Revert "selftests: make order checking verbose in msg_zerocopy selftest"
  nvme/pci: Add APST quirk for Lenovo N60z laptop
  exec: Fix ToCToU between perm check and set-uid/gid usage
  drm/i915/gem: Fix Virtual Memory mapping boundaries calculation
  drm/i915: Try GGTT mmapping whole object as partial
  netfilter: nf_tables: set element extended ACK reporting support
  kbuild: Fix '-S -c' in x86 stack protector scripts
  drm/mgag200: Set DDC timeout in milliseconds
  drm/bridge: analogix_dp: properly handle zero sized AUX transactions
  drm/bridge: analogix_dp: Properly log AUX CH errors
  drm/bridge: analogix_dp: Reset aux channel if an error occurred
  drm/bridge: analogix_dp: Check AUX_EN status when doing AUX transfer
  x86/mtrr: Check if fixed MTRRs exist before saving them
  tracing: Fix overflow in get_free_elt()
  ...

Change-Id: I0e92a979e31d4fa6c526c6b70a1b61711d9747bb
Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>
2024-10-11 00:50:20 -03:00
Kees Cook
02acb3b20d exec: Fix ToCToU between perm check and set-uid/gid usage
commit f50733b45d865f91db90919f8311e2127ce5a0cb upstream.

When opening a file for exec via do_filp_open(), permission checking is
done against the file's metadata at that moment, and on success, a file
pointer is passed back. Much later in the execve() code path, the file
metadata (specifically mode, uid, and gid) is used to determine if/how
to set the uid and gid. However, those values may have changed since the
permissions check, meaning the execution may gain unintended privileges.

For example, if a file could change permissions from executable and not
set-id:

---------x 1 root root 16048 Aug  7 13:16 target

to set-id and non-executable:

---S------ 1 root root 16048 Aug  7 13:16 target

it is possible to gain root privileges when execution should have been
disallowed.

While this race condition is rare in real-world scenarios, it has been
observed (and proven exploitable) when package managers are updating
the setuid bits of installed programs. Such files start with being
world-executable but then are adjusted to be group-exec with a set-uid
bit. For example, "chmod o-x,u+s target" makes "target" executable only
by uid "root" and gid "cdrom", while also becoming setuid-root:

-rwxr-xr-x 1 root cdrom 16048 Aug  7 13:16 target

becomes:

-rwsr-xr-- 1 root cdrom 16048 Aug  7 13:16 target

But racing the chmod means users without group "cdrom" membership can
get the permission to execute "target" just before the chmod, and when
the chmod finishes, the exec reaches brpm_fill_uid(), and performs the
setuid to root, violating the expressed authorization of "only cdrom
group members can setuid to root".

Re-check that we still have execute permissions in case the metadata
has changed. It would be better to keep a copy from the perm-check time,
but until we can do that refactoring, the least-bad option is to do a
full inode_permission() call (under inode lock). It is understood that
this is safe against dead-locks, but hardly optimal.

Reported-by: Marco Vanotti <mvanotti@google.com>
Tested-by: Marco Vanotti <mvanotti@google.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: stable@vger.kernel.org
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Signed-off-by: Kees Cook <kees@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit d5c3c7e26275a2d83b894d30f7582a42853a958f)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-10 10:27:56 +00:00
Kemeng Shi
fb37e57b6e ext4: fix wrong unit use in ext4_mb_find_by_goal
[ Upstream commit 99c515e3a860576ba90c11acbc1d6488dfca6463 ]

We need start in block unit while fe_start is in cluster unit. Use
ext4_grp_offs_to_block helper to convert fe_start to get start in
block unit.

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Link: https://lore.kernel.org/r/20230603150327.3596033-4-shikemeng@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 585b8d86c39882425f737b800e7552fb42a4785f)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-10 10:27:28 +00:00
Kemeng Shi
82f1f40db0 jbd2: avoid memleak in jbd2_journal_write_metadata_buffer
[ Upstream commit cc102aa24638b90e04364d64e4f58a1fa91a1976 ]

The new_bh is from alloc_buffer_head, we should call free_buffer_head to
free it in error case.

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240514112438.1269037-2-shikemeng@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 831db95409cc12589c14a71b9bf6c3e7f70bf5a0)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-10 10:27:28 +00:00
Filipe Manana
dd102bb94b btrfs: fix bitmap leak when loading free space cache on duplicate entry
[ Upstream commit 320d8dc612660da84c3b70a28658bb38069e5a9a ]

If we failed to link a free space entry because there's already a
conflicting entry for the same offset, we free the free space entry but
we don't free the associated bitmap that we had just allocated before.
Fix that by freeing the bitmap before freeing the entry.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit fad0bb34cfcea693903409356693988f04715b8e)
[Vegard: use kfree() due to missing commit
 4874c6fe1c9efe704bf155afab268ead7c364c9b ("btrfs: fix allocation of
 free space cache v1 bitmap pages").]
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-10 10:27:28 +00:00
Roman Smirnov
12ce9c96b1 udf: prevent integer overflow in udf_bitmap_free_blocks()
[ Upstream commit 56e69e59751d20993f243fb7dd6991c4e522424c ]

An overflow may occur if the function is called with the last
block and an offset greater than zero. It is necessary to add
a check to avoid this.

Found by Linux Verification Center (linuxtesting.org) with Svace.

[JK: Make test cover also unalloc table freeing]

Link: https://patch.msgid.link/20240620072413.7448-1-r.smirnov@omp.ru
Suggested-by: Jan Kara <jack@suse.com>
Signed-off-by: Roman Smirnov <r.smirnov@omp.ru>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 097420e48e30f51e8f4f650b5c946f5af63ec1a3)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-10 10:27:28 +00:00
Steve Magnani
ed09bb9292 udf: Fix signed/unsigned format specifiers
Fix problems noted in compilion with -Wformat=2 -Wformat-signedness.
In particular, a mismatch between the signedness of a value and the
signedness of its format specifier can result in unsigned values being
printed as negative numbers, e.g.:

  Partition (0 type 1511) starts at physical 460, block length -1779968542

...which occurs when mounting a large (> 1 TiB) UDF partition.

Changes since V1:
* Fixed additional issues noted in udf_bitmap_free_blocks(),
  udf_get_fileident(), udf_show_options()

Signed-off-by: Steven J. Magnani <steve@digidescorp.com>
Signed-off-by: Jan Kara <jack@suse.cz>
(cherry picked from commit fcbf7637e6647e00de04d4b2e05ece2484bb3062)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-10 10:27:28 +00:00
Al Viro
f4c005cc38 protect the fetch of ->fd[fd] in do_dup2() from mispredictions
commit 8aa37bde1a7b645816cda8b80df4753ecf172bf1 upstream.

both callers have verified that fd is not greater than ->max_fds;
however, misprediction might end up with
        tofree = fdt->fd[fd];
being speculatively executed.  That's wrong for the same reasons
why it's wrong in close_fd()/file_close_fd_locked(); the same
solution applies - array_index_nospec(fd, fdt->max_fds) could differ
from fd only in case of speculative execution on mispredicted path.

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit ed42e8ff509d2a61c6642d1825032072dab79f26)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-10 10:27:27 +00:00
Jeongjun Park
4c2dc9502e jfs: Fix array-index-out-of-bounds in diFree
[ Upstream commit f73f969b2eb39ad8056f6c7f3a295fa2f85e313a ]

Reported-by: syzbot+241c815bda521982cb49@syzkaller.appspotmail.com
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Jeongjun Park <aha310510@gmail.com>
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 55b732c8b09b41148eaab2fa8e31b0af47671e00)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-10 10:27:25 +00:00
Ryusuke Konishi
0e318baa08 nilfs2: handle inconsistent state in nilfs_btnode_create_block()
commit 4811f7af6090e8f5a398fbdd766f903ef6c0d787 upstream.

Syzbot reported that a buffer state inconsistency was detected in
nilfs_btnode_create_block(), triggering a kernel bug.

It is not appropriate to treat this inconsistency as a bug; it can occur
if the argument block address (the buffer index of the newly created
block) is a virtual block number and has been reallocated due to
corruption of the bitmap used to manage its allocation state.

So, modify nilfs_btnode_create_block() and its callers to treat it as a
possible filesystem error, rather than triggering a kernel bug.

Link: https://lkml.kernel.org/r/20240725052007.4562-1-konishi.ryusuke@gmail.com
Fixes: a60be987d45d ("nilfs2: B-tree node cache")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Reported-by: syzbot+89cc4f2324ed37988b60@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=89cc4f2324ed37988b60
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 19cce46238ffe3546e44b9c74057103ff8b24c62)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-10 10:27:25 +00:00
Chao Yu
27f9505abc f2fs: fix to don't dirty inode for readonly filesystem
commit 192b8fb8d1c8ca3c87366ebbef599fa80bb626b8 upstream.

syzbot reports f2fs bug as below:

kernel BUG at fs/f2fs/inode.c:933!
RIP: 0010:f2fs_evict_inode+0x1576/0x1590 fs/f2fs/inode.c:933
Call Trace:
 evict+0x2a4/0x620 fs/inode.c:664
 dispose_list fs/inode.c:697 [inline]
 evict_inodes+0x5f8/0x690 fs/inode.c:747
 generic_shutdown_super+0x9d/0x2c0 fs/super.c:675
 kill_block_super+0x44/0x90 fs/super.c:1667
 kill_f2fs_super+0x303/0x3b0 fs/f2fs/super.c:4894
 deactivate_locked_super+0xc1/0x130 fs/super.c:484
 cleanup_mnt+0x426/0x4c0 fs/namespace.c:1256
 task_work_run+0x24a/0x300 kernel/task_work.c:180
 ptrace_notify+0x2cd/0x380 kernel/signal.c:2399
 ptrace_report_syscall include/linux/ptrace.h:411 [inline]
 ptrace_report_syscall_exit include/linux/ptrace.h:473 [inline]
 syscall_exit_work kernel/entry/common.c:251 [inline]
 syscall_exit_to_user_mode_prepare kernel/entry/common.c:278 [inline]
 __syscall_exit_to_user_mode_work kernel/entry/common.c:283 [inline]
 syscall_exit_to_user_mode+0x15c/0x280 kernel/entry/common.c:296
 do_syscall_64+0x50/0x110 arch/x86/entry/common.c:88
 entry_SYSCALL_64_after_hwframe+0x63/0x6b

The root cause is:
- do_sys_open
 - f2fs_lookup
  - __f2fs_find_entry
   - f2fs_i_depth_write
    - f2fs_mark_inode_dirty_sync
     - f2fs_dirty_inode
      - set_inode_flag(inode, FI_DIRTY_INODE)

- umount
 - kill_f2fs_super
  - kill_block_super
   - generic_shutdown_super
    - sync_filesystem
    : sb is readonly, skip sync_filesystem()
    - evict_inodes
     - iput
      - f2fs_evict_inode
       - f2fs_bug_on(sbi, is_inode_flag_set(inode, FI_DIRTY_INODE))
       : trigger kernel panic

When we try to repair i_current_depth in readonly filesystem, let's
skip dirty inode to avoid panic in later f2fs_evict_inode().

Cc: stable@vger.kernel.org
Reported-by: syzbot+31e4659a3fe953aec2f4@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-f2fs-devel/000000000000e890bc0609a55cff@google.com
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 2d2916516577f2239b3377d9e8d12da5e6ccdfcf)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-10 10:27:24 +00:00
Daeho Jeong
24eb54283c f2fs: prevent newly created inode from being dirtied incorrectly
Now, we invoke f2fs_mark_inode_dirty_sync() to make an inode dirty in
advance of creating a new node page for the inode. By this, some inodes
whose node page is not created yet can be linked into the global dirty
list.

If the checkpoint is executed at this moment, the inode will be written
back by writeback_single_inode() and finally update_inode_page() will
fail to detach the inode from the global dirty list because the inode
doesn't have a node page.

The problem is that the inode's state in VFS layer will become clean
after execution of writeback_single_inode() and it's still linked in
the global dirty list of f2fs and this will cause a kernel panic.

So, we will prevent the newly created inode from being dirtied during
the FI_NEW_INODE flag of the inode is set. We will make it dirty
right after the flag is cleared.

Signed-off-by: Daeho Jeong <daeho.jeong@samsung.com>
Signed-off-by: Youngjin Gil <youngjin.gil@samsung.com>
Tested-by: Hobin Woo <hobin.woo@samsung.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
(cherry picked from commit 9ac1e2d88d076aa1ae9e33d44a9bbc8ae3bfa791)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-10 10:27:24 +00:00
Baokun Li
839f300001 ext4: make sure the first directory block is not a hole
commit f9ca51596bbfd0f9c386dd1c613c394c78d9e5e6 upstream.

The syzbot constructs a directory that has no dirblock but is non-inline,
i.e. the first directory block is a hole. And no errors are reported when
creating files in this directory in the following flow.

    ext4_mknod
     ...
      ext4_add_entry
        // Read block 0
        ext4_read_dirblock(dir, block, DIRENT)
          bh = ext4_bread(NULL, inode, block, 0)
          if (!bh && (type == INDEX || type == DIRENT_HTREE))
          // The first directory block is a hole
          // But type == DIRENT, so no error is reported.

After that, we get a directory block without '.' and '..' but with a valid
dentry. This may cause some code that relies on dot or dotdot (such as
make_indexed_dir()) to crash.

Therefore when ext4_read_dirblock() finds that the first directory block
is a hole report that the filesystem is corrupted and return an error to
avoid loading corrupted data from disk causing something bad.

Reported-by: syzbot+ae688d469e36fb5138d0@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=ae688d469e36fb5138d0
Fixes: 4e19d6b65fb4 ("ext4: allow directory holes")
Cc: stable@kernel.org
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240702132349.2600605-3-libaokun@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit d81d7e347d1f1f48a5634607d39eb90c161c8afe)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-10 10:27:23 +00:00
Baokun Li
4ed99f550b ext4: check dot and dotdot of dx_root before making dir indexed
commit 50ea741def587a64e08879ce6c6a30131f7111e7 upstream.

Syzbot reports a issue as follows:
============================================
BUG: unable to handle page fault for address: ffffed11022e24fe
PGD 23ffee067 P4D 23ffee067 PUD 0
Oops: Oops: 0000 [#1] PREEMPT SMP KASAN PTI
CPU: 0 PID: 5079 Comm: syz-executor306 Not tainted 6.10.0-rc5-g55027e689933 #0
Call Trace:
 <TASK>
 make_indexed_dir+0xdaf/0x13c0 fs/ext4/namei.c:2341
 ext4_add_entry+0x222a/0x25d0 fs/ext4/namei.c:2451
 ext4_rename fs/ext4/namei.c:3936 [inline]
 ext4_rename2+0x26e5/0x4370 fs/ext4/namei.c:4214
[...]
============================================

The immediate cause of this problem is that there is only one valid dentry
for the block to be split during do_split, so split==0 results in out of
bounds accesses to the map triggering the issue.

    do_split
      unsigned split
      dx_make_map
       count = 1
      split = count/2 = 0;
      continued = hash2 == map[split - 1].hash;
       ---> map[4294967295]

The maximum length of a filename is 255 and the minimum block size is 1024,
so it is always guaranteed that the number of entries is greater than or
equal to 2 when do_split() is called.

But syzbot's crafted image has no dot and dotdot in dir, and the dentry
distribution in dirblock is as follows:

  bus     dentry1          hole           dentry2           free
|xx--|xx-------------|...............|xx-------------|...............|
0   12 (8+248)=256  268     256     524 (8+256)=264 788     236     1024

So when renaming dentry1 increases its name_len length by 1, neither hole
nor free is sufficient to hold the new dentry, and make_indexed_dir() is
called.

In make_indexed_dir() it is assumed that the first two entries of the
dirblock must be dot and dotdot, so bus and dentry1 are left in dx_root
because they are treated as dot and dotdot, and only dentry2 is moved
to the new leaf block. That's why count is equal to 1.

Therefore add the ext4_check_dx_root() helper function to add more sanity
checks to dot and dotdot before starting the conversion to avoid the above
issue.

Reported-by: syzbot+ae688d469e36fb5138d0@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=ae688d469e36fb5138d0
Fixes: ac27a0ec112a ("[PATCH] ext4: initial copy of files from ext3")
Cc: stable@kernel.org
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240702132349.2600605-2-libaokun@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit b80575ffa98b5bb3a5d4d392bfe4c2e03e9557db)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-10 10:27:23 +00:00
Chao Yu
26722f1171 hfs: fix to initialize fields of hfs_inode_info after hfs_alloc_inode()
commit 26a2ed107929a855155429b11e1293b83e6b2a8b upstream.

Syzbot reports uninitialized value access issue as below:

loop0: detected capacity change from 0 to 64
=====================================================
BUG: KMSAN: uninit-value in hfs_revalidate_dentry+0x307/0x3f0 fs/hfs/sysdep.c:30
 hfs_revalidate_dentry+0x307/0x3f0 fs/hfs/sysdep.c:30
 d_revalidate fs/namei.c:862 [inline]
 lookup_fast+0x89e/0x8e0 fs/namei.c:1649
 walk_component fs/namei.c:2001 [inline]
 link_path_walk+0x817/0x1480 fs/namei.c:2332
 path_lookupat+0xd9/0x6f0 fs/namei.c:2485
 filename_lookup+0x22e/0x740 fs/namei.c:2515
 user_path_at_empty+0x8b/0x390 fs/namei.c:2924
 user_path_at include/linux/namei.h:57 [inline]
 do_mount fs/namespace.c:3689 [inline]
 __do_sys_mount fs/namespace.c:3898 [inline]
 __se_sys_mount+0x66b/0x810 fs/namespace.c:3875
 __x64_sys_mount+0xe4/0x140 fs/namespace.c:3875
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xcf/0x1e0 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x63/0x6b

BUG: KMSAN: uninit-value in hfs_ext_read_extent fs/hfs/extent.c:196 [inline]
BUG: KMSAN: uninit-value in hfs_get_block+0x92d/0x1620 fs/hfs/extent.c:366
 hfs_ext_read_extent fs/hfs/extent.c:196 [inline]
 hfs_get_block+0x92d/0x1620 fs/hfs/extent.c:366
 block_read_full_folio+0x4ff/0x11b0 fs/buffer.c:2271
 hfs_read_folio+0x55/0x60 fs/hfs/inode.c:39
 filemap_read_folio+0x148/0x4f0 mm/filemap.c:2426
 do_read_cache_folio+0x7c8/0xd90 mm/filemap.c:3553
 do_read_cache_page mm/filemap.c:3595 [inline]
 read_cache_page+0xfb/0x2f0 mm/filemap.c:3604
 read_mapping_page include/linux/pagemap.h:755 [inline]
 hfs_btree_open+0x928/0x1ae0 fs/hfs/btree.c:78
 hfs_mdb_get+0x260c/0x3000 fs/hfs/mdb.c:204
 hfs_fill_super+0x1fb1/0x2790 fs/hfs/super.c:406
 mount_bdev+0x628/0x920 fs/super.c:1359
 hfs_mount+0xcd/0xe0 fs/hfs/super.c:456
 legacy_get_tree+0x167/0x2e0 fs/fs_context.c:610
 vfs_get_tree+0xdc/0x5d0 fs/super.c:1489
 do_new_mount+0x7a9/0x16f0 fs/namespace.c:3145
 path_mount+0xf98/0x26a0 fs/namespace.c:3475
 do_mount fs/namespace.c:3488 [inline]
 __do_sys_mount fs/namespace.c:3697 [inline]
 __se_sys_mount+0x919/0x9e0 fs/namespace.c:3674
 __ia32_sys_mount+0x15b/0x1b0 fs/namespace.c:3674
 do_syscall_32_irqs_on arch/x86/entry/common.c:112 [inline]
 __do_fast_syscall_32+0xa2/0x100 arch/x86/entry/common.c:178
 do_fast_syscall_32+0x37/0x80 arch/x86/entry/common.c:203
 do_SYSENTER_32+0x1f/0x30 arch/x86/entry/common.c:246
 entry_SYSENTER_compat_after_hwframe+0x70/0x82

Uninit was created at:
 __alloc_pages+0x9a6/0xe00 mm/page_alloc.c:4590
 __alloc_pages_node include/linux/gfp.h:238 [inline]
 alloc_pages_node include/linux/gfp.h:261 [inline]
 alloc_slab_page mm/slub.c:2190 [inline]
 allocate_slab mm/slub.c:2354 [inline]
 new_slab+0x2d7/0x1400 mm/slub.c:2407
 ___slab_alloc+0x16b5/0x3970 mm/slub.c:3540
 __slab_alloc mm/slub.c:3625 [inline]
 __slab_alloc_node mm/slub.c:3678 [inline]
 slab_alloc_node mm/slub.c:3850 [inline]
 kmem_cache_alloc_lru+0x64d/0xb30 mm/slub.c:3879
 alloc_inode_sb include/linux/fs.h:3018 [inline]
 hfs_alloc_inode+0x5a/0xc0 fs/hfs/super.c:165
 alloc_inode+0x83/0x440 fs/inode.c:260
 new_inode_pseudo fs/inode.c:1005 [inline]
 new_inode+0x38/0x4f0 fs/inode.c:1031
 hfs_new_inode+0x61/0x1010 fs/hfs/inode.c:186
 hfs_mkdir+0x54/0x250 fs/hfs/dir.c:228
 vfs_mkdir+0x49a/0x700 fs/namei.c:4126
 do_mkdirat+0x529/0x810 fs/namei.c:4149
 __do_sys_mkdirat fs/namei.c:4164 [inline]
 __se_sys_mkdirat fs/namei.c:4162 [inline]
 __x64_sys_mkdirat+0xc8/0x120 fs/namei.c:4162
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xcf/0x1e0 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x63/0x6b

It missed to initialize .tz_secondswest, .cached_start and .cached_blocks
fields in struct hfs_inode_info after hfs_alloc_inode(), fix it.

Cc: stable@vger.kernel.org
Reported-by: syzbot+3ae6be33a50b5aae4dab@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-fsdevel/0000000000005ad04005ee48897f@google.com
Signed-off-by: Chao Yu <chao@kernel.org>
Link: https://lore.kernel.org/r/20240616013841.2217-1-chao@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit f7316b2b2f11cf0c6de917beee8d3de728be24db)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-10 10:27:23 +00:00
Ryusuke Konishi
440e5d6b0d nilfs2: avoid undefined behavior in nilfs_cnt32_ge macro
[ Upstream commit 0f3819e8c483771a59cf9d3190cd68a7a990083c ]

According to the C standard 3.4.3p3, the result of signed integer overflow
is undefined.  The macro nilfs_cnt32_ge(), which compares two sequence
numbers, uses signed integer subtraction that can overflow, and therefore
the result of the calculation may differ from what is expected due to
undefined behavior in different environments.

Similar to an earlier change to the jiffies-related comparison macros in
commit 5a581b367b5d ("jiffies: Avoid undefined behavior from signed
overflow"), avoid this potential issue by changing the definition of the
macro to perform the subtraction as unsigned integers, then cast the
result to a signed integer for comparison.

Link: https://lkml.kernel.org/r/20130727225828.GA11864@linux.vnet.ibm.com
Link: https://lkml.kernel.org/r/20240702183512.6390-1-konishi.ryusuke@gmail.com
Fixes: 9ff05123e3bf ("nilfs2: segment constructor")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit d2b9bc7dfd6b0fa1a37eb91e68bca3175cb5ef50)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-10 10:27:23 +00:00
Alex Shi
2891e08c6f fs/nilfs2: remove some unused macros to tame gcc
[ Upstream commit e7920b3e9d9f5470d5ff7d883e72a47addc0a137 ]

There some macros are unused and cause gcc warning. Remove them.

  fs/nilfs2/segment.c:137:0: warning: macro "nilfs_cnt32_gt" is not used [-Wunused-macros]
  fs/nilfs2/segment.c:144:0: warning: macro "nilfs_cnt32_le" is not used [-Wunused-macros]
  fs/nilfs2/segment.c:143:0: warning: macro "nilfs_cnt32_lt" is not used [-Wunused-macros]

Link: https://lkml.kernel.org/r/1607552733-24292-1-git-send-email-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Stable-dep-of: 0f3819e8c483 ("nilfs2: avoid undefined behavior in nilfs_cnt32_ge macro")
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 175ac70d8af52bc0f5b100901702fdb2bc662885)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-10 10:27:22 +00:00
Jan Kara
5e8bf66151 ext4: avoid writing unitialized memory to disk in EA inodes
[ Upstream commit 65121eff3e4c8c90f8126debf3c369228691c591 ]

If the extended attribute size is not a multiple of block size, the last
block in the EA inode will have uninitialized tail which will get
written to disk. We will never expose the data to userspace but still
this is not a good practice so just zero out the tail of the block as it
isn't going to cause a noticeable performance overhead.

Fixes: e50e5129f384 ("ext4: xattr-in-inode support")
Reported-by: syzbot+9c1fe13fcb51574b249b@syzkaller.appspotmail.com
Reported-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240613150234.25176-1-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 282e8d4e9d33182a5ca25fe6333beafdc5282946)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-10-10 10:27:21 +00:00
Richard Raya
40ea4ab57a f2fs: Force nobarrier fsync mode
Change-Id: I6b2a3a1e0be5d7e81202a317acd5bdaeb7de4350
Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>
2024-10-08 22:41:39 -03:00
Sultan Alsawaf
a15b181bbb debugfs: Always compile core debugfs driver for Android kernels
Unfortunately, Android userspace is very dependent on debugfs for several
unrelated things; however, it definitely doesn't require *everything*
that is included in the kernel when CONFIG_DEBUG_FS is enabled.

Therefore, in order to be able to selectively whitelist drivers that
Android needs debugfs for (by passing -DCONFIG_DEBUG_FS to every object
that's desired to be whitelisted), always compile the core debugfs drivers
even when CONFIG_DEBUG_FS is disabled so that debugfs can still be used
where it's necessary.

Change-Id: Ie90681fdbbbe6169b27b517b9ecc430d3230d15b
Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>
2024-10-06 00:37:28 -03:00
Jebaitedneko
6b94bfa83f pstore: Always execute ramoops_pstore_write()
Change-Id: I57e866cb1b49e729843391f7d91d93883b545573
Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>
2024-10-06 00:37:25 -03:00
trautamaki
8c219406d3 pstore: Dump ramoops even when kernel doesn't crash
Change-Id: Iaefddcfd37304e12e5ca1ec34fc8cd4edcd5cebb
Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>
2024-10-06 00:37:25 -03:00
Adithya R
1cb3530d6b pstore: Import xiaomi changes
Change-Id: I5518ef2c1b66a4c21e63a62ad7949574982a5eaa
Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>
2024-10-06 00:37:23 -03:00
Richard Raya
be1ff8e638 msm-4.14: Revert some unsafe optimizations
Change-Id: I2c268f87ab8d9154758384c7a7639046c3784eb8
Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>
2024-09-29 17:32:34 -03:00
Richard Raya
2a888d5292 f2fs: Update congestion timeout for 250Hz
Change-Id: I2b2208705f5eebb7a2fc4af8e1941cabb241b09f
Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>
2024-09-09 00:04:07 -03:00
Sultan Alsawaf
9d446ea713 xattr: Avoid dynamically allocating memory in getxattr
Although the maximum xattr size is too big to fit on the stack (64 KiB),
we can still fulfill most getxattr requests with a 4 KiB stack
allocation, thereby improving performance.

Change-Id: I36798b527d7641597c44c1d7035981a7477e5d4b
Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
Signed-off-by: Yaroslav Furman <yaro330@gmail.com>
Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>
2024-08-27 11:40:02 -03:00
Park Ju Hyung
bcaa3ffd5d quota_tree: Avoid dynamic memory allocations
Most allocations done here are rather small and can fit on the stack,
eliminating the need to allocate them dynamically. Reserve a 1024B
stack buffer for this purpose to avoid the overhead of dynamic
memory allocation.

1024B covers most use cases, and higher values were observed to cause
stack corruptions.

Change-Id: I3413ddc239bc59e87148090ce2a9d4035329ad8b
Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com>
Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>
2024-08-27 11:40:01 -03:00
Richard Raya
f15dbc5435 This is the 4.14.352 OpenELA-Extended LTS stable release
-----BEGIN PGP SIGNATURE-----
 
 iQJNBAABCAA3FiEERFwmR4yFob14UDOYC8702P6YulgFAmbMN2wZHHZlZ2FyZC5u
 b3NzdW1Ab3JhY2xlLmNvbQAKCRALzvTY/pi6WI/OD/0df1vNkreJiEgYmrzoT5rG
 FekNwYxJLpnN2cIk5qZv5SUSrZDfqPEMzhz4tuv0kVMv8gb+k8tXpBGr6rfZlYEw
 UoDAZzAN0Ns0JwxtPoIIJrjfD1a0xq8teSGhbphUDLA/fH/M66pTPtJrDjia/99E
 0Ymhc+8pv77UljmaxjEoQ8a0hXoe/TxoxwJpUdklwhcXZ+u4+FwJfxxHd4webnGs
 XiC1rrq4fYOX5MaomrGhKQBFv/iQAM+U9VHn4m1yLqDXYXn3j7stOa8dPY5g6EJw
 kzPsure193EXj9Tp5Y+sC+aExLGfaO7MB+hlXR0El/cHboTFJshafVuIhHp93zlI
 4NVExnDe3Xmrb7U4hCJd2L2z3Vgq8jSH0R1FwZqfZruOo+B1zG9/SIiM96pD5YR1
 ugk2P9XAXEpDUoMv4XPCNlpMetT7HBdUWIJCZQDCTl1axiGWnqs1MpGVHfavoTLz
 CKuoluR/0vL5lVX7vrc1Pb6+7APaY0U9ehVDlMbumS2PmpkLsoaTbTLlJsnVBrmg
 fkN4sjja0mszL4f5zzPCZ/9G8opf63+LLiheUNdnB4yWn7VoK4+kH3v3dW3e2gzJ
 t5IYygJ8Ys8pCLAueuiRb3pce59HrEq8mFtoF+MXlLN4fVHFwKbTtJWiEvEl5x3E
 Ks8j03Ywad/Zcl5GiwnnEw==
 =AZG+
 -----END PGP SIGNATURE-----

Merge tag 'v4.14.352-openela' of https://github.com/openela/kernel-lts

This is the 4.14.352 OpenELA-Extended LTS stable release

* tag 'v4.14.352-openela' of https://github.com/openela/kernel-lts: (32 commits)
  LTS: Update to 4.14.352
  filelock: Fix fcntl/close race recovery compat path
  jfs: don't walk off the end of ealist
  ocfs2: add bounds checking to ocfs2_check_dir_entry()
  net: relax socket state check at accept time.
  ACPI: processor_idle: Fix invalid comparison with insertion sort for latency
  ARM: 9324/1: fix get_user() broken with veneer
  filelock: Remove locks reliably when fcntl/close race is detected
  hfsplus: fix uninit-value in copy_name
  selftests/vDSO: fix clang build errors and warnings
  spi: imx: Don't expect DMA for i.MX{25,35,50,51,53} cspi devices
  fs: better handle deep ancestor chains in is_subdir()
  Bluetooth: hci_core: cancel all works upon hci_unregister_dev()
  net: mac802154: Fix racy device stats updates by DEV_STATS_INC() and DEV_STATS_ADD()
  net: usb: qmi_wwan: add Telit FN912 compositions
  ALSA: dmaengine_pcm: terminate dmaengine before synchronize
  s390/sclp: Fix sclp_init() cleanup on failure
  Input: elantech - fix touchpad state on resume for Lenovo N24
  wifi: cfg80211: wext: add extra SIOCSIWSCAN data check
  mei: demote client disconnect warning on suspend to debug
  ...

Change-Id: I4cbdfa0321bf83d62ac62f386eb77d21c5785dec
Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>
2024-08-26 18:30:32 -03:00