This reverts commits:
- e06e502c8d6befbd107b0a68333389583907dbfb [proc: add seq_put_decimal_ull_width to speed up /proc/pid/smaps]
- 5aa0772702bef7ebc336f23b49cfe8ff6d4ebf69 [fs/proc/task_mmu.c: do not show VmExe bigger than total executable virtual memory]
This commit addresses a critical issue that led to app crashes due to out-of-memory (OOM) conditions.
The issue arose when certain apps, such as "Livin' by Mandiri" attempted to allocate close to 1 GB of memory,
causing the app to crash.
Change-Id: Ida596ad69c8308d38ca79d0fd51cc8ff87cc09f3
Suggested-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>
Unfortunately, Android userspace is very dependent on debugfs for several
unrelated things; however, it definitely doesn't require *everything*
that is included in the kernel when CONFIG_DEBUG_FS is enabled.
Therefore, in order to be able to selectively whitelist drivers that
Android needs debugfs for (by passing -DCONFIG_DEBUG_FS to every object
that's desired to be whitelisted), always compile the core debugfs drivers
even when CONFIG_DEBUG_FS is disabled so that debugfs can still be used
where it's necessary.
Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
Signed-off-by: Jebaitedneko <Jebaitedneko@gmail.com>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: Forenche <prahul2003@gmail.com>
When FUSE passthrough is used, the lower file system file is manipulated
directly, but neither mtime, atime or ctime of the referencing FUSE file
is updated.
Fix by updating the file times when passthrough operations are
performed.
Bug: 200779468
Reported-by: Fengnan Chang <changfengnan@vivo.com>
Reported-by: Ed Tsai <ed.tsai@mediatek.com>
Signed-off-by: Alessio Balsini <balsini@google.com>
Change-Id: I35b72196b2cc1d79a9f62ddb32e2cfa934c3b6d3
[cyberknight777: backport to 4.14]
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: Forenche <prahul2003@gmail.com>
The page used to contain the fuse_dentry_canonical_path to be handled in
fuse_dev_do_write is allocated using __get_free_pages(GFP_KERNEL).
The returned page may contain undefined data, that by chance may be
considered as a valid path name that is not in the cache. In that case,
if the FUSE daemon mistakenly doesn't fill the canonical path buffer,
the FUSE driver may fall into two blocking
request_wait_answer(fuse_dev_write->kern_path->fuse_lookup_name)
causing a deadlock condition.
The stack is as follows:
find S 0 20511 20117 0x00000000
Call trace:
[<ffffff8008085e78>] __switch_to+0xb8/0xd4
[<ffffff8008a0cac4>] __schedule+0x458/0x714
[<ffffff8008a0ce0c>] schedule+0x8c/0xa8
[<ffffff800833865c>] request_wait_answer+0x74/0x220
[<ffffff8008339f70>] __fuse_request_send+0x8c/0xa0
[<ffffff8008339fe4>] fuse_request_send+0x60/0x6c
[<ffffff800833c1a8>] fuse_dentry_canonical_path+0xb8/0x104
[<ffffff800820b14c>] do_sys_open+0x1b4/0x260
[<ffffff800820b27c>] SyS_openat+0x3c/0x4c
[<ffffff8008083540>] el0_svc_naked+0x34/0x38
mount.ntfs-3g S 0 5845 1 0x00000000
Call trace:
[<ffffff8008085e78>] __switch_to+0xb8/0xd4
[<ffffff8008a0cac4>] __schedule+0x458/0x714
[<ffffff8008a0ce0c>] schedule+0x8c/0xa8
[<ffffff800833865c>] request_wait_answer+0x74/0x220
[<ffffff8008339f70>] __fuse_request_send+0x8c/0xa0
[<ffffff8008339fe4>] fuse_request_send+0x60/0x6c
[<ffffff800833bdb0>] fuse_simple_request+0x128/0x16c
[<ffffff800833dddc>] fuse_lookup_name+0x104/0x1b0
[<ffffff800833dee4>] fuse_lookup+0x5c/0x11c
[<ffffff800821861c>] lookup_slow+0xfc/0x174
[<ffffff800821b474>] walk_component+0xf0/0x290
[<ffffff800821bbac>] path_lookupat+0xa0/0x128
[<ffffff800821c7f4>] filename_lookup+0x84/0x124
[<ffffff800821c8d8>] kern_path+0x44/0x54
[<ffffff800833b0c8>] fuse_dev_do_write+0x828/0xa0c
[<ffffff800833b610>] fuse_dev_write+0x90/0xb4
[<ffffff800820b770>] do_iter_readv_writev+0xf4/0x13c
[<ffffff800820cc88>] do_readv_writev+0xec/0x220
[<ffffff800820d05c>] vfs_writev+0x60/0x74
[<ffffff800820d0ec>] do_writev+0x7c/0x100
[<ffffff800820e348>] SyS_writev+0x38/0x48
[<ffffff8008083540>] el0_svc_naked+0x34/0x38
Fix by ensuring that the page allocated for the canonical path is zeroed.
Bug: 194856119
Bug: 196051870
Fixes: 24ab59f6bb42 ("ANDROID: fuse: Add support for d_canonical_path")
Signed-off-by: Biao Li <libiao@allwinnertech.com>
Signed-off-by: Shuosheng Huang <huangshuosheng@allwinnertech.com>
Signed-off-by: Alessio Balsini <balsini@google.com>
Change-Id: I400815dc1049d90c308f5cf87ce60de97ff82131
[cyberknight777: backport to 4.14]
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: Forenche <prahul2003@gmail.com>
With commit f8425c939663 ("fuse: 32-bit user space ioctl compat for fuse
device") the matching constraints for the FUSE_DEV_IOC_CLONE ioctl command
are relaxed, limited to the testing of command type and number. As Arnd
noticed, this is wrong as it wouldn't ensure the correctness of the data
size or direction for the received FUSE device ioctl.
Fix by bringing back the comparison of the ioctl received by the FUSE
device to the originally generated FUSE_DEV_IOC_CLONE.
Fixes: f8425c939663 ("fuse: 32-bit user space ioctl compat for fuse device")
Reported-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Alessio Balsini <balsini@android.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Alessio Balsini <balsini@android.com>
Change-Id: I372d8399db6d603ba20ef50528acf6645e4d3c66
(cherry picked from commit 6076f5f341e612152879bfda99f0b76c1953bf0b)
[cyberknight777: backport to 4.14]
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: Forenche <prahul2003@gmail.com>
The initial FUSE passthrough interface has the issue of introducing an
ioctl which receives as a parameter a data structure containing a
pointer. What happens is that, depending on the architecture, the size
of this struct might change, and especially for 32-bit userspace running
on 64-bit kernel, the size mismatch results into different a single
ioctl the behavior of which depends on the data that is passed (e.g.,
with an enum). This is just a poor ioctl design as mentioned by Arnd
Bergmann [1].
Introduce the new FUSE_PASSTHROUGH_OPEN ioctl which only gets the fd of
the lower file system, which is a fixed-size __u32, dropping the
confusing fuse_passthrough_out data structure.
[1] https://lore.kernel.org/lkml/CAK8P3a2K2FzPvqBYL9W=Yut58SFXyetXwU4Fz50G5O3TsS0pPQ@mail.gmail.com/
Bug: 175195837
Signed-off-by: Alessio Balsini <balsini@google.com>
Change-Id: I486d71cbe20f3c0c87544fa75da4e2704fe57c7c
[cyberknight777: backport to 4.14]
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: Forenche <prahul2003@gmail.com>
If the system doesn't have enough memory when fuse_passthrough_read_iter
is requested in asynchronous IO, an error is directly returned without
restoring the caller's credentials.
Fix by always ensuring credentials are restored.
Fixes: aa29f32988c1f84c96e2457b049dea437601f2cc ("FROMLIST: fuse: Use daemon creds in passthrough mode")
Link: https://lore.kernel.org/lkml/YB0qPHVORq7bJy6G@google.com/
Reported-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Alessio Balsini <balsini@android.com>
Signed-off-by: Alessio Balsini <balsini@google.com>
Change-Id: I4aff43f5dd8ddab2cc8871cd9f81438963ead5b6
(cherry picked from commit 79a47db66416232bbc5b9fce8f417c9fad025fb1)
Signed-off-by: alk3pInjection <webmaster@raspii.tech>
Signed-off-by: atndko <z1281552865@gmail.com>
Signed-off-by: Zlatan Radovanovic <zlatan.radovanovic@fet.ba>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: Forenche <prahul2003@gmail.com>
Enabling FUSE passthrough for mmap-ed operations not only affects
performance, but has also been shown as mandatory for the correct
functioning of FUSE passthrough.
yanwu noticed [1] that a FUSE file with passthrough enabled may suffer
data inconsistencies if the same file is also accessed with mmap. What
happens is that read/write operations are directly applied to the lower
file system (and its cache), while mmap-ed operations are affecting the
FUSE cache.
Extend the FUSE passthrough implementation to also handle memory-mapped
FUSE file, to both fix the cache inconsistencies and extend the
passthrough performance benefits to mmap-ed operations.
[1] https://lore.kernel.org/lkml/20210119110654.11817-1-wu-yan@tcl.com/
Bug: 179164095
Link: https://lore.kernel.org/lkml/20210125153057.3623715-9-balsini@android.com/
Signed-off-by: Alessio Balsini <balsini@android.com>
Change-Id: Ifad4698b0380f6e004c487940ac6907b9a9f2964
Signed-off-by: Alessio Balsini <balsini@google.com>
(cherry picked from commit bf5cb932f0e0dd028dcebf3a6c2fcfedb4fd8265)
Signed-off-by: alk3pInjection <webmaster@raspii.tech>
Signed-off-by: atndko <z1281552865@gmail.com>
Signed-off-by: Zlatan Radovanovic <zlatan.radovanovic@fet.ba>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: Forenche <prahul2003@gmail.com>
When using FUSE passthrough, read/write operations are directly
forwarded to the lower file system file through VFS, but there is no
guarantee that the process that is triggering the request has the right
permissions to access the lower file system. This would cause the
read/write access to fail.
In passthrough file systems, where the FUSE daemon is responsible for
the enforcement of the lower file system access policies, often happens
that the process dealing with the FUSE file system doesn't have access
to the lower file system.
Being the FUSE daemon in charge of implementing the FUSE file
operations, that in the case of read/write operations usually simply
results in the copy of memory buffers from/to the lower file system
respectively, these operations are executed with the FUSE daemon
privileges.
This patch adds a reference to the FUSE daemon credentials, referenced
at FUSE_DEV_IOC_PASSTHROUGH_OPEN ioctl() time so that they can be used
to temporarily raise the user credentials when accessing lower file
system files in passthrough.
The process accessing the FUSE file with passthrough enabled temporarily
receives the privileges of the FUSE daemon while performing read/write
operations. Similar behavior is implemented in overlayfs.
These privileges will be reverted as soon as the IO operation completes.
This feature does not provide any higher security privileges to those
processes accessing the FUSE file system with passthrough enabled. This
is because it is still the FUSE daemon responsible for enabling or not
the passthrough feature at file open time, and should enable the feature
only after appropriate access policy checks.
Bug: 179164095
Link: https://lore.kernel.org/lkml/20210125153057.3623715-8-balsini@android.com/
Signed-off-by: Alessio Balsini <balsini@android.com>
Change-Id: Idb4f03a2ce7c536691e5eaf8fadadfcf002e1677
Signed-off-by: Alessio Balsini <balsini@google.com>
(cherry picked from commit 5f3d78268b21d381310574af1c16882c7680ceb1)
Signed-off-by: alk3pInjection <webmaster@raspii.tech>
Signed-off-by: atndko <z1281552865@gmail.com>
Signed-off-by: Zlatan Radovanovic <zlatan.radovanovic@fet.ba>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: Forenche <prahul2003@gmail.com>
Extend the passthrough feature by handling asynchronous IO both for read
and write operations.
When an AIO request is received, if the request targets a FUSE file with
the passthrough functionality enabled, a new identical AIO request is
created. The new request targets the lower file system file and gets
assigned a special FUSE passthrough AIO completion callback.
When the lower file system AIO request is completed, the FUSE
passthrough AIO completion callback is executed and propagates the
completion signal to the FUSE AIO request by triggering its completion
callback as well.
Bug: 179164095
Link: https://lore.kernel.org/lkml/20210125153057.3623715-7-balsini@android.com/
Signed-off-by: Alessio Balsini <balsini@android.com>
Change-Id: I47671ef36211102da6dd3ee8b2f226d1e6cd9d5c
Signed-off-by: Alessio Balsini <balsini@google.com>
(cherry picked from commit ea2b7a36847b14dee60d1f5dbf2aa26cf101c426)
Signed-off-by: alk3pInjection <webmaster@raspii.tech>
Signed-off-by: atndko <z1281552865@gmail.com>
Signed-off-by: Zlatan Radovanovic <zlatan.radovanovic@fet.ba>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: Forenche <prahul2003@gmail.com>
All the read and write operations performed on fuse_files which have the
passthrough feature enabled are forwarded to the associated lower file
system file via VFS.
Sending the request directly to the lower file system avoids the
userspace round-trip that, because of possible context switches and
additional operations might reduce the overall performance, especially
in those cases where caching doesn't help, for example in reads at
random offsets.
Verifying if a fuse_file has a lower file system file associated with
can be done by checking the validity of its passthrough_filp pointer.
This pointer is not NULL only if passthrough has been successfully
enabled via the appropriate ioctl().
When a read/write operation is requested for a FUSE file with
passthrough enabled, a new equivalent VFS request is generated, which
instead targets the lower file system file.
The VFS layer performs additional checks that allow for safer operations
but may cause the operation to fail if the process accessing the FUSE
file system does not have access to the lower file system.
This change only implements synchronous requests in passthrough,
returning an error in the case of asynchronous operations, yet covering
the majority of the use cases.
Bug: 179164095
Link: https://lore.kernel.org/lkml/20210125153057.3623715-6-balsini@android.com/
Signed-off-by: Alessio Balsini <balsini@android.com>
Change-Id: Ifbe6a247fe7338f87d078fde923f0252eeaeb668
Signed-off-by: Alessio Balsini <balsini@google.com>
(cherry picked from commit ea9685a7f9cb16b30e25386386274fdd30627c3a)
Signed-off-by: alk3pInjection <webmaster@raspii.tech>
Signed-off-by: atndko <z1281552865@gmail.com>
Signed-off-by: Zlatan Radovanovic <zlatan.radovanovic@fet.ba>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: Forenche <prahul2003@gmail.com>
Implement the FUSE passthrough ioctl that associates the lower
(passthrough) file system file with the fuse_file.
The file descriptor passed to the ioctl by the FUSE daemon is used to
access the relative file pointer, that will be copied to the fuse_file
data structure to consolidate the link between the FUSE and lower file
system.
To enable the passthrough mode, user space triggers the
FUSE_DEV_IOC_PASSTHROUGH_OPEN ioctl and, if the call succeeds, receives
back an identifier that will be used at open/create response time in the
fuse_open_out field to associate the FUSE file to the lower file system
file.
The value returned by the ioctl to user space can be:
- > 0: success, the identifier can be used as part of an open/create
reply.
- <= 0: an error occurred.
The value 0 represents an error to preserve backward compatibility: the
fuse_open_out field that is used to pass the passthrough_fh back to the
kernel uses the same bits that were previously as struct padding, and is
commonly zero-initialized (e.g., in the libfuse implementation).
Removing 0 from the correct values fixes the ambiguity between the case
in which 0 corresponds to a real passthrough_fh, a missing
implementation of FUSE passthrough or a request for a normal FUSE file,
simplifying the user space implementation.
For the passthrough mode to be successfully activated, the lower file
system file must implement both read_iter and write_iter file
operations. This extra check avoids special pseudo files to be targeted
for this feature.
Passthrough comes with another limitation: no further file system
stacking is allowed for those FUSE file systems using passthrough.
Bug: 179164095
Link: https://lore.kernel.org/lkml/20210125153057.3623715-5-balsini@android.com/
Signed-off-by: Alessio Balsini <balsini@android.com>
Change-Id: I4d8290012302fb4547bce9bb261a03cc4f66b5aa
Signed-off-by: Alessio Balsini <balsini@google.com>
(cherry picked from commit 28e86146c501a0f943fe9dc0ec0252df066a2b3d)
Signed-off-by: alk3pInjection <webmaster@raspii.tech>
Signed-off-by: atndko <z1281552865@gmail.com>
Signed-off-by: Zlatan Radovanovic <zlatan.radovanovic@fet.ba>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: Forenche <prahul2003@gmail.com>
Expose the FUSE_PASSTHROUGH interface to user space and declare all the
basic data structures and functions as the skeleton on top of which the
FUSE passthrough functionality will be built.
As part of this, introduce the new FUSE passthrough ioctl, which allows
the FUSE daemon to specify a direct connection between a FUSE file and a
lower file system file. Such ioctl requires user space to pass the file
descriptor of one of its opened files through the fuse_passthrough_out
data structure introduced in this patch. This structure includes extra
fields for possible future extensions.
Also, add the passthrough functions for the set-up and tear-down of the
data structures and locks that will be used both when fuse_conns and
fuse_files are created/deleted.
Bug: 179164095
Link: https://lore.kernel.org/lkml/20210125153057.3623715-4-balsini@android.com/
Signed-off-by: Alessio Balsini <balsini@android.com>
Change-Id: I732532581348adadda5b5048a9346c2b0868d539
Signed-off-by: Alessio Balsini <balsini@google.com>
(cherry picked from commit d02368d67989781a3484cd8dd71e0079d0d1bda2)
Signed-off-by: alk3pInjection <webmaster@raspii.tech>
Signed-off-by: atndko <z1281552865@gmail.com>
Signed-off-by: Zlatan Radovanovic <zlatan.radovanovic@fet.ba>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: Forenche <prahul2003@gmail.com>
With a 64-bit kernel build the FUSE device cannot handle ioctl requests
coming from 32-bit user space.
This is due to the ioctl command translation that generates different
command identifiers that thus cannot be used for direct comparisons
without proper manipulation.
Explicitly extract type and number from the ioctl command to enable
32-bit user space compatibility on 64-bit kernel builds.
Bug: 179164095
Link: https://lore.kernel.org/lkml/20210125153057.3623715-3-balsini@android.com/
Signed-off-by: Alessio Balsini <balsini@android.com>
Change-Id: I595517c54d551be70e83c7fcb4b62397a3615004
Signed-off-by: Alessio Balsini <balsini@google.com>
(cherry picked from commit af4048924e191bda0bb85b4bf127f22cf3c70fba)
Signed-off-by: alk3pInjection <webmaster@raspii.tech>
Signed-off-by: atndko <z1281552865@gmail.com>
Signed-off-by: Zlatan Radovanovic <zlatan.radovanovic@fet.ba>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: Forenche <prahul2003@gmail.com>
The description of this flag says "Don't sync attributes with the server".
In other words: always use the attributes cached in the kernel and don't
send network or local messages to refresh the attributes.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
(cherry picked from commit ff1b89f389a8e64d0a583ce0b0308696f4ab5860)
Signed-off-by: alk3pInjection <webmaster@raspii.tech>
Signed-off-by: atndko <z1281552865@gmail.com>
Signed-off-by: Zlatan Radovanovic <zlatan.radovanovic@fet.ba>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: Forenche <prahul2003@gmail.com>
Unprivileged users are normally restricted from mounting with the
allow_other option by system policy, but this could be bypassed for a mount
done with user namespace root permissions. In such cases allow_other should
not allow users outside the userns to access the mount as doing so would
give the unprivileged user the ability to manipulate processes it would
otherwise be unable to manipulate. Restrict allow_other to apply to users
in the same userns used at mount or a descendant of that namespace. Also
export current_in_userns() for use by fuse when built as a module.
Reviewed-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Dongsu Park <dongsu@kinvolk.io>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
(cherry picked from commit 73f03c2b4b527346778c711c2734dbff3442b139)
Signed-off-by: alk3pInjection <webmaster@raspii.tech>
Signed-off-by: atndko <z1281552865@gmail.com>
Signed-off-by: Zlatan Radovanovic <zlatan.radovanovic@fet.ba>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: Forenche <prahul2003@gmail.com>
In order to support mounts from namespaces other than init_user_ns, fuse
must translate uids and gids to/from the userns of the process servicing
requests on /dev/fuse. This patch does that, with a couple of restrictions
on the namespace:
- The userns for the fuse connection is fixed to the namespace
from which /dev/fuse is opened.
- The namespace must be the same as s_user_ns.
These restrictions simplify the implementation by avoiding the need to pass
around userns references and by allowing fuse to rely on the checks in
setattr_prepare for ownership changes. Either restriction could be relaxed
in the future if needed.
For cuse the userns used is the opener of /dev/cuse. Semantically the cuse
support does not appear safe for unprivileged users. Practically the
permissions on /dev/cuse only make it accessible to the global root user.
If something slips through the cracks in a user namespace the only users
who will be able to use the cuse device are those users mapped into the
user namespace.
Translation in the posix acl is updated to use the uuser namespace of the
filesystem. Avoiding cases which might bypass this translation is handled
in a following change.
This change is stronlgy based on a similar change from Seth Forshee and
Dongsu Park.
Cc: Seth Forshee <seth.forshee@canonical.com>
Cc: Dongsu Park <dongsu@kinvolk.io>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
(cherry picked from commit 8cb08329b0809453722bc12aa912be34355bcb66)
Signed-off-by: alk3pInjection <webmaster@raspii.tech>
Signed-off-by: atndko <z1281552865@gmail.com>
Signed-off-by: Zlatan Radovanovic <zlatan.radovanovic@fet.ba>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: Forenche <prahul2003@gmail.com>
Upon a cursory examinination the uid and gid of a fuse request are
necessary for correct operation. Failing a fuse request where those
values are not reliable seems a straight forward and reliable means of
ensuring that fuse requests with bad data are not sent or processed.
In most cases the vfs will avoid actions it suspects will cause
an inode write back of an inode with an invalid uid or gid. But that does
not map precisely to what fuse is doing, so test for this and solve
this at the fuse level as well.
Performing this work in fuse_req_init_context is cheap as the code is
already performing the translation here and only needs to check the
result of the translation to see if things are not representable in
a form the fuse server can handle.
[SzM] Don't zero the context for the nofail case, just keep using the
munging version (makes sense for debugging and doesn't hurt).
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
(cherry picked from commit c9582eb0ff7d2b560be60eafab29183882cdc82b)
Signed-off-by: alk3pInjection <webmaster@raspii.tech>
Signed-off-by: atndko <z1281552865@gmail.com>
Signed-off-by: Zlatan Radovanovic <zlatan.radovanovic@fet.ba>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: Forenche <prahul2003@gmail.com>
At the point of fuse_dev_do_read the user space process that initiated the
action on the fuse filesystem may no longer exist. The process have been
killed or may have fired an asynchronous request and exited.
If the initial process has exited, the code "pid_vnr(find_pid_ns(in->h.pid,
fc->pid_ns)" will either return a pid of 0, or in the unlikely event that
the pid has been reallocated it can return practically any pid. Any pid is
possible as the pid allocator allocates pid numbers in different pid
namespaces independently.
The only way to make translation in fuse_dev_do_read reliable is to call
get_pid in fuse_req_init_context, and pid_vnr followed by put_pid in
fuse_dev_do_read. That reference counting in other contexts has been shown
to bounce cache lines between processors and in general be slow. So that
is not desirable.
The only known user of running the fuse server in a different pid namespace
from the filesystem does not care what the pids are in the fuse messages so
removing this code should not matter.
Getting the translation to a server running outside of the pid namespace of
a container can still be achieved by playing setns games at mount time. It
is also possible to add an option to pass a pid namespace into the fuse
filesystem at mount time.
Fixes: 5d6d3a301c4e ("fuse: allow server to run in different pid_ns")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
(cherry picked from commit dbf107b2a7f36fa635b40e0b554514f599c75b33)
Signed-off-by: alk3pInjection <webmaster@raspii.tech>
Signed-off-by: atndko <z1281552865@gmail.com>
Signed-off-by: Zlatan Radovanovic <zlatan.radovanovic@fet.ba>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: Forenche <prahul2003@gmail.com>
Currently the userspace has no way of knowing whether the fuse
connection ended because of umount or abort via sysfs. It makes it hard
for filesystems to free the mountpoint after abort without worrying
about removing some new mount.
The patch fixes it by returning different errors when userspace reads
from /dev/fuse (-ENODEV for umount and -ECONNABORTED for abort).
Add a new capability flag FUSE_ABORT_ERROR. If set and the connection is
gone because of sysfs abort, reading from the device will return
-ECONNABORTED.
Signed-off-by: Szymon Lukasz <noh4hss@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
(cherry picked from commit 3b7008b226f3de811d4ac34238e9cf670f7c9fe7)
Signed-off-by: alk3pInjection <webmaster@raspii.tech>
Signed-off-by: atndko <z1281552865@gmail.com>
Signed-off-by: Zlatan Radovanovic <zlatan.radovanovic@fet.ba>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: Forenche <prahul2003@gmail.com>
In general, as of now, in FUSE, direct writes on the same file are
serialized over inode lock i.e we hold inode lock for the full duration
of the write request. I could not found in fuse code a comment which
clearly explains why this exclusive lock is taken for direct writes.
Our guess is some USER space fuse implementations might be relying
on this lock for seralization and also it protects for the issues
arising due to file size assumption or write failures. This patch
relaxes this exclusive lock in some cases of direct writes.
With these changes, we allows non-extending parallel direct writes
on the same file with the help of a flag called FOPEN_PARALLEL_WRITES.
If this flag is set on the file (flag is passed from libfuse to fuse
kernel as part of file open/create), we do not take exclusive lock instead
use shared lock so that all non-extending writes can run in parallel.
Best practise would be to enable parallel direct writes of all kinds
including extending writes as well but we see some issues such as
when one write completes and other fails, how we should truncate(if
needed) the file if underlying file system does not support holes
(For file systems which supports holes, there might be a possibility
of enabling parallel writes for all cases).
FUSE implementations which rely on this inode lock for serialisation
can continue to do so and this is default behaviour i.e no parallel
direct writes.
Signed-off-by: Dharmendra Singh <dsingh@ddn.com>
[cyberknight777: backport and adapt to 4.14 fuse implementation]
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: Forenche <prahul2003@gmail.com>
Not longer use MAYA storage on P NPI, and SDCARD_FS_PARTIAL_RELATIME/
SDCARD_FS_DIR_WRITER depend on SDCARD_FS, causing these two configs
are "n" in out/.../obj/KERNEL_OBJ/.config if we don't set configs in
defconfig. we use "#ifdef CONFIG_SDCARD_FS_PARTIAL_RELATIME" to enable/
disable feature in code, un-used code will run and print some error logs
if we don't enable these two features.
.config
CONFIG_SDCARD_FS_PARTIAL_RELATIME="n"
CONFIG_SDCARD_FS_DIR_WRITER="n"
kernel log
82574.033828: <6> sdcardfs: 102 user.relatime: invalid format
82574.036783: <6> sdcardfs: 102 user.relatime: invalid format
82574.060422: <6> sdcardfs: 102 user.relatime: invalid format
82574.118677: <6> sdcardfs: 102 user.relatime: invalid format
82574.121654: <6> sdcardfs: 102 user.relatime: invalid format
82574.168275: <6> sdcardfs: 102 user.relatime: invalid format
82574.172752: <6> sdcardfs: 102 user.relatime: invalid format
Change-Id: Ib1f8c3743fa5e1ae1ffcad8dfb8f7b46923de8a4
Reviewed-on: https://gerrit.mot.com/1296500
SME-Granted: SME Approvals Granted
SLTApproved: Slta Waiver
Tested-by: Jira Key
Reviewed-by: Igor Kovalenko <igork@motorola.com>
Submit-Approved: Jira Key
Signed-off-by: Forenche <prahul2003@gmail.com>
Even if reserving nothing in sdcardfs, lower fs statfs info is still
needed to avoid APP overwriting media storage.
Change-Id: I63e87c3b34c739aeff2dcdd5e6fb3f34c4270cfa
Signed-off-by: Shiyong Li <a22381@motorola.com>
Reviewed-on: https://gerrit.mot.com/1249381
SLTApproved: Slta Waiver
SME-Granted: SME Approvals Granted
Tested-by: Jira Key
Reviewed-by: Hujun Liao <liaohj@motorola.com>
Submit-Approved: Jira Key
Signed-off-by: Forenche <prahul2003@gmail.com>
User can create zero byte files even sdcardfs hits the reserved limit
that will consume blocks for the file name
and entually use up all memeory
Check the avaliable free space before creating files.
Change-Id: Ib902844f2e6e6c638184895e3f4fa7282acc8333
Signed-off-by: a17671 <a17671@motorola.com>
Reviewed-on: https://gerrit.mot.com/1249379
SLTApproved: Slta Waiver
SME-Granted: SME Approvals Granted
Tested-by: Jira Key
Reviewed-by: Hujun Liao <liaohj@motorola.com>
Submit-Approved: Jira Key
Signed-off-by: Forenche <prahul2003@gmail.com>
Copy inode size from lower file system while getting attributes, or file
size is wrong for those files that have already existed before.
Change-Id: Ic7fe8d976cb010695eda7e134c3aca2d3c0d3198
Reviewed-on: https://gerrit.mot.com/1248830
SLTApproved: Slta Waiver
SME-Granted: SME Approvals Granted
Reviewed-by: Huosheng Liao <liaohs@motorola.com>
Tested-by: Jira Key
Submit-Approved: Jira Key
Signed-off-by: Forenche <prahul2003@gmail.com>
- Check if directory writer name has been listed in xattr at every
writing. If not, append it to writer name list in xattr. Put "overrun"
there if seeing too many writers.
- Rename xattr from "user.firstwriter" to "user.dwriter" accordingly.
Change-Id: I7f02dd76201ab66b18ca43fa8370c0e8be6b3411
Reviewed-on: https://gerrit.mot.com/1249368
SLTApproved: Slta Waiver
SME-Granted: SME Approvals Granted
Tested-by: Jira Key
Reviewed-by: Hujun Liao <liaohj@motorola.com>
Submit-Approved: Jira Key
Signed-off-by: Forenche <prahul2003@gmail.com>
Format of "user.relatime" value: {wildcard-path}:{exception-app-name}
- support wildcard path:
* /%s/: update all files or directory under 1st level sub-directories
* /%s/%s/: update all files or directory under 2nd level sub-directories
* /%s/{folder-name}: update all files or directories under the folder
{folder-name} of 1st level sub-directory
* 0: disable
- don't update relatime for the specified APP or non-APP access
Change-Id: I79dcf69618b91eeb8bd9e36f077f243107e32183
Reviewed-on: https://gerrit.mot.com/1249367
SLTApproved: Slta Waiver
SME-Granted: SME Approvals Granted
Tested-by: Jira Key
Reviewed-by: Hujun Liao <liaohj@motorola.com>
Submit-Approved: Jira Key
Signed-off-by: Forenche <prahul2003@gmail.com>
Support setting writer APP name to xattr "user.firstwriter.name" while
creating or writing the directory at first time. To enable this feature,
extended attribute "user.firstwriter" of parent directory should be set
to specify sub-directory logging xattr.
- Enable xattr for sub-directories of specific path:
"setfattr -n user.firstwriter -v {wildcard-path} {sdcardfs-path}"
(wildcard-path examples: /%s, /%s/%s, /%s/Download, ...)
- Disable xattr:
"setfattr -n user.firstwriter -v 0 {sdcardfs-path}"
User space checks xattr "user.firstwriter.name" of specified directory to
figure out owner.
Change-Id: Id76d94a1243d67f25b85af68a098f5c89c5986cd
Reviewed-on: https://gerrit.mot.com/1249366
SME-Granted: SME Approvals Granted
SLTApproved: Slta Waiver
Tested-by: Jira Key
Reviewed-by: Hujun Liao <liaohj@motorola.com>
Submit-Approved: Jira Key
Signed-off-by: Forenche <prahul2003@gmail.com>
Support updating relative access time for specific path instead of whole
mount partition. Extended attribute "user.relatime" is introduced to
toggle this update. As long as relatime of top parent directory is enabled,
all of sub-directories and files's relatime would be updated whatever
their xattr.
- Enable relatime for specific path:
adb shell "setfattr -n user.relatime -v 1 {sdcardfs-path}"
- Disable relatime:
adb shell "setfattr -n user.relatime -v 0 {sdcardfs-path}"
Additionally, relatime would be updated only in one of two conditions:
- Access time is only updated if the previous access time was earlier than
the current modify or change time.
- Last access time is always updated if it is more than 1 day old.
Change-Id: Ib442b929d94206278135edd6ce3e654da02f24e0
Reviewed-on: https://gerrit.mot.com/1249365
SLTApproved: Slta Waiver
SME-Granted: SME Approvals Granted
Tested-by: Jira Key
Reviewed-by: Hujun Liao <liaohj@motorola.com>
Submit-Approved: Jira Key
Signed-off-by: Forenche <prahul2003@gmail.com>
The big pcluster feature has been merged for a year, it has been mostly
stable now.
Signed-off-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Link: https://lore.kernel.org/r/20220407050505.12683-1-huyue2@coolpad.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: Forenche <prahul2003@gmail.com>
This results in no change in structure size on 64-bit machines as it
fits in the padding between the gfp_t and the void *. 32-bit machines
will grow the structure from 8 to 12 bytes. Almost all radix trees are
protected with (at least) a spinlock, so as they are converted from
radix trees to xarrays, the data structures will shrink again.
Initialising the spinlock requires a name for the benefit of lockdep, so
RADIX_TREE_INIT() now needs to know the name of the radix tree it's
initialising, and so do IDR_INIT() and IDA_INIT().
Also add the xa_lock() and xa_unlock() family of wrappers to make it
easier to use the lock. If we could rely on -fplan9-extensions in the
compiler, we could avoid all of this syntactic sugar, but that wasn't
added until gcc 4.6.
Link: http://lkml.kernel.org/r/20180313132639.17387-8-willy@infradead.org
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[@RealJohnGalt: adapt to 4.14]
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: Forenche <prahul2003@gmail.com>
XFS currently contains a copy-and-paste of __set_page_dirty(). Export
it from buffer.c instead.
Link: http://lkml.kernel.org/r/20180313132639.17387-6-willy@infradead.org
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Acked-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: Forenche <prahul2003@gmail.com>
As Xiang mentioned, such path has no real impact to our current
decompression strategy, remove it directly. Also, update the return
value of z_erofs_lz4_decompress() to 0 if success to keep consistent
with LZMA which will return 0 as well for that case.
Link: https://lore.kernel.org/r/20211014065744.1787-1-zbestahu@gmail.com
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Yue Hu <huyue2@yulong.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: Forenche <prahul2003@gmail.com>
Currently, the whole indexes will only be compacted 4B if
compacted_4b_initial > totalidx. So, the calculated compacted_2b
is worthless for that case. It may waste CPU resources.
No need to update compacted_4b_initial as mkfs since it's used to
fulfill the alignment of the 1st compacted_2b pack and would handle
the case above.
We also need to clarify compacted_4b_end here. It's used for the
last lclusters which aren't fitted in the previous compacted_2b
packs.
Some messages are from Xiang.
Link: https://lore.kernel.org/r/20210914035915.1190-1-zbestahu@gmail.com
Signed-off-by: Yue Hu <huyue2@yulong.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Chao Yu <chao@kernel.org>
[ Gao Xiang: it's enough to use "compacted_4b_initial < totalidx". ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: Forenche <prahul2003@gmail.com>
The mapping is not used at all, remove it and update related code.
Link: https://lore.kernel.org/r/20210810072416.1392-1-zbestahu@gmail.com
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Yue Hu <huyue2@yulong.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: Forenche <prahul2003@gmail.com>
No any behavior to variable occupied in z_erofs_attach_page() which
is only caller to z_erofs_pagevec_enqueue().
Link: https://lore.kernel.org/r/20210419102623.2015-1-zbestahu@gmail.com
Signed-off-by: Yue Hu <huyue2@yulong.com>
Reviewed-by: Gao Xiang <xiang@kernel.org>
Signed-off-by: Gao Xiang <xiang@kernel.org>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: Forenche <prahul2003@gmail.com>
If the 1st NONHEAD lcluster of a pcluster isn't CBLKCNT lcluster type
rather than a HEAD or PLAIN type instead, which means its pclustersize
_must_ be 1 lcluster (since its uncompressed size < 2 lclusters),
as illustrated below:
HEAD HEAD / PLAIN lcluster type
____________ ____________
|_:__________|_________:__| file data (uncompressed)
. .
.____________.
|____________| pcluster data (compressed)
Such on-disk case was explained before [1] but missed to be handled
properly in the runtime implementation.
It can be observed if manually generating 1 lcluster-sized pcluster
with 2 lclusters (thus CBLKCNT doesn't exist.) Let's fix it now.
[1] https://lore.kernel.org/r/20210407043927.10623-1-xiang@kernel.org
Link: https://lore.kernel.org/r/20210510064715.29123-1-xiang@kernel.org
Fixes: cec6e93beadf ("erofs: support parsing big pcluster compress indexes")
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Gao Xiang <xiang@kernel.org>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: Forenche <prahul2003@gmail.com>