[ Upstream commit 0934ad42bb2c5df90a1b9de690f93de735b622fe ]
syzbot is reporting UAF at cipso_v4_doi_search() [1], for smk_cipso_doi()
is calling kfree() without removing from the cipso_v4_doi_list list after
netlbl_cfg_cipsov4_map_add() returned an error. We need to use
netlbl_cfg_cipsov4_del() in order to remove from the list and wait for
RCU grace period before kfree().
Link: https://syzkaller.appspot.com/bug?extid=93dba5b91f0fed312cbd [1]
Reported-by: syzbot <syzbot+93dba5b91f0fed312cbd@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Fixes: 6c2e8ac0953fccdd ("netlabel: Update kernel configuration API")
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f91488ee15bd3cac467e2d6a361fc2d34d1052ae ]
syzbot is reporting kernel panic at smk_cipso_doi() due to memory
allocation fault injection [1]. The reason for need to use panic() was
not explained. But since no fix was proposed for 18 months, for now
let's use __GFP_NOFAIL for utilizing syzbot resource on other bugs.
Link: https://syzkaller.appspot.com/bug?extid=89731ccb6fec15ce1c22 [1]
Reported-by: syzbot <syzbot+89731ccb6fec15ce1c22@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0817534ff9ea809fac1322c5c8c574be8483ea57 ]
Syzkaller reported use-after-free bug as described in [1]. The bug is
triggered when smk_set_cipso() tries to free stale category bitmaps
while there are concurrent reader(s) using the same bitmaps.
Wait for RCU grace period to finish before freeing the category bitmaps
in smk_set_cipso(). This makes sure that there are no more readers using
the stale bitmaps and freeing them should be safe.
[1] https://lore.kernel.org/netdev/000000000000a814c505ca657a4e@google.com/
Reported-by: syzbot+3f91de0b813cc3d19a80@syzkaller.appspotmail.com
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6d14f5c7028eea70760df284057fe198ce7778dd ]
In the smk_access_entry() function, if no matching rule is found
in the rust_list, a negative error code will be used to perform bit
operations with the MAY_ enumeration value. This is semantically
wrong. This patch fixes this issue.
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 7ef4c19d245f3dc233fd4be5acea436edd1d83d8 upstream.
syzbot found WARNINGs in several smackfs write operations where
bytes count is passed to memdup_user_nul which exceeds
GFP MAX_ORDER. Check count size if bigger than PAGE_SIZE.
Per smackfs doc, smk_write_net4addr accepts any label or -CIPSO,
smk_write_net6addr accepts any label or -DELETE. I couldn't find
any general rule for other label lengths except SMK_LABELLEN,
SMK_LONGLABEL, SMK_CIPSOMAX which are documented.
Let's constrain, in general, smackfs label lengths for PAGE_SIZE.
Although fuzzer crashes write to smackfs/netlabel on 0x400000 length.
Here is a quick way to reproduce the WARNING:
python -c "print('A' * 0x400000)" > /sys/fs/smackfs/netlabel
Reported-by: syzbot+a71a442385a0b2815497@syzkaller.appspotmail.com
Signed-off-by: Sabyrzhan Tasbolatov <snovitoll@gmail.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 42a2df3e829f3c5562090391b33714b2e2e5ad4a ]
We have an upper bound on "maplevel" but forgot to check for negative
values.
Fixes: e114e473771c ("Smack: Simplified Mandatory Access Control Kernel")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a6bd4f6d9b07452b0b19842044a6c3ea384b0b88 ]
This is similar to commit 84e99e58e8d1 ("Smack: slab-out-of-bounds in
vsscanf") where we added a bounds check on "rule".
Reported-by: syzbot+a22c6092d003d6fe1122@syzkaller.appspotmail.com
Fixes: f7112e6c9abf ("Smack: allow for significantly longer Smack labels v4")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit beb4ee6770a89646659e6a2178538d2b13e2654e upstream.
smk_write_relabel_self() frees memory from the task's credentials with
no locking, which can easily cause a use-after-free because multiple
tasks can share the same credentials structure.
Fix this by using prepare_creds() and commit_creds() to correctly modify
the task's credentials.
Reproducer for "BUG: KASAN: use-after-free in smk_write_relabel_self":
#include <fcntl.h>
#include <pthread.h>
#include <unistd.h>
static void *thrproc(void *arg)
{
int fd = open("/sys/fs/smackfs/relabel-self", O_WRONLY);
for (;;) write(fd, "foo", 3);
}
int main()
{
pthread_t t;
pthread_create(&t, NULL, thrproc, NULL);
thrproc(NULL);
}
Reported-by: syzbot+e6416dabb497a650da40@syzkaller.appspotmail.com
Fixes: 38416e53936e ("Smack: limited capability for changing process label")
Cc: <stable@vger.kernel.org> # v4.4+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e5bfad3d7acc5702f32aafeb388362994f4d7bd0 upstream.
inode_smack::smk_lock is taken during smack_d_instantiate(), which is
called during a filesystem transaction when creating a file on ext4.
Therefore to avoid a deadlock, all code that takes this lock must use
GFP_NOFS, to prevent memory reclaim from waiting for the filesystem
transaction to complete.
Reported-by: syzbot+0eefc1e06a77d327a056@syzkaller.appspotmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3675f052b43ba51b99b85b073c7070e083f3e6fb upstream.
There is a logic bug in the current smack_bprm_set_creds():
If LSM_UNSAFE_PTRACE is set, but the ptrace state is deemed to be
acceptable (e.g. because the ptracer detached in the meantime), the other
->unsafe flags aren't checked. As far as I can tell, this means that
something like the following could work (but I haven't tested it):
- task A: create task B with fork()
- task B: set NO_NEW_PRIVS
- task B: install a seccomp filter that makes open() return 0 under some
conditions
- task B: replace fd 0 with a malicious library
- task A: attach to task B with PTRACE_ATTACH
- task B: execve() a file with an SMACK64EXEC extended attribute
- task A: while task B is still in the middle of execve(), exit (which
destroys the ptrace relationship)
Make sure that if any flags other than LSM_UNSAFE_PTRACE are set in
bprm->unsafe, we reject the execve().
Cc: stable@vger.kernel.org
Fixes: 5663884caab1 ("Smack: unify all ptrace accesses in the smack")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 3f4287e7d98a2954f20bf96c567fdffcd2b63eb9 ]
In smack_socket_sock_rcv_skb(), there is an if statement
on line 3920 to check whether skb is NULL:
if (skb && skb->secmark != 0)
This check indicates skb can be NULL in some cases.
But on lines 3931 and 3932, skb is used:
ad.a.u.net->netif = skb->skb_iif;
ipv6_skb_to_auditdata(skb, &ad.a, NULL);
Thus, possible null-pointer dereferences may occur when skb is NULL.
To fix these possible bugs, an if statement is added to check skb.
These bugs are found by a static analysis tool STCheck written by us.
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5b841bfab695e3b8ae793172a9ff7990f99cc3e2 ]
Function smack_key_permission() only issues smack requests for the
following operations:
- KEY_NEED_READ (issues MAY_READ)
- KEY_NEED_WRITE (issues MAY_WRITE)
- KEY_NEED_LINK (issues MAY_WRITE)
- KEY_NEED_SETATTR (issues MAY_WRITE)
A blank smack request is issued in all other cases, resulting in
smack access being granted if there is any rule defined between
subject and object, or denied with -EACCES otherwise.
Request MAY_READ access for KEY_NEED_SEARCH and KEY_NEED_VIEW.
Fix the logic in the unlikely case when both MAY_READ and
MAY_WRITE are needed. Validate access permission field for valid
contents.
Signed-off-by: Zoran Markovic <zmarkovic@sierrawireless.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 129a99890936766f4b69b9da7ed88366313a9210 ]
A socket which has sk_family set to PF_INET6 is able to receive not
only IPv6 but also IPv4 traffic (IPv4-mapped IPv6 addresses).
Prior to this patch, the smk_skb_to_addr_ipv6() could have been
called for socket buffers containing IPv4 packets, in result such
traffic was allowed.
Signed-off-by: Piotr Sawicki <p.sawicki2@partner.samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7b4e88434c4e7982fb053c49657e1c8bbb8692d9 ]
Smack: Mark inode instant in smack_task_to_inode
/proc clean-up in commit 1bbc55131e59bd099fdc568d3aa0b42634dbd188
resulted in smack_task_to_inode() being called before smack_d_instantiate.
This resulted in the smk_inode value being ignored, even while present
for files in /proc/self. Marking the inode as instant here fixes that.
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: James Morris <james.morris@microsoft.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
security_inode_getsecurity() provides the text string value
of a security attribute. It does not provide a "secctx".
The code in xattr_getsecurity() that calls security_inode_getsecurity()
and then calls security_release_secctx() happened to work because
SElinux and Smack treat the attribute and the secctx the same way.
It fails for cap_inode_getsecurity(), because that module has no
secctx that ever needs releasing. It turns out that Smack is the
one that's doing things wrong by not allocating memory when instructed
to do so by the "alloc" parameter.
The fix is simple enough. Change the security_release_secctx() to
kfree() because it isn't a secctx being returned by
security_inode_getsecurity(). Change Smack to allocate the string when
told to do so.
Note: this also fixes memory leaks for LSMs which implement
inode_getsecurity but not release_secctx, such as capabilities.
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Reported-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: stable@vger.kernel.org
Signed-off-by: James Morris <james.l.morris@oracle.com>
running set*id processes. To do this, the bprm_secureexec LSM hook is
collapsed into the bprm_set_creds hook so the secureexec-ness of an exec
can be determined early enough to make decisions about rlimits and the
resulting memory layouts. Other logic acting on the secureexec-ness of an
exec is similarly consolidated. Capabilities needed some special handling,
but the refactoring removed other special handling, so that was a wash.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: Kees Cook <kees@outflux.net>
iQIcBAABCgAGBQJZrwRKAAoJEIly9N/cbcAmhboP/iwLbYfWngIJdu3pYKrW+CEg
uUVY6RNnsumJ5yEhD/yQKXSPmZ8PkC8vexPYvf8TcPOlMRQuhVvdiR0FfSUvkMWy
pB8ZVCyAV1uSnW4BH61FCxHInrahy8jlvQwnAujvw+FNxhcQjyEGKupOLIMGLioQ
8G5Ihf+hOjiXRhKbXueQi89n8i4jEI5YTH1RnC+Gsy8jG11EC9BhPddKSMaUKZA3
HYYqUyV0daYpGuxTOxaRdDO5wb6rlS+B46hqtOsSsIBOQkCjnLCRcdeMCqvXjQmv
kyZj03cPlUjEHqh3d3nB6utvVWReGf/p986//kQjT1OZPhATbySAu7wUHoLik3dU
zuexudNTBROf6YXahMxSJp348GS++xoBFARa78402E++U7C4/eoclbLCWAylBwVA
H+QAHFYRC2WFoskejSYBRPz6HLr1SIaSYMsKbkHqP07zi6p3ic2Uq3XvOP2zL/5p
l/mXa1Fs2vcDOWPER8a8b9mVkJDvuXj6J11lG+q80UWAWC3sd9GkSwOen80ps3Xo
/7dd+h2BAJSSVxZQFxd5YCx99mT0ntQZ797PhjxOY6SX/xUdOCAp9x1zDU5OUovP
q2ty3UTd7tq8h1RnHOnrn9cKmMmI7kpBvEfPGM507cEVjyfsMu2jJtUxN9dXOAkB
aebEsg3C8M6z5OdGVpWH
=Yva4
-----END PGP SIGNATURE-----
Merge tag 'secureexec-v4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull secureexec update from Kees Cook:
"This series has the ultimate goal of providing a sane stack rlimit
when running set*id processes.
To do this, the bprm_secureexec LSM hook is collapsed into the
bprm_set_creds hook so the secureexec-ness of an exec can be
determined early enough to make decisions about rlimits and the
resulting memory layouts. Other logic acting on the secureexec-ness of
an exec is similarly consolidated. Capabilities needed some special
handling, but the refactoring removed other special handling, so that
was a wash"
* tag 'secureexec-v4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
exec: Consolidate pdeath_signal clearing
exec: Use sane stack rlimit under secureexec
exec: Consolidate dumpability logic
smack: Remove redundant pdeath_signal clearing
exec: Use secureexec for clearing pdeath_signal
exec: Use secureexec for setting dumpability
LSM: drop bprm_secureexec hook
commoncap: Move cap_elevated calculation into bprm_set_creds
commoncap: Refactor to remove bprm_secureexec hook
smack: Refactor to remove bprm_secureexec hook
selinux: Refactor to remove bprm_secureexec hook
apparmor: Refactor to remove bprm_secureexec hook
binfmt: Introduce secureexec flag
exec: Correct comments about "point of no return"
exec: Rename bprm->cred_prepared to called_set_creds
This removes the redundant pdeath_signal clearing in Smack: the check in
smack_bprm_committing_creds() matches the check in smack_bprm_set_creds()
(which used to be in the now-removed smack_bprm_securexec() hook) and
since secureexec is now being checked for clearing pdeath_signal, this
is redundant to the common exec code.
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Serge Hallyn <serge@hallyn.com>
Reviewed-by: James Morris <james.l.morris@oracle.com>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
The Smack bprm_secureexec hook can be merged with the bprm_set_creds
hook since it's dealing with the same information, and all of the details
are finalized during the first call to the bprm_set_creds hook via
prepare_binprm() (subsequent calls due to binfmt_script, etc, are ignored
via bprm->called_set_creds).
Here, the test can just happen at the end of the bprm_set_creds hook,
and the bprm_secureexec hook can be dropped.
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Serge Hallyn <serge@hallyn.com>
Reviewed-by: James Morris <james.l.morris@oracle.com>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
The cred_prepared bprm flag has a misleading name. It has nothing to do
with the bprm_prepare_cred hook, and actually tracks if bprm_set_creds has
been called. Rename this flag and improve its comment.
Cc: David Howells <dhowells@redhat.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: John Johansen <john.johansen@canonical.com>
Acked-by: James Morris <james.l.morris@oracle.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
We no longer place these on a list so they can be const.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Use cap_capable() rather than capable() in the Smack privilege
check as the former does not invoke other security module
privilege check, while the later does. This becomes important
when stacking. It may be a problem even with minor modules.
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
The check of S_ISSOCK() in smack_file_receive() is not
appropriate if the passed descriptor is a socket.
Reported-by: Stephen Smalley <sds@tyco.nsa.gov>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
It will allow us to remove the old netfilter hook api in the near future.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Pull misc vfs updates from Al Viro:
"Assorted bits and pieces from various people. No common topic in this
pile, sorry"
* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs/affs: add rename exchange
fs/affs: add rename2 to prepare multiple methods
Make stat/lstat/fstatat pass AT_NO_AUTOMOUNT to vfs_statx()
fs: don't set *REFERENCED on single use objects
fs: compat: Remove warning from COMPATIBLE_IOCTL
remove pointless extern of atime_need_update_rcu()
fs: completely ignore unknown open flags
fs: add a VALID_OPEN_FLAGS
fs: remove _submit_bh()
fs: constify tree_descr arrays passed to simple_fill_super()
fs: drop duplicate header percpu-rwsem.h
fs/affs: bugfix: Write files greater than page size on OFS
fs/affs: bugfix: enable writes on OFS disks
fs/affs: remove node generation check
fs/affs: import amigaffs.h
fs/affs: bugfix: make symbolic links work again
simple_fill_super() is passed an array of tree_descr structures which
describe the files to create in the filesystem's root directory. Since
these arrays are never modified intentionally, they should be 'const' so
that they are placed in .rodata and benefit from memory protection.
This patch updates the function signature and all users, and also
constifies tree_descr.name.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Since all callers of smk_netlbl_mls() are GFP_KERNEL context
(smk_set_cipso() calls memdup_user_nul(), init_smk_fs() calls
__kernfs_new_node(), smk_import_entry() calls kzalloc(GFP_KERNEL)),
it is safe to use GFP_KERNEL from netlbl_catmap_setbit().
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
smack_parse_opts_str() calls kfree(opts->mnt_opts) when kcalloc() for
opts->mnt_opts_flags failed. But it should not have called it because
security_free_mnt_opts() will call kfree(opts->mnt_opts).
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
fixes: 3bf2789cad9e6573 ("smack: allow mount opts setting over filesystems with binary mount data")
Cc: Vivek Trivedi <t.vivek@samsung.com>
Cc: Amit Sahrawat <a.sahrawat@samsung.com>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Mark all of the registration hooks as __ro_after_init (via the
__lsm_ro_after_init macro).
Signed-off-by: James Morris <james.l.morris@oracle.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Kees Cook <keescook@chromium.org>
Pull namespace updates from Eric Biederman:
"There is a lot here. A lot of these changes result in subtle user
visible differences in kernel behavior. I don't expect anything will
care but I will revert/fix things immediately if any regressions show
up.
From Seth Forshee there is a continuation of the work to make the vfs
ready for unpriviled mounts. We had thought the previous changes
prevented the creation of files outside of s_user_ns of a filesystem,
but it turns we missed the O_CREAT path. Ooops.
Pavel Tikhomirov and Oleg Nesterov worked together to fix a long
standing bug in the implemenation of PR_SET_CHILD_SUBREAPER where only
children that are forked after the prctl are considered and not
children forked before the prctl. The only known user of this prctl
systemd forks all children after the prctl. So no userspace
regressions will occur. Holding earlier forked children to the same
rules as later forked children creates a semantic that is sane enough
to allow checkpoing of processes that use this feature.
There is a long delayed change by Nikolay Borisov to limit inotify
instances inside a user namespace.
Michael Kerrisk extends the API for files used to maniuplate
namespaces with two new trivial ioctls to allow discovery of the
hierachy and properties of namespaces.
Konstantin Khlebnikov with the help of Al Viro adds code that when a
network namespace exits purges it's sysctl entries from the dcache. As
in some circumstances this could use a lot of memory.
Vivek Goyal fixed a bug with stacked filesystems where the permissions
on the wrong inode were being checked.
I continue previous work on ptracing across exec. Allowing a file to
be setuid across exec while being ptraced if the tracer has enough
credentials in the user namespace, and if the process has CAP_SETUID
in it's own namespace. Proc files for setuid or otherwise undumpable
executables are now owned by the root in the user namespace of their
mm. Allowing debugging of setuid applications in containers to work
better.
A bug I introduced with permission checking and automount is now
fixed. The big change is to mark the mounts that the kernel initiates
as a result of an automount. This allows the permission checks in sget
to be safely suppressed for this kind of mount. As the permission
check happened when the original filesystem was mounted.
Finally a special case in the mount namespace is removed preventing
unbounded chains in the mount hash table, and making the semantics
simpler which benefits CRIU.
The vfs fix along with related work in ima and evm I believe makes us
ready to finish developing and merge fully unprivileged mounts of the
fuse filesystem. The cleanups of the mount namespace makes discussing
how to fix the worst case complexity of umount. The stacked filesystem
fixes pave the way for adding multiple mappings for the filesystem
uids so that efficient and safer containers can be implemented"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
proc/sysctl: Don't grab i_lock under sysctl_lock.
vfs: Use upper filesystem inode in bprm_fill_uid()
proc/sysctl: prune stale dentries during unregistering
mnt: Tuck mounts under others instead of creating shadow/side mounts.
prctl: propagate has_child_subreaper flag to every descendant
introduce the walk_process_tree() helper
nsfs: Add an ioctl() to return owner UID of a userns
fs: Better permission checking for submounts
exit: fix the setns() && PR_SET_CHILD_SUBREAPER interaction
vfs: open() with O_CREAT should not create inodes with unknown ids
nsfs: Add an ioctl() to return the namespace type
proc: Better ownership of files for non-dumpable tasks in user namespaces
exec: Remove LSM_UNSAFE_PTRACE_CAP
exec: Test the ptracer's saved cred to see if the tracee can gain caps
exec: Don't reset euid and egid when the tracee has CAP_SETUID
inotify: Convert to using per-namespace limits
With previous changes every location that tests for
LSM_UNSAFE_PTRACE_CAP also tests for LSM_UNSAFE_PTRACE making the
LSM_UNSAFE_PTRACE_CAP redundant, so remove it.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
I am still tired of having to find indirect ways to determine
what security modules are active on a system. I have added
/sys/kernel/security/lsm, which contains a comma separated
list of the active security modules. No more groping around
in /proc/filesystems or other clever hacks.
Unchanged from previous versions except for being updated
to the latest security next branch.
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: John Johansen <john.johansen@canonical.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
As reported by yangshukui, a permission denial from security_task_wait()
can lead to a soft lockup in zap_pid_ns_processes() since it only expects
sys_wait4() to return 0 or -ECHILD. Further, security_task_wait() can
in general lead to zombies; in the absence of some way to automatically
reparent a child process upon a denial, the hook is not useful. Remove
the security hook and its implementations in SELinux and Smack. Smack
already removed its check from its hook.
Reported-by: yangshukui <yangshukui@huawei.com>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
The access to fd from anon_inode is always failed because there is
no set xattr operations. So this patch fixes to ignore private
inode including anon_inode for file functions.
It was only ignored for smack_file_receive() to share dma-buf fd,
but dma-buf has other functions like ioctl and mmap.
Reference: https://lkml.org/lkml/2015/4/17/16
Signed-off-by: Seung-Woo Kim <sw0312.kim@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Since 4b936885a (v2.6.32) all inodes on sockfs and pipefs are disconnected.
It caused filesystem specific code in smack_d_instantiate to be skipped,
because all inodes on those pseudo filesystems were treated as root inodes.
As a result all sockfs inodes had the Smack label set to floor.
In most cases access checks for sockets use socket_smack data so the inode
label is not important. But there are special cases that were broken.
One example would be calling fcntl with F_SETOWN command on a socket fd.
Now smack_d_instantiate expects all pipefs and sockfs inodes to be
disconnected and has the logic in appropriate place.
Signed-off-by: Rafal Krypa <r.krypa@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
smack_file_open() is first checking the capability of calling subject,
this check will skip the SMACK logging for success case. Use smk_tskacc()
for proper logging and SMACK access check.
Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
In smack_from_secattr function,"smack_known_list" is being traversed
using list_for_each_entry macro, although it is a rcu protected
structure. So it should be traversed using "list_for_each_entry_rcu"
macro to fetch the rcu protected entry.
Signed-off-by: Vishal Goel <vishal.goel@samsung.com>
Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
There is race condition issue while freeing the i_security blob in SMACK
module. There is existing condition where i_security can be freed while
inode_permission is called from path lookup on second CPU. There has been
observed the page fault with such condition. VFS code and Selinux module
takes care of this condition by freeing the inode and i_security field
using RCU via call_rcu(). But in SMACK directly the i_secuirty blob is
being freed. Use call_rcu() to fix this race condition issue.
Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com>
Signed-off-by: Vishal Goel <vishal.goel@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
smk_copy_rules() and smk_copy_relabel() are initializing list_head though
they have been initialized already in new_task_smack() function. Delete
repeated initialization.
Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
"smk_set_access()" function adds a new rule entry in subject label specific
list(rule_list) and in global rule list(smack_rule_list) both. Mutex lock
(rule_lock) is used to avoid simultaneous updates. But this lock is subject
label specific lock. If 2 processes tries to add different rules(i.e with
different subject labels) simultaneously, then both the processes can take
the "rule_lock" respectively. So it will cause a problem while adding
entries in master rule list.
Now a new mutex lock(smack_master_list_lock) has been taken to add entry in
smack_rule_list to avoid simultaneous updates of different rules.
Signed-off-by: Vishal Goel <vishal.goel@samsung.com>
Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Fix the issue of wrong SMACK label (SMACK64IPIN) update when a second bind
call is made to same IP address & port, but with different SMACK label
(SMACK64IPIN) by second instance of server. In this case server returns
with "Bind:Address already in use" error but before returning, SMACK label
is updated in SMACK port-label mapping list inside smack_socket_bind() hook
To fix this issue a new check has been added in smk_ipv6_port_label()
function before updating the existing port entry. It checks whether the
socket for matching port entry is closed or not. If it is closed then it
means port is not bound and it is safe to update the existing port entry
else return if port is still getting used. For checking whether socket is
closed or not, one more field "smk_can_reuse" has been added in the
"smk_port_label" structure. This field will be set to '1' in
"smack_sk_free_security()" function which is called to free the socket
security blob when the socket is being closed. In this function, port entry
is searched in the SMACK port-label mapping list for the closing socket.
If entry is found then "smk_can_reuse" field is set to '1'.Initially
"smk_can_reuse" field is set to '0' in smk_ipv6_port_label() function after
creating a new entry in the list which indicates that socket is in use.
Signed-off-by: Vishal Goel <vishal.goel@samsung.com>
Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Permission denied error comes when 2 IPv6 servers are running and client
tries to connect one of them. Scenario is that both servers are using same
IP and port but different protocols(Udp and tcp). They are using different
SMACK64IPIN labels.Tcp server is using "test" and udp server is using
"test-in". When we try to run tcp client with SMACK64IPOUT label as "test",
then connection denied error comes. It should not happen since both tcp
server and client labels are same.This happens because there is no check
for protocol in smk_ipv6_port_label() function while searching for the
earlier port entry. It checks whether there is an existing port entry on
the basis of port only. So it updates the earlier port entry in the list.
Due to which smack label gets changed for earlier entry in the
"smk_ipv6_port_list" list and permission denied error comes.
Now a check is added for socket type also.Now if 2 processes use same
port but different protocols (tcp or udp), then 2 different port entries
will be added in the list. Similarly while checking smack access in
smk_ipv6_port_check() function, port entry is searched on the basis of
both port and protocol.
Signed-off-by: Vishal Goel <vishal.goel@samsung.com>
Signed-off-by: Himanshu Shukla <Himanshu.sh@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Add the rcu synchronization mechanism for accessing smk_ipv6_port_list
in smack IPv6 hooks. Access to the port list is vulnerable to a race
condition issue,it does not apply proper synchronization methods while
working on critical section. It is possible that when one thread is
reading the list, at the same time another thread is modifying the
same port list, which can cause the major problems.
To ensure proper synchronization between two threads, rcu mechanism
has been applied while accessing and modifying the port list. RCU will
also not affect the performance, as there are more accesses than
modification where RCU is most effective synchronization mechanism.
Signed-off-by: Vishal Goel <vishal.goel@samsung.com>
Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Processes can only alter their own security attributes via
/proc/pid/attr nodes. This is presently enforced by each individual
security module and is also imposed by the Linux credentials
implementation, which only allows a task to alter its own credentials.
Move the check enforcing this restriction from the individual
security modules to proc_pid_attr_write() before calling the security hook,
and drop the unnecessary task argument to the security hook since it can
only ever be the current task.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Pull vfs updates from Al Viro:
- more ->d_init() stuff (work.dcache)
- pathname resolution cleanups (work.namei)
- a few missing iov_iter primitives - copy_from_iter_full() and
friends. Either copy the full requested amount, advance the iterator
and return true, or fail, return false and do _not_ advance the
iterator. Quite a few open-coded callers converted (and became more
readable and harder to fuck up that way) (work.iov_iter)
- several assorted patches, the big one being logfs removal
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
logfs: remove from tree
vfs: fix put_compat_statfs64() does not handle errors
namei: fold should_follow_link() with the step into not-followed link
namei: pass both WALK_GET and WALK_MORE to should_follow_link()
namei: invert WALK_PUT logics
namei: shift interpretation of LOOKUP_FOLLOW inside should_follow_link()
namei: saner calling conventions for mountpoint_last()
namei.c: get rid of user_path_parent()
switch getfrag callbacks to ..._full() primitives
make skb_add_data,{_nocache}() and skb_copy_to_page_nocache() advance only on success
[iov_iter] new primitives - copy_from_iter_full() and friends
don't open-code file_inode()
ceph: switch to use of ->d_init()
ceph: unify dentry_operations instances
lustre: switch to use of ->d_init()
The invalid Smack label ("") and the Huh ("?") Smack label
serve the same purpose and having both is unnecessary.
While pulling out the invalid label it became clear that
the use of smack_from_secid() was inconsistent, so that
is repaired. The setting of inode labels to the invalid
label could never happen in a functional system, has
never been observed in the wild and is not what you'd
really want for a failure behavior in any case. That is
removed.
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>