Highlights include:
- Fix NFSv4 recovery so that it doesn't recover lost locks in cases such as
lease loss due to a network partition, where doing so may result in data
corruption. Add a kernel parameter to control choice of legacy behaviour
or not.
- Performance improvements when 2 processes are writing to the same file.
- Flush data to disk when an RPCSEC_GSS session timeout is imminent.
- Implement NFSv4.1 SP4_MACH_CRED state protection to prevent other
NFS clients from being able to manipulate our lease and file lockingr
state.
- Allow sharing of RPCSEC_GSS caches between different rpc clients
- Fix the broken NFSv4 security auto-negotiation between client and server
- Fix rmdir() to wait for outstanding sillyrename unlinks to complete
- Add a tracepoint framework for debugging NFSv4 state recovery issues.
- Add tracing to the generic NFS layer.
- Add tracing for the SUNRPC socket connection state.
- Clean up the rpc_pipefs mount/umount event management.
- Merge more patches from Chuck in preparation for NFSv4 migration support.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQIcBAABAgAGBQJSLelVAAoJEGcL54qWCgDyo2IQAKOfRJyZVnf4ipxi3xLNl1QF
w/70DVSIF1S1djWN7G3vgkxj/R8KCvJ8CcvkAD2BEgRDeZJ9TtyKAdM/jYLZ+W05
7k2QKk8fkwZmc1Y2qDqFwKHzP5ZgP5L2nGx7FNhi/99wEAe47yFG3qd3rUWKrcOf
mnd863zgGDE2Q10slhoq/bywwMJo6tKZNeaIE8kPjgFbBEh/jslpAWr8dSA4QgvJ
nZ8VB5XU8L+XJ0GpHHdjYm9LvQ51DbQ6omOF+0P4fI093azKmf4ZsrjMDWT8+iu3
XkXlnQmKLGTi7yB43hHtn2NiRqwGzCcZ1Amo9PpCFaHUt1RP9cc37UhG1T+x1xWJ
STEKDbvCdQ3FU9FvbgrGEwBR0e8fNS4fZY3ToDBflIcfwre0aWs5RCodZMUD0nUI
4wY5J9NsQR/bL+v8KeUR4V4cXK8YrgL0zB4u4WYzH5Npxr5KD0NEKDNqRPhrB9l2
LLF9Haql8j76Ff0ek6UGFIZjDE0h6Fs71wLBpLj+ZWArOJ7vBuLMBSOVqNpld9+9
f2fEG7qoGF4FGTY4myH/eakMPaWnk9Ol4Ls/svSIapJ9+rePD+a93e/qnmdofIMf
4TuEYk6ERib1qXgaeDRQuCsm2YE1Co5skGMaOsRFWgReE1c12QoJQVst2nMtEKp3
uV2w8LgX18aZOZXJVkCM
=ZuW+
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-3.12-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
"Highlights include:
- Fix NFSv4 recovery so that it doesn't recover lost locks in cases
such as lease loss due to a network partition, where doing so may
result in data corruption. Add a kernel parameter to control
choice of legacy behaviour or not.
- Performance improvements when 2 processes are writing to the same
file.
- Flush data to disk when an RPCSEC_GSS session timeout is imminent.
- Implement NFSv4.1 SP4_MACH_CRED state protection to prevent other
NFS clients from being able to manipulate our lease and file
locking state.
- Allow sharing of RPCSEC_GSS caches between different rpc clients.
- Fix the broken NFSv4 security auto-negotiation between client and
server.
- Fix rmdir() to wait for outstanding sillyrename unlinks to complete
- Add a tracepoint framework for debugging NFSv4 state recovery
issues.
- Add tracing to the generic NFS layer.
- Add tracing for the SUNRPC socket connection state.
- Clean up the rpc_pipefs mount/umount event management.
- Merge more patches from Chuck in preparation for NFSv4 migration
support"
* tag 'nfs-for-3.12-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (107 commits)
NFSv4: use mach cred for SECINFO_NO_NAME w/ integrity
NFS: nfs_compare_super shouldn't check the auth flavour unless 'sec=' was set
NFSv4: Allow security autonegotiation for submounts
NFSv4: Disallow security negotiation for lookups when 'sec=' is specified
NFSv4: Fix security auto-negotiation
NFS: Clean up nfs_parse_security_flavors()
NFS: Clean up the auth flavour array mess
NFSv4.1 Use MDS auth flavor for data server connection
NFS: Don't check lock owner compatability unless file is locked (part 2)
NFS: Don't check lock owner compatibility in writes unless file is locked
nfs4: Map NFS4ERR_WRONG_CRED to EPERM
nfs4.1: Add SP4_MACH_CRED write and commit support
nfs4.1: Add SP4_MACH_CRED stateid support
nfs4.1: Add SP4_MACH_CRED secinfo support
nfs4.1: Add SP4_MACH_CRED cleanup support
nfs4.1: Add state protection handler
nfs4.1: Minimal SP4_MACH_CRED implementation
SUNRPC: Replace pointer values with task->tk_pid and rpc_clnt->cl_clid
SUNRPC: Add an identifier for struct rpc_clnt
SUNRPC: Ensure rpc_task->tk_pid is available for tracepoints
...
Pull fuse bugfixes from Miklos Szeredi:
"Just a bunch of bugfixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: use list_for_each_entry() for list traversing
fuse: readdir: check for slash in names
fuse: hotfix truncate_pagecache() issue
fuse: invalidate inode attributes on xattr modification
fuse: postpone end_page_writeback() in fuse_writepage_locked()
window. Also, most of them are bug fixes this time. Two of my
three patches (moving gfs2_sync_meta and merging the two writepage
implementations) are clean ups with the third (taking the glock ref
in examine_bucket) being a fix for a difficult to hit race condition.
The removal of an unused memory barrier is a clean up from Bob Peterson,
and the "spectator" relates to a rarely used mount option. Ben
Marzinski's patch fixes a corner case where the incorrect inode
flags were being set, resulting in incorrect behaviour on fsync.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJSLYWjAAoJEMrg3m4a/8jSHBUQAIBXyrLypCJNJNpYreRlg4Te
YZOqXMxqrfsWD5jkfWvxV/vZyVu6FEJRJRRwKoO8foZhnVIy2HiGdpFvwk4NMzGs
Ah5cCaySgDhFIKXNq1CnhVZNpRU8N+Lf+8U2MwkpPUrT+KnZlDdG8COuHbWJ3/+t
prqHy0N1oa8hgj3P0oDf3kyQKiB2MQVhORTBVOWaas1mzw8vsKRjvO7k5g0VPcfV
VnIYzvQ033V6pW6ymWGcbz6us7hXeFJmXo4grAdLIclK6QDt1zPLVKlvk7X/Me52
PTxO5AP/Nw6AlJABTNy8UZ3uJO3QaiqhcIz+OzIXlYpsICqma30qkHHrWzOh0Q5F
HrDqh03A2zuqigamVmsJr2+tbdLWxz72D6N3eFlIBLAw2JxILosCPcGc4UNzG/7g
mFr6+LDnZXzFpjT6OBdepjVH06ZqfcV+nr3KhvmKQT+tkzTizArbt8JutOdKjahH
3+pXx/bA0B7LRVYymK5/xrfbjnQp629GG3QnGiI23gtDZ6eEzm663bQz6MTgjR88
H0nigLi8d2KmM8UnrZqq5nI+LirFD/IZ5DO47Vv6w3NCZW/ZpNOodlWkd1igvhHR
ugTvKqyw2O9hImM8jPoRWbqL01RGTKTMMn/3cCCO8Uww7DpI4GLb48a39buYh4+2
kijIxCtntfQAOiR6MCvA
=rsQv
-----END PGP SIGNATURE-----
Merge tag 'gfs2-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmw
Pull GFS2 updates from Steven Whitehouse:
"This is possibly the smallest ever set of GFS2 patches for a merge
window. Also, most of them are bug fixes this time.
Two of my three patches (moving gfs2_sync_meta and merging the two
writepage implementations) are clean ups with the third (taking the
glock ref in examine_bucket) being a fix for a difficult to hit race
condition.
The removal of an unused memory barrier is a clean up from Bob
Peterson, and the "spectator" relates to a rarely used mount option.
Ben Marzinski's patch fixes a corner case where the incorrect inode
flags were being set, resulting in incorrect behaviour on fsync"
* tag 'gfs2-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmw:
GFS2: dirty inode correctly in gfs2_write_end
GFS2: Don't flag consistency error if first mounter is a spectator
GFS2: Remove unnecessary memory barrier
GFS2: Merge ordered and writeback writepage
GFS2: Take glock reference in examine_bucket()
GFS2: Move gfs2_sync_meta to lops.c
Pull ceph updates from Sage Weil:
"This includes both the first pile of Ceph patches (which I sent to
torvalds@vger, sigh) and a few new patches that add support for
fscache for Ceph. That includes a few fscache core fixes that David
Howells asked go through the Ceph tree. (Thanks go to Milosz Tanski
for putting this feature together)
This first batch of patches (included here) had (has) several
important RBD bug fixes, hole punch support, several different
cleanups in the page cache interactions, improvements in the truncate
code (new truncate mutex to avoid shenanigans with i_mutex), and a
series of fixes in the synchronous striping read/write code.
On top of that is a random collection of small fixes all across the
tree (error code checks and error path cleanup, obsolete wq flags,
etc)"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (43 commits)
ceph: use d_invalidate() to invalidate aliases
ceph: remove ceph_lookup_inode()
ceph: trivial buildbot warnings fix
ceph: Do not do invalidate if the filesystem is mounted nofsc
ceph: page still marked private_2
ceph: ceph_readpage_to_fscache didn't check if marked
ceph: clean PgPrivate2 on returning from readpages
ceph: use fscache as a local presisent cache
fscache: Netfs function for cleanup post readpages
FS-Cache: Fix heading in documentation
CacheFiles: Implement interface to check cache consistency
FS-Cache: Add interface to check consistency of a cached object
rbd: fix null dereference in dout
rbd: fix buffer size for writes to images with snapshots
libceph: use pg_num_mask instead of pgp_num_mask for pg.seed calc
rbd: fix I/O error propagation for reads
ceph: use vfs __set_page_dirty_nobuffers interface instead of doing it inside filesystem
ceph: allow sync_read/write return partial successed size of read/write.
ceph: fix bugs about handling short-read for sync read mode.
ceph: remove useless variable revoked_rdcache
...
Pull UML updates from Richard Weinberger:
"This pile contains mostly fixes and improvements for issues identified
by Richard W M Jones while adding UML as backend to libguestfs"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
um: Add irq chip um/mask handlers
um: prctl: Do not include linux/ptrace.h
um: Run UML in it's own session.
um: Cleanup SIGTERM handling
um: ubd: Introduce submit_request()
um: ubd: Add REQ_FLUSH suppport
um: Implement probe_kernel_read()
um: hostfs: Fix writeback
Prior to the introduction of page migration support in "fs/aio: Add support
to aio ring pages migration" / 36bc08cc01709b4a9bb563b35aa530241ddc63e3,
mapping of the ring buffer pages was done via get_user_pages() while
retaining mmap_sem held for write. This avoided possible races with userland
racing an munmap() or mremap(). The page migration patch, however, switched
to using mm_populate() to prime the page mapping. mm_populate() cannot be
called with mmap_sem held.
Instead of dropping the mmap_sem, revert to the old behaviour and simply
drop the use of mm_populate() since get_user_pages() will cause the pages to
get mapped anyways. Thanks to Al Viro for spotting this issue.
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
When reconnecting to automounts at startup an autofs ioctl is used
to find the device and inode of existing mounts so they can be used
to open a file descriptor of possibly covered mounts.
At this time the the caller might not yet "own" the mount so it can
trigger calling ->d_automount(). This causes automount to hang when
trying to reconnect to direct or offset mount types.
Consequently kern_path() can't be used but kern_path_mountpoint() can be.
Signed-off-by: Ian Kent <raven@themaw.net>
Cc: Jeff Layton <jlayton@redhat.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
This is the fix that the last two commits indirectly led up to - making
sure that we don't call dput() in a bad context on the dentries we've
looked up in RCU mode after the sequence count validation fails.
This basically expands d_rcu_to_refcount() into the callers, and then
fixes the callers to delay the dput() in the failure case until _after_
we've dropped all locks and are no longer in an RCU-locked region.
The case of 'complete_walk()' was trivial, since its failure case did
the unlock_rcu_walk() directly after the call to d_rcu_to_refcount(),
and as such that is just a pure expansion of the function with a trivial
movement of the resulting dput() to after 'unlock_rcu_walk()'.
In contrast, the unlazy_walk() case was much more complicated, because
not only does convert two different dentries from RCU to be reference
counted, but it used to not call unlock_rcu_walk() at all, and instead
just returned an error and let the caller clean everything up in
"terminate_walk()".
Happily, one of the dentries in question (called "parent" inside
unlazy_walk()) is the dentry of "nd->path", which terminate_walk() wants
a refcount to anyway for the non-RCU case.
So what the new and improved unlazy_walk() does is to first turn that
dentry into a refcounted one, and once that is set up, the error cases
can continue to use the terminate_walk() helper for cleanup, but for the
non-RCU case. Which makes it possible to drop out of RCU mode if we
actually hit the sequence number failure case.
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
that prepare the code to handle different types of SMB2 leases.
Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
This simplifies the RCU to refcounting code in particular.
I was originally intending to leave this for later, but walking through
all the dput() logic (see previous commit), I realized that the dput()
"might_sleep()" check was misleadingly weak. And I removed it as
misleading, both for performance profiling and for debugging.
However, the might_sleep() debugging case is actually true: the final
dput() can indeed sleep, if the inode of the dentry that you are
releasing ends up sleeping at iput time (see dentry_iput()). So the
problem with the might_sleep() in dput() wasn't that it wasn't true, it
was that it wasn't actually testing and triggering on the interesting
case.
In particular, just about *any* dput() can indeed sleep, if you happen
to race with another thread deleting the file in question, and you then
lose the race to the be the last dput() for that file. But because it's
a very rare race, the debugging code would never trigger it in practice.
Why is this problematic? The new d_rcu_to_refcount() (see commit
15570086b590: "vfs: reimplement d_rcu_to_refcount() using
lockref_get_or_lock()") does a dput() for the failure case, and it does
it under the RCU lock. So potentially sleeping really is a bug.
But there's no way I'm going to fix this with the previous complicated
"lockref_get_or_lock()" interface. And rather than revert to the old
and crufty nested dentry locking code (which did get this right by
delaying the reference count updates until they were verified to be
safe), let's make forward progress.
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is me being a bit OCD after all the dentry optimization work this
merge window: profiles end up showing 'dput()' as a rather expensive
operation, and there were two unrelated bad reasons for that.
The first reason was reading d_lockref.count for debugging purposes,
which touches the lockref cacheline (for reads) before really need to.
More importantly, the debugging test in question is _wrong_, and has
hidden bugs. It's true that we can only sleep when the count goes down
to zero, but the test as-is hides the much more subtle bug that happens
if we race with somebody else deleting the file.
Anyway we _will_ touch that cacheline, but let's do it for a write and
in the right routine (ie in "lockref_put_or_lock()") which annotates the
costs better. So remove the misleading debug code.
The other was an unnecessary access to the cacheline that contains the
d_lru list, just to check whether we already were on the LRU list or
not. This is exactly what we have d_flags for, so that we can avoid
touching extra cache lines for the common case. So just add another bit
for "is this dentry on the LRU".
Finally, mark the tests properly likely/unlikely, so that the common
fast-paths are dense in the instruction stream.
This makes the profiles look much saner.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jeff's patchset introduced trivial sparse warning on new cifs toupper routine
Signed-off-by: Steve French <smfrench@gmail.com>
CC: Jeff Layton <jlayton@redhat.com>
Switch smb2 code to use per session session key and smb3 code to
use per session signing key instead of per connection key to
generate signatures.
For that, we need to find a session to fetch the session key to
generate signature to match for every request and response packet.
We also forgo checking signature for a session setup response
from the server.
Acked-by: Jeff Layton <jlayton@samba.org>
Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Add a variable specific to NTLMSSP authentication to determine
whether to exchange keys during negotiation and authentication phases.
Since session key for smb1 is per smb connection, once a very first
sesion is established, there is no need for key exchange during
subsequent session setups. As a result, smb1 session setup code sets this
variable as false.
Since session key for smb2 and smb3 is per smb connection, we need to
exchange keys to generate session key for every sesion being established.
As a result, smb2/3 session setup code sets this variable as true.
Acked-by: Jeff Layton <jlayton@samba.org>
Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Move the post (successful) session setup code to respective dialect routines.
For smb1, session key is per smb connection.
For smb2/smb3, session key is per smb session.
If client and server do not require signing, free session key for smb1/2/3.
If client and server require signing
smb1 - Copy (kmemdup) session key for the first session to connection.
Free session key of that and subsequent sessions on this connection.
smb2 - For every session, keep the session key and free it when the
session is being shutdown.
smb3 - For every session, generate the smb3 signing key using the session key
and then free the session key.
There are two unrelated line formatting changes as well.
Reviewed-by: Jeff Layton <jlayton@samba.org>
Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Convert cpu_to_le32(le32_to_cpu(E1) + E2) to use le32_add_cpu().
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Steve French <smfrench@gmail.com>
If a server sends a lease break to a connection that doesn't have
opens with a lease key specified in the server response, we can't
find an open file to send an ack. Fix this by walking through
all connections we have.
Cc: <stable@vger.kernel.org>
Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
This happens when we receive a lease break from a server, then
find an appropriate lease key in opened files and schedule the
oplock_break slow work. lw pointer isn't freed in this case.
Cc: <stable@vger.kernel.org>
Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
Have the case-insensitive d_compare and d_hash routines convert each
character in the filenames to wchar_t's and then use the new
cifs_toupper routine to convert those into uppercase.
With this scheme we should more closely emulate the case conversion that
the servers will do.
Reported-and-Tested-by: Jan-Marek Glogowski <glogow@fbihome.de>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
The existing NLS case conversion routines do not appropriately handle
the (now common) case where the local host is using UTF8. This is
because nls_utf8 has no support at all for converting a utf8 string
between cases and the NLS infrastructure in general cannot handle
a multibyte input character.
In any case, what we really need for cifs is to emulate how we expect
the server to convert the character to upper or lowercase. Thus, even
if we had routines that could handle utf8 case conversion, we likely
would end up with the wrong result if the name ends up being in the
upper planes.
This patch adds a new scheme for doing unicode case conversion. The
case conversion tables that Microsoft has published for Windows 8
have been converted to a set of lookup tables, and a routine is
added to convert a wchar_t from lower to uppercase using those
tables.
Reported-and-Tested-by: Jan-Marek Glogowski <glogow@fbihome.de>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
MAX_SERVER_SIZE has been moved to cifs_mount.h and renamed
CIFS_NI_MAXHOST for clarity. It has been expanded to 1024 as the
previous value of 16 was very short.
Signed-off-by: Scott Lovenberg <scott.lovenberg@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
The max string length definitions for user name, domain name, password,
and share name have been moved into their own header file in uapi so the
mount helper can use autoconf to define them instead of keeping the
kernel side and userland side definitions in sync manually. The names
have also been standardized with a "CIFS" prefix and "LEN" suffix.
Signed-off-by: Scott Lovenberg <scott.lovenberg@gmail.com>
Reviewed-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Steve French <smfrench@gmail.com>
by using a query reparse ioctl request.
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
that allows to access files through symlink created on a server.
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
Currently, we have a number of documentation files that live under
fs/cifs/. Generally, these don't get picked up by distro packagers,
since they're in a non-standard location. Move them to a new spot
under Documentation/ instead.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
On some filesystems it's impossible even with fs corruption, but we'd
better not rely on that, what with memcpy() into on-stack array we
are doing there.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
No dentry can get to directory modification methods without
having passed either ->lookup() or ->atomic_open(); if name is
rejected by those two (or by ->d_hash()) with an error, it won't
be seen by anything else.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Commit 97431204ea005ec8070ac94bc3251e836daa7ca7 introduced a regression
that causes SECINFO_NO_NAME to fail without sending an RPC if:
1) the nfs_client's rpc_client is using krb5i/p (now tried by default)
2) the current user doesn't have valid kerberos credentials
This situation is quite common - as of now a sec=sys mount would use
krb5i for the nfs_client's rpc_client and a user would hardly be faulted
for not having run kinit.
The solution is to use the machine cred when trying to use an integrity
protected auth flavor for SECINFO_NO_NAME.
Older servers may not support using the machine cred or an integrity
protected auth flavor for SECINFO_NO_NAME in every circumstance, so we fall
back to using the user's cred and the filesystem's auth flavor in this case.
We run into another problem when running against linux nfs servers -
they return NFS4ERR_WRONGSEC when using integrity auth flavor (unless the
mount is also that flavor) even though that is not a valid error for
SECINFO*. Even though it's against spec, handle WRONGSEC errors on
SECINFO_NO_NAME by falling back to using the user cred and the
filesystem's auth flavor.
Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
In cases where the parent super block was not mounted with a 'sec=' line,
allow autonegotiation of security for the submounts.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Pull vfs pile 2 (of many) from Al Viro:
"Mostly Miklos' series this time"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
constify dcache.c inlined helpers where possible
fuse: drop dentry on failed revalidate
fuse: clean up return in fuse_dentry_revalidate()
fuse: use d_materialise_unique()
sysfs: use check_submounts_and_drop()
nfs: use check_submounts_and_drop()
gfs2: use check_submounts_and_drop()
afs: use check_submounts_and_drop()
vfs: check unlinked ancestors before mount
vfs: check submounts and drop atomically
vfs: add d_walk()
vfs: restructure d_genocide()
Pull namespace changes from Eric Biederman:
"This is an assorted mishmash of small cleanups, enhancements and bug
fixes.
The major theme is user namespace mount restrictions. nsown_capable
is killed as it encourages not thinking about details that need to be
considered. A very hard to hit pid namespace exiting bug was finally
tracked and fixed. A couple of cleanups to the basic namespace
infrastructure.
Finally there is an enhancement that makes per user namespace
capabilities usable as capabilities, and an enhancement that allows
the per userns root to nice other processes in the user namespace"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
userns: Kill nsown_capable it makes the wrong thing easy
capabilities: allow nice if we are privileged
pidns: Don't have unshare(CLONE_NEWPID) imply CLONE_THREAD
userns: Allow PR_CAPBSET_DROP in a user namespace.
namespaces: Simplify copy_namespaces so it is clear what is going on.
pidns: Fix hang in zap_pid_ns_processes by sending a potentially extra wakeup
sysfs: Restrict mounting sysfs
userns: Better restrictions on when proc and sysfs can be mounted
vfs: Don't copy mount bind mounts of /proc/<pid>/ns/mnt between namespaces
kernel/nsproxy.c: Improving a snippet of code.
proc: Restrict mounting the proc filesystem
vfs: Lock in place mounts from more privileged users
Pull security subsystem updates from James Morris:
"Nothing major for this kernel, just maintenance updates"
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (21 commits)
apparmor: add the ability to report a sha1 hash of loaded policy
apparmor: export set of capabilities supported by the apparmor module
apparmor: add the profile introspection file to interface
apparmor: add an optional profile attachment string for profiles
apparmor: add interface files for profiles and namespaces
apparmor: allow setting any profile into the unconfined state
apparmor: make free_profile available outside of policy.c
apparmor: rework namespace free path
apparmor: update how unconfined is handled
apparmor: change how profile replacement update is done
apparmor: convert profile lists to RCU based locking
apparmor: provide base for multiple profiles to be replaced at once
apparmor: add a features/policy dir to interface
apparmor: enable users to query whether apparmor is enabled
apparmor: remove minimum size check for vmalloc()
Smack: parse multiple rules per write to load2, up to PAGE_SIZE-1 bytes
Smack: network label match fix
security: smack: add a hash table to quicken smk_find_entry()
security: smack: fix memleak in smk_write_rules_list()
xattr: Constify ->name member of "struct xattr".
...
NFSv4 security auto-negotiation has been broken since
commit 4580a92d44e2b21c2254fa5fef0f1bfb43c82318 (NFS:
Use server-recommended security flavor by default (NFSv3))
because nfs4_try_mount() will automatically select AUTH_SYS
if it sees no auth flavours.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Chuck Lever <chuck.lever@oracle.com>
What is the point of having a 'auth_flavor_len' field, if it is
always set to 1, and can't be used to determine if the user has
selected an auth flavour?
This cleanup goes back to using auth_flavor_len for its original
intended purpose, and gets rid of the ad-hoc replacements.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
We have to implement ->release() and trigger writeback from it.
Otherwise we might lose dirty pages at munmap().
Signed-off-by: Richard Weinberger <richard@nod.at>
It might be possible for two callers to race the mutex lock after the
NULL ctx check. Instead, move the lock above the check so there isn't
the possibility of leaking a crypto ctx. Additionally, report the full
algo name when failing.
Signed-off-by: Kees Cook <keescook@chromium.org>
[tyhicks: remove out label, which is no longer used]
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
It doesn't make sense to check if an array is NULL. The compiler just
removes the check.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
d_invalidate() is the standard VFS method to invalidate dentry.
compare to d_delete(), it also try shrinking children dentries.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Sage Weil <sage@inktank.com>
commit 6f60f889 (ceph: fix freeing inode vs removing session caps race)
introduced ceph_lookup_inode(). But there is already a ceph_find_inode()
which provides similar function. So remove ceph_lookup_inode(), use
ceph_find_inode() instead.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Alex Elder <alex.elder@linary.org>
Reviewed-by: Sage Weil <sage@inktank.com>
Commit 4edaa308 "NFS: Use "krb5i" to establish NFSv4 state whenever possible"
uses the nfs_client cl_rpcclient for all state management operations, and
will use krb5i or auth_sys with no regard to the mount command authflavor
choice.
The MDS, as any NFSv4.1 mount point, uses the nfs_server rpc client for all
non-state management operations with a different nfs_server for each fsid
encountered traversing the mount point, each with a potentially different
auth flavor.
pNFS data servers are not mounted in the normal sense as there is no associated
nfs_server structure. Data servers can also export multiple fsids, each with
a potentially different auth flavor.
Data servers need to use the same authflavor as the MDS server rpc client for
non-state management operations. Populate a list of rpc clients with the MDS
server rpc client auth flavor for the DS to use.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The linux-next build bot found a three of warnings, this addresses all of them.
* non-ANSI function declaration of function 'ceph_fscache_register' and
'ceph_fscache_unregister'
* symbol 'ceph_cache_netfs' was not declared, now it's extern in the header.
* warning: "pr_fmt" redefined
Signed-off-by: Milosz Tanski <milosz@adfin.com>