[ Upstream commit 994fd530a512597ffcd713b0f6d5bc916c5698f0 ]
Use the IOCB_DIRECT indicator flag on the I/O context rather than checking to
see if the file was opened O_DIRECT.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Steve French <sfrench@samba.org>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: Rohith Surabattula <rohiths.msft@gmail.com>
cc: linux-cifs@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3d6cc9898efdfb062efb74dc18cfc700e082f5d5 ]
When cifs_get_root() fails during cifs_smb3_do_mount() we call
deactivate_locked_super() which eventually will call delayed_free() which
will free the context.
In this situation we should not proceed to enter the out: section in
cifs_smb3_do_mount() and free the same resources a second time.
[Thu Feb 10 12:59:06 2022] BUG: KASAN: use-after-free in rcu_cblist_dequeue+0x32/0x60
[Thu Feb 10 12:59:06 2022] Read of size 8 at addr ffff888364f4d110 by task swapper/1/0
[Thu Feb 10 12:59:06 2022] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G OE 5.17.0-rc3+ #4
[Thu Feb 10 12:59:06 2022] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.0 12/17/2019
[Thu Feb 10 12:59:06 2022] Call Trace:
[Thu Feb 10 12:59:06 2022] <IRQ>
[Thu Feb 10 12:59:06 2022] dump_stack_lvl+0x5d/0x78
[Thu Feb 10 12:59:06 2022] print_address_description.constprop.0+0x24/0x150
[Thu Feb 10 12:59:06 2022] ? rcu_cblist_dequeue+0x32/0x60
[Thu Feb 10 12:59:06 2022] kasan_report.cold+0x7d/0x117
[Thu Feb 10 12:59:06 2022] ? rcu_cblist_dequeue+0x32/0x60
[Thu Feb 10 12:59:06 2022] __asan_load8+0x86/0xa0
[Thu Feb 10 12:59:06 2022] rcu_cblist_dequeue+0x32/0x60
[Thu Feb 10 12:59:06 2022] rcu_core+0x547/0xca0
[Thu Feb 10 12:59:06 2022] ? call_rcu+0x3c0/0x3c0
[Thu Feb 10 12:59:06 2022] ? __this_cpu_preempt_check+0x13/0x20
[Thu Feb 10 12:59:06 2022] ? lock_is_held_type+0xea/0x140
[Thu Feb 10 12:59:06 2022] rcu_core_si+0xe/0x10
[Thu Feb 10 12:59:06 2022] __do_softirq+0x1d4/0x67b
[Thu Feb 10 12:59:06 2022] __irq_exit_rcu+0x100/0x150
[Thu Feb 10 12:59:06 2022] irq_exit_rcu+0xe/0x30
[Thu Feb 10 12:59:06 2022] sysvec_hyperv_stimer0+0x9d/0xc0
...
[Thu Feb 10 12:59:07 2022] Freed by task 58179:
[Thu Feb 10 12:59:07 2022] kasan_save_stack+0x26/0x50
[Thu Feb 10 12:59:07 2022] kasan_set_track+0x25/0x30
[Thu Feb 10 12:59:07 2022] kasan_set_free_info+0x24/0x40
[Thu Feb 10 12:59:07 2022] ____kasan_slab_free+0x137/0x170
[Thu Feb 10 12:59:07 2022] __kasan_slab_free+0x12/0x20
[Thu Feb 10 12:59:07 2022] slab_free_freelist_hook+0xb3/0x1d0
[Thu Feb 10 12:59:07 2022] kfree+0xcd/0x520
[Thu Feb 10 12:59:07 2022] cifs_smb3_do_mount+0x149/0xbe0 [cifs]
[Thu Feb 10 12:59:07 2022] smb3_get_tree+0x1a0/0x2e0 [cifs]
[Thu Feb 10 12:59:07 2022] vfs_get_tree+0x52/0x140
[Thu Feb 10 12:59:07 2022] path_mount+0x635/0x10c0
[Thu Feb 10 12:59:07 2022] __x64_sys_mount+0x1bf/0x210
[Thu Feb 10 12:59:07 2022] do_syscall_64+0x5c/0xc0
[Thu Feb 10 12:59:07 2022] entry_SYSCALL_64_after_hwframe+0x44/0xae
[Thu Feb 10 12:59:07 2022] Last potentially related work creation:
[Thu Feb 10 12:59:07 2022] kasan_save_stack+0x26/0x50
[Thu Feb 10 12:59:07 2022] __kasan_record_aux_stack+0xb6/0xc0
[Thu Feb 10 12:59:07 2022] kasan_record_aux_stack_noalloc+0xb/0x10
[Thu Feb 10 12:59:07 2022] call_rcu+0x76/0x3c0
[Thu Feb 10 12:59:07 2022] cifs_umount+0xce/0xe0 [cifs]
[Thu Feb 10 12:59:07 2022] cifs_kill_sb+0xc8/0xe0 [cifs]
[Thu Feb 10 12:59:07 2022] deactivate_locked_super+0x5d/0xd0
[Thu Feb 10 12:59:07 2022] cifs_smb3_do_mount+0xab9/0xbe0 [cifs]
[Thu Feb 10 12:59:07 2022] smb3_get_tree+0x1a0/0x2e0 [cifs]
[Thu Feb 10 12:59:07 2022] vfs_get_tree+0x52/0x140
[Thu Feb 10 12:59:07 2022] path_mount+0x635/0x10c0
[Thu Feb 10 12:59:07 2022] __x64_sys_mount+0x1bf/0x210
[Thu Feb 10 12:59:07 2022] do_syscall_64+0x5c/0xc0
[Thu Feb 10 12:59:07 2022] entry_SYSCALL_64_after_hwframe+0x44/0xae
Reported-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 14302ee3301b3a77b331cc14efb95bf7184c73cc upstream.
In cifs_statfs(), if server->ops->queryfs is not NULL, then we should
use its return value rather than always returning 0. Instead, use rc
variable as it is properly set to 0 in case there is no
server->ops->queryfs.
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
CC: <stable@vger.kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 8c6c9bed8773375b1d54ccca2911ec892c59db5d ]
There is a null check on dst_file->private data which suggests
it can be potentially null. However, before this check, pointer
smb_file_target is derived from dst_file->private and dereferenced
in the call to tlink_tcon, hence there is a potential null pointer
deference.
Fix this by assigning smb_file_target and target_tcon after the
null pointer sanity checks.
Detected by CoverityScan, CID#1475302 ("Dereference before null check")
Fixes: 04b38d601239 ("vfs: pull btrfs clone API to vfs layer")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 21ba3845b59c733a79ed4fe1c4f3732e7ece9df7 upstream.
Fil in the correct namelen (typically 255 not 4096) in the
statfs response and also fill in a reasonably unique fsid
(in this case taken from the volume id, and the creation time
of the volume).
In the case of the POSIX statfs all fields are now filled in,
and in the case of non-POSIX mounts, all fields are filled
in which can be.
Signed-off-by: Steve French <stfrench@gmail.com>
CC: Stable <stable@vger.kernel.org>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6e70c267e68d77679534dcf4aaf84e66f2cf1425 upstream.
As with NFS, which ignores sync on directory handles,
fsync on a directory handle is a noop for CIFS/SMB3.
Do not return an error on it. It breaks some database
apps otherwise.
Signed-off-by: Steve French <smfrench@gmail.com>
CC: Stable <stable@vger.kernel.org>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5fcd7f3f966f37f3f9a215af4cc1597fe338d0d5 upstream.
* prepare for SMB3.11 pre-auth integrity
* enable sha512 when SMB311 is enabled in Kconfig
* add sha512 as a soft dependency
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <smfrench@gmail.com>
CC: Stable <stable@vger.kernel.org>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
memory leak was found by kmemleak. exit_cifs_spnego
should be called before cifs module removed, or
cifs root_cred will not be released.
kmemleak report:
unreferenced object 0xffff880070a3ce40 (size 192):
backtrace:
kmemleak_alloc+0x4a/0xa0
kmem_cache_alloc+0xc7/0x1d0
prepare_kernel_cred+0x20/0x120
init_cifs_spnego+0x2d/0x170 [cifs]
0xffffffffc07801f3
do_one_initcall+0x51/0x1b0
do_init_module+0x60/0x1fd
load_module+0x161e/0x1b60
SYSC_finit_module+0xa9/0x100
SyS_finit_module+0xe/0x10
Signed-off-by: Shu Wang <shuwang@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
CC: Stable <stable@vger.kernel.org>
callers can more safely get random bytes if they can block until the
CRNG is initialized.
Also print a warning if get_random_*() is called before the CRNG is
initialized. By default, only one single-line warning will be printed
per boot. If CONFIG_WARN_ALL_UNSEEDED_RANDOM is defined, then a
warning will be printed for each function which tries to get random
bytes before the CRNG is initialized. This can get spammy for certain
architecture types, so it is not enabled by default.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEK2m5VNv+CHkogTfJ8vlZVpUNgaMFAllqXNUACgkQ8vlZVpUN
gaPtAgf/aUbXZuWYsDQzslHsbzEWi+qz4QgL885/w4L00pEImTTp91Q06SDxWhtB
KPvGnZHS3IofxBh2DC+6AwN6dPMoWDCfYhhO6po3FSz0DiPRIQCTuvOb8fhKY1X7
rTdDq2xtDxPGxJ25bMJtlrgzH2XlXPpVyPUeoc9uh87zUK5aesXpUn9kBniRexoz
ume+M/cDzPKkwNQpbLq8vzhNjoWMVv0FeW2akVvrjkkWko8nZLZ0R/kIyKQlRPdG
LZDXcz0oTHpDS6+ufEo292ZuWm2IGer2YtwHsKyCAsyEWsUqBz2yurtkSj3mAVyC
hHafyS+5WNaGdgBmg0zJxxwn5qxxLg==
=ua7p
-----END PGP SIGNATURE-----
Merge tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random
Pull random updates from Ted Ts'o:
"Add wait_for_random_bytes() and get_random_*_wait() functions so that
callers can more safely get random bytes if they can block until the
CRNG is initialized.
Also print a warning if get_random_*() is called before the CRNG is
initialized. By default, only one single-line warning will be printed
per boot. If CONFIG_WARN_ALL_UNSEEDED_RANDOM is defined, then a
warning will be printed for each function which tries to get random
bytes before the CRNG is initialized. This can get spammy for certain
architecture types, so it is not enabled by default"
* tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
random: reorder READ_ONCE() in get_random_uXX
random: suppress spammy warnings about unseeded randomness
random: warn when kernel uses unseeded randomness
net/route: use get_random_int for random counter
net/neighbor: use get_random_u32 for 32-bit hash random
rhashtable: use get_random_u32 for hash_rnd
ceph: ensure RNG is seeded before using
iscsi: ensure RNG is seeded before use
cifs: use get_random_u32 for 32-bit lock random
random: add get_random_{bytes,u32,u64,int,long,once}_wait family
random: add wait_for_random_bytes() API
Remove the CONFIG_CIFS_SMB2 ifdef and Kconfig option since they
must always be on now.
For various security reasons, SMB3 and later are STRONGLY preferred
over CIFS and older dialects, and SMB3 (and later) will now be
the default dialects so we do not want to allow them to be
ifdeffed out.
In the longer term, we may be able to make older CIFS support
disableable in Kconfig with a new set of #ifdef, but we always
want SMB3 and later support enabled.
Signed-off-by: Steven French <smfrench@gmail.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Using get_random_u32 here is faster, more fitting of the use case, and
just as cryptographically secure. It also has the benefit of providing
better randomness at early boot, which is sometimes when this is used.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Steve French <sfrench@samba.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Pull cifs fixes from Steve French:
"Various fixes for stable for CIFS/SMB3 especially for better
interoperability for SMB3 to Macs.
It also includes Pavel's improvements to SMB3 async i/o support
(which is much faster now)"
* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
CIFS: add misssing SFM mapping for doublequote
SMB3: Work around mount failure when using SMB3 dialect to Macs
cifs: fix CIFS_IOC_GET_MNT_INFO oops
CIFS: fix mapping of SFM_SPACE and SFM_PERIOD
CIFS: fix oplock break deadlocks
cifs: fix CIFS_ENUMERATE_SNAPSHOTS oops
cifs: fix leak in FSCTL_ENUM_SNAPS response handling
Set unicode flag on cifs echo request to avoid Mac error
CIFS: Add asynchronous write support through kernel AIO
CIFS: Add asynchronous read support through kernel AIO
CIFS: Add asynchronous context to support kernel AIO
cifs: fix IPv6 link local, with scope id, address parsing
cifs: small underflow in cnvrtDosUnixTm()
When the final cifsFileInfo_put() is called from cifsiod and an oplock
break work is queued, lockdep complains loudly:
=============================================
[ INFO: possible recursive locking detected ]
4.11.0+ #21 Not tainted
---------------------------------------------
kworker/0:2/78 is trying to acquire lock:
("cifsiod"){++++.+}, at: flush_work+0x215/0x350
but task is already holding lock:
("cifsiod"){++++.+}, at: process_one_work+0x255/0x8e0
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock("cifsiod");
lock("cifsiod");
*** DEADLOCK ***
May be due to missing lock nesting notation
2 locks held by kworker/0:2/78:
#0: ("cifsiod"){++++.+}, at: process_one_work+0x255/0x8e0
#1: ((&wdata->work)){+.+...}, at: process_one_work+0x255/0x8e0
stack backtrace:
CPU: 0 PID: 78 Comm: kworker/0:2 Not tainted 4.11.0+ #21
Workqueue: cifsiod cifs_writev_complete
Call Trace:
dump_stack+0x85/0xc2
__lock_acquire+0x17dd/0x2260
? match_held_lock+0x20/0x2b0
? trace_hardirqs_off_caller+0x86/0x130
? mark_lock+0xa6/0x920
lock_acquire+0xcc/0x260
? lock_acquire+0xcc/0x260
? flush_work+0x215/0x350
flush_work+0x236/0x350
? flush_work+0x215/0x350
? destroy_worker+0x170/0x170
__cancel_work_timer+0x17d/0x210
? ___preempt_schedule+0x16/0x18
cancel_work_sync+0x10/0x20
cifsFileInfo_put+0x338/0x7f0
cifs_writedata_release+0x2a/0x40
? cifs_writedata_release+0x2a/0x40
cifs_writev_complete+0x29d/0x850
? preempt_count_sub+0x18/0xd0
process_one_work+0x304/0x8e0
worker_thread+0x9b/0x6a0
kthread+0x1b2/0x200
? process_one_work+0x8e0/0x8e0
? kthread_create_on_node+0x40/0x40
ret_from_fork+0x31/0x40
This is a real warning. Since the oplock is queued on the same
workqueue this can deadlock if there is only one worker thread active
for the workqueue (which will be the case during memory pressure when
the rescuer thread is handling it).
Furthermore, there is at least one other kind of hang possible due to
the oplock break handling if there is only worker. (This can be
reproduced without introducing memory pressure by having passing 1 for
the max_active parameter of cifsiod.) cifs_oplock_break() can wait
indefintely in the filemap_fdatawait() while the cifs_writev_complete()
work is blocked:
sysrq: SysRq : Show Blocked State
task PC stack pid father
kworker/0:1 D 0 16 2 0x00000000
Workqueue: cifsiod cifs_oplock_break
Call Trace:
__schedule+0x562/0xf40
? mark_held_locks+0x4a/0xb0
schedule+0x57/0xe0
io_schedule+0x21/0x50
wait_on_page_bit+0x143/0x190
? add_to_page_cache_lru+0x150/0x150
__filemap_fdatawait_range+0x134/0x190
? do_writepages+0x51/0x70
filemap_fdatawait_range+0x14/0x30
filemap_fdatawait+0x3b/0x40
cifs_oplock_break+0x651/0x710
? preempt_count_sub+0x18/0xd0
process_one_work+0x304/0x8e0
worker_thread+0x9b/0x6a0
kthread+0x1b2/0x200
? process_one_work+0x8e0/0x8e0
? kthread_create_on_node+0x40/0x40
ret_from_fork+0x31/0x40
dd D 0 683 171 0x00000000
Call Trace:
__schedule+0x562/0xf40
? mark_held_locks+0x29/0xb0
schedule+0x57/0xe0
io_schedule+0x21/0x50
wait_on_page_bit+0x143/0x190
? add_to_page_cache_lru+0x150/0x150
__filemap_fdatawait_range+0x134/0x190
? do_writepages+0x51/0x70
filemap_fdatawait_range+0x14/0x30
filemap_fdatawait+0x3b/0x40
filemap_write_and_wait+0x4e/0x70
cifs_flush+0x6a/0xb0
filp_close+0x52/0xa0
__close_fd+0xdc/0x150
SyS_close+0x33/0x60
entry_SYSCALL_64_fastpath+0x1f/0xbe
Showing all locks held in the system:
2 locks held by kworker/0:1/16:
#0: ("cifsiod"){.+.+.+}, at: process_one_work+0x255/0x8e0
#1: ((&cfile->oplock_break)){+.+.+.}, at: process_one_work+0x255/0x8e0
Showing busy workqueues and worker pools:
workqueue cifsiod: flags=0xc
pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/1
in-flight: 16:cifs_oplock_break
delayed: cifs_writev_complete, cifs_echo_request
pool 0: cpus=0 node=0 flags=0x0 nice=0 hung=0s workers=3 idle: 750 3
Fix these problems by creating a a new workqueue (with a rescuer) for
the oplock break work.
Signed-off-by: Rabin Vincent <rabinv@axis.com>
Signed-off-by: Steve French <smfrench@gmail.com>
CC: Stable <stable@vger.kernel.org>
Pull networking updates from David Millar:
"Here are some highlights from the 2065 networking commits that
happened this development cycle:
1) XDP support for IXGBE (John Fastabend) and thunderx (Sunil Kowuri)
2) Add a generic XDP driver, so that anyone can test XDP even if they
lack a networking device whose driver has explicit XDP support
(me).
3) Sparc64 now has an eBPF JIT too (me)
4) Add a BPF program testing framework via BPF_PROG_TEST_RUN (Alexei
Starovoitov)
5) Make netfitler network namespace teardown less expensive (Florian
Westphal)
6) Add symmetric hashing support to nft_hash (Laura Garcia Liebana)
7) Implement NAPI and GRO in netvsc driver (Stephen Hemminger)
8) Support TC flower offload statistics in mlxsw (Arkadi Sharshevsky)
9) Multiqueue support in stmmac driver (Joao Pinto)
10) Remove TCP timewait recycling, it never really could possibly work
well in the real world and timestamp randomization really zaps any
hint of usability this feature had (Soheil Hassas Yeganeh)
11) Support level3 vs level4 ECMP route hashing in ipv4 (Nikolay
Aleksandrov)
12) Add socket busy poll support to epoll (Sridhar Samudrala)
13) Netlink extended ACK support (Johannes Berg, Pablo Neira Ayuso,
and several others)
14) IPSEC hw offload infrastructure (Steffen Klassert)"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2065 commits)
tipc: refactor function tipc_sk_recv_stream()
tipc: refactor function tipc_sk_recvmsg()
net: thunderx: Optimize page recycling for XDP
net: thunderx: Support for XDP header adjustment
net: thunderx: Add support for XDP_TX
net: thunderx: Add support for XDP_DROP
net: thunderx: Add basic XDP support
net: thunderx: Cleanup receive buffer allocation
net: thunderx: Optimize CQE_TX handling
net: thunderx: Optimize RBDR descriptor handling
net: thunderx: Support for page recycling
ipx: call ipxitf_put() in ioctl error path
net: sched: add helpers to handle extended actions
qed*: Fix issues in the ptp filter config implementation.
qede: Fix concurrency issue in PTP Tx path processing.
stmmac: Add support for SIMATIC IOT2000 platform
net: hns: fix ethtool_get_strings overflow in hns driver
tcp: fix wraparound issue in tcp_lp
bpf, arm64: fix jit branch offset related to ldimm64
bpf, arm64: implement jiting of BPF_XADD
...
Pull block layer updates from Jens Axboe:
- Add BFQ IO scheduler under the new blk-mq scheduling framework. BFQ
was initially a fork of CFQ, but subsequently changed to implement
fairness based on B-WF2Q+, a modified variant of WF2Q. BFQ is meant
to be used on desktop type single drives, providing good fairness.
From Paolo.
- Add Kyber IO scheduler. This is a full multiqueue aware scheduler,
using a scalable token based algorithm that throttles IO based on
live completion IO stats, similary to blk-wbt. From Omar.
- A series from Jan, moving users to separately allocated backing
devices. This continues the work of separating backing device life
times, solving various problems with hot removal.
- A series of updates for lightnvm, mostly from Javier. Includes a
'pblk' target that exposes an open channel SSD as a physical block
device.
- A series of fixes and improvements for nbd from Josef.
- A series from Omar, removing queue sharing between devices on mostly
legacy drivers. This helps us clean up other bits, if we know that a
queue only has a single device backing. This has been overdue for
more than a decade.
- Fixes for the blk-stats, and improvements to unify the stats and user
windows. This both improves blk-wbt, and enables other users to
register a need to receive IO stats for a device. From Omar.
- blk-throttle improvements from Shaohua. This provides a scalable
framework for implementing scalable priotization - particularly for
blk-mq, but applicable to any type of block device. The interface is
marked experimental for now.
- Bucketized IO stats for IO polling from Stephen Bates. This improves
efficiency of polled workloads in the presence of mixed block size
IO.
- A few fixes for opal, from Scott.
- A few pulls for NVMe, including a lot of fixes for NVMe-over-fabrics.
From a variety of folks, mostly Sagi and James Smart.
- A series from Bart, improving our exposed info and capabilities from
the blk-mq debugfs support.
- A series from Christoph, cleaning up how handle WRITE_ZEROES.
- A series from Christoph, cleaning up the block layer handling of how
we track errors in a request. On top of being a nice cleanup, it also
shrinks the size of struct request a bit.
- Removal of mg_disk and hd (sorry Linus) by Christoph. The former was
never used by platforms, and the latter has outlived it's usefulness.
- Various little bug fixes and cleanups from a wide variety of folks.
* 'for-4.12/block' of git://git.kernel.dk/linux-block: (329 commits)
block: hide badblocks attribute by default
blk-mq: unify hctx delay_work and run_work
block: add kblock_mod_delayed_work_on()
blk-mq: unify hctx delayed_run_work and run_work
nbd: fix use after free on module unload
MAINTAINERS: bfq: Add Paolo as maintainer for the BFQ I/O scheduler
blk-mq-sched: alloate reserved tags out of normal pool
mtip32xx: use runtime tag to initialize command header
scsi: Implement blk_mq_ops.show_rq()
blk-mq: Add blk_mq_ops.show_rq()
blk-mq: Show operation, cmd_flags and rq_flags names
blk-mq: Make blk_flags_show() callers append a newline character
blk-mq: Move the "state" debugfs attribute one level down
blk-mq: Unregister debugfs attributes earlier
blk-mq: Only unregister hctxs for which registration succeeded
blk-mq-debugfs: Rename functions for registering and unregistering the mq directory
blk-mq: Let blk_mq_debugfs_register() look up the queue name
blk-mq: Register <dev>/queue/mq after having registered <dev>/queue
ide-pm: always pass 0 error to ide_complete_rq in ide_do_devset
ide-pm: always pass 0 error to __blk_end_request_all
..
Allocate struct backing_dev_info separately instead of embedding it
inside superblock. This unifies handling of bdi among users.
CC: Steve French <sfrench@samba.org>
CC: linux-cifs@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
Conflicts were simply overlapping changes. In the net/ipv4/route.c
case the code had simply moved around a little bit and the same fix
was made in both 'net' and 'net-next'.
In the net/sched/sch_generic.c case a fix in 'net' happened at
the same time that a new argument was added to qdisc_hash_add().
Signed-off-by: David S. Miller <davem@davemloft.net>
The earlier changes to copy range for cifs unintentionally disabled the more
common form of server side copy.
The patch introduces the file_operations helper cifs_copy_file_range()
which is used by the syscall copy_file_range. The new file operations
helper allows us to perform server side copies for SMB2.0 and 2.1
servers as well as SMB 3.0+ servers which do not support the ioctl
FSCTL_DUPLICATE_EXTENTS_TO_FILE.
The new helper uses the ioctl FSCTL_SRV_COPYCHUNK_WRITE to perform
server side copies. The helper is called by vfs_copy_file_range() only
once an attempt to clone the file using the ioctl
FSCTL_DUPLICATE_EXTENTS_TO_FILE has failed.
Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Steve French <smfrench@gmail.com>
There is an include loop between netdevice.h, dsa.h, devlink.h because
of NETDEV_ALIGN, making it impossible to use devlink structures in
dsa.h.
Break this loop by taking dsa.h out of netdevice.h, add a forward
declaration of dsa_switch_tree and netdev_set_default_ethtool_ops()
function, which is what netdevice.h requires.
No longer having dsa.h in netdevice.h means the includes in dsa.h no
longer get included. This breaks a few other files which depend on
these includes. Add these directly in the affected file.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This change allows to encrypt packets if it is required by a server
for SMB sessions or tree connections.
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
List soft dependencies of cifs so that mkinitrd and dracut can include
the required helper modules.
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Steve French <sfrench@samba.org>
Pull cifs fixes from Steve French:
"This ncludes various cifs/smb3 bug fixes, mostly for stable as well.
In the next week I expect that Germano will have some reconnection
fixes, and also I expect to have the remaining pieces of the snapshot
enablement and SMB3 ACLs, but wanted to get this set of bug fixes in"
* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
cifs_get_root shouldn't use path with tree name
Fix default behaviour for empty domains and add domainauto option
cifs: use %16phN for formatting md5 sum
cifs: Fix smbencrypt() to stop pointing a scatterlist at the stack
CIFS: Fix a possible double locking of mutex during reconnect
CIFS: Fix a possible memory corruption during reconnect
CIFS: Fix a possible memory corruption in push locks
CIFS: Fix missing nls unload in smb2_reconnect()
CIFS: Decrease verbosity of ioctl call
SMB3: parsing for new snapshot timestamp mount parm
When a server returns the optional flag SMB_SHARE_IS_IN_DFS in response
to a tree connect, cifs_build_path_to_root() will return a pathname
which includes the hostname. This causes problems with cifs_get_root()
which separates each component and does a lookup for each component of
the path which in this case will incorrectly include looking up the
hostname component as a path component.
We encountered a problem with dfs shares hosted by a Netapp. When
connecting to nodes pointed to by the DFS share. The tree connect for
these nodes return SMB_SHARE_IS_IN_DFS resulting failures in lookup
in cifs_get_root().
RH bz: 1373153
The patch was tested against a Netapp simulator and by a user using an
actual Netapp server.
Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Reported-by: Pierguido Lambri <plambri@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
If .readlink == NULL implies generic_readlink().
Generated by:
to_del="\.readlink.*=.*generic_readlink"
for i in `git grep -l $to_del`; do sed -i "/$to_del"/d $i; done
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Add "idsfromsid" mount option to indicate to cifs.ko that it should
try to retrieve the uid and gid owner fields from special sids in the
ACL if present. This first patch just adds the parsing for the mount
option.
Signed-off-by: Steve French <steve.french@primarydata.com>
Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
/sys/module/cifs/parameters should display the three
other module load time configuration settings for cifs.ko
Signed-off-by: Germano Percossi <germano.percossi@citrix.com>
Signed-off-by: Steve French <steve.french@primarydata.com>
Remove the global file_list_lock to simplify cifs/smb3 locking and
have spinlocks that more closely match the information they are
protecting.
Add new tcon->open_file_lock and file->file_info_lock spinlocks.
Locks continue to follow a heirachy,
cifs_socket --> cifs_ses --> cifs_tcon --> cifs_file
where global tcp_ses_lock still protects socket and cifs_ses, while the
the newer locks protect the lower level structure's information
(tcon and cifs_file respectively).
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Steve French <steve.french@primarydata.com>
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Reviewed-by: Germano Percossi <germano.percossi@citrix.com>
GUIDs although random, and 16 bytes, need to be generated as
proper uuids.
Signed-off-by: Steve French <steve.french@primarydata.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Reported-by: David Goebels <davidgoe@microsoft.com>
CC: Stable <stable@vger.kernel.org>
Pull more vfs updates from Al Viro:
">rename2() work from Miklos + current_time() from Deepa"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs: Replace current_fs_time() with current_time()
fs: Replace CURRENT_TIME_SEC with current_time() for inode timestamps
fs: Replace CURRENT_TIME with current_time() for inode timestamps
fs: proc: Delete inode time initializations in proc_alloc_inode()
vfs: Add current_time() api
vfs: add note about i_op->rename changes to porting
fs: rename "rename2" i_op to "rename"
vfs: remove unused i_op->rename
fs: make remaining filesystems use .rename2
libfs: support RENAME_NOREPLACE in simple_rename()
fs: support RENAME_NOREPLACE for local filesystems
ncpfs: fix unused variable warning
These inode operations are no longer used; remove them.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Fix memory leaks introduced by the patch
fs/cifs: make share unaccessible at root level mountable
Also move allocation of cifs_sb->prepath to cifs_setup_cifs_sb().
Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Tested-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <smfrench@gmail.com>
if, when mounting //HOST/share/sub/dir/foo we can query /sub/dir/foo but
not any of the path components above:
- store the /sub/dir/foo prefix in the cifs super_block info
- in the superblock, set root dentry to the subpath dentry (instead of
the share root)
- set a flag in the superblock to remember it
- use prefixpath when building path from a dentry
fixes bso#8950
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
CC: Stable <stable@vger.kernel.org>
Reviewed-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
Right now, we send the tgid cross the wire. What we really want to send
though is a hashed fl_owner_t since samba treats this field as a generic
lockowner.
It turns out that because we enforce and release locks locally before
they are ever sent to the server, this patch makes no difference in
behavior. Still, setting OFD locks on the server using the process
pid seems wrong, so I think this patch still makes sense.
Signed-off-by: Jeff Layton <jlayton@poochiereds.net>
Signed-off-by: Steve French <smfrench@gmail.com>
Acked-by: Pavel Shilovsky <pshilovsky@samba.org>
Acked-by: Sachin Prabhu <sprabhu@redhat.com>
Remove some obsolete comments in the cifs inode_operations
structs that were pointed out by Stephen Rothwell.
CC: Stephen Rothwell <sfr@canb.auug.org.au>
CC: Al Viro <viro@zeniv.linux.org.uk>
Reviewed-by: Sachin Prabhu <sprabhu@redhat.com>
Signed-off-by: Steve French <steve.french@primarydata.com>
The session key is the default keyring set for request_key operations.
This session key is revoked when the user owning the session logs out.
Any long running daemon processes started by this session ends up with
revoked session keyring which prevents these processes from using the
request_key mechanism from obtaining the krb5 keys.
The problem has been reported by a large number of autofs users. The
problem is also seen with multiuser mounts where the share may be used
by processes run by a user who has since logged out. A reproducer using
automount is available on the Red Hat bz.
The patch creates a new keyring which is used to cache cifs spnego
upcalls.
Red Hat bz: 1267754
Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Reported-by: Scott Mayhew <smayhew@redhat.com>
Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Steve French <smfrench@gmail.com>
Pull cifs xattr updates from Al Viro:
"This is the remaining parts of the xattr work - the cifs bits"
* 'for-cifs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
cifs: Switch to generic xattr handlers
cifs: Fix removexattr for os2.* xattrs
cifs: Check for equality with ACL_TYPE_ACCESS and ACL_TYPE_DEFAULT
cifs: Fix xattr name checks
Use xattr handlers for resolving attribute names. The amount of setup
code required on cifs is nontrivial, so use the same get and set
functions for all handlers, with switch statements for the different
types of attributes in them.
The set_EA handler can handle NULL values, so we don't need a separate
removexattr function anymore. Remove the cifs_dbg statements related to
xattr name resolution; they don't add much. Don't build xattr.o when
CONFIG_CIFS_XATTR is not defined.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.
This promise never materialized. And unlikely will.
We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE. And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.
Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.
Let's stop pretending that pages in page cache are special. They are
not.
The changes are pretty straight-forward:
- <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
- page_cache_get() -> get_page();
- page_cache_release() -> put_page();
This patch contains automated changes generated with coccinelle using
script below. For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.
The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.
There are few places in the code where coccinelle didn't reach. I'll
fix them manually in a separate patch. Comments and documentation also
will be addressed with the separate patch.
virtual patch
@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT
@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE
@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK
@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)
@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)
@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull vfs updates from Al Viro:
- Preparations of parallel lookups (the remaining main obstacle is the
need to move security_d_instantiate(); once that becomes safe, the
rest will be a matter of rather short series local to fs/*.c
- preadv2/pwritev2 series from Christoph
- assorted fixes
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (32 commits)
splice: handle zero nr_pages in splice_to_pipe()
vfs: show_vfsstat: do not ignore errors from show_devname method
dcache.c: new helper: __d_add()
don't bother with __d_instantiate(dentry, NULL)
untangle fsnotify_d_instantiate() a bit
uninline d_add()
replace d_add_unique() with saner primitive
quota: use lookup_one_len_unlocked()
cifs_get_root(): use lookup_one_len_unlocked()
nfs_lookup: don't bother with d_instantiate(dentry, NULL)
kill dentry_unhash()
ceph_fill_trace(): don't bother with d_instantiate(dn, NULL)
autofs4: don't bother with d_instantiate(dentry, NULL) in ->lookup()
configfs: move d_rehash() into configfs_create() for regular files
ceph: don't bother with d_rehash() in splice_dentry()
namei: teach lookup_slow() to skip revalidate
namei: massage lookup_slow() to be usable by lookup_one_len_unlocked()
lookup_one_len_unlocked(): use lookup_dcache()
namei: simplify invalidation logics in lookup_dcache()
namei: change calling conventions for lookup_{fast,slow} and follow_managed()
...
Some callers of strtobool() were passing a pointer to unterminated
strings. In preparation of adding multi-character processing to
kstrtobool(), update the callers to not pass single-character pointers,
and switch to using the new kstrtobool_from_user() helper where
possible.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Amitkumar Karwar <akarwar@marvell.com>
Cc: Nishant Sarmukadam <nishants@marvell.com>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: Steve French <sfrench@samba.org>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Joe Perches <joe@perches.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 04b38d601239b4 ("vfs: pull btrfs clone API to vfs layer")
added a duplicated line (in cifsfs.c) which causes a sparse compile
warning.
Signed-off-by: Steve French <steve.french@primarydata.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Pull SMB3 fixes from Steve French:
"A collection of CIFS/SMB3 fixes.
It includes a couple bug fixes, a few for improved debugging of
cifs.ko and some improvements to the way cifs does key generation.
I do have some additional bug fixes I expect in the next week or two
(to address a problem found by xfstest, and some fixes for SMB3.11
dialect, and a couple patches that just came in yesterday that I am
reviewing)"
* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
cifs_dbg() outputs an uninitialized buffer in cifs_readdir()
cifs: fix race between call_async() and reconnect()
Prepare for encryption support (first part). Add decryption and encryption key generation. Thanks to Metze for helping with this.
cifs: Allow using O_DIRECT with cache=loose
cifs: Make echo interval tunable
cifs: Check uniqueid for SMB2+ and return -ESTALE if necessary
Print IP address of unresponsive server
cifs: Ratelimit kernel log messages
parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
inode_foo(inode) being mutex_foo(&inode->i_mutex).
Please, use those for access to ->i_mutex; over the coming cycle
->i_mutex will become rwsem, with ->lookup() done with it held
only shared.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Mark those kmem allocations that are known to be easily triggered from
userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to
memcg. For the list, see below:
- threadinfo
- task_struct
- task_delay_info
- pid
- cred
- mm_struct
- vm_area_struct and vm_region (nommu)
- anon_vma and anon_vma_chain
- signal_struct
- sighand_struct
- fs_struct
- files_struct
- fdtable and fdtable->full_fds_bits
- dentry and external_name
- inode for all filesystems. This is the most tedious part, because
most filesystems overwrite the alloc_inode method.
The list is far from complete, so feel free to add more objects.
Nevertheless, it should be close to "account everything" approach and
keep most workloads within bounds. Malevolent users will be able to
breach the limit, but this was possible even with the former "account
everything" approach (simply because it did not account everything in
fact).
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Greg Thelen <gthelen@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently O_DIRECT is supported with cache=none and cache=strict, but
not cache=loose. Add support for using O_DIRECT when mounted with
cache=loose.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: Steve French <smfrench@gmail.com>