msm-4.14

mirror of https://github.com/rd-stuffs/msm-4.14.git synced 2025-02-20 11:45:48 +08:00

Author	SHA1	Message	Date
Paul Lawrence	02b2463554	ANDROID: dm-bow: Add dm-bow feature Based on https://www.redhat.com/archives/dm-devel/2019-March/msg00025.html Third version of dm-bow. Key changes: Free list added Support for block sizes other than 4k Handles writes during trim phase, and overlapping trims Integer overflow error Support trims even if underlying device doesn't Numerous small bug fixes bow == backup on write USE CASE: dm-bow takes a snapshot of an existing file system before mounting. The user may, before removing the device, commit the snapshot. Alternatively the user may remove the device and then run a command line utility to restore the device to its original state. dm-bow does not require an external device dm-bow efficiently uses all the available free space on the file system. IMPLEMENTATION: dm-bow can be in one of three states. In state one, the free blocks on the device are identified by issuing an FSTRIM to the filesystem. In state two, any writes cause the overwritten data to be backup up to the available free space. While in this state, the device can be restored by unmounting the filesystem, removing the dm-bow device and running a usermode tool over the underlying device. In state three, the changes are committed, dm-bow is in pass-through mode and the drive can no longer be restored. It is planned to use this driver to enable restoration of a failed update attempt on Android devices using ext4. Test: Can boot Android with userdata mounted on this device. Can commit userdata after SUW has run. Can then reboot, make changes and roll back. Known issues: Mutex is held around entire flush operation, including lengthy I/O. Plan is to convert to state machine with pending queues. Interaction with block encryption is unknown, especially with respect to sector 0. Bug: 119769411 Change-Id: Id70988bbd797ebe3e76fc175094388b423c8da8c Signed-off-by: Paul Lawrence <paullawrence@google.com>	2019-03-25 13:57:28 -07:00
Greg Kroah-Hartman	bb60f28e48	This is the 4.14.37 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlrhlZ8ACgkQONu9yGCS aT6VhBAAkG3u95ECjzudalQDGkXvWoV7YDpBsAn+npc8NjNsiORZoEWZGofflbIm mnZdNvEpEJ2hDin53NBRwEicY3SAREU5ym9xvApg4YPjYDUN4ENqQZHBgvswq6cP BlOs7JNTDKSycrxbYyaamPQNr7QBV72/Y8nRcBlnrpVuSgjPNJWMTNs7Gie/mufu MyzX2vQ0Yz+KAZAD4y1bzQ37ByR1/u+4r/1Hq/lHdVXbmBkGHxQq+OgQvScmKmC3 XpFv5J5NGUQHL5jGe7bCfrfhN7U84Codeur4bzJzqQ3O+RL2uu9eZDAkeSw3HbxG YHRqGo5yi1lR33sazA92mBDxbteLUX+pDGMZ8LkfHqmMXhTMFCVWnxmDxMiji5G1 +xMjxXH4b5WOquyR+y7LoLvirkYNYJa2mkPDuSitgiTCVRh4o6aP5UziBLao9SRy Uke1983VluEowQu8QSNjAX4vZUm7j44UKWWQqpqgjKV4PUr8iilPsG9Z3AoRqRV+ u8ZI2FqUGl7hG+XsfDIlc/0Qz72u/OluSkLnNAcSh5rAxHQIuDG2ELcGpwHm5yd9 SBclUH9/cDlfOnlvZKPVAIDFhc23Ez4i+IWmObQ4VsIsrOq0WSzj+oYnWsqeNNw9 NiDQwym4eGWGPs9+GMsKfVAmfpv1HjA0LM6/wNvzYaACU56Lp+o= =5URr -----END PGP SIGNATURE----- Merge 4.14.37 into android-4.14 Changes in 4.14.37 cifs: do not allow creating sockets except with SMB1 posix exensions btrfs: fix unaligned access in readdir x86/acpi: Prevent X2APIC id 0xffffffff from being accounted clocksource/imx-tpm: Correct -ETIME return condition check x86/tsc: Prevent 32bit truncation in calc_hpet_ref() drm/vc4: Fix memory leak during BO teardown drm/i915/gvt: throw error on unhandled vfio ioctls drm/i915/audio: Fix audio detection issue on GLK drm/i915: Do no use kfree() to free a kmem_cache_alloc() return value drm/i915: Fix LSPCON TMDS output buffer enabling from low-power state drm/i915/bxt, glk: Increase PCODE timeouts during CDCLK freq changing usb: musb: fix enumeration after resume usb: musb: call pm_runtime_{get,put}_sync before reading vbus registers usb: musb: Fix external abort in musb_remove on omap2430 firewire-ohci: work around oversized DMA reads on JMicron controllers x86/tsc: Allow TSC calibration without PIT NFSv4: always set NFS_LOCK_LOST when a lock is lost. ACPI / LPSS: Do not instiate platform_dev for devs without MMIO resources ALSA: hda - Use IS_REACHABLE() for dependency on input ASoC: au1x: Fix timeout tests in au1xac97c_ac97_read() kvm: x86: fix KVM_XEN_HVM_CONFIG ioctl RDMA/core: Clarify rdma_ah_find_type KVM: PPC: Book3S HV: Enable migration of decrementer register netfilter: ipv6: nf_defrag: Pass on packets to stack per RFC2460 tracing/hrtimer: Fix tracing bugs by taking all clock bases and modes into account KVM: s390: use created_vcpus in more places platform/x86: dell-laptop: Filter out spurious keyboard backlight change events xprtrdma: Fix backchannel allocation of extra rpcrdma_reps selftest: ftrace: Fix to pick text symbols for kprobes PCI: Add function 1 DMA alias quirk for Marvell 9128 Input: psmouse - fix Synaptics detection when protocol is disabled libbpf: Makefile set specified permission mode Input: synaptics - reset the ABS_X/Y fuzz after initializing MT axes i40iw: Free IEQ resources i40iw: Zero-out consumer key on allocate stag for FMR scsi: qla2xxx: Fix warning in qla2x00_async_iocb_timeout() perf unwind: Do not look just at the global callchain_param.record_mode tools lib traceevent: Simplify pointer print logic and fix %pF perf callchain: Fix attr.sample_max_stack setting tools lib traceevent: Fix get_field_str() for dynamic strings perf record: Fix failed memory allocation for get_cpuid_str iommu/exynos: Don't unconditionally steal bus ops powerpc: System reset avoid interleaving oops using die synchronisation iommu/vt-d: Use domain instead of cache fetching dm thin: fix documentation relative to low water mark threshold dm mpath: return DM_MAPIO_REQUEUE on blk-mq rq allocation failure blk-mq: turn WARN_ON in __blk_mq_run_hw_queue into printk ubifs: Fix uninitialized variable in search_dh_cookie() net: stmmac: dwmac-meson8b: fix setting the RGMII TX clock on Meson8b net: stmmac: dwmac-meson8b: propagate rate changes to the parent clock spi: a3700: Clear DATA_OUT when performing a read IB/cq: Don't force IB_POLL_DIRECT poll context for ib_process_cq_direct nfs: Do not convert nfs_idmap_cache_timeout to jiffies MIPS: Fix clean of vmlinuz.{32,ecoff,bin,srec} PCI: Add dummy pci_irqd_intx_xlate() for CONFIG_PCI=n build watchdog: sp5100_tco: Fix watchdog disable bit kconfig: Don't leak main menus during parsing kconfig: Fix automatic menu creation mem leak kconfig: Fix expr_free() E_NOT leak mac80211_hwsim: fix possible memory leak in hwsim_new_radio_nl() ipmi/powernv: Fix error return code in ipmi_powernv_probe() Btrfs: set plug for fsync btrfs: Fix out of bounds access in btrfs_search_slot Btrfs: fix scrub to repair raid6 corruption btrfs: fail mount when sb flag is not in BTRFS_SUPER_FLAG_SUPP Btrfs: fix unexpected EEXIST from btrfs_get_extent Btrfs: raid56: fix race between merge_bio and rbio_orig_end_io RDMA/cma: Check existence of netdevice during port validation f2fs: avoid hungtask when GC encrypted block if io_bits is set scsi: devinfo: fix format of the device list scsi: fas216: fix sense buffer initialization Input: stmfts - set IRQ_NOAUTOEN to the irq flag HID: roccat: prevent an out of bounds read in kovaplus_profile_activated() nfp: fix error return code in nfp_pci_probe() block: Set BIO_TRACE_COMPLETION on new bio during split bpf: test_maps: cleanup sockmaps when test ends i40evf: Don't schedule reset_task when device is being removed i40evf: ignore link up if not running platform/x86: thinkpad_acpi: suppress warning about palm detection KVM: s390: vsie: use READ_ONCE to access some SCB fields blk-mq-debugfs: don't allow write on attributes with seq_operations set ASoC: rockchip: Use dummy_dai for rt5514 dsp dailink igb: Allow to remove administratively set MAC on VFs igb: Clear TXSTMP when ptp_tx_work() is timeout fm10k: fix "failed to kill vid" message for VF x86/hyperv: Stop suppressing X86_FEATURE_PCID tty: serial: exar: Relocate sleep wake-up handling device property: Define type of PROPERTY_ENRTY_() macros crypto: artpec6 - remove select on non-existing CRYPTO_SHA384 RDMA/uverbs: Use an unambiguous errno for method not supported jffs2: Fix use-after-free bug in jffs2_iget()'s error handling path ixgbe: don't set RXDCTL.RLPML for 82599 i40e: program fragmented IPv4 filter input set i40e: fix reported mask for ntuple filters samples/bpf: Partially fixes the bpf.o build powerpc/numa: Use ibm,max-associativity-domains to discover possible nodes powerpc/numa: Ensure nodes initialized for hotplug RDMA/mlx5: Avoid memory leak in case of XRCD dealloc failure ntb_transport: Fix bug with max_mw_size parameter gianfar: prevent integer wrapping in the rx handler x86/hyperv: Check for required priviliges in hyperv_init() netfilter: x_tables: fix pointer leaks to userspace tcp_nv: fix potential integer overflow in tcpnv_acked kvm: Map PFN-type memory regions as writable (if possible) x86/kvm/vmx: do not use vm-exit instruction length for fast MMIO when running nested fs/dax.c: release PMD lock even when there is no PMD support in DAX ocfs2: return -EROFS to mount.ocfs2 if inode block is invalid ocfs2/acl: use 'ip_xattr_sem' to protect getting extended attribute ocfs2: return error when we attempt to access a dirty bh in jbd2 mm/mempolicy: fix the check of nodemask from user mm/mempolicy: add nodes_empty check in SYSC_migrate_pages asm-generic: provide generic_pmdp_establish() sparc64: update pmdp_invalidate() to return old pmd value mm: thp: use down_read_trylock() in khugepaged to avoid long block mm: pin address_space before dereferencing it while isolating an LRU page mm/fadvise: discard partial page if endbyte is also EOF openvswitch: Remove padding from packet before L3+ conntrack processing blk-mq: fix discard merge with scheduler attached IB/hfi1: Re-order IRQ cleanup to address driver cleanup race IB/hfi1: Fix for potential refcount leak in hfi1_open_file() IB/ipoib: Fix for potential no-carrier state IB/core: Map iWarp AH type to undefined in rdma_ah_find_type drm/nouveau/pmu/fuc: don't use movw directly anymore s390/eadm: fix CONFIG_BLOCK include dependency netfilter: ipv6: nf_defrag: Kill frag queue on RFC2460 failure x86/power: Fix swsusp_arch_resume prototype x86/dumpstack: Avoid uninitlized variable firmware: dmi_scan: Fix handling of empty DMI strings ACPI: processor_perflib: Do not send _PPC change notification if not ready ACPI / bus: Do not call _STA on battery devices with unmet dependencies ACPI / scan: Use acpi_bus_get_status() to initialize ACPI_TYPE_DEVICE devs bpf: fix selftests/bpf test_kmod.sh failure when CONFIG_BPF_JIT_ALWAYS_ON=y MIPS: TXx9: use IS_BUILTIN() for CONFIG_LEDS_CLASS perf record: Fix period option handling MIPS: Generic: Support GIC in EIC mode perf evsel: Fix period/freq terms setup xen-netfront: Fix race between device setup and open xen/grant-table: Use put_page instead of free_page bpf: sockmap, fix leaking maps with attached but not detached progs RDS: IB: Fix null pointer issue arm64: spinlock: Fix theoretical trylock() A-B-A with LSE atomics proc: fix /proc//map_files lookup PM / domains: Fix up domain-idle-states OF parsing cifs: silence compiler warnings showing up with gcc-8.0.0 bcache: properly set task state in bch_writeback_thread() bcache: fix for allocator and register thread race bcache: fix for data collapse after re-attaching an attached device bcache: return attach error when no cache set exist cpufreq: intel_pstate: Enable HWP during system resume on CPU0 selftests/ftrace: Add some missing glob checks rxrpc: Don't put crypto buffers on the stack svcrdma: Fix Read chunk round-up net: Extra '_get' in declaration of arch_get_platform_mac_address tools/libbpf: handle issues with bpf ELF objects containing .eh_frames KVM: PPC: Book3S HV: Fix handling of secondary HPTEG in HPT resizing code SUNRPC: Don't call __UDPX_INC_STATS() from a preemptible context net: stmmac: discard disabled flags in interrupt status register bpf: fix rlimit in reuseport net selftest ACPI / EC: Restore polling during noirq suspend/resume phases PM / wakeirq: Fix unbalanced IRQ enable for wakeirq vfs/proc/kcore, x86/mm/kcore: Fix SMAP fault when dumping vsyscall user page powerpc/mm/hash64: Zero PGD pages on allocation x86/platform/UV: Fix GAM Range Table entries less than 1GB locking/qspinlock: Ensure node->count is updated before initialising node powerpc/powernv: IMC fix out of bounds memory access at shutdown perf test: Fix test trace+probe_libc_inet_pton.sh for s390x irqchip/gic-v3: Ignore disabled ITS nodes cpumask: Make for_each_cpu_wrap() available on UP as well irqchip/gic-v3: Change pr_debug message to pr_devel RDMA/core: Reduce poll batch for direct cq polling alarmtimer: Init nanosleep alarm timer on stack netfilter: x_tables: cap allocations at 512 mbyte netfilter: x_tables: add counters allocation wrapper netfilter: compat: prepare xt_compat_init_offsets to return errors netfilter: compat: reject huge allocation requests netfilter: x_tables: limit allocation requests for blob rule heads perf: Fix sample_max_stack maximum check perf: Return proper values for user stack errors RDMA/mlx5: Fix NULL dereference while accessing XRC_TGT QPs Revert "KVM: X86: Fix SMRAM accessing even if VM is shutdown" mac80211_hwsim: fix use-after-free bug in hwsim_exit_net Linux 4.14.37 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2018-04-26 11:37:46 +02:00
mulhern	223ed638e9	dm thin: fix documentation relative to low water mark threshold [ Upstream commit 9b28a1102efc75d81298198166ead87d643a29ce ] Fixes: 1. The use of "exceeds" when the opposite of exceeds, falls below, was meant. 2. Properly speaking, a table can not exceed a threshold. It emphasizes the important point, which is that it is the userspace daemon's responsibility to check for low free space when a device is resumed, since it won't get a special event indicating low free space in that situation. Signed-off-by: mulhern <amulhern@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-04-26 11:02:07 +02:00
Patrik Torstensson	a73c9bca68	BACKPORT: dm verity: add 'check_at_most_once' option to only validate hashes once This allows platforms that are CPU/memory contrained to verify data blocks only the first time they are read from the data device, rather than every time. As such, it provides a reduced level of security because only offline tampering of the data device's content will be detected, not online tampering. Hash blocks are still verified each time they are read from the hash device, since verification of hash blocks is less performance critical than data blocks, and a hash block will not be verified any more after all the data blocks it covers have been verified anyway. This option introduces a bitset that is used to check if a block has been validated before or not. A block can be validated more than once as there is no thread protection for the bitset. These changes were developed and tested on entry-level Android Go devices. Bug: 72664474 Change-Id: Ie5f1ffda93c7f48e95b90ca80fe3f896c11f7baf Signed-off-by: Mike Snitzer <snitzer@redhat.com> (cherry picked from commit 843f38d382b1ca2f6f4ae2ef7c35933e6319ffbb) Signed-off-by: Patrik Torstensson <totte@google.com>	2018-04-23 14:36:05 +00:00
Will Drewry	0cc78aab28	CHROMIUM: dm: boot time specification of dm= This is a wrap-up of three patches pending upstream approval. I'm bundling them because they are interdependent, and it'll be easier to drop it on rebase later. 1. dm: allow a dm-fs-style device to be shared via dm-ioctl Integrates feedback from Alisdair, Mike, and Kiyoshi. Two main changes occur here: - One function is added which allows for a programmatically created mapped device to be inserted into the dm-ioctl hash table. This binds the device to a name and, optional, uuid which is needed by udev and allows for userspace management of the mapped device. - dm_table_complete() was extended to handle all of the final functional changes required for the table to be operational once called. 2. init: boot to device-mapper targets without an initr* Add a dm= kernel parameter modeled after the md= parameter from do_mounts_md. It allows for device-mapper targets to be configured at boot time for use early in the boot process (as the root device or otherwise). It also replaces /dev/XXX calls with major:minor opportunistically. The format is dm="name uuid ro,table line 1,table line 2,...". The parser expects the comma to be safe to use as a newline substitute but, otherwise, uses the normal separator of space. Some attempt has been made to make it forgiving of additional spaces (using skip_spaces()). A mapped device created during boot will be assigned a minor of 0 and may be access via /dev/dm-0. An example dm-linear root with no uuid may look like: root=/dev/dm-0 dm="lroot none ro, 0 4096 linear /dev/ubdb 0, 4096 4096 linear /dv/ubdc 0" Once udev is started, /dev/dm-0 will become /dev/mapper/lroot. Older upstream threads: http://marc.info/?l=dm-devel&m=127429492521964&w=2 http://marc.info/?l=dm-devel&m=127429499422096&w=2 http://marc.info/?l=dm-devel&m=127429493922000&w=2 Latest upstream threads: https://patchwork.kernel.org/patch/104859/ https://patchwork.kernel.org/patch/104860/ https://patchwork.kernel.org/patch/104861/ Bug: 27175947 Signed-off-by: Will Drewry <wad@chromium.org> Review URL: http://codereview.chromium.org/2020011 Change-Id: I92bd53432a11241228d2e5ac89a3b20d19b05a31 [AmitP: Refactored the original changes based on upstream changes, commit e52347bd66f6 ("Documentation/admin-guide: split the kernel parameter list to a separate file")] Signed-off-by: Amit Pundir <amit.pundir@linaro.org>	2017-12-18 21:11:22 +05:30
Jonathan Brassow	41dcf197ad	dm raid: fix incorrect status output at the end of a "recover" process There are three important fields that indicate the overall health and status of an array: dev_health, sync_ratio, and sync_action. They tell us the condition of the devices in the array, and the degree to which the array is synchronized. This commit fixes a condition that is reported incorrectly. When a member of the array is being rebuilt or a new device is added, the "recover" process is used to synchronize it with the rest of the array. When the process is complete, but the sync thread hasn't yet been reaped, it is possible for the state of MD to be: mddev->recovery = [ MD_RECOVERY_RUNNING MD_RECOVERY_RECOVER MD_RECOVERY_DONE ] curr_resync_completed = <max dev size> (but not MaxSector) and all rdevs to be In_sync. This causes the 'array_in_sync' output parameter that is passed to rs_get_progress() to be computed incorrectly and reported as 'false' -- or not in-sync. This in turn causes the dev_health status characters to be reported as all 'a', rather than the proper 'A'. This can cause erroneous output for several seconds at a time when tools will want to be checking the condition due to events that are raised at the end of a sync process. Fix this by properly calculating the 'array_in_sync' return parameter in rs_get_progress(). Also, remove an unnecessary intermediate 'recovery_cp' variable in rs_get_progress(). Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-10-05 16:21:30 -04:00
Heinz Mauelshagen	ac6a318888	dm raid: bump target version Bumo dm-raid target version to 1.12.1 to reflect that commit cc27b0c78c ("md: fix deadlock between mddev_suspend() and md_write_start()") is available. This version change allows userspace to detect that MD fix is available. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-07-25 14:54:20 -04:00
Damien Le Moal	3b1a94c88b	dm zoned: drive-managed zoned block device target The dm-zoned device mapper target provides transparent write access to zoned block devices (ZBC and ZAC compliant block devices). dm-zoned hides to the device user (a file system or an application doing raw block device accesses) any constraint imposed on write requests by the device, equivalent to a drive-managed zoned block device model. Write requests are processed using a combination of on-disk buffering using the device conventional zones and direct in-place processing for requests aligned to a zone sequential write pointer position. A background reclaim process implemented using dm_kcopyd_copy ensures that conventional zones are always available for executing unaligned write requests. The reclaim process overhead is minimized by managing buffer zones in a least-recently-written order and first targeting the oldest buffer zones. Doing so, blocks under regular write access (such as metadata blocks of a file system) remain stored in conventional zones, resulting in no apparent overhead. dm-zoned implementation focus on simplicity and on minimizing overhead (CPU, memory and storage overhead). For a 14TB host-managed disk with 256 MB zones, dm-zoned memory usage per disk instance is at most about 3 MB and as little as 5 zones will be used internally for storing metadata and performing buffer zone reclaim operations. This is achieved using zone level indirection rather than a full block indirection system for managing block movement between zones. dm-zoned primary target is host-managed zoned block devices but it can also be used with host-aware device models to mitigate potential device-side performance degradation due to excessive random writing. Zoned block devices can be formatted and checked for use with the dm-zoned target using the dmzadm utility available at: https://github.com/hgst/dm-zoned-tools Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> [Mike Snitzer partly refactored Damien's original work to cleanup the code] Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-06-19 11:05:20 -04:00
Linus Torvalds	d35a878ae1	- A major update for DM cache that reduces the latency for deciding whether blocks should migrate to/from the cache. The bio-prison-v2 interface supports this improvement by enabling direct dispatch of work to workqueues rather than having to delay the actual work dispatch to the DM cache core. So the dm-cache policies are much more nimble by being able to drive IO as they see fit. One immediate benefit from the improved latency is a cache that should be much more adaptive to changing workloads. - Add a new DM integrity target that emulates a block device that has additional per-sector tags that can be used for storing integrity information. - Add a new authenticated encryption feature to the DM crypt target that builds on the capabilities provided by the DM integrity target. - Add MD interface for switching the raid4/5/6 journal mode and update the DM raid target to use it to enable aid4/5/6 journal write-back support. - Switch the DM verity target over to using the asynchronous hash crypto API (this helps work better with architectures that have access to off-CPU algorithm providers, which should reduce CPU utilization). - Various request-based DM and DM multipath fixes and improvements from Bart and Christoph. - A DM thinp target fix for a bio structure leak that occurs for each discard IFF discard passdown is enabled. - A fix for a possible deadlock in DM bufio and a fix to re-check the new buffer allocation watermark in the face of competing admin changes to the 'max_cache_size_bytes' tunable. - A couple DM core cleanups. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJZB6vtAAoJEMUj8QotnQNaoicIALuZTLElgAzxzA28cfk1+1Ea Gd09CfJ3M6cvk/YGUU7WwiSYIwu16yOJALG4sLcYnEmUCzvKfFPcl/RpeSJHPpYM 0aVXa6NIJw7K2r3C17toiK2DRMHYw6QU843WeWI93vBW13lDJklNJL9fM7GBEOLH NMSNw2mAq9ajtLlnJhM3ZfhloA7/u/jektvlBO1AA3RQ5Kx1cXVXFPqN7FdRfcqp 4RuEMe9faAadlXLsj3bia5IBmF/W0Qza6JilP+NLKLWB4fm7LZDjN/k+TsHWMa9e cGR73TgUGLMBJX+sDJy8R3oeBG9JZkFVkD7I30eCjzyhSOs/54XNYQ23EkqHJU0= =9Ryi -----END PGP SIGNATURE----- Merge tag 'for-4.12/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper updates from Mike Snitzer: - A major update for DM cache that reduces the latency for deciding whether blocks should migrate to/from the cache. The bio-prison-v2 interface supports this improvement by enabling direct dispatch of work to workqueues rather than having to delay the actual work dispatch to the DM cache core. So the dm-cache policies are much more nimble by being able to drive IO as they see fit. One immediate benefit from the improved latency is a cache that should be much more adaptive to changing workloads. - Add a new DM integrity target that emulates a block device that has additional per-sector tags that can be used for storing integrity information. - Add a new authenticated encryption feature to the DM crypt target that builds on the capabilities provided by the DM integrity target. - Add MD interface for switching the raid4/5/6 journal mode and update the DM raid target to use it to enable aid4/5/6 journal write-back support. - Switch the DM verity target over to using the asynchronous hash crypto API (this helps work better with architectures that have access to off-CPU algorithm providers, which should reduce CPU utilization). - Various request-based DM and DM multipath fixes and improvements from Bart and Christoph. - A DM thinp target fix for a bio structure leak that occurs for each discard IFF discard passdown is enabled. - A fix for a possible deadlock in DM bufio and a fix to re-check the new buffer allocation watermark in the face of competing admin changes to the 'max_cache_size_bytes' tunable. - A couple DM core cleanups. * tag 'for-4.12/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (50 commits) dm bufio: check new buffer allocation watermark every 30 seconds dm bufio: avoid a possible ABBA deadlock dm mpath: make it easier to detect unintended I/O request flushes dm mpath: cleanup QUEUE_IF_NO_PATH bit manipulation by introducing assign_bit() dm mpath: micro-optimize the hot path relative to MPATHF_QUEUE_IF_NO_PATH dm: introduce enum dm_queue_mode to cleanup related code dm mpath: verify __pg_init_all_paths locking assumptions at runtime dm: verify suspend_locking assumptions at runtime dm block manager: remove an unused argument from dm_block_manager_create() dm rq: check blk_mq_register_dev() return value in dm_mq_init_request_queue() dm mpath: delay requeuing while path initialization is in progress dm mpath: avoid that path removal can trigger an infinite loop dm mpath: split and rename activate_path() to prepare for its expanded use dm ioctl: prevent stack leak in dm ioctl call dm integrity: use previously calculated log2 of sectors_per_block dm integrity: use hex2bin instead of open-coded variant dm crypt: replace custom implementation of hex2bin() dm crypt: remove obsolete references to per-CPU state dm verity: switch to using asynchronous hash crypto API dm crypt: use WQ_HIGHPRI for the IO and crypt workqueues ...	2017-05-03 10:31:20 -07:00
Mikulas Patocka	9d609f85b7	dm integrity: support larger block sizes The DM integrity block size can now be 512, 1k, 2k or 4k. Using larger blocks reduces metadata handling overhead. The block size can be configured at table load time using the "block_size:<value>" option; where <value> is expressed in bytes (defult is still 512 bytes). It is safe to use larger block sizes with DM integrity, because the DM integrity journal makes sure that the whole block is updated atomically even if the underlying device doesn't support atomic writes of that size (e.g. 4k block ontop of a 512b device). Depends-on: 2859323e ("block: fix blk_integrity_register to use template's interval_exp if not 0") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-04-24 12:04:33 -04:00
Mikulas Patocka	56b67a4f29	dm integrity: various small changes and cleanups Some coding style changes. Fix a bug that the array test_tag has insufficient size if the digest size of internal has is bigger than the tag size. The function __fls is undefined for zero argument, this patch fixes undefined behavior if the user sets zero interleave_sectors. Fix the limit of optional arguments to 8. Don't allocate crypt_data on the stack to avoid a BUG with debug kernel. Rename all optional argument names to have underscores rather than dashes. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-04-24 12:04:32 -04:00
Heinz Mauelshagen	6e53636fe8	dm raid: add raid4/5/6 journal write-back support via journal_mode option Commit 63c32ed4afc ("dm raid: add raid4/5/6 journaling support") added journal support to close the raid4/5/6 "write hole" -- in terms of writethrough caching. Introduce a "journal_mode" feature and use the new r5c_journal_mode_set() API to add support for switching the journal device's cache mode between write-through (the current default) and write-back. NOTE: If the journal device is not layered on resilent storage and it fails, write-through mode will cause the "write hole" to reoccur. But if the journal fails while in write-back mode it will cause data loss for any dirty cache entries unless resilent storage is used for the journal. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-03-27 12:08:07 -04:00
Heinz Mauelshagen	4464e36e06	dm raid: fix table line argument order in status Commit 3a1c1ef2f ("dm raid: enhance status interface and fixup takeover/raid0") added new table line arguments and introduced an ordering flaw. The sequence of the raid10_copies and raid10_format raid parameters got reversed which causes lvm2 userspace to fail by falsely assuming a changed table line. Sequence those 2 parameters as before so that old lvm2 can function properly with new kernels by adjusting the table line output as documented in Documentation/device-mapper/dm-raid.txt. Also, add missing version 1.10.1 highlight to the documention. Fixes: 3a1c1ef2f ("dm raid: enhance status interface and fixup takeover/raid0") Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-03-27 11:45:26 -04:00
Mikulas Patocka	c2bcb2b702	dm integrity: add recovery mode In recovery mode, we don't: - replay the journal - check checksums - allow writes to the device This mode can be used as a last resort for data recovery. The motivation for recovery mode is that when there is a single error in the journal, the user should not lose access to the whole device. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-03-24 15:54:23 -04:00
Milan Broz	8f0009a225	dm crypt: optionally support larger encryption sector size Add optional "sector_size" parameter that specifies encryption sector size (atomic unit of block device encryption). Parameter can be in range 512 - 4096 bytes and must be power of two. For compatibility reasons, the maximal IO must fit into the page limit, so the limit is set to the minimal page size possible (4096 bytes). NOTE: this device cannot yet be handled by cryptsetup if this parameter is set. IV for the sector is calculated from the 512 bytes sector offset unless the iv_large_sectors option is used. Test script using dmsetup: DEV="/dev/sdb" DEV_SIZE=$(blockdev --getsz $DEV) KEY="9c1185a5c5e9fc54612808977ee8f548b2258d31ddadef707ba62c166051b9e3cd0294c27515f2bccee924e8823ca6e124b8fc3167ed478bca702babe4e130ac" BLOCK_SIZE=4096 # dmsetup create test_crypt --table "0 $DEV_SIZE crypt aes-xts-plain64 $KEY 0 $DEV 0 1 sector_size:$BLOCK_SIZE" # dmsetup table --showkeys test_crypt Signed-off-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-03-24 15:54:21 -04:00
Milan Broz	33d2f09fcb	dm crypt: introduce new format of cipher with "capi:" prefix For the new authenticated encryption we have to support generic composed modes (combination of encryption algorithm and authenticator) because this is how the kernel crypto API accesses such algorithms. To simplify the interface, we accept an algorithm directly in crypto API format. The new format is recognised by the "capi:" prefix. The dmcrypt internal IV specification is the same as for the old format. The crypto API cipher specifications format is: capi:cipher_api_spec-ivmode[:ivopts] Examples: capi:cbc(aes)-essiv:sha256 (equivalent to old aes-cbc-essiv:sha256) capi:xts(aes)-plain64 (equivalent to old aes-xts-plain64) Examples of authenticated modes: capi:gcm(aes)-random capi:authenc(hmac(sha256),xts(aes))-random capi:rfc7539(chacha20,poly1305)-random Authenticated modes can only be configured using the new cipher format. Note that this format allows user to specify arbitrary combinations that can be insecure. (Policy decision is done in cryptsetup userspace.) Authenticated encryption algorithms can be of two types, either native modes (like GCM) that performs both encryption and authentication internally, or composed modes where user can compose AEAD with separate specification of encryption algorithm and authenticator. For composed mode with HMAC (length-preserving encryption mode like an XTS and HMAC as an authenticator) we have to calculate HMAC digest size (the separate authentication key is the same size as the HMAC digest). Introduce crypt_ctr_auth_cipher() to parse the crypto API string to get HMAC algorithm and retrieve digest size from it. Also, for HMAC composed mode we need to parse the crypto API string to get the cipher mode nested in the specification. For native AEAD mode (like GCM), we can use crypto_tfm_alg_name() API to get the cipher specification. Because the HMAC composed mode is not processed the same as the native AEAD mode, the CRYPT_MODE_INTEGRITY_HMAC flag is no longer needed and "hmac" specification for the table integrity argument is removed. Signed-off-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-03-24 15:54:20 -04:00
Milan Broz	ef43aa3806	dm crypt: add cryptographic data integrity protection (authenticated encryption) Allow the use of per-sector metadata, provided by the dm-integrity module, for integrity protection and persistently stored per-sector Initialization Vector (IV). The underlying device must support the "DM-DIF-EXT-TAG" dm-integrity profile. The per-bio integrity metadata is allocated by dm-crypt for every bio. Example of low-level mapping table for various types of use: DEV=/dev/sdb SIZE=417792 # Additional HMAC with CBC-ESSIV, key is concatenated encryption key + HMAC key SIZE_INT=389952 dmsetup create x --table "0 $SIZE_INT integrity $DEV 0 32 J 0" dmsetup create y --table "0 $SIZE_INT crypt aes-cbc-essiv:sha256 \ 11ff33c6fb942655efb3e30cf4c0fd95f5ef483afca72166c530ae26151dd83b \ 00112233445566778899aabbccddeeff00112233445566778899aabbccddeeff \ 0 /dev/mapper/x 0 1 integrity:32:hmac(sha256)" # AEAD (Authenticated Encryption with Additional Data) - GCM with random IVs # GCM in kernel uses 96bits IV and we store 128bits auth tag (so 28 bytes metadata space) SIZE_INT=393024 dmsetup create x --table "0 $SIZE_INT integrity $DEV 0 28 J 0" dmsetup create y --table "0 $SIZE_INT crypt aes-gcm-random \ 11ff33c6fb942655efb3e30cf4c0fd95f5ef483afca72166c530ae26151dd83b \ 0 /dev/mapper/x 0 1 integrity:28:aead" # Random IV only for XTS mode (no integrity protection but provides atomic random sector change) SIZE_INT=401272 dmsetup create x --table "0 $SIZE_INT integrity $DEV 0 16 J 0" dmsetup create y --table "0 $SIZE_INT crypt aes-xts-random \ 11ff33c6fb942655efb3e30cf4c0fd95f5ef483afca72166c530ae26151dd83b \ 0 /dev/mapper/x 0 1 integrity:16:none" # Random IV with XTS + HMAC integrity protection SIZE_INT=377656 dmsetup create x --table "0 $SIZE_INT integrity $DEV 0 48 J 0" dmsetup create y --table "0 $SIZE_INT crypt aes-xts-random \ 11ff33c6fb942655efb3e30cf4c0fd95f5ef483afca72166c530ae26151dd83b \ 00112233445566778899aabbccddeeff00112233445566778899aabbccddeeff \ 0 /dev/mapper/x 0 1 integrity:48:hmac(sha256)" Both AEAD and HMAC protection authenticates not only data but also sector metadata. HMAC protection is implemented through autenc wrapper (so it is processed the same way as an authenticated mode). In HMAC mode there are two keys (concatenated in dm-crypt mapping table). First is the encryption key and the second is the key for authentication (HMAC). (It is userspace decision if these keys are independent or somehow derived.) The sector request for AEAD/HMAC authenticated encryption looks like this: \|----- AAD -------\|------ DATA -------\|-- AUTH TAG --\| \| (authenticated) \| (auth+encryption) \| \| \| sector_LE \| IV \| sector in/out \| tag in/out \| For writes, the integrity fields are calculated during AEAD encryption of every sector and stored in bio integrity fields and sent to underlying dm-integrity target for storage. For reads, the integrity metadata is verified during AEAD decryption of every sector (they are filled in by dm-integrity, but the integrity fields are pre-allocated in dm-crypt). There is also an experimental support in cryptsetup utility for more friendly configuration (part of LUKS2 format). Because the integrity fields are not valid on initial creation, the device must be "formatted". This can be done by direct-io writes to the device (e.g. dd in direct-io mode). For now, there is available trivial tool to do this, see: https://github.com/mbroz/dm_int_tools Signed-off-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Ondrej Mosnacek <omosnacek@gmail.com> Signed-off-by: Vashek Matyas <matyas@fi.muni.cz> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-03-24 15:49:41 -04:00
Mikulas Patocka	7eada909bf	dm: add integrity target The dm-integrity target emulates a block device that has additional per-sector tags that can be used for storing integrity information. A general problem with storing integrity tags with every sector is that writing the sector and the integrity tag must be atomic - i.e. in case of crash, either both sector and integrity tag or none of them is written. To guarantee write atomicity the dm-integrity target uses a journal. It writes sector data and integrity tags into a journal, commits the journal and then copies the data and integrity tags to their respective location. The dm-integrity target can be used with the dm-crypt target - in this situation the dm-crypt target creates the integrity data and passes them to the dm-integrity target via bio_integrity_payload attached to the bio. In this mode, the dm-crypt and dm-integrity targets provide authenticated disk encryption - if the attacker modifies the encrypted device, an I/O error is returned instead of random data. The dm-integrity target can also be used as a standalone target, in this mode it calculates and verifies the integrity tag internally. In this mode, the dm-integrity target can be used to detect silent data corruption on the disk or in the I/O path. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-03-24 15:49:07 -04:00
sayli karnik	3f816bac24	Documentation: device-mapper: cache.txt: Fix typos Fix a spelling error (hexidecimal->hexadecimal). Signed-off-by: sayli karnik <karniksayli1995@gmail.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>	2017-03-19 09:16:07 -06:00
Masahiro Yamada	34dcaf40c1	scripts/spelling.txt: add "explictely" pattern and fix typo instances Fix typos and add the following to the scripts/spelling.txt: explictely\|\|explicitly Link: http://lkml.kernel.org/r/1481573103-11329-25-git-send-email-yamada.masahiro@socionext.com Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-02-27 18:43:47 -08:00
Joe Thornber	629d0a8a1a	dm cache metadata: add "metadata2" feature If "metadata2" is provided as a table argument when creating/loading a cache target a more compact metadata format, with separate dirty bits, is used. "metadata2" improves speed of shutting down a cache target. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-02-16 13:12:47 -05:00
Heinz Mauelshagen	63c32ed4af	dm raid: add raid4/5/6 journaling support Add md raid4/5/6 journaling support (upstream commit bac624f3f86a started the implementation) which closes the write hole (i.e. non-atomic updates to stripes) using a dedicated journal device. Background: raid4/5/6 stripes hold N data payloads per stripe plus one parity raid4/5 or two raid6 P/Q syndrome payloads in an in-memory stripe cache. Parity or P/Q syndromes used to recover any data payloads in case of a disk failure are calculated from the N data payloads and need to be updated on the different component devices of the raid device. Those are non-atomic, persistent updates. Hence a crash can cause failure to update all stripe payloads persistently and thus cause data loss during stripe recovery. This problem gets addressed by writing whole stripe cache entries (together with journal metadata) to a persistent journal entry on a dedicated journal device. Only if that journal entry is written successfully, the stripe cache entry is updated on the component devices of the raid device (i.e. writethrough type). In case of a crash, the entry can be recovered from the journal and be written again thus ensuring consistent stripe payload suitable to data recovery. Future dependencies: once writeback caching being worked on to compensate for the throughput implictions involved with writethrough overhead is supported with journaling in upstream, an additional patch based on this one will support it in dm-raid. Journal resilience related remarks: because stripes are recovered from the journal in case of a crash, the journal device better be resilient. Resilience becomes mandatory with future writeback support, because loosing the working set in the log means data loss as oposed to writethrough, were the loss of the journal device 'only' reintroduces the write hole. Fix comment on data offsets in parse_dev_params() and initialize new_data_offset as well. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-01-25 12:49:06 +01:00
Heinz Mauelshagen	c63ede3b42	dm raid: fix transient device failure processing This fix addresses the following 3 failure scenarios: 1) If a (transiently) inaccessible metadata device is being passed into the constructor (e.g. a device tuple '254:4 254:5'), it is processed as if '- -' was given. This erroneously results in a status table line containing '- -', which mistakenly differs from what has been passed in. As a result, userspace libdevmapper puts the device tuple seperate from the RAID device thus not processing the dependencies properly. 2) False health status char 'A' instead of 'D' is emitted on the status status info line for the meta/data device tuple in this metadata device failure case. 3) If the metadata device is accessible when passed into the constructor but the data device (partially) isn't, that leg may be set faulty by the raid personality on access to the (partially) unavailable leg. Restore tried in a second raid device resume on such failed leg (status char 'D') fails after the (partial) leg returned. Fixes for aforementioned failure scenarios: - don't release passed in devices in the constructor thus allowing the status table line to e.g. contain '254:4 254:5' rather than '- -' - emit device status char 'D' rather than 'A' for the device tuple with the failed metadata device on the status info line - when attempting to restore faulty devices in a second resume, allow the device hot remove function to succeed by setting the device to not in-sync In case userspace intentionally passes '- -' into the constructor to avoid that device tuple (e.g. to split off a raid1 leg temporarily for later re-addition), the status table line will correctly show '- -' and the status info line will provide a '-' device health character for the non-defined device tuple. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-01-25 12:49:06 +01:00
Linus Torvalds	a9042defa2	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial updates from Jiri Kosina. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: NTB: correct ntb_spad_count comment typo misc: ibmasm: fix typo in error message Remove references to dead make variable LINUX_INCLUDE Remove last traces of ikconfig.h treewide: Fix printk() message errors Documentation/device-mapper: s/getsize/getsz/	2016-12-14 11:12:25 -08:00
Michael Witten	95f21c5c6d	Documentation/device-mapper: s/getsize/getsz/ According to `man blockdev': --getsize Print device size (32-bit!) in sectors. Deprecated in favor of the --getsz option. ... --getsz Get size in 512-byte sectors. Hence, occurrences of `--getsize' should be replaced with `--getsz', which this commit has achieved as follows: $ cd "$repo" $ git grep -l -e --getsz Documentation/device-mapper/delay.txt Documentation/device-mapper/dm-crypt.txt Documentation/device-mapper/linear.txt Documentation/device-mapper/log-writes.txt Documentation/device-mapper/striped.txt Documentation/device-mapper/switch.txt $ cd Documentation/device-mapper $ sed -i s/getsize/getsz/g * Signed-off-by: Michael Witten <mfwitten@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2016-12-14 10:54:27 +01:00
Heinz Mauelshagen	58fc4fedee	Documentation: dm raid: define data_offset status field Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2016-12-08 14:13:13 -05:00
Ondrej Kozina	c538f6ec9f	dm crypt: add ability to use keys from the kernel key retention service The kernel key service is a generic way to store keys for the use of other subsystems. Currently there is no way to use kernel keys in dm-crypt. This patch aims to fix that. Instead of key userspace may pass a key description with preceding ':'. So message that constructs encryption mapping now looks like this: <cipher> [<key>\|:<key_string>] <iv_offset> <dev_path> <start> [<#opt_params> <opt_params>] where <key_string> is in format: <key_size>:<key_type>:<key_description> Currently we only support two elementary key types: 'user' and 'logon'. Keys may be loaded in dm-crypt either via <key_string> or using classical method and pass the key in hex representation directly. dm-crypt device initialised with a key passed in hex representation may be replaced with key passed in key_string format and vice versa. (Based on original work by Andrey Ryabinin) Signed-off-by: Ondrej Kozina <okozina@redhat.com> Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2016-12-08 14:13:09 -05:00
Masanari Iida	bb1423a96f	dm raid: fix typos in Documentation/device-mapper/dm-raid.txt Signed-off-by: Masanari Iida <standby24x7@gmail.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2016-11-21 09:52:04 -05:00
Heinz Mauelshagen	b052b07c39	dm raid: fix activation of existing raid4/10 devices dm-raid 1.9.0 fails to activate existing RAID4/10 devices that have the old superblock format (which does not have takeover/reshaping support that was added via commit 33e53f06850f). Fix validation path for old superblocks by reverting to the old raid4 layout and basing checks on mddev->new_{level,layout,...} members in super_init_validation(). Cc: stable@vger.kernel.org # 4.8 Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2016-10-17 16:41:31 -04:00
Jens Axboe	1eff9d322a	block: rename bio bi_rw to bi_opf Since commit 63a4cc24867d, bio->bi_rw contains flags in the lower portion and the op code in the higher portions. This means that old code that relies on manually setting bi_rw is most likely going to be broken. Instead of letting that brokeness linger, rename the member, to force old and out-of-tree code to break at compile time instead of at runtime. No intended functional changes in this commit. Signed-off-by: Jens Axboe <axboe@fb.com>	2016-08-07 14:41:02 -06:00
Heinz Mauelshagen	d41bfed091	dm raid: update Documentation about reshaping/takeover/additonal RAID types Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2016-06-14 18:52:12 -04:00
Mike Christie	28a8f0d317	block, drivers, fs: rename REQ_FLUSH to REQ_PREFLUSH To avoid confusion between REQ_OP_FLUSH, which is handled by request_fn drivers, and upper layers requesting the block layer perform a flush sequence along with possibly a WRITE, this patch renames REQ_FLUSH to REQ_PREFLUSH. Signed-off-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2016-06-07 13:41:38 -06:00
Eric Engestrom	52813d4046	dm stats: fix spelling mistake in Documentation Signed-off-by: Eric Engestrom <eric@engestrom.ch> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2016-05-05 15:25:54 -04:00
Mike Snitzer	492d48db8d	dm cache: update cache-policies.txt now that mq is an alias for smq Also fix some typos and make all "smq" and "mq" references consistently lowercase. Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2016-05-05 15:25:53 -04:00
Joe Thornber	9ed84698fd	dm cache: make the 'mq' policy an alias for 'smq' smq seems to be performing better than the old mq policy in all situations, as well as using a quarter of the memory. Make 'mq' an alias for 'smq' when choosing a cache policy. The tunables that were present for the old mq are faked, and have no effect. mq should be considered deprecated now. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2016-03-10 17:12:08 -05:00
Sami Tolvanen	0cc37c2df4	dm verity: add ignore_zero_blocks feature If ignore_zero_blocks is enabled dm-verity will return zeroes for blocks matching a zero hash without validating the content. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2015-12-10 10:39:03 -05:00
Sami Tolvanen	a739ff3f54	dm verity: add support for forward error correction Add support for correcting corrupted blocks using Reed-Solomon. This code uses RS(255, N) interleaved across data and hash blocks. Each error-correcting block covers N bytes evenly distributed across the combined total data, so that each byte is a maximum distance away from the others. This makes it possible to recover from several consecutive corrupted blocks with relatively small space overhead. In addition, using verity hashes to locate erasures nearly doubles the effectiveness of error correction. Being able to detect corrupted blocks also improves performance, because only corrupted blocks need to corrected. For a 2 GiB partition, RS(255, 253) (two parity bytes for each 253-byte block) can correct up to 16 MiB of consecutive corrupted blocks if erasures can be located, and 8 MiB if they cannot, with 16 MiB space overhead. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2015-12-10 10:39:03 -05:00
Linus Torvalds	e0700ce709	- Revert a dm-multipath change that caused a regression for unprivledged users (e.g. kvm guests) that issued ioctls when a multipath device had no available paths. - Include Christoph's refactoring of DM's ioctl handling and add support for passing through persistent reservations with DM multipath. - All other changes are very simple cleanups. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJWOp04AAoJEMUj8QotnQNaFLsH/AhMEH/jI1ObOfy4J1Wy4rOx ujJT91uS/s0H3pc9cGKQYnuGpFkX6WWU4wMiabIyiTn4sAsoXaflfIGutivLiDJr HfecrMrGZgnP4ZlpPPB02BmlxFbcPW8yzAU4ma38xBgQ+Pu30RO/HkvX/2vKOppG qwPop/XsNxq3KXgFGM44ToytM6c/MPGluhuvOwbaacAO1HviMuen9qsVjk4kwcf3 jGYTbEPHATxyu5/6oKDTkQTYhzdwg3B2qHCiKMGw3l1kXhaQLFcaOivOLV8Sf3xh bj1070pkGe9OpqaVzMnwDtJ8rnsBl/Nt4wj9oiQPxbX71GYZAmcMIYn9WEkcKFI= =AR2D -----END PGP SIGNATURE----- Merge tag 'dm-4.4-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper updates from Mike Snitzer: "Smaller set of DM changes for this merge. I've based these changes on Jens' for-4.4/reservations branch because the associated DM changes required it. - Revert a dm-multipath change that caused a regression for unprivledged users (e.g. kvm guests) that issued ioctls when a multipath device had no available paths. - Include Christoph's refactoring of DM's ioctl handling and add support for passing through persistent reservations with DM multipath. - All other changes are very simple cleanups" * tag 'dm-4.4-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm switch: simplify conditional in alloc_region_table() dm delay: document that offsets are specified in sectors dm delay: capitalize the start of an delay_ctr() error message dm delay: Use DM_MAPIO macros instead of open-coded equivalents dm linear: remove redundant target name from error messages dm persistent data: eliminate unnecessary return values dm: eliminate unused "bioset" process for each bio-based DM device dm: convert ffs to __ffs dm: drop NULL test before kmem_cache_destroy() and mempool_destroy() dm: add support for passing through persistent reservations dm: refactor ioctl handling Revert "dm mpath: fix stalls when handling invalid ioctls" dm: initialize non-blk-mq queue data before queue is used	2015-11-04 21:19:53 -08:00
Tomohiro Kusumi	f49e869a61	dm delay: document that offsets are specified in sectors Only delay params are mentioned in delay.txt. Mention offsets just like documents for linear and flakey do. Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2015-10-31 19:06:05 -04:00
Mike Snitzer	b0d3cc011e	dm snapshot: add new persistent store option to support overflow Commit 76c44f6d80 introduced the possibly for "Overflow" to be reported by the snapshot device's status. Older userspace (e.g. lvm2) does not handle the "Overflow" status response. Fix this incompatibility by requiring newer userspace code, that can cope with "Overflow", request the persistent store with overflow support by using "PO" (Persistent with Overflow) for the snapshot store type. Reported-by: Zdenek Kabelac <zkabelac@redhat.com> Fixes: 76c44f6d80 ("dm snapshot: don't invalidate on-disk image on snapshot write overflow") Reviewed-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2015-10-09 16:57:03 -04:00
Heinz Mauelshagen	f15f4d7200	dm raid: document RAID 4/5/6 discard support For RAID 4/5/6 data integrity reasons 'discard_zeroes_data' must work properly. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2015-08-31 15:05:31 -04:00
Mikulas Patocka	bd49784fd1	dm stats: report precise_timestamps and histogram in @stats_list output If the user selected the precise_timestamps or histogram options, report it in the @stats_list message output. If the user didn't select these options, no extra tokens are reported, thus it is backward compatible with old software that doesn't know about precise timestamps and histogram. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org # 4.2	2015-08-18 17:20:03 -04:00
Mike Snitzer	255eac2005	dm cache: display 'needs_check' in status if it is set There is currently no way to see that the needs_check flag has been set in the metadata. Display 'needs_check' in the cache status if it is set in the cache metadata. Also, update cache documentation. Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2015-07-16 10:23:50 -04:00
Mike Snitzer	e4c78e210d	dm thin: display 'needs_check' in status if it is set There is currently no way to see that the needs_check flag has been set in the metadata. Display 'needs_check' in the thin-pool status if it is set in the thinp metadata. Also, update thinp documentation. Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2015-07-16 10:23:50 -04:00
Mikulas Patocka	dfcfac3e4c	dm stats: collect and report histogram of IO latencies Add an option to dm statistics to collect and report a histogram of IO latencies. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2015-06-17 12:40:40 -04:00
Mikulas Patocka	c96aec344d	dm stats: support precise timestamps Make it possible to use precise timestamps with nanosecond granularity in dm statistics. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2015-06-17 12:40:40 -04:00
Mike Snitzer	bccab6a01a	dm cache: switch the "default" cache replacement policy from mq to smq The Stochastic multiqueue (SMQ) policy (vs MQ) offers the promise of less memory utilization, improved performance and increased adaptability in the face of changing workloads. SMQ also does not have any cumbersome tuning knobs. Users may switch from "mq" to "smq" simply by appropriately reloading a DM table that is using the cache target. Doing so will cause all of the mq policy's hints to be dropped. Also, performance of the cache may degrade slightly until smq recalculates the origin device's hotspots that should be cached. In the future the "mq" policy will just silently make use of "smq" and the mq code will be removed. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Acked-by: Joe Thornber <ejt@redhat.com>	2015-06-17 12:40:38 -04:00
Joe Thornber	028ae9f76f	dm cache: add fail io mode and needs_check flag If a cache metadata operation fails (e.g. transaction commit) the cache's metadata device will abort the current transaction, set a new needs_check flag, and the cache will transition to "read-only" mode. If aborting the transaction or setting the needs_check flag fails the cache will transition to "fail-io" mode. Once needs_check is set the cache device will not be allowed to activate. Activation requires write access to metadata. Future work is needed to add proper support for running the cache in read-only mode. Once in fail-io mode the cache will report a status of "Fail". Also, add commit() wrapper that will disallow commits if in read_only or fail mode. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2015-06-11 17:13:00 -04:00
Heinz Mauelshagen	0cf4503174	dm raid: add support for the MD RAID0 personality Add dm-raid access to the MD RAID0 personality to enable single zone striping. The following changes enable that access: - add type definition to raid_types array - make bitmap creation conditonal in super_validate(), because bitmaps are not allowed in raid0 - set rdev->sectors to the data image size in super_validate() to allow the raid0 personality to calculate the MD array size properly - use mdddev(un)lock() functions instead of direct mutex_(un)lock() (wrapped in here because it's a trivial change) - enhance raid_status() to always report full sync for raid0 so that userspace checks for 100% sync will succeed and allow for resize (and takeover/reshape once added in future paches) - enhance raid_resume() to not load bitmap in case of raid0 - add merge function to avoid data corruption (seen with readahead) that resulted from bio payloads that grew too large. This problem did not occur with the other raid levels because it either did not apply without striping (raid1) or was avoided via stripe caching. - raise version to 1.7.0 because of the raid0 API change Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Reviewed-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2015-05-29 14:19:00 -04:00
Heinz Mauelshagen	0f4106b32f	dm raid: fixup documentation for discard support Remove comment above parse_raid_params() that claims "devices_handle_discard_safely" is a table line argument when it is actually is a module parameter. Also, backfill dm-raid target version 1.6.0 documentation. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Reviewed-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2015-05-29 14:18:59 -04:00

1 2 3

134 Commits