msm-4.14/drivers/md/dm-table.c

2104 lines
51 KiB
C
Raw Normal View History

/*
* Copyright (C) 2001 Sistina Software (UK) Limited.
dm table: rework reference counting Rework table reference counting. The existing code uses a reference counter. When the last reference is dropped and the counter reaches zero, the table destructor is called. Table reference counters are acquired/released from upcalls from other kernel code (dm_any_congested, dm_merge_bvec, dm_unplug_all). If the reference counter reaches zero in one of the upcalls, the table destructor is called from almost random kernel code. This leads to various problems: * dm_any_congested being called under a spinlock, which calls the destructor, which calls some sleeping function. * the destructor attempting to take a lock that is already taken by the same process. * stale reference from some other kernel code keeps the table constructed, which keeps some devices open, even after successful return from "dmsetup remove". This can confuse lvm and prevent closing of underlying devices or reusing device minor numbers. The patch changes reference counting so that the table destructor can be called only at predetermined places. The table has always exactly one reference from either mapped_device->map or hash_cell->new_map. After this patch, this reference is not counted in table->holders. A pair of dm_create_table/dm_destroy_table functions is used for table creation/destruction. Temporary references from the other code increase table->holders. A pair of dm_table_get/dm_table_put functions is used to manipulate it. When the table is about to be destroyed, we wait for table->holders to reach 0. Then, we call the table destructor. We use active waiting with msleep(1), because the situation happens rarely (to one user in 5 years) and removing the device isn't performance-critical task: the user doesn't care if it takes one tick more or not. This way, the destructor is called only at specific points (dm_table_destroy function) and the above problems associated with lazy destruction can't happen. Finally remove the temporary protection added to dm_any_congested(). Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-01-06 03:05:10 +00:00
* Copyright (C) 2004-2008 Red Hat, Inc. All rights reserved.
*
* This file is released under the GPL.
*/
#include "dm-core.h"
#include <linux/module.h>
#include <linux/vmalloc.h>
#include <linux/blkdev.h>
#include <linux/namei.h>
CHROMIUM: dm: boot time specification of dm= This is a wrap-up of three patches pending upstream approval. I'm bundling them because they are interdependent, and it'll be easier to drop it on rebase later. 1. dm: allow a dm-fs-style device to be shared via dm-ioctl Integrates feedback from Alisdair, Mike, and Kiyoshi. Two main changes occur here: - One function is added which allows for a programmatically created mapped device to be inserted into the dm-ioctl hash table. This binds the device to a name and, optional, uuid which is needed by udev and allows for userspace management of the mapped device. - dm_table_complete() was extended to handle all of the final functional changes required for the table to be operational once called. 2. init: boot to device-mapper targets without an initr* Add a dm= kernel parameter modeled after the md= parameter from do_mounts_md. It allows for device-mapper targets to be configured at boot time for use early in the boot process (as the root device or otherwise). It also replaces /dev/XXX calls with major:minor opportunistically. The format is dm="name uuid ro,table line 1,table line 2,...". The parser expects the comma to be safe to use as a newline substitute but, otherwise, uses the normal separator of space. Some attempt has been made to make it forgiving of additional spaces (using skip_spaces()). A mapped device created during boot will be assigned a minor of 0 and may be access via /dev/dm-0. An example dm-linear root with no uuid may look like: root=/dev/dm-0 dm="lroot none ro, 0 4096 linear /dev/ubdb 0, 4096 4096 linear /dv/ubdc 0" Once udev is started, /dev/dm-0 will become /dev/mapper/lroot. Older upstream threads: http://marc.info/?l=dm-devel&m=127429492521964&w=2 http://marc.info/?l=dm-devel&m=127429499422096&w=2 http://marc.info/?l=dm-devel&m=127429493922000&w=2 Latest upstream threads: https://patchwork.kernel.org/patch/104859/ https://patchwork.kernel.org/patch/104860/ https://patchwork.kernel.org/patch/104861/ Bug: 27175947 Signed-off-by: Will Drewry <wad@chromium.org> Review URL: http://codereview.chromium.org/2020011 Change-Id: I92bd53432a11241228d2e5ac89a3b20d19b05a31 [AmitP: Refactored the original changes based on upstream changes, commit e52347bd66f6 ("Documentation/admin-guide: split the kernel parameter list to a separate file")] Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
2010-06-09 17:47:38 -05:00
#include <linux/mount.h>
#include <linux/ctype.h>
tree-wide: convert open calls to remove spaces to skip_spaces() lib function Makes use of skip_spaces() defined in lib/string.c for removing leading spaces from strings all over the tree. It decreases lib.a code size by 47 bytes and reuses the function tree-wide: text data bss dec hex filename 64688 584 592 65864 10148 (TOTALS-BEFORE) 64641 584 592 65817 10119 (TOTALS-AFTER) Also, while at it, if we see (*str && isspace(*str)), we can be sure to remove the first condition (*str) as the second one (isspace(*str)) also evaluates to 0 whenever *str == 0, making it redundant. In other words, "a char equals zero is never a space". Julia Lawall tried the semantic patch (http://coccinelle.lip6.fr) below, and found occurrences of this pattern on 3 more files: drivers/leds/led-class.c drivers/leds/ledtrig-timer.c drivers/video/output.c @@ expression str; @@ ( // ignore skip_spaces cases while (*str && isspace(*str)) { \(str++;\|++str;\) } | - *str && isspace(*str) ) Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com> Cc: Julia Lawall <julia@diku.dk> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Neil Brown <neilb@suse.de> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Cc: David Howells <dhowells@redhat.com> Cc: <linux-ext4@vger.kernel.org> Cc: Samuel Ortiz <samuel@sortiz.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-14 18:01:06 -08:00
#include <linux/string.h>
#include <linux/slab.h>
#include <linux/interrupt.h>
#include <linux/mutex.h>
dm table: rework reference counting Rework table reference counting. The existing code uses a reference counter. When the last reference is dropped and the counter reaches zero, the table destructor is called. Table reference counters are acquired/released from upcalls from other kernel code (dm_any_congested, dm_merge_bvec, dm_unplug_all). If the reference counter reaches zero in one of the upcalls, the table destructor is called from almost random kernel code. This leads to various problems: * dm_any_congested being called under a spinlock, which calls the destructor, which calls some sleeping function. * the destructor attempting to take a lock that is already taken by the same process. * stale reference from some other kernel code keeps the table constructed, which keeps some devices open, even after successful return from "dmsetup remove". This can confuse lvm and prevent closing of underlying devices or reusing device minor numbers. The patch changes reference counting so that the table destructor can be called only at predetermined places. The table has always exactly one reference from either mapped_device->map or hash_cell->new_map. After this patch, this reference is not counted in table->holders. A pair of dm_create_table/dm_destroy_table functions is used for table creation/destruction. Temporary references from the other code increase table->holders. A pair of dm_table_get/dm_table_put functions is used to manipulate it. When the table is about to be destroyed, we wait for table->holders to reach 0. Then, we call the table destructor. We use active waiting with msleep(1), because the situation happens rarely (to one user in 5 years) and removing the device isn't performance-critical task: the user doesn't care if it takes one tick more or not. This way, the destructor is called only at specific points (dm_table_destroy function) and the above problems associated with lazy destruction can't happen. Finally remove the temporary protection added to dm_any_congested(). Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-01-06 03:05:10 +00:00
#include <linux/delay.h>
#include <linux/atomic.h>
#include <linux/blk-mq.h>
dm table: fall back to getting device using name_to_dev_t() If a device is used as the root filesystem, it can't be built off of devices which are within the root filesystem (just like command line arguments to root=). For this reason, Linux has a pseudo-filesystem for root= and MD initialization (based on the function name_to_dev_t) which handles different ways of specifying devices including PARTUUID and major:minor. Switch to using name_to_dev_t() in dm_get_device(). Rather than having DM assume that all things which are not major:minor are paths in an already-mounted filesystem, change dm_get_device() to first attempt to look up the device in the filesystem, and if not found it will fall back to using name_to_dev_t(). In terms of backwards compatibility, there are some cases where behavior will be different: - If you have a file in the current working directory named 1:2 and you initialze DM there, then it will try to use that file rather than the disk with that major:minor pair as a backing device. - Similarly for other bdev types which name_to_dev_t() knows how to interpret, the previous behavior was to repeatedly check for the existence of the file (e.g., while waiting for rootfs to come up) but the new behavior is to use the name_to_dev_t() interpretation. For example, if you have a file named /dev/ubiblock0_0 which is a symlink to /dev/sda3, but it is not yet present when DM starts to initialize, then the name_to_dev_t() interpretation will take precedence. These incompatibilities would only show up in really strange setups with bad practices so we shouldn't have to worry about them. Signed-off-by: Dan Ehrenberg <dehrenberg@chromium.org> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2015-02-10 15:20:51 -08:00
#include <linux/mount.h>
#include <linux/dax.h>
Integrate the new file encryption framework These changes integrate new file encryption framework to use new V2 encryption policies. These changes were earlier reverted in 'commit 4211691d298c ("Reverting crypto and incrementalfs changes")', as part of android-4.14.171 merge from Android common kernel. This patch attempts to bring them back post validation. commit a9a5450 ANDROID: dm: prevent default-key from being enabled without needed hooks commit e1a94e6 ANDROID: dm: add dm-default-key target for metadata encryption commit commit 232fd35 ANDROID: dm: enable may_passthrough_inline_crypto on some targets commit 53bc059 ANDROID: dm: add support for passing through inline crypto support commit aeed6db ANDROID: block: Introduce passthrough keyslot manager commit 4f27c8b ANDROID: ext4, f2fs: enable direct I/O with inline encryption commit c91db46 BACKPORT: FROMLIST: scsi: ufs: add program_key() variant op commit f9a8e4a ANDROID: block: export symbols needed for modules to use inline crypto commit 75fea5f ANDROID: block: fix some inline crypto bugs commit 2871f73 ANDROID: fscrypt: add support for hardware-wrapped keys commit bb5a657 ANDROID: block: add KSM op to derive software secret from wrapped key commit d42ba87 ANDROID: block: provide key size as input to inline crypto APIs commit 86646eb ANDROID: ufshcd-crypto: export cap find API commit 83bc20e ANDROID: scsi: ufs-qcom: Enable BROKEN_CRYPTO quirk flag commit c266a13 ANDROID: scsi: ufs: Add quirk bit for controllers that don't play well with inline crypto commit ea09b99 ANDROID: cuttlefish_defconfig: Enable blk-crypto fallback commit e12563c BACKPORT: FROMLIST: Update Inline Encryption from v5 to v6 of patch series commit 8e8f55d ANDROID: scsi: ufs: UFS init should not require inline crypto commit dae9899 ANDROID: scsi: ufs: UFS crypto variant operations API commit a69516d ANDROID: cuttlefish_defconfig: enable inline encryption commit b8f7b23 BACKPORT: FROMLIST: ext4: add inline encryption support commit e64327f BACKPORT: FROMLIST: f2fs: add inline encryption support commit a0dc8da BACKPORT: FROMLIST: fscrypt: add inline encryption support commit 19c3c62 BACKPORT: FROMLIST: scsi: ufs: Add inline encryption support to UFS commit f858a99 BACKPORT: FROMLIST: scsi: ufs: UFS crypto API commit 011b834 BACKPORT: FROMLIST: scsi: ufs: UFS driver v2.1 spec crypto additions commit ec0b569 BACKPORT: FROMLIST: block: blk-crypto for Inline Encryption commit 760b328 ANDROID: block: Fix bio_crypt_should_process WARN_ON commit 138adbb BACKPORT: FROMLIST: block: Add encryption context to struct bio commit 66b5609 BACKPORT: FROMLIST: block: Keyslot Manager for Inline Encryption Git-repo: https://android.googlesource.com/kernel/common/+/refs/heads/android-4.14-stable Git-commit: a9a545067a93d9821f965989b8eaea6fba7d27f7 Git-commit: e1a94e6b17e2610b56c5740b763df7858dad40f0 Git-commit: 232fd353e45d13576d507a011b5dac17e3c320ab Git-commit: 53bc059bc6d98631e8936ab9eeb7ac780c9ab2c3 Git-commit: aeed6db424b22148964d9788d4f9abac6e6cd7d8 Git-commit: 4f27c8b90bd223e967c98dc658961e67b9b864ae Git-commit: c91db466b51479ae761becc233d79c50ca3748a5 Git-commit: f9a8e4a5c5455a6bada70ed6d2f0af8900a872cb Git-commit: 75fea5f6057df78af1655f2f79a9c66a94bc838f Git-commit: 2871f731940165ed4042001a36bbe7d58f9d983b Git-commit: bb5a65771a206ae39086af1a9e78afeaf654cf03 Git-commit: d42ba87e29ab44aac446b5434298d1369c44fe3c Git-commit: 86646ebb1742a663c4c9c39c06d58dcb3f8f89e5 Git-commit: 83bc20ed4ba7dbf76964fd68905fde591b5de8b2 Git-commit: c266a1311e74b3ae1047a9d6abd6c6044059995c Git-commit: ea09b9954cc40b3088b8b2778b2daab12820a7e6 Git-commit: e12563c18d484e6379d03105b4565db7bb3a7975 Git-commit: 8e8f55d1a7e865562d2e3e022a7fcf13753a9c8e Git-commit: dae9899044f320bb119e02b45d816a493b1488ae Git-commit: a69516d0913e7f2c9bdde17c2ea6a793bb474830 Git-commit: b8f7b236748261bec545b69b39d7fb75e519f4ed Git-commit: e64327f5719b4a41e0de341ead7d48ed73216a23 Git-commit: a0dc8da519ccf2040af2dbbd6b4f688b50eb1755 Git-commit: 19c3c62836e5dbc9ceb620ecef0aa0c81578ed43 Git-commit: f858a9981a94a4e1d1b77b00bc05ab61b8431bce Git-commit: 011b8344c36d39255b8057c63d98e593e364ed7f Git-commit: ec0b569b5cc89391d9d6c90d2f76dc0a4db03e57 Git-commit: 760b3283e8056ffa6382722457c2e0cf08328629 Git-commit: 138adbbe5e4bfb6dee0571261f4d96a98f71d228 Git-commit: 66b5609826d60f80623643f1a7a1d865b5233f19 Change-Id: I171d90de41185824e0c7515f3a3b43ab88f4e058 Signed-off-by: Neeraj Soni <neersoni@codeaurora.org>
2020-07-18 19:59:05 +05:30
#include <linux/bio.h>
#include <linux/keyslot-manager.h>
#define DM_MSG_PREFIX "table"
#define MAX_DEPTH 16
#define NODE_SIZE L1_CACHE_BYTES
#define KEYS_PER_NODE (NODE_SIZE / sizeof(sector_t))
#define CHILDREN_PER_NODE (KEYS_PER_NODE + 1)
struct dm_table {
struct mapped_device *md;
enum dm_queue_mode type;
/* btree table */
unsigned int depth;
unsigned int counts[MAX_DEPTH]; /* in nodes */
sector_t *index[MAX_DEPTH];
unsigned int num_targets;
unsigned int num_allocated;
sector_t *highs;
struct dm_target *targets;
struct target_type *immutable_target_type;
bool integrity_supported:1;
bool singleton:1;
bool all_blk_mq:1;
unsigned integrity_added:1;
/*
* Indicates the rw permissions for the new logical
* device. This should be a combination of FMODE_READ
* and FMODE_WRITE.
*/
fmode_t mode;
/* a list of devices used by this table */
struct list_head devices;
/* events get handed up using this callback */
void (*event_fn)(void *);
void *event_context;
struct dm_md_mempools *mempools;
struct list_head target_callbacks;
};
/*
* Similar to ceiling(log_size(n))
*/
static unsigned int int_log(unsigned int n, unsigned int base)
{
int result = 0;
while (n > 1) {
n = dm_div_up(n, base);
result++;
}
return result;
}
/*
* Calculate the index of the child node of the n'th node k'th key.
*/
static inline unsigned int get_child(unsigned int n, unsigned int k)
{
return (n * CHILDREN_PER_NODE) + k;
}
/*
* Return the n'th node of level l from table t.
*/
static inline sector_t *get_node(struct dm_table *t,
unsigned int l, unsigned int n)
{
return t->index[l] + (n * KEYS_PER_NODE);
}
/*
* Return the highest key that you could lookup from the n'th
* node on level l of the btree.
*/
static sector_t high(struct dm_table *t, unsigned int l, unsigned int n)
{
for (; l < t->depth - 1; l++)
n = get_child(n, CHILDREN_PER_NODE - 1);
if (n >= t->counts[l])
return (sector_t) - 1;
return get_node(t, l, n)[KEYS_PER_NODE - 1];
}
/*
* Fills in a level of the btree based on the highs of the level
* below it.
*/
static int setup_btree_index(unsigned int l, struct dm_table *t)
{
unsigned int n, k;
sector_t *node;
for (n = 0U; n < t->counts[l]; n++) {
node = get_node(t, l, n);
for (k = 0U; k < KEYS_PER_NODE; k++)
node[k] = high(t, l + 1, get_child(n, k));
}
return 0;
}
void *dm_vcalloc(unsigned long nmemb, unsigned long elem_size)
{
unsigned long size;
void *addr;
/*
* Check that we're not going to overflow.
*/
if (nmemb > (ULONG_MAX / elem_size))
return NULL;
size = nmemb * elem_size;
addr = vzalloc(size);
return addr;
}
EXPORT_SYMBOL(dm_vcalloc);
/*
* highs, and targets are managed as dynamic arrays during a
* table load.
*/
static int alloc_targets(struct dm_table *t, unsigned int num)
{
sector_t *n_highs;
struct dm_target *n_targets;
/*
* Allocate both the target array and offset array at once.
* Append an empty entry to catch sectors beyond the end of
* the device.
*/
n_highs = (sector_t *) dm_vcalloc(num + 1, sizeof(struct dm_target) +
sizeof(sector_t));
if (!n_highs)
return -ENOMEM;
n_targets = (struct dm_target *) (n_highs + num);
memset(n_highs, -1, sizeof(*n_highs) * num);
vfree(t->highs);
t->num_allocated = num;
t->highs = n_highs;
t->targets = n_targets;
return 0;
}
int dm_table_create(struct dm_table **result, fmode_t mode,
unsigned num_targets, struct mapped_device *md)
{
struct dm_table *t;
if (num_targets > DM_MAX_TARGETS)
return -EOVERFLOW;
t = kzalloc(sizeof(*t), GFP_KERNEL);
if (!t)
return -ENOMEM;
INIT_LIST_HEAD(&t->devices);
INIT_LIST_HEAD(&t->target_callbacks);
if (!num_targets)
num_targets = KEYS_PER_NODE;
num_targets = dm_round_up(num_targets, KEYS_PER_NODE);
if (!num_targets) {
kfree(t);
return -EOVERFLOW;
}
if (alloc_targets(t, num_targets)) {
kfree(t);
return -ENOMEM;
}
t->type = DM_TYPE_NONE;
t->mode = mode;
t->md = md;
*result = t;
return 0;
}
dm: allow active and inactive tables to share dm_devs Until this change, when loading a new DM table, DM core would re-open all of the devices in the DM table. Now, DM core will avoid redundant device opens (and closes when destroying the old table) if the old table already has a device open using the same mode. This is achieved by managing reference counts on the table_devices that DM core now stores in the mapped_device structure (rather than in the dm_table structure). So a mapped_device's active and inactive dm_tables' dm_dev lists now just point to the dm_devs stored in the mapped_device's table_devices list. This improvement in DM core's device reference counting has the side-effect of fixing a long-standing limitation of the multipath target: a DM multipath table couldn't include any paths that were unusable (failed). For example: if all paths have failed and you add a new, working, path to the table; you can't use it since the table load would fail due to it still containing failed paths. Now a re-load of a multipath table can include failed devices and when those devices become active again they can be used instantly. The device list code in dm.c isn't a straight copy/paste from the code in dm-table.c, but it's very close (aside from some variable renames). One subtle difference is that find_table_device for the tables_devices list will only match devices with the same name and mode. This is because we don't want to upgrade a device's mode in the active table when an inactive table is loaded. Access to the mapped_device structure's tables_devices list requires a mutex (tables_devices_lock), so that tables cannot be created and destroyed concurrently. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2014-08-13 13:53:43 -05:00
static void free_devices(struct list_head *devices, struct mapped_device *md)
{
struct list_head *tmp, *next;
list_for_each_safe(tmp, next, devices) {
struct dm_dev_internal *dd =
list_entry(tmp, struct dm_dev_internal, list);
dm: allow active and inactive tables to share dm_devs Until this change, when loading a new DM table, DM core would re-open all of the devices in the DM table. Now, DM core will avoid redundant device opens (and closes when destroying the old table) if the old table already has a device open using the same mode. This is achieved by managing reference counts on the table_devices that DM core now stores in the mapped_device structure (rather than in the dm_table structure). So a mapped_device's active and inactive dm_tables' dm_dev lists now just point to the dm_devs stored in the mapped_device's table_devices list. This improvement in DM core's device reference counting has the side-effect of fixing a long-standing limitation of the multipath target: a DM multipath table couldn't include any paths that were unusable (failed). For example: if all paths have failed and you add a new, working, path to the table; you can't use it since the table load would fail due to it still containing failed paths. Now a re-load of a multipath table can include failed devices and when those devices become active again they can be used instantly. The device list code in dm.c isn't a straight copy/paste from the code in dm-table.c, but it's very close (aside from some variable renames). One subtle difference is that find_table_device for the tables_devices list will only match devices with the same name and mode. This is because we don't want to upgrade a device's mode in the active table when an inactive table is loaded. Access to the mapped_device structure's tables_devices list requires a mutex (tables_devices_lock), so that tables cannot be created and destroyed concurrently. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2014-08-13 13:53:43 -05:00
DMWARN("%s: dm_table_destroy: dm_put_device call missing for %s",
dm_device_name(md), dd->dm_dev->name);
dm_put_table_device(md, dd->dm_dev);
kfree(dd);
}
}
dm table: rework reference counting Rework table reference counting. The existing code uses a reference counter. When the last reference is dropped and the counter reaches zero, the table destructor is called. Table reference counters are acquired/released from upcalls from other kernel code (dm_any_congested, dm_merge_bvec, dm_unplug_all). If the reference counter reaches zero in one of the upcalls, the table destructor is called from almost random kernel code. This leads to various problems: * dm_any_congested being called under a spinlock, which calls the destructor, which calls some sleeping function. * the destructor attempting to take a lock that is already taken by the same process. * stale reference from some other kernel code keeps the table constructed, which keeps some devices open, even after successful return from "dmsetup remove". This can confuse lvm and prevent closing of underlying devices or reusing device minor numbers. The patch changes reference counting so that the table destructor can be called only at predetermined places. The table has always exactly one reference from either mapped_device->map or hash_cell->new_map. After this patch, this reference is not counted in table->holders. A pair of dm_create_table/dm_destroy_table functions is used for table creation/destruction. Temporary references from the other code increase table->holders. A pair of dm_table_get/dm_table_put functions is used to manipulate it. When the table is about to be destroyed, we wait for table->holders to reach 0. Then, we call the table destructor. We use active waiting with msleep(1), because the situation happens rarely (to one user in 5 years) and removing the device isn't performance-critical task: the user doesn't care if it takes one tick more or not. This way, the destructor is called only at specific points (dm_table_destroy function) and the above problems associated with lazy destruction can't happen. Finally remove the temporary protection added to dm_any_congested(). Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-01-06 03:05:10 +00:00
void dm_table_destroy(struct dm_table *t)
{
unsigned int i;
if (!t)
return;
/* free the indexes */
if (t->depth >= 2)
vfree(t->index[t->depth - 2]);
/* free the targets */
for (i = 0; i < t->num_targets; i++) {
struct dm_target *tgt = t->targets + i;
if (tgt->type->dtr)
tgt->type->dtr(tgt);
dm_put_target_type(tgt->type);
}
vfree(t->highs);
/* free the device list */
dm: allow active and inactive tables to share dm_devs Until this change, when loading a new DM table, DM core would re-open all of the devices in the DM table. Now, DM core will avoid redundant device opens (and closes when destroying the old table) if the old table already has a device open using the same mode. This is achieved by managing reference counts on the table_devices that DM core now stores in the mapped_device structure (rather than in the dm_table structure). So a mapped_device's active and inactive dm_tables' dm_dev lists now just point to the dm_devs stored in the mapped_device's table_devices list. This improvement in DM core's device reference counting has the side-effect of fixing a long-standing limitation of the multipath target: a DM multipath table couldn't include any paths that were unusable (failed). For example: if all paths have failed and you add a new, working, path to the table; you can't use it since the table load would fail due to it still containing failed paths. Now a re-load of a multipath table can include failed devices and when those devices become active again they can be used instantly. The device list code in dm.c isn't a straight copy/paste from the code in dm-table.c, but it's very close (aside from some variable renames). One subtle difference is that find_table_device for the tables_devices list will only match devices with the same name and mode. This is because we don't want to upgrade a device's mode in the active table when an inactive table is loaded. Access to the mapped_device structure's tables_devices list requires a mutex (tables_devices_lock), so that tables cannot be created and destroyed concurrently. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2014-08-13 13:53:43 -05:00
free_devices(&t->devices, t->md);
dm_free_md_mempools(t->mempools);
kfree(t);
}
/*
* See if we've already got a device in the list.
*/
static struct dm_dev_internal *find_device(struct list_head *l, dev_t dev)
{
struct dm_dev_internal *dd;
list_for_each_entry (dd, l, list)
dm: allow active and inactive tables to share dm_devs Until this change, when loading a new DM table, DM core would re-open all of the devices in the DM table. Now, DM core will avoid redundant device opens (and closes when destroying the old table) if the old table already has a device open using the same mode. This is achieved by managing reference counts on the table_devices that DM core now stores in the mapped_device structure (rather than in the dm_table structure). So a mapped_device's active and inactive dm_tables' dm_dev lists now just point to the dm_devs stored in the mapped_device's table_devices list. This improvement in DM core's device reference counting has the side-effect of fixing a long-standing limitation of the multipath target: a DM multipath table couldn't include any paths that were unusable (failed). For example: if all paths have failed and you add a new, working, path to the table; you can't use it since the table load would fail due to it still containing failed paths. Now a re-load of a multipath table can include failed devices and when those devices become active again they can be used instantly. The device list code in dm.c isn't a straight copy/paste from the code in dm-table.c, but it's very close (aside from some variable renames). One subtle difference is that find_table_device for the tables_devices list will only match devices with the same name and mode. This is because we don't want to upgrade a device's mode in the active table when an inactive table is loaded. Access to the mapped_device structure's tables_devices list requires a mutex (tables_devices_lock), so that tables cannot be created and destroyed concurrently. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2014-08-13 13:53:43 -05:00
if (dd->dm_dev->bdev->bd_dev == dev)
return dd;
return NULL;
}
/*
* If possible, this checks an area of a destination device is invalid.
*/
static int device_area_is_invalid(struct dm_target *ti, struct dm_dev *dev,
sector_t start, sector_t len, void *data)
{
struct request_queue *q;
struct queue_limits *limits = data;
struct block_device *bdev = dev->bdev;
sector_t dev_size =
i_size_read(bdev->bd_inode) >> SECTOR_SHIFT;
unsigned short logical_block_size_sectors =
limits->logical_block_size >> SECTOR_SHIFT;
char b[BDEVNAME_SIZE];
/*
* Some devices exist without request functions,
* such as loop devices not yet bound to backing files.
* Forbid the use of such devices.
*/
q = bdev_get_queue(bdev);
if (!q || !q->make_request_fn) {
DMWARN("%s: %s is not yet initialised: "
"start=%llu, len=%llu, dev_size=%llu",
dm_device_name(ti->table->md), bdevname(bdev, b),
(unsigned long long)start,
(unsigned long long)len,
(unsigned long long)dev_size);
return 1;
}
if (!dev_size)
return 0;
if ((start >= dev_size) || (start + len > dev_size)) {
DMWARN("%s: %s too small for target: "
"start=%llu, len=%llu, dev_size=%llu",
dm_device_name(ti->table->md), bdevname(bdev, b),
(unsigned long long)start,
(unsigned long long)len,
(unsigned long long)dev_size);
return 1;
}
dm table: add zoned block devices validation 1) Introduce DM_TARGET_ZONED_HM feature flag: The target drivers currently available will not operate correctly if a table target maps onto a host-managed zoned block device. To avoid problems, introduce the new feature flag DM_TARGET_ZONED_HM to allow a target to explicitly state that it supports host-managed zoned block devices. This feature is checked for all targets in a table if any of the table's block devices are host-managed. Note that as host-aware zoned block devices are backward compatible with regular block devices, they can be used by any of the current target types. This new feature is thus restricted to host-managed zoned block devices. 2) Check device area zone alignment: If a target maps to a zoned block device, check that the device area is aligned on zone boundaries to avoid problems with REQ_OP_ZONE_RESET operations (resetting a partially mapped sequential zone would not be possible). This also facilitates the processing of zone report with REQ_OP_ZONE_REPORT bios. 3) Check block devices zone model compatibility When setting the DM device's queue limits, several possibilities exists for zoned block devices: 1) The DM target driver may want to expose a different zone model (e.g. host-managed device emulation or regular block device on top of host-managed zoned block devices) 2) Expose the underlying zone model of the devices as-is To allow both cases, the underlying block device zone model must be set in the target limits in dm_set_device_limits() and the compatibility of all devices checked similarly to the logical block size alignment. For this last check, introduce validate_hardware_zoned_model() to check that all targets of a table have the same zone model and that the zone size of the target devices are equal. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> [Mike Snitzer refactored Damien's original work to simplify the code] Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-05-08 16:40:43 -07:00
/*
* If the target is mapped to zoned block device(s), check
* that the zones are not partially mapped.
*/
if (bdev_zoned_model(bdev) != BLK_ZONED_NONE) {
unsigned int zone_sectors = bdev_zone_sectors(bdev);
if (start & (zone_sectors - 1)) {
DMWARN("%s: start=%llu not aligned to h/w zone size %u of %s",
dm_device_name(ti->table->md),
(unsigned long long)start,
zone_sectors, bdevname(bdev, b));
return 1;
}
/*
* Note: The last zone of a zoned block device may be smaller
* than other zones. So for a target mapping the end of a
* zoned block device with such a zone, len would not be zone
* aligned. We do not allow such last smaller zone to be part
* of the mapping here to ensure that mappings with multiple
* devices do not end up with a smaller zone in the middle of
* the sector range.
*/
if (len & (zone_sectors - 1)) {
DMWARN("%s: len=%llu not aligned to h/w zone size %u of %s",
dm_device_name(ti->table->md),
(unsigned long long)len,
zone_sectors, bdevname(bdev, b));
return 1;
}
}
if (logical_block_size_sectors <= 1)
return 0;
if (start & (logical_block_size_sectors - 1)) {
DMWARN("%s: start=%llu not aligned to h/w "
"logical block size %u of %s",
dm_device_name(ti->table->md),
(unsigned long long)start,
limits->logical_block_size, bdevname(bdev, b));
return 1;
}
if (len & (logical_block_size_sectors - 1)) {
DMWARN("%s: len=%llu not aligned to h/w "
"logical block size %u of %s",
dm_device_name(ti->table->md),
(unsigned long long)len,
limits->logical_block_size, bdevname(bdev, b));
return 1;
}
return 0;
}
/*
* This upgrades the mode on an already open dm_dev, being
* careful to leave things as they were if we fail to reopen the
* device and not to touch the existing bdev field in case
* it is accessed concurrently inside dm_table_any_congested().
*/
static int upgrade_mode(struct dm_dev_internal *dd, fmode_t new_mode,
struct mapped_device *md)
{
int r;
dm: allow active and inactive tables to share dm_devs Until this change, when loading a new DM table, DM core would re-open all of the devices in the DM table. Now, DM core will avoid redundant device opens (and closes when destroying the old table) if the old table already has a device open using the same mode. This is achieved by managing reference counts on the table_devices that DM core now stores in the mapped_device structure (rather than in the dm_table structure). So a mapped_device's active and inactive dm_tables' dm_dev lists now just point to the dm_devs stored in the mapped_device's table_devices list. This improvement in DM core's device reference counting has the side-effect of fixing a long-standing limitation of the multipath target: a DM multipath table couldn't include any paths that were unusable (failed). For example: if all paths have failed and you add a new, working, path to the table; you can't use it since the table load would fail due to it still containing failed paths. Now a re-load of a multipath table can include failed devices and when those devices become active again they can be used instantly. The device list code in dm.c isn't a straight copy/paste from the code in dm-table.c, but it's very close (aside from some variable renames). One subtle difference is that find_table_device for the tables_devices list will only match devices with the same name and mode. This is because we don't want to upgrade a device's mode in the active table when an inactive table is loaded. Access to the mapped_device structure's tables_devices list requires a mutex (tables_devices_lock), so that tables cannot be created and destroyed concurrently. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2014-08-13 13:53:43 -05:00
struct dm_dev *old_dev, *new_dev;
dm: allow active and inactive tables to share dm_devs Until this change, when loading a new DM table, DM core would re-open all of the devices in the DM table. Now, DM core will avoid redundant device opens (and closes when destroying the old table) if the old table already has a device open using the same mode. This is achieved by managing reference counts on the table_devices that DM core now stores in the mapped_device structure (rather than in the dm_table structure). So a mapped_device's active and inactive dm_tables' dm_dev lists now just point to the dm_devs stored in the mapped_device's table_devices list. This improvement in DM core's device reference counting has the side-effect of fixing a long-standing limitation of the multipath target: a DM multipath table couldn't include any paths that were unusable (failed). For example: if all paths have failed and you add a new, working, path to the table; you can't use it since the table load would fail due to it still containing failed paths. Now a re-load of a multipath table can include failed devices and when those devices become active again they can be used instantly. The device list code in dm.c isn't a straight copy/paste from the code in dm-table.c, but it's very close (aside from some variable renames). One subtle difference is that find_table_device for the tables_devices list will only match devices with the same name and mode. This is because we don't want to upgrade a device's mode in the active table when an inactive table is loaded. Access to the mapped_device structure's tables_devices list requires a mutex (tables_devices_lock), so that tables cannot be created and destroyed concurrently. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2014-08-13 13:53:43 -05:00
old_dev = dd->dm_dev;
dm: allow active and inactive tables to share dm_devs Until this change, when loading a new DM table, DM core would re-open all of the devices in the DM table. Now, DM core will avoid redundant device opens (and closes when destroying the old table) if the old table already has a device open using the same mode. This is achieved by managing reference counts on the table_devices that DM core now stores in the mapped_device structure (rather than in the dm_table structure). So a mapped_device's active and inactive dm_tables' dm_dev lists now just point to the dm_devs stored in the mapped_device's table_devices list. This improvement in DM core's device reference counting has the side-effect of fixing a long-standing limitation of the multipath target: a DM multipath table couldn't include any paths that were unusable (failed). For example: if all paths have failed and you add a new, working, path to the table; you can't use it since the table load would fail due to it still containing failed paths. Now a re-load of a multipath table can include failed devices and when those devices become active again they can be used instantly. The device list code in dm.c isn't a straight copy/paste from the code in dm-table.c, but it's very close (aside from some variable renames). One subtle difference is that find_table_device for the tables_devices list will only match devices with the same name and mode. This is because we don't want to upgrade a device's mode in the active table when an inactive table is loaded. Access to the mapped_device structure's tables_devices list requires a mutex (tables_devices_lock), so that tables cannot be created and destroyed concurrently. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2014-08-13 13:53:43 -05:00
r = dm_get_table_device(md, dd->dm_dev->bdev->bd_dev,
dd->dm_dev->mode | new_mode, &new_dev);
if (r)
return r;
dm: allow active and inactive tables to share dm_devs Until this change, when loading a new DM table, DM core would re-open all of the devices in the DM table. Now, DM core will avoid redundant device opens (and closes when destroying the old table) if the old table already has a device open using the same mode. This is achieved by managing reference counts on the table_devices that DM core now stores in the mapped_device structure (rather than in the dm_table structure). So a mapped_device's active and inactive dm_tables' dm_dev lists now just point to the dm_devs stored in the mapped_device's table_devices list. This improvement in DM core's device reference counting has the side-effect of fixing a long-standing limitation of the multipath target: a DM multipath table couldn't include any paths that were unusable (failed). For example: if all paths have failed and you add a new, working, path to the table; you can't use it since the table load would fail due to it still containing failed paths. Now a re-load of a multipath table can include failed devices and when those devices become active again they can be used instantly. The device list code in dm.c isn't a straight copy/paste from the code in dm-table.c, but it's very close (aside from some variable renames). One subtle difference is that find_table_device for the tables_devices list will only match devices with the same name and mode. This is because we don't want to upgrade a device's mode in the active table when an inactive table is loaded. Access to the mapped_device structure's tables_devices list requires a mutex (tables_devices_lock), so that tables cannot be created and destroyed concurrently. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2014-08-13 13:53:43 -05:00
dd->dm_dev = new_dev;
dm_put_table_device(md, old_dev);
return 0;
}
dm snapshot: disallow the COW and origin devices from being identical Otherwise loading a "snapshot" table using the same device for the origin and COW devices, e.g.: echo "0 20971520 snapshot 253:3 253:3 P 8" | dmsetup create snap will trigger: BUG: unable to handle kernel NULL pointer dereference at 0000000000000098 [ 1958.979934] IP: [<ffffffffa040efba>] dm_exception_store_set_chunk_size+0x7a/0x110 [dm_snapshot] [ 1958.989655] PGD 0 [ 1958.991903] Oops: 0000 [#1] SMP ... [ 1959.059647] CPU: 9 PID: 3556 Comm: dmsetup Tainted: G IO 4.5.0-rc5.snitm+ #150 ... [ 1959.083517] task: ffff8800b9660c80 ti: ffff88032a954000 task.ti: ffff88032a954000 [ 1959.091865] RIP: 0010:[<ffffffffa040efba>] [<ffffffffa040efba>] dm_exception_store_set_chunk_size+0x7a/0x110 [dm_snapshot] [ 1959.104295] RSP: 0018:ffff88032a957b30 EFLAGS: 00010246 [ 1959.110219] RAX: 0000000000000000 RBX: 0000000000000008 RCX: 0000000000000001 [ 1959.118180] RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff880329334a00 [ 1959.126141] RBP: ffff88032a957b50 R08: 0000000000000000 R09: 0000000000000001 [ 1959.134102] R10: 000000000000000a R11: f000000000000000 R12: ffff880330884d80 [ 1959.142061] R13: 0000000000000008 R14: ffffc90001c13088 R15: ffff880330884d80 [ 1959.150021] FS: 00007f8926ba3840(0000) GS:ffff880333440000(0000) knlGS:0000000000000000 [ 1959.159047] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1959.165456] CR2: 0000000000000098 CR3: 000000032f48b000 CR4: 00000000000006e0 [ 1959.173415] Stack: [ 1959.175656] ffffc90001c13040 ffff880329334a00 ffff880330884ed0 ffff88032a957bdc [ 1959.183946] ffff88032a957bb8 ffffffffa040f225 ffff880329334a30 ffff880300000000 [ 1959.192233] ffffffffa04133e0 ffff880329334b30 0000000830884d58 00000000569c58cf [ 1959.200521] Call Trace: [ 1959.203248] [<ffffffffa040f225>] dm_exception_store_create+0x1d5/0x240 [dm_snapshot] [ 1959.211986] [<ffffffffa040d310>] snapshot_ctr+0x140/0x630 [dm_snapshot] [ 1959.219469] [<ffffffffa0005c44>] ? dm_split_args+0x64/0x150 [dm_mod] [ 1959.226656] [<ffffffffa0005ea7>] dm_table_add_target+0x177/0x440 [dm_mod] [ 1959.234328] [<ffffffffa0009203>] table_load+0x143/0x370 [dm_mod] [ 1959.241129] [<ffffffffa00090c0>] ? retrieve_status+0x1b0/0x1b0 [dm_mod] [ 1959.248607] [<ffffffffa0009e35>] ctl_ioctl+0x255/0x4d0 [dm_mod] [ 1959.255307] [<ffffffff813304e2>] ? memzero_explicit+0x12/0x20 [ 1959.261816] [<ffffffffa000a0c3>] dm_ctl_ioctl+0x13/0x20 [dm_mod] [ 1959.268615] [<ffffffff81215eb6>] do_vfs_ioctl+0xa6/0x5c0 [ 1959.274637] [<ffffffff81120d2f>] ? __audit_syscall_entry+0xaf/0x100 [ 1959.281726] [<ffffffff81003176>] ? do_audit_syscall_entry+0x66/0x70 [ 1959.288814] [<ffffffff81216449>] SyS_ioctl+0x79/0x90 [ 1959.294450] [<ffffffff8167e4ae>] entry_SYSCALL_64_fastpath+0x12/0x71 ... [ 1959.323277] RIP [<ffffffffa040efba>] dm_exception_store_set_chunk_size+0x7a/0x110 [dm_snapshot] [ 1959.333090] RSP <ffff88032a957b30> [ 1959.336978] CR2: 0000000000000098 [ 1959.344121] ---[ end trace b049991ccad1169e ]--- Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1195899 Cc: stable@vger.kernel.org Signed-off-by: Ding Xiang <dingxiang@huawei.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-02-02 12:29:18 +08:00
/*
* Convert the path to a device
*/
dev_t dm_get_dev_t(const char *path)
{
dev_t dev;
dm snapshot: disallow the COW and origin devices from being identical Otherwise loading a "snapshot" table using the same device for the origin and COW devices, e.g.: echo "0 20971520 snapshot 253:3 253:3 P 8" | dmsetup create snap will trigger: BUG: unable to handle kernel NULL pointer dereference at 0000000000000098 [ 1958.979934] IP: [<ffffffffa040efba>] dm_exception_store_set_chunk_size+0x7a/0x110 [dm_snapshot] [ 1958.989655] PGD 0 [ 1958.991903] Oops: 0000 [#1] SMP ... [ 1959.059647] CPU: 9 PID: 3556 Comm: dmsetup Tainted: G IO 4.5.0-rc5.snitm+ #150 ... [ 1959.083517] task: ffff8800b9660c80 ti: ffff88032a954000 task.ti: ffff88032a954000 [ 1959.091865] RIP: 0010:[<ffffffffa040efba>] [<ffffffffa040efba>] dm_exception_store_set_chunk_size+0x7a/0x110 [dm_snapshot] [ 1959.104295] RSP: 0018:ffff88032a957b30 EFLAGS: 00010246 [ 1959.110219] RAX: 0000000000000000 RBX: 0000000000000008 RCX: 0000000000000001 [ 1959.118180] RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff880329334a00 [ 1959.126141] RBP: ffff88032a957b50 R08: 0000000000000000 R09: 0000000000000001 [ 1959.134102] R10: 000000000000000a R11: f000000000000000 R12: ffff880330884d80 [ 1959.142061] R13: 0000000000000008 R14: ffffc90001c13088 R15: ffff880330884d80 [ 1959.150021] FS: 00007f8926ba3840(0000) GS:ffff880333440000(0000) knlGS:0000000000000000 [ 1959.159047] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1959.165456] CR2: 0000000000000098 CR3: 000000032f48b000 CR4: 00000000000006e0 [ 1959.173415] Stack: [ 1959.175656] ffffc90001c13040 ffff880329334a00 ffff880330884ed0 ffff88032a957bdc [ 1959.183946] ffff88032a957bb8 ffffffffa040f225 ffff880329334a30 ffff880300000000 [ 1959.192233] ffffffffa04133e0 ffff880329334b30 0000000830884d58 00000000569c58cf [ 1959.200521] Call Trace: [ 1959.203248] [<ffffffffa040f225>] dm_exception_store_create+0x1d5/0x240 [dm_snapshot] [ 1959.211986] [<ffffffffa040d310>] snapshot_ctr+0x140/0x630 [dm_snapshot] [ 1959.219469] [<ffffffffa0005c44>] ? dm_split_args+0x64/0x150 [dm_mod] [ 1959.226656] [<ffffffffa0005ea7>] dm_table_add_target+0x177/0x440 [dm_mod] [ 1959.234328] [<ffffffffa0009203>] table_load+0x143/0x370 [dm_mod] [ 1959.241129] [<ffffffffa00090c0>] ? retrieve_status+0x1b0/0x1b0 [dm_mod] [ 1959.248607] [<ffffffffa0009e35>] ctl_ioctl+0x255/0x4d0 [dm_mod] [ 1959.255307] [<ffffffff813304e2>] ? memzero_explicit+0x12/0x20 [ 1959.261816] [<ffffffffa000a0c3>] dm_ctl_ioctl+0x13/0x20 [dm_mod] [ 1959.268615] [<ffffffff81215eb6>] do_vfs_ioctl+0xa6/0x5c0 [ 1959.274637] [<ffffffff81120d2f>] ? __audit_syscall_entry+0xaf/0x100 [ 1959.281726] [<ffffffff81003176>] ? do_audit_syscall_entry+0x66/0x70 [ 1959.288814] [<ffffffff81216449>] SyS_ioctl+0x79/0x90 [ 1959.294450] [<ffffffff8167e4ae>] entry_SYSCALL_64_fastpath+0x12/0x71 ... [ 1959.323277] RIP [<ffffffffa040efba>] dm_exception_store_set_chunk_size+0x7a/0x110 [dm_snapshot] [ 1959.333090] RSP <ffff88032a957b30> [ 1959.336978] CR2: 0000000000000098 [ 1959.344121] ---[ end trace b049991ccad1169e ]--- Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1195899 Cc: stable@vger.kernel.org Signed-off-by: Ding Xiang <dingxiang@huawei.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-02-02 12:29:18 +08:00
struct block_device *bdev;
bdev = lookup_bdev(path);
if (IS_ERR(bdev))
dev = name_to_dev_t(path);
else {
dev = bdev->bd_dev;
bdput(bdev);
}
return dev;
}
EXPORT_SYMBOL_GPL(dm_get_dev_t);
/*
* Add a device to the list, or just increment the usage count if
* it's already present.
*/
int dm_get_device(struct dm_target *ti, const char *path, fmode_t mode,
struct dm_dev **result)
{
int r;
dm snapshot: disallow the COW and origin devices from being identical Otherwise loading a "snapshot" table using the same device for the origin and COW devices, e.g.: echo "0 20971520 snapshot 253:3 253:3 P 8" | dmsetup create snap will trigger: BUG: unable to handle kernel NULL pointer dereference at 0000000000000098 [ 1958.979934] IP: [<ffffffffa040efba>] dm_exception_store_set_chunk_size+0x7a/0x110 [dm_snapshot] [ 1958.989655] PGD 0 [ 1958.991903] Oops: 0000 [#1] SMP ... [ 1959.059647] CPU: 9 PID: 3556 Comm: dmsetup Tainted: G IO 4.5.0-rc5.snitm+ #150 ... [ 1959.083517] task: ffff8800b9660c80 ti: ffff88032a954000 task.ti: ffff88032a954000 [ 1959.091865] RIP: 0010:[<ffffffffa040efba>] [<ffffffffa040efba>] dm_exception_store_set_chunk_size+0x7a/0x110 [dm_snapshot] [ 1959.104295] RSP: 0018:ffff88032a957b30 EFLAGS: 00010246 [ 1959.110219] RAX: 0000000000000000 RBX: 0000000000000008 RCX: 0000000000000001 [ 1959.118180] RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff880329334a00 [ 1959.126141] RBP: ffff88032a957b50 R08: 0000000000000000 R09: 0000000000000001 [ 1959.134102] R10: 000000000000000a R11: f000000000000000 R12: ffff880330884d80 [ 1959.142061] R13: 0000000000000008 R14: ffffc90001c13088 R15: ffff880330884d80 [ 1959.150021] FS: 00007f8926ba3840(0000) GS:ffff880333440000(0000) knlGS:0000000000000000 [ 1959.159047] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1959.165456] CR2: 0000000000000098 CR3: 000000032f48b000 CR4: 00000000000006e0 [ 1959.173415] Stack: [ 1959.175656] ffffc90001c13040 ffff880329334a00 ffff880330884ed0 ffff88032a957bdc [ 1959.183946] ffff88032a957bb8 ffffffffa040f225 ffff880329334a30 ffff880300000000 [ 1959.192233] ffffffffa04133e0 ffff880329334b30 0000000830884d58 00000000569c58cf [ 1959.200521] Call Trace: [ 1959.203248] [<ffffffffa040f225>] dm_exception_store_create+0x1d5/0x240 [dm_snapshot] [ 1959.211986] [<ffffffffa040d310>] snapshot_ctr+0x140/0x630 [dm_snapshot] [ 1959.219469] [<ffffffffa0005c44>] ? dm_split_args+0x64/0x150 [dm_mod] [ 1959.226656] [<ffffffffa0005ea7>] dm_table_add_target+0x177/0x440 [dm_mod] [ 1959.234328] [<ffffffffa0009203>] table_load+0x143/0x370 [dm_mod] [ 1959.241129] [<ffffffffa00090c0>] ? retrieve_status+0x1b0/0x1b0 [dm_mod] [ 1959.248607] [<ffffffffa0009e35>] ctl_ioctl+0x255/0x4d0 [dm_mod] [ 1959.255307] [<ffffffff813304e2>] ? memzero_explicit+0x12/0x20 [ 1959.261816] [<ffffffffa000a0c3>] dm_ctl_ioctl+0x13/0x20 [dm_mod] [ 1959.268615] [<ffffffff81215eb6>] do_vfs_ioctl+0xa6/0x5c0 [ 1959.274637] [<ffffffff81120d2f>] ? __audit_syscall_entry+0xaf/0x100 [ 1959.281726] [<ffffffff81003176>] ? do_audit_syscall_entry+0x66/0x70 [ 1959.288814] [<ffffffff81216449>] SyS_ioctl+0x79/0x90 [ 1959.294450] [<ffffffff8167e4ae>] entry_SYSCALL_64_fastpath+0x12/0x71 ... [ 1959.323277] RIP [<ffffffffa040efba>] dm_exception_store_set_chunk_size+0x7a/0x110 [dm_snapshot] [ 1959.333090] RSP <ffff88032a957b30> [ 1959.336978] CR2: 0000000000000098 [ 1959.344121] ---[ end trace b049991ccad1169e ]--- Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1195899 Cc: stable@vger.kernel.org Signed-off-by: Ding Xiang <dingxiang@huawei.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-02-02 12:29:18 +08:00
dev_t dev;
unsigned int major, minor;
char dummy;
struct dm_dev_internal *dd;
struct dm_table *t = ti->table;
BUG_ON(!t);
if (sscanf(path, "%u:%u%c", &major, &minor, &dummy) == 2) {
/* Extract the major/minor numbers */
dev = MKDEV(major, minor);
if (MAJOR(dev) != major || MINOR(dev) != minor)
return -EOVERFLOW;
} else {
dev = dm_get_dev_t(path);
if (!dev)
return -ENODEV;
}
dd = find_device(&t->devices, dev);
if (!dd) {
dd = kmalloc(sizeof(*dd), GFP_KERNEL);
if (!dd)
return -ENOMEM;
dm: allow active and inactive tables to share dm_devs Until this change, when loading a new DM table, DM core would re-open all of the devices in the DM table. Now, DM core will avoid redundant device opens (and closes when destroying the old table) if the old table already has a device open using the same mode. This is achieved by managing reference counts on the table_devices that DM core now stores in the mapped_device structure (rather than in the dm_table structure). So a mapped_device's active and inactive dm_tables' dm_dev lists now just point to the dm_devs stored in the mapped_device's table_devices list. This improvement in DM core's device reference counting has the side-effect of fixing a long-standing limitation of the multipath target: a DM multipath table couldn't include any paths that were unusable (failed). For example: if all paths have failed and you add a new, working, path to the table; you can't use it since the table load would fail due to it still containing failed paths. Now a re-load of a multipath table can include failed devices and when those devices become active again they can be used instantly. The device list code in dm.c isn't a straight copy/paste from the code in dm-table.c, but it's very close (aside from some variable renames). One subtle difference is that find_table_device for the tables_devices list will only match devices with the same name and mode. This is because we don't want to upgrade a device's mode in the active table when an inactive table is loaded. Access to the mapped_device structure's tables_devices list requires a mutex (tables_devices_lock), so that tables cannot be created and destroyed concurrently. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2014-08-13 13:53:43 -05:00
if ((r = dm_get_table_device(t->md, dev, mode, &dd->dm_dev))) {
kfree(dd);
return r;
}
atomic_set(&dd->count, 0);
list_add(&dd->list, &t->devices);
dm: allow active and inactive tables to share dm_devs Until this change, when loading a new DM table, DM core would re-open all of the devices in the DM table. Now, DM core will avoid redundant device opens (and closes when destroying the old table) if the old table already has a device open using the same mode. This is achieved by managing reference counts on the table_devices that DM core now stores in the mapped_device structure (rather than in the dm_table structure). So a mapped_device's active and inactive dm_tables' dm_dev lists now just point to the dm_devs stored in the mapped_device's table_devices list. This improvement in DM core's device reference counting has the side-effect of fixing a long-standing limitation of the multipath target: a DM multipath table couldn't include any paths that were unusable (failed). For example: if all paths have failed and you add a new, working, path to the table; you can't use it since the table load would fail due to it still containing failed paths. Now a re-load of a multipath table can include failed devices and when those devices become active again they can be used instantly. The device list code in dm.c isn't a straight copy/paste from the code in dm-table.c, but it's very close (aside from some variable renames). One subtle difference is that find_table_device for the tables_devices list will only match devices with the same name and mode. This is because we don't want to upgrade a device's mode in the active table when an inactive table is loaded. Access to the mapped_device structure's tables_devices list requires a mutex (tables_devices_lock), so that tables cannot be created and destroyed concurrently. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2014-08-13 13:53:43 -05:00
} else if (dd->dm_dev->mode != (mode | dd->dm_dev->mode)) {
r = upgrade_mode(dd, mode, t->md);
if (r)
return r;
}
atomic_inc(&dd->count);
dm: allow active and inactive tables to share dm_devs Until this change, when loading a new DM table, DM core would re-open all of the devices in the DM table. Now, DM core will avoid redundant device opens (and closes when destroying the old table) if the old table already has a device open using the same mode. This is achieved by managing reference counts on the table_devices that DM core now stores in the mapped_device structure (rather than in the dm_table structure). So a mapped_device's active and inactive dm_tables' dm_dev lists now just point to the dm_devs stored in the mapped_device's table_devices list. This improvement in DM core's device reference counting has the side-effect of fixing a long-standing limitation of the multipath target: a DM multipath table couldn't include any paths that were unusable (failed). For example: if all paths have failed and you add a new, working, path to the table; you can't use it since the table load would fail due to it still containing failed paths. Now a re-load of a multipath table can include failed devices and when those devices become active again they can be used instantly. The device list code in dm.c isn't a straight copy/paste from the code in dm-table.c, but it's very close (aside from some variable renames). One subtle difference is that find_table_device for the tables_devices list will only match devices with the same name and mode. This is because we don't want to upgrade a device's mode in the active table when an inactive table is loaded. Access to the mapped_device structure's tables_devices list requires a mutex (tables_devices_lock), so that tables cannot be created and destroyed concurrently. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2014-08-13 13:53:43 -05:00
*result = dd->dm_dev;
return 0;
}
EXPORT_SYMBOL(dm_get_device);
static int dm_set_device_limits(struct dm_target *ti, struct dm_dev *dev,
sector_t start, sector_t len, void *data)
{
struct queue_limits *limits = data;
struct block_device *bdev = dev->bdev;
struct request_queue *q = bdev_get_queue(bdev);
char b[BDEVNAME_SIZE];
if (unlikely(!q)) {
DMWARN("%s: Cannot set limits for nonexistent device %s",
dm_device_name(ti->table->md), bdevname(bdev, b));
return 0;
}
if (bdev_stack_limits(limits, bdev, start) < 0)
DMWARN("%s: adding target device %s caused an alignment inconsistency: "
"physical_block_size=%u, logical_block_size=%u, "
"alignment_offset=%u, start=%llu",
dm_device_name(ti->table->md), bdevname(bdev, b),
q->limits.physical_block_size,
q->limits.logical_block_size,
q->limits.alignment_offset,
(unsigned long long) start << SECTOR_SHIFT);
dm table: add zoned block devices validation 1) Introduce DM_TARGET_ZONED_HM feature flag: The target drivers currently available will not operate correctly if a table target maps onto a host-managed zoned block device. To avoid problems, introduce the new feature flag DM_TARGET_ZONED_HM to allow a target to explicitly state that it supports host-managed zoned block devices. This feature is checked for all targets in a table if any of the table's block devices are host-managed. Note that as host-aware zoned block devices are backward compatible with regular block devices, they can be used by any of the current target types. This new feature is thus restricted to host-managed zoned block devices. 2) Check device area zone alignment: If a target maps to a zoned block device, check that the device area is aligned on zone boundaries to avoid problems with REQ_OP_ZONE_RESET operations (resetting a partially mapped sequential zone would not be possible). This also facilitates the processing of zone report with REQ_OP_ZONE_REPORT bios. 3) Check block devices zone model compatibility When setting the DM device's queue limits, several possibilities exists for zoned block devices: 1) The DM target driver may want to expose a different zone model (e.g. host-managed device emulation or regular block device on top of host-managed zoned block devices) 2) Expose the underlying zone model of the devices as-is To allow both cases, the underlying block device zone model must be set in the target limits in dm_set_device_limits() and the compatibility of all devices checked similarly to the logical block size alignment. For this last check, introduce validate_hardware_zoned_model() to check that all targets of a table have the same zone model and that the zone size of the target devices are equal. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> [Mike Snitzer refactored Damien's original work to simplify the code] Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-05-08 16:40:43 -07:00
limits->zoned = blk_queue_zoned_model(q);
return 0;
}
/*
* Decrement a device's use count and remove it if necessary.
*/
void dm_put_device(struct dm_target *ti, struct dm_dev *d)
{
dm: allow active and inactive tables to share dm_devs Until this change, when loading a new DM table, DM core would re-open all of the devices in the DM table. Now, DM core will avoid redundant device opens (and closes when destroying the old table) if the old table already has a device open using the same mode. This is achieved by managing reference counts on the table_devices that DM core now stores in the mapped_device structure (rather than in the dm_table structure). So a mapped_device's active and inactive dm_tables' dm_dev lists now just point to the dm_devs stored in the mapped_device's table_devices list. This improvement in DM core's device reference counting has the side-effect of fixing a long-standing limitation of the multipath target: a DM multipath table couldn't include any paths that were unusable (failed). For example: if all paths have failed and you add a new, working, path to the table; you can't use it since the table load would fail due to it still containing failed paths. Now a re-load of a multipath table can include failed devices and when those devices become active again they can be used instantly. The device list code in dm.c isn't a straight copy/paste from the code in dm-table.c, but it's very close (aside from some variable renames). One subtle difference is that find_table_device for the tables_devices list will only match devices with the same name and mode. This is because we don't want to upgrade a device's mode in the active table when an inactive table is loaded. Access to the mapped_device structure's tables_devices list requires a mutex (tables_devices_lock), so that tables cannot be created and destroyed concurrently. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2014-08-13 13:53:43 -05:00
int found = 0;
struct list_head *devices = &ti->table->devices;
struct dm_dev_internal *dd;
dm: allow active and inactive tables to share dm_devs Until this change, when loading a new DM table, DM core would re-open all of the devices in the DM table. Now, DM core will avoid redundant device opens (and closes when destroying the old table) if the old table already has a device open using the same mode. This is achieved by managing reference counts on the table_devices that DM core now stores in the mapped_device structure (rather than in the dm_table structure). So a mapped_device's active and inactive dm_tables' dm_dev lists now just point to the dm_devs stored in the mapped_device's table_devices list. This improvement in DM core's device reference counting has the side-effect of fixing a long-standing limitation of the multipath target: a DM multipath table couldn't include any paths that were unusable (failed). For example: if all paths have failed and you add a new, working, path to the table; you can't use it since the table load would fail due to it still containing failed paths. Now a re-load of a multipath table can include failed devices and when those devices become active again they can be used instantly. The device list code in dm.c isn't a straight copy/paste from the code in dm-table.c, but it's very close (aside from some variable renames). One subtle difference is that find_table_device for the tables_devices list will only match devices with the same name and mode. This is because we don't want to upgrade a device's mode in the active table when an inactive table is loaded. Access to the mapped_device structure's tables_devices list requires a mutex (tables_devices_lock), so that tables cannot be created and destroyed concurrently. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2014-08-13 13:53:43 -05:00
list_for_each_entry(dd, devices, list) {
if (dd->dm_dev == d) {
found = 1;
break;
}
}
if (!found) {
DMWARN("%s: device %s not in table devices list",
dm_device_name(ti->table->md), d->name);
return;
}
if (atomic_dec_and_test(&dd->count)) {
dm: allow active and inactive tables to share dm_devs Until this change, when loading a new DM table, DM core would re-open all of the devices in the DM table. Now, DM core will avoid redundant device opens (and closes when destroying the old table) if the old table already has a device open using the same mode. This is achieved by managing reference counts on the table_devices that DM core now stores in the mapped_device structure (rather than in the dm_table structure). So a mapped_device's active and inactive dm_tables' dm_dev lists now just point to the dm_devs stored in the mapped_device's table_devices list. This improvement in DM core's device reference counting has the side-effect of fixing a long-standing limitation of the multipath target: a DM multipath table couldn't include any paths that were unusable (failed). For example: if all paths have failed and you add a new, working, path to the table; you can't use it since the table load would fail due to it still containing failed paths. Now a re-load of a multipath table can include failed devices and when those devices become active again they can be used instantly. The device list code in dm.c isn't a straight copy/paste from the code in dm-table.c, but it's very close (aside from some variable renames). One subtle difference is that find_table_device for the tables_devices list will only match devices with the same name and mode. This is because we don't want to upgrade a device's mode in the active table when an inactive table is loaded. Access to the mapped_device structure's tables_devices list requires a mutex (tables_devices_lock), so that tables cannot be created and destroyed concurrently. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2014-08-13 13:53:43 -05:00
dm_put_table_device(ti->table->md, d);
list_del(&dd->list);
kfree(dd);
}
}
EXPORT_SYMBOL(dm_put_device);
/*
* Checks to see if the target joins onto the end of the table.
*/
static int adjoin(struct dm_table *table, struct dm_target *ti)
{
struct dm_target *prev;
if (!table->num_targets)
return !ti->begin;
prev = &table->targets[table->num_targets - 1];
return (ti->begin == (prev->begin + prev->len));
}
/*
* Used to dynamically allocate the arg array.
*
* We do first allocation with GFP_NOIO because dm-mpath and dm-thin must
* process messages even if some device is suspended. These messages have a
* small fixed number of arguments.
*
* On the other hand, dm-switch needs to process bulk data using messages and
* excessive use of GFP_NOIO could cause trouble.
*/
static char **realloc_argv(unsigned *size, char **old_argv)
{
char **argv;
unsigned new_size;
gfp_t gfp;
if (*size) {
new_size = *size * 2;
gfp = GFP_KERNEL;
} else {
new_size = 8;
gfp = GFP_NOIO;
}
argv = kmalloc(new_size * sizeof(*argv), gfp);
if (argv) {
memcpy(argv, old_argv, *size * sizeof(*argv));
*size = new_size;
}
kfree(old_argv);
return argv;
}
/*
* Destructively splits up the argument list to pass to ctr.
*/
int dm_split_args(int *argc, char ***argvp, char *input)
{
char *start, *end = input, *out, **argv = NULL;
unsigned array_size = 0;
*argc = 0;
if (!input) {
*argvp = NULL;
return 0;
}
argv = realloc_argv(&array_size, argv);
if (!argv)
return -ENOMEM;
while (1) {
/* Skip whitespace */
tree-wide: convert open calls to remove spaces to skip_spaces() lib function Makes use of skip_spaces() defined in lib/string.c for removing leading spaces from strings all over the tree. It decreases lib.a code size by 47 bytes and reuses the function tree-wide: text data bss dec hex filename 64688 584 592 65864 10148 (TOTALS-BEFORE) 64641 584 592 65817 10119 (TOTALS-AFTER) Also, while at it, if we see (*str && isspace(*str)), we can be sure to remove the first condition (*str) as the second one (isspace(*str)) also evaluates to 0 whenever *str == 0, making it redundant. In other words, "a char equals zero is never a space". Julia Lawall tried the semantic patch (http://coccinelle.lip6.fr) below, and found occurrences of this pattern on 3 more files: drivers/leds/led-class.c drivers/leds/ledtrig-timer.c drivers/video/output.c @@ expression str; @@ ( // ignore skip_spaces cases while (*str && isspace(*str)) { \(str++;\|++str;\) } | - *str && isspace(*str) ) Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com> Cc: Julia Lawall <julia@diku.dk> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Neil Brown <neilb@suse.de> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Cc: David Howells <dhowells@redhat.com> Cc: <linux-ext4@vger.kernel.org> Cc: Samuel Ortiz <samuel@sortiz.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-14 18:01:06 -08:00
start = skip_spaces(end);
if (!*start)
break; /* success, we hit the end */
/* 'out' is used to remove any back-quotes */
end = out = start;
while (*end) {
/* Everything apart from '\0' can be quoted */
if (*end == '\\' && *(end + 1)) {
*out++ = *(end + 1);
end += 2;
continue;
}
if (isspace(*end))
break; /* end of token */
*out++ = *end++;
}
/* have we already filled the array ? */
if ((*argc + 1) > array_size) {
argv = realloc_argv(&array_size, argv);
if (!argv)
return -ENOMEM;
}
/* we know this is whitespace */
if (*end)
end++;
/* terminate the string and put it in the array */
*out = '\0';
argv[*argc] = start;
(*argc)++;
}
*argvp = argv;
return 0;
}
/*
* Impose necessary and sufficient conditions on a devices's table such
* that any incoming bio which respects its logical_block_size can be
* processed successfully. If it falls across the boundary between
* two or more targets, the size of each piece it gets split into must
* be compatible with the logical_block_size of the target processing it.
*/
static int validate_hardware_logical_block_alignment(struct dm_table *table,
struct queue_limits *limits)
{
/*
* This function uses arithmetic modulo the logical_block_size
* (in units of 512-byte sectors).
*/
unsigned short device_logical_block_size_sects =
limits->logical_block_size >> SECTOR_SHIFT;
/*
* Offset of the start of the next table entry, mod logical_block_size.
*/
unsigned short next_target_start = 0;
/*
* Given an aligned bio that extends beyond the end of a
* target, how many sectors must the next target handle?
*/
unsigned short remaining = 0;
treewide: Remove uninitialized_var() usage commit 3f649ab728cda8038259d8f14492fe400fbab911 upstream. Using uninitialized_var() is dangerous as it papers over real bugs[1] (or can in the future), and suppresses unrelated compiler warnings (e.g. "unused variable"). If the compiler thinks it is uninitialized, either simply initialize the variable or make compiler changes. In preparation for removing[2] the[3] macro[4], remove all remaining needless uses with the following script: git grep '\buninitialized_var\b' | cut -d: -f1 | sort -u | \ xargs perl -pi -e \ 's/\buninitialized_var\(([^\)]+)\)/\1/g; s:\s*/\* (GCC be quiet|to make compiler happy) \*/$::g;' drivers/video/fbdev/riva/riva_hw.c was manually tweaked to avoid pathological white-space. No outstanding warnings were found building allmodconfig with GCC 9.3.0 for x86_64, i386, arm64, arm, powerpc, powerpc64le, s390x, mips, sparc64, alpha, and m68k. [1] https://lore.kernel.org/lkml/20200603174714.192027-1-glider@google.com/ [2] https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1TGqCR5vQkCzWJ0QxK6CernOU6eedsudAixw@mail.gmail.com/ [3] https://lore.kernel.org/lkml/CA+55aFwgbgqhbp1fkxvRKEpzyR5J8n1vKT1VZdz9knmPuXhOeg@mail.gmail.com/ [4] https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yVJu65TpLgN_ybYNv0VEOKA@mail.gmail.com/ Reviewed-by: Leon Romanovsky <leonro@mellanox.com> # drivers/infiniband and mlx4/mlx5 Acked-by: Jason Gunthorpe <jgg@mellanox.com> # IB Acked-by: Kalle Valo <kvalo@codeaurora.org> # wireless drivers Reviewed-by: Chao Yu <yuchao0@huawei.com> # erofs Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-03 13:09:38 -07:00
struct dm_target *ti;
struct queue_limits ti_limits;
unsigned i;
/*
* Check each entry in the table in turn.
*/
for (i = 0; i < dm_table_get_num_targets(table); i++) {
ti = dm_table_get_target(table, i);
blk_set_stacking_limits(&ti_limits);
/* combine all target devices' limits */
if (ti->type->iterate_devices)
ti->type->iterate_devices(ti, dm_set_device_limits,
&ti_limits);
/*
* If the remaining sectors fall entirely within this
* table entry are they compatible with its logical_block_size?
*/
if (remaining < ti->len &&
remaining & ((ti_limits.logical_block_size >>
SECTOR_SHIFT) - 1))
break; /* Error */
next_target_start =
(unsigned short) ((next_target_start + ti->len) &
(device_logical_block_size_sects - 1));
remaining = next_target_start ?
device_logical_block_size_sects - next_target_start : 0;
}
if (remaining) {
DMWARN("%s: table line %u (start sect %llu len %llu) "
"not aligned to h/w logical block size %u",
dm_device_name(table->md), i,
(unsigned long long) ti->begin,
(unsigned long long) ti->len,
limits->logical_block_size);
return -EINVAL;
}
return 0;
}
int dm_table_add_target(struct dm_table *t, const char *type,
sector_t start, sector_t len, char *params)
{
int r = -EINVAL, argc;
char **argv;
struct dm_target *tgt;
if (t->singleton) {
DMERR("%s: target type %s must appear alone in table",
dm_device_name(t->md), t->targets->type->name);
return -EINVAL;
}
BUG_ON(t->num_targets >= t->num_allocated);
tgt = t->targets + t->num_targets;
memset(tgt, 0, sizeof(*tgt));
if (!len) {
DMERR("%s: zero-length target", dm_device_name(t->md));
return -EINVAL;
}
tgt->type = dm_get_target_type(type);
if (!tgt->type) {
DMERR("%s: %s: unknown target type", dm_device_name(t->md), type);
return -EINVAL;
}
if (dm_target_needs_singleton(tgt->type)) {
if (t->num_targets) {
tgt->error = "singleton target type must appear alone in table";
goto bad;
}
t->singleton = true;
}
if (dm_target_always_writeable(tgt->type) && !(t->mode & FMODE_WRITE)) {
tgt->error = "target type may not be included in a read-only table";
goto bad;
}
if (t->immutable_target_type) {
if (t->immutable_target_type != tgt->type) {
tgt->error = "immutable target type cannot be mixed with other target types";
goto bad;
}
} else if (dm_target_is_immutable(tgt->type)) {
if (t->num_targets) {
tgt->error = "immutable target type cannot be mixed with other target types";
goto bad;
}
t->immutable_target_type = tgt->type;
}
if (dm_target_has_integrity(tgt->type))
t->integrity_added = 1;
tgt->table = t;
tgt->begin = start;
tgt->len = len;
tgt->error = "Unknown error";
/*
* Does this target adjoin the previous one ?
*/
if (!adjoin(t, tgt)) {
tgt->error = "Gap in table";
goto bad;
}
r = dm_split_args(&argc, &argv, params);
if (r) {
tgt->error = "couldn't split parameters (insufficient memory)";
goto bad;
}
r = tgt->type->ctr(tgt, argc, argv);
kfree(argv);
if (r)
goto bad;
t->highs[t->num_targets++] = tgt->begin + tgt->len - 1;
if (!tgt->num_discard_bios && tgt->discards_supported)
DMWARN("%s: %s: ignoring discards_supported because num_discard_bios is zero.",
dm_device_name(t->md), type);
return 0;
bad:
DMERR("%s: %s: %s", dm_device_name(t->md), type, tgt->error);
dm_put_target_type(tgt->type);
return r;
}
/*
* Target argument parsing helpers.
*/
static int validate_next_arg(const struct dm_arg *arg,
struct dm_arg_set *arg_set,
unsigned *value, char **error, unsigned grouped)
{
const char *arg_str = dm_shift_arg(arg_set);
char dummy;
if (!arg_str ||
(sscanf(arg_str, "%u%c", value, &dummy) != 1) ||
(*value < arg->min) ||
(*value > arg->max) ||
(grouped && arg_set->argc < *value)) {
*error = arg->error;
return -EINVAL;
}
return 0;
}
int dm_read_arg(const struct dm_arg *arg, struct dm_arg_set *arg_set,
unsigned *value, char **error)
{
return validate_next_arg(arg, arg_set, value, error, 0);
}
EXPORT_SYMBOL(dm_read_arg);
int dm_read_arg_group(const struct dm_arg *arg, struct dm_arg_set *arg_set,
unsigned *value, char **error)
{
return validate_next_arg(arg, arg_set, value, error, 1);
}
EXPORT_SYMBOL(dm_read_arg_group);
const char *dm_shift_arg(struct dm_arg_set *as)
{
char *r;
if (as->argc) {
as->argc--;
r = *as->argv;
as->argv++;
return r;
}
return NULL;
}
EXPORT_SYMBOL(dm_shift_arg);
void dm_consume_args(struct dm_arg_set *as, unsigned num_args)
{
BUG_ON(as->argc < num_args);
as->argc -= num_args;
as->argv += num_args;
}
EXPORT_SYMBOL(dm_consume_args);
static bool __table_type_bio_based(enum dm_queue_mode table_type)
{
return (table_type == DM_TYPE_BIO_BASED ||
table_type == DM_TYPE_DAX_BIO_BASED);
}
static bool __table_type_request_based(enum dm_queue_mode table_type)
{
return (table_type == DM_TYPE_REQUEST_BASED ||
table_type == DM_TYPE_MQ_REQUEST_BASED);
}
void dm_table_set_type(struct dm_table *t, enum dm_queue_mode type)
{
t->type = type;
}
EXPORT_SYMBOL_GPL(dm_table_set_type);
static int device_not_dax_capable(struct dm_target *ti, struct dm_dev *dev,
sector_t start, sector_t len, void *data)
{
return !bdev_dax_supported(dev->bdev, PAGE_SIZE);
}
static bool dm_table_supports_dax(struct dm_table *t)
{
struct dm_target *ti;
unsigned i;
/* Ensure that all targets support DAX. */
for (i = 0; i < dm_table_get_num_targets(t); i++) {
ti = dm_table_get_target(t, i);
if (!ti->type->direct_access)
return false;
if (!ti->type->iterate_devices ||
ti->type->iterate_devices(ti, device_not_dax_capable, NULL))
return false;
}
return true;
}
static int dm_table_determine_type(struct dm_table *t)
{
unsigned i;
unsigned bio_based = 0, request_based = 0, hybrid = 0;
unsigned sq_count = 0, mq_count = 0;
struct dm_target *tgt;
struct dm_dev_internal *dd;
struct list_head *devices = dm_table_get_devices(t);
enum dm_queue_mode live_md_type = dm_get_md_type(t->md);
if (t->type != DM_TYPE_NONE) {
/* target already set the table's type */
if (t->type == DM_TYPE_BIO_BASED)
return 0;
BUG_ON(t->type == DM_TYPE_DAX_BIO_BASED);
goto verify_rq_based;
}
for (i = 0; i < t->num_targets; i++) {
tgt = t->targets + i;
if (dm_target_hybrid(tgt))
hybrid = 1;
else if (dm_target_request_based(tgt))
request_based = 1;
else
bio_based = 1;
if (bio_based && request_based) {
DMWARN("Inconsistent table: different target types"
" can't be mixed up");
return -EINVAL;
}
}
if (hybrid && !bio_based && !request_based) {
/*
* The targets can work either way.
* Determine the type from the live device.
* Default to bio-based if device is new.
*/
if (__table_type_request_based(live_md_type))
request_based = 1;
else
bio_based = 1;
}
if (bio_based) {
/* We must use this table as bio-based */
t->type = DM_TYPE_BIO_BASED;
if (dm_table_supports_dax(t) ||
(list_empty(devices) && live_md_type == DM_TYPE_DAX_BIO_BASED))
t->type = DM_TYPE_DAX_BIO_BASED;
return 0;
}
BUG_ON(!request_based); /* No targets in this table */
/*
* The only way to establish DM_TYPE_MQ_REQUEST_BASED is by
* having a compatible target use dm_table_set_type.
*/
t->type = DM_TYPE_REQUEST_BASED;
verify_rq_based:
/*
* Request-based dm supports only tables that have a single target now.
* To support multiple targets, request splitting support is needed,
* and that needs lots of changes in the block-layer.
* (e.g. request completion process for partial completion.)
*/
if (t->num_targets > 1) {
DMWARN("Request-based dm doesn't support multiple targets yet");
return -EINVAL;
}
if (list_empty(devices)) {
int srcu_idx;
struct dm_table *live_table = dm_get_live_table(t->md, &srcu_idx);
/* inherit live table's type and all_blk_mq */
if (live_table) {
t->type = live_table->type;
t->all_blk_mq = live_table->all_blk_mq;
}
dm_put_live_table(t->md, srcu_idx);
return 0;
}
/* Non-request-stackable devices can't be used for request-based dm */
list_for_each_entry(dd, devices, list) {
2014-12-17 21:08:12 -05:00
struct request_queue *q = bdev_get_queue(dd->dm_dev->bdev);
if (!blk_queue_stackable(q)) {
DMERR("table load rejected: including"
" non-request-stackable devices");
return -EINVAL;
}
2014-12-17 21:08:12 -05:00
if (q->mq_ops)
mq_count++;
else
sq_count++;
2014-12-17 21:08:12 -05:00
}
if (sq_count && mq_count) {
DMERR("table load rejected: not all devices are blk-mq request-stackable");
return -EINVAL;
}
t->all_blk_mq = mq_count > 0;
if (t->type == DM_TYPE_MQ_REQUEST_BASED && !t->all_blk_mq) {
DMERR("table load rejected: all devices are not blk-mq request-stackable");
return -EINVAL;
}
return 0;
}
enum dm_queue_mode dm_table_get_type(struct dm_table *t)
{
return t->type;
}
struct target_type *dm_table_get_immutable_target_type(struct dm_table *t)
{
return t->immutable_target_type;
}
struct dm_target *dm_table_get_immutable_target(struct dm_table *t)
{
/* Immutable target is implicitly a singleton */
if (t->num_targets > 1 ||
!dm_target_is_immutable(t->targets[0].type))
return NULL;
return t->targets;
}
struct dm_target *dm_table_get_wildcard_target(struct dm_table *t)
{
struct dm_target *ti;
unsigned i;
for (i = 0; i < dm_table_get_num_targets(t); i++) {
ti = dm_table_get_target(t, i);
if (dm_target_is_wildcard(ti->type))
return ti;
}
return NULL;
}
bool dm_table_bio_based(struct dm_table *t)
{
return __table_type_bio_based(dm_table_get_type(t));
}
bool dm_table_request_based(struct dm_table *t)
{
return __table_type_request_based(dm_table_get_type(t));
2014-12-17 21:08:12 -05:00
}
bool dm_table_all_blk_mq_devices(struct dm_table *t)
2014-12-17 21:08:12 -05:00
{
return t->all_blk_mq;
}
static int dm_table_alloc_md_mempools(struct dm_table *t, struct mapped_device *md)
{
enum dm_queue_mode type = dm_table_get_type(t);
unsigned per_io_data_size = 0;
struct dm_target *tgt;
unsigned i;
if (unlikely(type == DM_TYPE_NONE)) {
DMWARN("no table type is set, can't allocate mempools");
return -EINVAL;
}
if (__table_type_bio_based(type))
for (i = 0; i < t->num_targets; i++) {
tgt = t->targets + i;
per_io_data_size = max(per_io_data_size, tgt->per_io_data_size);
}
t->mempools = dm_alloc_md_mempools(md, type, t->integrity_supported, per_io_data_size);
if (!t->mempools)
return -ENOMEM;
return 0;
}
void dm_table_free_md_mempools(struct dm_table *t)
{
dm_free_md_mempools(t->mempools);
t->mempools = NULL;
}
struct dm_md_mempools *dm_table_get_md_mempools(struct dm_table *t)
{
return t->mempools;
}
static int setup_indexes(struct dm_table *t)
{
int i;
unsigned int total = 0;
sector_t *indexes;
/* allocate the space for *all* the indexes */
for (i = t->depth - 2; i >= 0; i--) {
t->counts[i] = dm_div_up(t->counts[i + 1], CHILDREN_PER_NODE);
total += t->counts[i];
}
indexes = (sector_t *) dm_vcalloc(total, (unsigned long) NODE_SIZE);
if (!indexes)
return -ENOMEM;
/* set up internal nodes, bottom-up */
for (i = t->depth - 2; i >= 0; i--) {
t->index[i] = indexes;
indexes += (KEYS_PER_NODE * t->counts[i]);
setup_btree_index(i, t);
}
return 0;
}
/*
* Builds the btree to index the map.
*/
static int dm_table_build_index(struct dm_table *t)
{
int r = 0;
unsigned int leaf_nodes;
/* how many indexes will the btree have ? */
leaf_nodes = dm_div_up(t->num_targets, KEYS_PER_NODE);
t->depth = 1 + int_log(leaf_nodes, CHILDREN_PER_NODE);
/* leaf layer has already been set up */
t->counts[t->depth - 1] = leaf_nodes;
t->index[t->depth - 1] = t->highs;
if (t->depth >= 2)
r = setup_indexes(t);
return r;
}
static bool integrity_profile_exists(struct gendisk *disk)
{
return !!blk_get_integrity(disk);
}
/*
* Get a disk whose integrity profile reflects the table's profile.
* Returns NULL if integrity support was inconsistent or unavailable.
*/
static struct gendisk * dm_table_get_integrity_disk(struct dm_table *t)
{
struct list_head *devices = dm_table_get_devices(t);
struct dm_dev_internal *dd = NULL;
struct gendisk *prev_disk = NULL, *template_disk = NULL;
unsigned i;
for (i = 0; i < dm_table_get_num_targets(t); i++) {
struct dm_target *ti = dm_table_get_target(t, i);
if (!dm_target_passes_integrity(ti->type))
goto no_integrity;
}
list_for_each_entry(dd, devices, list) {
dm: allow active and inactive tables to share dm_devs Until this change, when loading a new DM table, DM core would re-open all of the devices in the DM table. Now, DM core will avoid redundant device opens (and closes when destroying the old table) if the old table already has a device open using the same mode. This is achieved by managing reference counts on the table_devices that DM core now stores in the mapped_device structure (rather than in the dm_table structure). So a mapped_device's active and inactive dm_tables' dm_dev lists now just point to the dm_devs stored in the mapped_device's table_devices list. This improvement in DM core's device reference counting has the side-effect of fixing a long-standing limitation of the multipath target: a DM multipath table couldn't include any paths that were unusable (failed). For example: if all paths have failed and you add a new, working, path to the table; you can't use it since the table load would fail due to it still containing failed paths. Now a re-load of a multipath table can include failed devices and when those devices become active again they can be used instantly. The device list code in dm.c isn't a straight copy/paste from the code in dm-table.c, but it's very close (aside from some variable renames). One subtle difference is that find_table_device for the tables_devices list will only match devices with the same name and mode. This is because we don't want to upgrade a device's mode in the active table when an inactive table is loaded. Access to the mapped_device structure's tables_devices list requires a mutex (tables_devices_lock), so that tables cannot be created and destroyed concurrently. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2014-08-13 13:53:43 -05:00
template_disk = dd->dm_dev->bdev->bd_disk;
if (!integrity_profile_exists(template_disk))
goto no_integrity;
else if (prev_disk &&
blk_integrity_compare(prev_disk, template_disk) < 0)
goto no_integrity;
prev_disk = template_disk;
}
return template_disk;
no_integrity:
if (prev_disk)
DMWARN("%s: integrity not set: %s and %s profile mismatch",
dm_device_name(t->md),
prev_disk->disk_name,
template_disk->disk_name);
return NULL;
}
/*
* Register the mapped device for blk_integrity support if the
* underlying devices have an integrity profile. But all devices may
* not have matching profiles (checking all devices isn't reliable
* during table load because this table may use other DM device(s) which
* must be resumed before they will have an initialized integity
* profile). Consequently, stacked DM devices force a 2 stage integrity
* profile validation: First pass during table load, final pass during
* resume.
*/
static int dm_table_register_integrity(struct dm_table *t)
{
struct mapped_device *md = t->md;
struct gendisk *template_disk = NULL;
/* If target handles integrity itself do not register it here. */
if (t->integrity_added)
return 0;
template_disk = dm_table_get_integrity_disk(t);
if (!template_disk)
return 0;
if (!integrity_profile_exists(dm_disk(md))) {
t->integrity_supported = true;
/*
* Register integrity profile during table load; we can do
* this because the final profile must match during resume.
*/
blk_integrity_register(dm_disk(md),
blk_get_integrity(template_disk));
return 0;
}
/*
* If DM device already has an initialized integrity
* profile the new profile should not conflict.
*/
if (blk_integrity_compare(dm_disk(md), template_disk) < 0) {
DMWARN("%s: conflict with existing integrity profile: "
"%s profile mismatch",
dm_device_name(t->md),
template_disk->disk_name);
return 1;
}
/* Preserve existing integrity profile */
t->integrity_supported = true;
return 0;
}
/*
* Prepares the table for use by building the indices,
* setting the type, and allocating mempools.
*/
int dm_table_complete(struct dm_table *t)
{
int r;
r = dm_table_determine_type(t);
if (r) {
DMERR("unable to determine table type");
return r;
}
r = dm_table_build_index(t);
if (r) {
DMERR("unable to build btrees");
return r;
}
r = dm_table_register_integrity(t);
if (r) {
DMERR("could not register integrity profile.");
return r;
}
r = dm_table_alloc_md_mempools(t, t->md);
if (r)
DMERR("unable to allocate mempools");
return r;
}
static DEFINE_MUTEX(_event_lock);
void dm_table_event_callback(struct dm_table *t,
void (*fn)(void *), void *context)
{
mutex_lock(&_event_lock);
t->event_fn = fn;
t->event_context = context;
mutex_unlock(&_event_lock);
}
void dm_table_event(struct dm_table *t)
{
mutex_lock(&_event_lock);
if (t->event_fn)
t->event_fn(t->event_context);
mutex_unlock(&_event_lock);
}
EXPORT_SYMBOL(dm_table_event);
inline sector_t dm_table_get_size(struct dm_table *t)
{
return t->num_targets ? (t->highs[t->num_targets - 1] + 1) : 0;
}
EXPORT_SYMBOL(dm_table_get_size);
struct dm_target *dm_table_get_target(struct dm_table *t, unsigned int index)
{
if (index >= t->num_targets)
return NULL;
return t->targets + index;
}
/*
* Search the btree for the correct target.
*
* Caller should check returned pointer with dm_target_is_valid()
* to trap I/O beyond end of device.
*/
struct dm_target *dm_table_find_target(struct dm_table *t, sector_t sector)
{
unsigned int l, n = 0, k = 0;
sector_t *node;
if (unlikely(sector >= dm_table_get_size(t)))
return &t->targets[t->num_targets];
for (l = 0; l < t->depth; l++) {
n = get_child(n, k);
node = get_node(t, l, n);
for (k = 0; k < KEYS_PER_NODE; k++)
if (node[k] >= sector)
break;
}
return &t->targets[(KEYS_PER_NODE * n) + k];
}
dm table: fix iterate_devices based device capability checks commit a4c8dd9c2d0987cf542a2a0c42684c9c6d78a04e upstream. According to the definition of dm_iterate_devices_fn: * This function must iterate through each section of device used by the * target until it encounters a non-zero return code, which it then returns. * Returns zero if no callout returned non-zero. For some target type (e.g. dm-stripe), one call of iterate_devices() may iterate multiple underlying devices internally, in which case a non-zero return code returned by iterate_devices_callout_fn will stop the iteration in advance. No iterate_devices_callout_fn should return non-zero unless device iteration should stop. Rename dm_table_requires_stable_pages() to dm_table_any_dev_attr() and elevate it for reuse to stop iterating (and return non-zero) on the first device that causes iterate_devices_callout_fn to return non-zero. Use dm_table_any_dev_attr() to properly iterate through devices. Rename device_is_nonrot() to device_is_rotational() and invert logic accordingly to fix improper disposition. [jeffle: backport notes] Also convert the no_sg_merge capability check, which is introduced by commit 200612ec33e5 ("dm table: propagate QUEUE_FLAG_NO_SG_MERGE"), and removed since commit 2705c93742e9 ("block: kill QUEUE_FLAG_NO_SG_MERGE") in v5.1. Fixes: c3c4555edd10 ("dm table: clear add_random unless all devices have it set") Fixes: 4693c9668fdc ("dm table: propagate non rotational flag") Cc: stable@vger.kernel.org Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-09 11:27:43 +08:00
/*
* type->iterate_devices() should be called when the sanity check needs to
* iterate and check all underlying data devices. iterate_devices() will
* iterate all underlying data devices until it encounters a non-zero return
* code, returned by whether the input iterate_devices_callout_fn, or
* iterate_devices() itself internally.
*
* For some target type (e.g. dm-stripe), one call of iterate_devices() may
* iterate multiple underlying devices internally, in which case a non-zero
* return code returned by iterate_devices_callout_fn will stop the iteration
* in advance.
*
* Cases requiring _any_ underlying device supporting some kind of attribute,
* should use the iteration structure like dm_table_any_dev_attr(), or call
* it directly. @func should handle semantics of positive examples, e.g.
* capable of something.
*
* Cases requiring _all_ underlying devices supporting some kind of attribute,
* should use the iteration structure like dm_table_supports_nowait() or
* dm_table_supports_discards(). Or introduce dm_table_all_devs_attr() that
* uses an @anti_func that handle semantics of counter examples, e.g. not
* capable of something. So: return !dm_table_any_dev_attr(t, anti_func, data);
dm table: fix iterate_devices based device capability checks commit a4c8dd9c2d0987cf542a2a0c42684c9c6d78a04e upstream. According to the definition of dm_iterate_devices_fn: * This function must iterate through each section of device used by the * target until it encounters a non-zero return code, which it then returns. * Returns zero if no callout returned non-zero. For some target type (e.g. dm-stripe), one call of iterate_devices() may iterate multiple underlying devices internally, in which case a non-zero return code returned by iterate_devices_callout_fn will stop the iteration in advance. No iterate_devices_callout_fn should return non-zero unless device iteration should stop. Rename dm_table_requires_stable_pages() to dm_table_any_dev_attr() and elevate it for reuse to stop iterating (and return non-zero) on the first device that causes iterate_devices_callout_fn to return non-zero. Use dm_table_any_dev_attr() to properly iterate through devices. Rename device_is_nonrot() to device_is_rotational() and invert logic accordingly to fix improper disposition. [jeffle: backport notes] Also convert the no_sg_merge capability check, which is introduced by commit 200612ec33e5 ("dm table: propagate QUEUE_FLAG_NO_SG_MERGE"), and removed since commit 2705c93742e9 ("block: kill QUEUE_FLAG_NO_SG_MERGE") in v5.1. Fixes: c3c4555edd10 ("dm table: clear add_random unless all devices have it set") Fixes: 4693c9668fdc ("dm table: propagate non rotational flag") Cc: stable@vger.kernel.org Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-09 11:27:43 +08:00
*/
static bool dm_table_any_dev_attr(struct dm_table *t,
iterate_devices_callout_fn func, void *data)
dm table: fix iterate_devices based device capability checks commit a4c8dd9c2d0987cf542a2a0c42684c9c6d78a04e upstream. According to the definition of dm_iterate_devices_fn: * This function must iterate through each section of device used by the * target until it encounters a non-zero return code, which it then returns. * Returns zero if no callout returned non-zero. For some target type (e.g. dm-stripe), one call of iterate_devices() may iterate multiple underlying devices internally, in which case a non-zero return code returned by iterate_devices_callout_fn will stop the iteration in advance. No iterate_devices_callout_fn should return non-zero unless device iteration should stop. Rename dm_table_requires_stable_pages() to dm_table_any_dev_attr() and elevate it for reuse to stop iterating (and return non-zero) on the first device that causes iterate_devices_callout_fn to return non-zero. Use dm_table_any_dev_attr() to properly iterate through devices. Rename device_is_nonrot() to device_is_rotational() and invert logic accordingly to fix improper disposition. [jeffle: backport notes] Also convert the no_sg_merge capability check, which is introduced by commit 200612ec33e5 ("dm table: propagate QUEUE_FLAG_NO_SG_MERGE"), and removed since commit 2705c93742e9 ("block: kill QUEUE_FLAG_NO_SG_MERGE") in v5.1. Fixes: c3c4555edd10 ("dm table: clear add_random unless all devices have it set") Fixes: 4693c9668fdc ("dm table: propagate non rotational flag") Cc: stable@vger.kernel.org Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-09 11:27:43 +08:00
{
struct dm_target *ti;
unsigned int i;
for (i = 0; i < dm_table_get_num_targets(t); i++) {
ti = dm_table_get_target(t, i);
if (ti->type->iterate_devices &&
ti->type->iterate_devices(ti, func, data))
dm table: fix iterate_devices based device capability checks commit a4c8dd9c2d0987cf542a2a0c42684c9c6d78a04e upstream. According to the definition of dm_iterate_devices_fn: * This function must iterate through each section of device used by the * target until it encounters a non-zero return code, which it then returns. * Returns zero if no callout returned non-zero. For some target type (e.g. dm-stripe), one call of iterate_devices() may iterate multiple underlying devices internally, in which case a non-zero return code returned by iterate_devices_callout_fn will stop the iteration in advance. No iterate_devices_callout_fn should return non-zero unless device iteration should stop. Rename dm_table_requires_stable_pages() to dm_table_any_dev_attr() and elevate it for reuse to stop iterating (and return non-zero) on the first device that causes iterate_devices_callout_fn to return non-zero. Use dm_table_any_dev_attr() to properly iterate through devices. Rename device_is_nonrot() to device_is_rotational() and invert logic accordingly to fix improper disposition. [jeffle: backport notes] Also convert the no_sg_merge capability check, which is introduced by commit 200612ec33e5 ("dm table: propagate QUEUE_FLAG_NO_SG_MERGE"), and removed since commit 2705c93742e9 ("block: kill QUEUE_FLAG_NO_SG_MERGE") in v5.1. Fixes: c3c4555edd10 ("dm table: clear add_random unless all devices have it set") Fixes: 4693c9668fdc ("dm table: propagate non rotational flag") Cc: stable@vger.kernel.org Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-09 11:27:43 +08:00
return true;
}
return false;
}
static int count_device(struct dm_target *ti, struct dm_dev *dev,
sector_t start, sector_t len, void *data)
{
unsigned *num_devices = data;
(*num_devices)++;
return 0;
}
/*
* Check whether a table has no data devices attached using each
* target's iterate_devices method.
* Returns false if the result is unknown because a target doesn't
* support iterate_devices.
*/
bool dm_table_has_no_data_devices(struct dm_table *table)
{
struct dm_target *ti;
unsigned i, num_devices;
for (i = 0; i < dm_table_get_num_targets(table); i++) {
ti = dm_table_get_target(table, i);
if (!ti->type->iterate_devices)
return false;
num_devices = 0;
ti->type->iterate_devices(ti, count_device, &num_devices);
if (num_devices)
return false;
}
return true;
}
static int device_not_zoned_model(struct dm_target *ti, struct dm_dev *dev,
sector_t start, sector_t len, void *data)
dm table: add zoned block devices validation 1) Introduce DM_TARGET_ZONED_HM feature flag: The target drivers currently available will not operate correctly if a table target maps onto a host-managed zoned block device. To avoid problems, introduce the new feature flag DM_TARGET_ZONED_HM to allow a target to explicitly state that it supports host-managed zoned block devices. This feature is checked for all targets in a table if any of the table's block devices are host-managed. Note that as host-aware zoned block devices are backward compatible with regular block devices, they can be used by any of the current target types. This new feature is thus restricted to host-managed zoned block devices. 2) Check device area zone alignment: If a target maps to a zoned block device, check that the device area is aligned on zone boundaries to avoid problems with REQ_OP_ZONE_RESET operations (resetting a partially mapped sequential zone would not be possible). This also facilitates the processing of zone report with REQ_OP_ZONE_REPORT bios. 3) Check block devices zone model compatibility When setting the DM device's queue limits, several possibilities exists for zoned block devices: 1) The DM target driver may want to expose a different zone model (e.g. host-managed device emulation or regular block device on top of host-managed zoned block devices) 2) Expose the underlying zone model of the devices as-is To allow both cases, the underlying block device zone model must be set in the target limits in dm_set_device_limits() and the compatibility of all devices checked similarly to the logical block size alignment. For this last check, introduce validate_hardware_zoned_model() to check that all targets of a table have the same zone model and that the zone size of the target devices are equal. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> [Mike Snitzer refactored Damien's original work to simplify the code] Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-05-08 16:40:43 -07:00
{
struct request_queue *q = bdev_get_queue(dev->bdev);
enum blk_zoned_model *zoned_model = data;
return !q || blk_queue_zoned_model(q) != *zoned_model;
dm table: add zoned block devices validation 1) Introduce DM_TARGET_ZONED_HM feature flag: The target drivers currently available will not operate correctly if a table target maps onto a host-managed zoned block device. To avoid problems, introduce the new feature flag DM_TARGET_ZONED_HM to allow a target to explicitly state that it supports host-managed zoned block devices. This feature is checked for all targets in a table if any of the table's block devices are host-managed. Note that as host-aware zoned block devices are backward compatible with regular block devices, they can be used by any of the current target types. This new feature is thus restricted to host-managed zoned block devices. 2) Check device area zone alignment: If a target maps to a zoned block device, check that the device area is aligned on zone boundaries to avoid problems with REQ_OP_ZONE_RESET operations (resetting a partially mapped sequential zone would not be possible). This also facilitates the processing of zone report with REQ_OP_ZONE_REPORT bios. 3) Check block devices zone model compatibility When setting the DM device's queue limits, several possibilities exists for zoned block devices: 1) The DM target driver may want to expose a different zone model (e.g. host-managed device emulation or regular block device on top of host-managed zoned block devices) 2) Expose the underlying zone model of the devices as-is To allow both cases, the underlying block device zone model must be set in the target limits in dm_set_device_limits() and the compatibility of all devices checked similarly to the logical block size alignment. For this last check, introduce validate_hardware_zoned_model() to check that all targets of a table have the same zone model and that the zone size of the target devices are equal. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> [Mike Snitzer refactored Damien's original work to simplify the code] Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-05-08 16:40:43 -07:00
}
static bool dm_table_supports_zoned_model(struct dm_table *t,
enum blk_zoned_model zoned_model)
{
struct dm_target *ti;
unsigned i;
for (i = 0; i < dm_table_get_num_targets(t); i++) {
ti = dm_table_get_target(t, i);
if (zoned_model == BLK_ZONED_HM &&
!dm_target_supports_zoned_hm(ti->type))
return false;
if (!ti->type->iterate_devices ||
ti->type->iterate_devices(ti, device_not_zoned_model, &zoned_model))
dm table: add zoned block devices validation 1) Introduce DM_TARGET_ZONED_HM feature flag: The target drivers currently available will not operate correctly if a table target maps onto a host-managed zoned block device. To avoid problems, introduce the new feature flag DM_TARGET_ZONED_HM to allow a target to explicitly state that it supports host-managed zoned block devices. This feature is checked for all targets in a table if any of the table's block devices are host-managed. Note that as host-aware zoned block devices are backward compatible with regular block devices, they can be used by any of the current target types. This new feature is thus restricted to host-managed zoned block devices. 2) Check device area zone alignment: If a target maps to a zoned block device, check that the device area is aligned on zone boundaries to avoid problems with REQ_OP_ZONE_RESET operations (resetting a partially mapped sequential zone would not be possible). This also facilitates the processing of zone report with REQ_OP_ZONE_REPORT bios. 3) Check block devices zone model compatibility When setting the DM device's queue limits, several possibilities exists for zoned block devices: 1) The DM target driver may want to expose a different zone model (e.g. host-managed device emulation or regular block device on top of host-managed zoned block devices) 2) Expose the underlying zone model of the devices as-is To allow both cases, the underlying block device zone model must be set in the target limits in dm_set_device_limits() and the compatibility of all devices checked similarly to the logical block size alignment. For this last check, introduce validate_hardware_zoned_model() to check that all targets of a table have the same zone model and that the zone size of the target devices are equal. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> [Mike Snitzer refactored Damien's original work to simplify the code] Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-05-08 16:40:43 -07:00
return false;
}
return true;
}
static int device_not_matches_zone_sectors(struct dm_target *ti, struct dm_dev *dev,
sector_t start, sector_t len, void *data)
dm table: add zoned block devices validation 1) Introduce DM_TARGET_ZONED_HM feature flag: The target drivers currently available will not operate correctly if a table target maps onto a host-managed zoned block device. To avoid problems, introduce the new feature flag DM_TARGET_ZONED_HM to allow a target to explicitly state that it supports host-managed zoned block devices. This feature is checked for all targets in a table if any of the table's block devices are host-managed. Note that as host-aware zoned block devices are backward compatible with regular block devices, they can be used by any of the current target types. This new feature is thus restricted to host-managed zoned block devices. 2) Check device area zone alignment: If a target maps to a zoned block device, check that the device area is aligned on zone boundaries to avoid problems with REQ_OP_ZONE_RESET operations (resetting a partially mapped sequential zone would not be possible). This also facilitates the processing of zone report with REQ_OP_ZONE_REPORT bios. 3) Check block devices zone model compatibility When setting the DM device's queue limits, several possibilities exists for zoned block devices: 1) The DM target driver may want to expose a different zone model (e.g. host-managed device emulation or regular block device on top of host-managed zoned block devices) 2) Expose the underlying zone model of the devices as-is To allow both cases, the underlying block device zone model must be set in the target limits in dm_set_device_limits() and the compatibility of all devices checked similarly to the logical block size alignment. For this last check, introduce validate_hardware_zoned_model() to check that all targets of a table have the same zone model and that the zone size of the target devices are equal. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> [Mike Snitzer refactored Damien's original work to simplify the code] Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-05-08 16:40:43 -07:00
{
struct request_queue *q = bdev_get_queue(dev->bdev);
unsigned int *zone_sectors = data;
return !q || blk_queue_zone_sectors(q) != *zone_sectors;
dm table: add zoned block devices validation 1) Introduce DM_TARGET_ZONED_HM feature flag: The target drivers currently available will not operate correctly if a table target maps onto a host-managed zoned block device. To avoid problems, introduce the new feature flag DM_TARGET_ZONED_HM to allow a target to explicitly state that it supports host-managed zoned block devices. This feature is checked for all targets in a table if any of the table's block devices are host-managed. Note that as host-aware zoned block devices are backward compatible with regular block devices, they can be used by any of the current target types. This new feature is thus restricted to host-managed zoned block devices. 2) Check device area zone alignment: If a target maps to a zoned block device, check that the device area is aligned on zone boundaries to avoid problems with REQ_OP_ZONE_RESET operations (resetting a partially mapped sequential zone would not be possible). This also facilitates the processing of zone report with REQ_OP_ZONE_REPORT bios. 3) Check block devices zone model compatibility When setting the DM device's queue limits, several possibilities exists for zoned block devices: 1) The DM target driver may want to expose a different zone model (e.g. host-managed device emulation or regular block device on top of host-managed zoned block devices) 2) Expose the underlying zone model of the devices as-is To allow both cases, the underlying block device zone model must be set in the target limits in dm_set_device_limits() and the compatibility of all devices checked similarly to the logical block size alignment. For this last check, introduce validate_hardware_zoned_model() to check that all targets of a table have the same zone model and that the zone size of the target devices are equal. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> [Mike Snitzer refactored Damien's original work to simplify the code] Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-05-08 16:40:43 -07:00
}
static int validate_hardware_zoned_model(struct dm_table *table,
enum blk_zoned_model zoned_model,
unsigned int zone_sectors)
{
if (zoned_model == BLK_ZONED_NONE)
return 0;
if (!dm_table_supports_zoned_model(table, zoned_model)) {
DMERR("%s: zoned model is not consistent across all devices",
dm_device_name(table->md));
return -EINVAL;
}
/* Check zone size validity and compatibility */
if (!zone_sectors || !is_power_of_2(zone_sectors))
return -EINVAL;
if (dm_table_any_dev_attr(table, device_not_matches_zone_sectors, &zone_sectors)) {
dm table: add zoned block devices validation 1) Introduce DM_TARGET_ZONED_HM feature flag: The target drivers currently available will not operate correctly if a table target maps onto a host-managed zoned block device. To avoid problems, introduce the new feature flag DM_TARGET_ZONED_HM to allow a target to explicitly state that it supports host-managed zoned block devices. This feature is checked for all targets in a table if any of the table's block devices are host-managed. Note that as host-aware zoned block devices are backward compatible with regular block devices, they can be used by any of the current target types. This new feature is thus restricted to host-managed zoned block devices. 2) Check device area zone alignment: If a target maps to a zoned block device, check that the device area is aligned on zone boundaries to avoid problems with REQ_OP_ZONE_RESET operations (resetting a partially mapped sequential zone would not be possible). This also facilitates the processing of zone report with REQ_OP_ZONE_REPORT bios. 3) Check block devices zone model compatibility When setting the DM device's queue limits, several possibilities exists for zoned block devices: 1) The DM target driver may want to expose a different zone model (e.g. host-managed device emulation or regular block device on top of host-managed zoned block devices) 2) Expose the underlying zone model of the devices as-is To allow both cases, the underlying block device zone model must be set in the target limits in dm_set_device_limits() and the compatibility of all devices checked similarly to the logical block size alignment. For this last check, introduce validate_hardware_zoned_model() to check that all targets of a table have the same zone model and that the zone size of the target devices are equal. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> [Mike Snitzer refactored Damien's original work to simplify the code] Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-05-08 16:40:43 -07:00
DMERR("%s: zone sectors is not consistent across all devices",
dm_device_name(table->md));
return -EINVAL;
}
return 0;
}
/*
* Establish the new table's queue_limits and validate them.
*/
int dm_calculate_queue_limits(struct dm_table *table,
struct queue_limits *limits)
{
struct dm_target *ti;
struct queue_limits ti_limits;
unsigned i;
dm table: add zoned block devices validation 1) Introduce DM_TARGET_ZONED_HM feature flag: The target drivers currently available will not operate correctly if a table target maps onto a host-managed zoned block device. To avoid problems, introduce the new feature flag DM_TARGET_ZONED_HM to allow a target to explicitly state that it supports host-managed zoned block devices. This feature is checked for all targets in a table if any of the table's block devices are host-managed. Note that as host-aware zoned block devices are backward compatible with regular block devices, they can be used by any of the current target types. This new feature is thus restricted to host-managed zoned block devices. 2) Check device area zone alignment: If a target maps to a zoned block device, check that the device area is aligned on zone boundaries to avoid problems with REQ_OP_ZONE_RESET operations (resetting a partially mapped sequential zone would not be possible). This also facilitates the processing of zone report with REQ_OP_ZONE_REPORT bios. 3) Check block devices zone model compatibility When setting the DM device's queue limits, several possibilities exists for zoned block devices: 1) The DM target driver may want to expose a different zone model (e.g. host-managed device emulation or regular block device on top of host-managed zoned block devices) 2) Expose the underlying zone model of the devices as-is To allow both cases, the underlying block device zone model must be set in the target limits in dm_set_device_limits() and the compatibility of all devices checked similarly to the logical block size alignment. For this last check, introduce validate_hardware_zoned_model() to check that all targets of a table have the same zone model and that the zone size of the target devices are equal. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> [Mike Snitzer refactored Damien's original work to simplify the code] Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-05-08 16:40:43 -07:00
enum blk_zoned_model zoned_model = BLK_ZONED_NONE;
unsigned int zone_sectors = 0;
blk_set_stacking_limits(limits);
for (i = 0; i < dm_table_get_num_targets(table); i++) {
blk_set_stacking_limits(&ti_limits);
ti = dm_table_get_target(table, i);
if (!ti->type->iterate_devices)
goto combine_limits;
/*
* Combine queue limits of all the devices this target uses.
*/
ti->type->iterate_devices(ti, dm_set_device_limits,
&ti_limits);
dm table: add zoned block devices validation 1) Introduce DM_TARGET_ZONED_HM feature flag: The target drivers currently available will not operate correctly if a table target maps onto a host-managed zoned block device. To avoid problems, introduce the new feature flag DM_TARGET_ZONED_HM to allow a target to explicitly state that it supports host-managed zoned block devices. This feature is checked for all targets in a table if any of the table's block devices are host-managed. Note that as host-aware zoned block devices are backward compatible with regular block devices, they can be used by any of the current target types. This new feature is thus restricted to host-managed zoned block devices. 2) Check device area zone alignment: If a target maps to a zoned block device, check that the device area is aligned on zone boundaries to avoid problems with REQ_OP_ZONE_RESET operations (resetting a partially mapped sequential zone would not be possible). This also facilitates the processing of zone report with REQ_OP_ZONE_REPORT bios. 3) Check block devices zone model compatibility When setting the DM device's queue limits, several possibilities exists for zoned block devices: 1) The DM target driver may want to expose a different zone model (e.g. host-managed device emulation or regular block device on top of host-managed zoned block devices) 2) Expose the underlying zone model of the devices as-is To allow both cases, the underlying block device zone model must be set in the target limits in dm_set_device_limits() and the compatibility of all devices checked similarly to the logical block size alignment. For this last check, introduce validate_hardware_zoned_model() to check that all targets of a table have the same zone model and that the zone size of the target devices are equal. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> [Mike Snitzer refactored Damien's original work to simplify the code] Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-05-08 16:40:43 -07:00
if (zoned_model == BLK_ZONED_NONE && ti_limits.zoned != BLK_ZONED_NONE) {
/*
* After stacking all limits, validate all devices
* in table support this zoned model and zone sectors.
*/
zoned_model = ti_limits.zoned;
zone_sectors = ti_limits.chunk_sectors;
}
/* Set I/O hints portion of queue limits */
if (ti->type->io_hints)
ti->type->io_hints(ti, &ti_limits);
/*
* Check each device area is consistent with the target's
* overall queue limits.
*/
if (ti->type->iterate_devices(ti, device_area_is_invalid,
&ti_limits))
return -EINVAL;
combine_limits:
/*
* Merge this target's queue limits into the overall limits
* for the table.
*/
if (blk_stack_limits(limits, &ti_limits, 0) < 0)
DMWARN("%s: adding target device "
"(start sect %llu len %llu) "
"caused an alignment inconsistency",
dm_device_name(table->md),
(unsigned long long) ti->begin,
(unsigned long long) ti->len);
dm table: add zoned block devices validation 1) Introduce DM_TARGET_ZONED_HM feature flag: The target drivers currently available will not operate correctly if a table target maps onto a host-managed zoned block device. To avoid problems, introduce the new feature flag DM_TARGET_ZONED_HM to allow a target to explicitly state that it supports host-managed zoned block devices. This feature is checked for all targets in a table if any of the table's block devices are host-managed. Note that as host-aware zoned block devices are backward compatible with regular block devices, they can be used by any of the current target types. This new feature is thus restricted to host-managed zoned block devices. 2) Check device area zone alignment: If a target maps to a zoned block device, check that the device area is aligned on zone boundaries to avoid problems with REQ_OP_ZONE_RESET operations (resetting a partially mapped sequential zone would not be possible). This also facilitates the processing of zone report with REQ_OP_ZONE_REPORT bios. 3) Check block devices zone model compatibility When setting the DM device's queue limits, several possibilities exists for zoned block devices: 1) The DM target driver may want to expose a different zone model (e.g. host-managed device emulation or regular block device on top of host-managed zoned block devices) 2) Expose the underlying zone model of the devices as-is To allow both cases, the underlying block device zone model must be set in the target limits in dm_set_device_limits() and the compatibility of all devices checked similarly to the logical block size alignment. For this last check, introduce validate_hardware_zoned_model() to check that all targets of a table have the same zone model and that the zone size of the target devices are equal. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> [Mike Snitzer refactored Damien's original work to simplify the code] Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-05-08 16:40:43 -07:00
/*
* FIXME: this should likely be moved to blk_stack_limits(), would
* also eliminate limits->zoned stacking hack in dm_set_device_limits()
*/
if (limits->zoned == BLK_ZONED_NONE && ti_limits.zoned != BLK_ZONED_NONE) {
/*
* By default, the stacked limits zoned model is set to
* BLK_ZONED_NONE in blk_set_stacking_limits(). Update
* this model using the first target model reported
* that is not BLK_ZONED_NONE. This will be either the
* first target device zoned model or the model reported
* by the target .io_hints.
*/
limits->zoned = ti_limits.zoned;
}
}
dm table: add zoned block devices validation 1) Introduce DM_TARGET_ZONED_HM feature flag: The target drivers currently available will not operate correctly if a table target maps onto a host-managed zoned block device. To avoid problems, introduce the new feature flag DM_TARGET_ZONED_HM to allow a target to explicitly state that it supports host-managed zoned block devices. This feature is checked for all targets in a table if any of the table's block devices are host-managed. Note that as host-aware zoned block devices are backward compatible with regular block devices, they can be used by any of the current target types. This new feature is thus restricted to host-managed zoned block devices. 2) Check device area zone alignment: If a target maps to a zoned block device, check that the device area is aligned on zone boundaries to avoid problems with REQ_OP_ZONE_RESET operations (resetting a partially mapped sequential zone would not be possible). This also facilitates the processing of zone report with REQ_OP_ZONE_REPORT bios. 3) Check block devices zone model compatibility When setting the DM device's queue limits, several possibilities exists for zoned block devices: 1) The DM target driver may want to expose a different zone model (e.g. host-managed device emulation or regular block device on top of host-managed zoned block devices) 2) Expose the underlying zone model of the devices as-is To allow both cases, the underlying block device zone model must be set in the target limits in dm_set_device_limits() and the compatibility of all devices checked similarly to the logical block size alignment. For this last check, introduce validate_hardware_zoned_model() to check that all targets of a table have the same zone model and that the zone size of the target devices are equal. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> [Mike Snitzer refactored Damien's original work to simplify the code] Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-05-08 16:40:43 -07:00
/*
* Verify that the zoned model and zone sectors, as determined before
* any .io_hints override, are the same across all devices in the table.
* - this is especially relevant if .io_hints is emulating a disk-managed
* zoned model (aka BLK_ZONED_NONE) on host-managed zoned block devices.
* BUT...
*/
if (limits->zoned != BLK_ZONED_NONE) {
/*
* ...IF the above limits stacking determined a zoned model
* validate that all of the table's devices conform to it.
*/
zoned_model = limits->zoned;
zone_sectors = limits->chunk_sectors;
}
if (validate_hardware_zoned_model(table, zoned_model, zone_sectors))
return -EINVAL;
return validate_hardware_logical_block_alignment(table, limits);
}
/*
* Verify that all devices have an integrity profile that matches the
* DM device's registered integrity profile. If the profiles don't
* match then unregister the DM device's integrity profile.
*/
static void dm_table_verify_integrity(struct dm_table *t)
{
struct gendisk *template_disk = NULL;
if (t->integrity_added)
return;
if (t->integrity_supported) {
/*
* Verify that the original integrity profile
* matches all the devices in this table.
*/
template_disk = dm_table_get_integrity_disk(t);
if (template_disk &&
blk_integrity_compare(dm_disk(t->md), template_disk) >= 0)
return;
}
if (integrity_profile_exists(dm_disk(t->md))) {
DMWARN("%s: unable to establish an integrity profile",
dm_device_name(t->md));
blk_integrity_unregister(dm_disk(t->md));
}
}
Integrate the new file encryption framework These changes integrate new file encryption framework to use new V2 encryption policies. These changes were earlier reverted in 'commit 4211691d298c ("Reverting crypto and incrementalfs changes")', as part of android-4.14.171 merge from Android common kernel. This patch attempts to bring them back post validation. commit a9a5450 ANDROID: dm: prevent default-key from being enabled without needed hooks commit e1a94e6 ANDROID: dm: add dm-default-key target for metadata encryption commit commit 232fd35 ANDROID: dm: enable may_passthrough_inline_crypto on some targets commit 53bc059 ANDROID: dm: add support for passing through inline crypto support commit aeed6db ANDROID: block: Introduce passthrough keyslot manager commit 4f27c8b ANDROID: ext4, f2fs: enable direct I/O with inline encryption commit c91db46 BACKPORT: FROMLIST: scsi: ufs: add program_key() variant op commit f9a8e4a ANDROID: block: export symbols needed for modules to use inline crypto commit 75fea5f ANDROID: block: fix some inline crypto bugs commit 2871f73 ANDROID: fscrypt: add support for hardware-wrapped keys commit bb5a657 ANDROID: block: add KSM op to derive software secret from wrapped key commit d42ba87 ANDROID: block: provide key size as input to inline crypto APIs commit 86646eb ANDROID: ufshcd-crypto: export cap find API commit 83bc20e ANDROID: scsi: ufs-qcom: Enable BROKEN_CRYPTO quirk flag commit c266a13 ANDROID: scsi: ufs: Add quirk bit for controllers that don't play well with inline crypto commit ea09b99 ANDROID: cuttlefish_defconfig: Enable blk-crypto fallback commit e12563c BACKPORT: FROMLIST: Update Inline Encryption from v5 to v6 of patch series commit 8e8f55d ANDROID: scsi: ufs: UFS init should not require inline crypto commit dae9899 ANDROID: scsi: ufs: UFS crypto variant operations API commit a69516d ANDROID: cuttlefish_defconfig: enable inline encryption commit b8f7b23 BACKPORT: FROMLIST: ext4: add inline encryption support commit e64327f BACKPORT: FROMLIST: f2fs: add inline encryption support commit a0dc8da BACKPORT: FROMLIST: fscrypt: add inline encryption support commit 19c3c62 BACKPORT: FROMLIST: scsi: ufs: Add inline encryption support to UFS commit f858a99 BACKPORT: FROMLIST: scsi: ufs: UFS crypto API commit 011b834 BACKPORT: FROMLIST: scsi: ufs: UFS driver v2.1 spec crypto additions commit ec0b569 BACKPORT: FROMLIST: block: blk-crypto for Inline Encryption commit 760b328 ANDROID: block: Fix bio_crypt_should_process WARN_ON commit 138adbb BACKPORT: FROMLIST: block: Add encryption context to struct bio commit 66b5609 BACKPORT: FROMLIST: block: Keyslot Manager for Inline Encryption Git-repo: https://android.googlesource.com/kernel/common/+/refs/heads/android-4.14-stable Git-commit: a9a545067a93d9821f965989b8eaea6fba7d27f7 Git-commit: e1a94e6b17e2610b56c5740b763df7858dad40f0 Git-commit: 232fd353e45d13576d507a011b5dac17e3c320ab Git-commit: 53bc059bc6d98631e8936ab9eeb7ac780c9ab2c3 Git-commit: aeed6db424b22148964d9788d4f9abac6e6cd7d8 Git-commit: 4f27c8b90bd223e967c98dc658961e67b9b864ae Git-commit: c91db466b51479ae761becc233d79c50ca3748a5 Git-commit: f9a8e4a5c5455a6bada70ed6d2f0af8900a872cb Git-commit: 75fea5f6057df78af1655f2f79a9c66a94bc838f Git-commit: 2871f731940165ed4042001a36bbe7d58f9d983b Git-commit: bb5a65771a206ae39086af1a9e78afeaf654cf03 Git-commit: d42ba87e29ab44aac446b5434298d1369c44fe3c Git-commit: 86646ebb1742a663c4c9c39c06d58dcb3f8f89e5 Git-commit: 83bc20ed4ba7dbf76964fd68905fde591b5de8b2 Git-commit: c266a1311e74b3ae1047a9d6abd6c6044059995c Git-commit: ea09b9954cc40b3088b8b2778b2daab12820a7e6 Git-commit: e12563c18d484e6379d03105b4565db7bb3a7975 Git-commit: 8e8f55d1a7e865562d2e3e022a7fcf13753a9c8e Git-commit: dae9899044f320bb119e02b45d816a493b1488ae Git-commit: a69516d0913e7f2c9bdde17c2ea6a793bb474830 Git-commit: b8f7b236748261bec545b69b39d7fb75e519f4ed Git-commit: e64327f5719b4a41e0de341ead7d48ed73216a23 Git-commit: a0dc8da519ccf2040af2dbbd6b4f688b50eb1755 Git-commit: 19c3c62836e5dbc9ceb620ecef0aa0c81578ed43 Git-commit: f858a9981a94a4e1d1b77b00bc05ab61b8431bce Git-commit: 011b8344c36d39255b8057c63d98e593e364ed7f Git-commit: ec0b569b5cc89391d9d6c90d2f76dc0a4db03e57 Git-commit: 760b3283e8056ffa6382722457c2e0cf08328629 Git-commit: 138adbbe5e4bfb6dee0571261f4d96a98f71d228 Git-commit: 66b5609826d60f80623643f1a7a1d865b5233f19 Change-Id: I171d90de41185824e0c7515f3a3b43ab88f4e058 Signed-off-by: Neeraj Soni <neersoni@codeaurora.org>
2020-07-18 19:59:05 +05:30
#ifdef CONFIG_BLK_INLINE_ENCRYPTION
static int device_intersect_crypto_modes(struct dm_target *ti,
struct dm_dev *dev, sector_t start,
sector_t len, void *data)
{
struct keyslot_manager *parent = data;
struct keyslot_manager *child = bdev_get_queue(dev->bdev)->ksm;
keyslot_manager_intersect_modes(parent, child);
return 0;
}
/*
* Update the inline crypto modes supported by 'q->ksm' to be the intersection
* of the modes supported by all targets in the table.
*
* For any mode to be supported at all, all targets must have explicitly
* declared that they can pass through inline crypto support. For a particular
* mode to be supported, all underlying devices must also support it.
*
* Assume that 'q->ksm' initially declares all modes to be supported.
*/
static void dm_calculate_supported_crypto_modes(struct dm_table *t,
struct request_queue *q)
{
struct dm_target *ti;
unsigned int i;
for (i = 0; i < dm_table_get_num_targets(t); i++) {
ti = dm_table_get_target(t, i);
if (!ti->may_passthrough_inline_crypto) {
keyslot_manager_intersect_modes(q->ksm, NULL);
return;
}
if (!ti->type->iterate_devices)
continue;
ti->type->iterate_devices(ti, device_intersect_crypto_modes,
q->ksm);
}
}
#else /* CONFIG_BLK_INLINE_ENCRYPTION */
static inline void dm_calculate_supported_crypto_modes(struct dm_table *t,
struct request_queue *q)
{
}
#endif /* !CONFIG_BLK_INLINE_ENCRYPTION */
static int device_flush_capable(struct dm_target *ti, struct dm_dev *dev,
sector_t start, sector_t len, void *data)
{
unsigned long flush = (unsigned long) data;
struct request_queue *q = bdev_get_queue(dev->bdev);
return q && (q->queue_flags & flush);
}
static bool dm_table_supports_flush(struct dm_table *t, unsigned long flush)
{
struct dm_target *ti;
unsigned i;
/*
* Require at least one underlying device to support flushes.
* t->devices includes internal dm devices such as mirror logs
* so we need to use iterate_devices here, which targets
* supporting flushes must provide.
*/
for (i = 0; i < dm_table_get_num_targets(t); i++) {
ti = dm_table_get_target(t, i);
if (!ti->num_flush_bios)
continue;
if (ti->flush_supported)
return true;
if (ti->type->iterate_devices &&
ti->type->iterate_devices(ti, device_flush_capable, (void *) flush))
return true;
}
return false;
}
static int device_dax_write_cache_enabled(struct dm_target *ti,
struct dm_dev *dev, sector_t start,
sector_t len, void *data)
{
struct dax_device *dax_dev = dev->dax_dev;
if (!dax_dev)
return false;
if (dax_write_cache_enabled(dax_dev))
return true;
return false;
}
dm table: fix iterate_devices based device capability checks commit a4c8dd9c2d0987cf542a2a0c42684c9c6d78a04e upstream. According to the definition of dm_iterate_devices_fn: * This function must iterate through each section of device used by the * target until it encounters a non-zero return code, which it then returns. * Returns zero if no callout returned non-zero. For some target type (e.g. dm-stripe), one call of iterate_devices() may iterate multiple underlying devices internally, in which case a non-zero return code returned by iterate_devices_callout_fn will stop the iteration in advance. No iterate_devices_callout_fn should return non-zero unless device iteration should stop. Rename dm_table_requires_stable_pages() to dm_table_any_dev_attr() and elevate it for reuse to stop iterating (and return non-zero) on the first device that causes iterate_devices_callout_fn to return non-zero. Use dm_table_any_dev_attr() to properly iterate through devices. Rename device_is_nonrot() to device_is_rotational() and invert logic accordingly to fix improper disposition. [jeffle: backport notes] Also convert the no_sg_merge capability check, which is introduced by commit 200612ec33e5 ("dm table: propagate QUEUE_FLAG_NO_SG_MERGE"), and removed since commit 2705c93742e9 ("block: kill QUEUE_FLAG_NO_SG_MERGE") in v5.1. Fixes: c3c4555edd10 ("dm table: clear add_random unless all devices have it set") Fixes: 4693c9668fdc ("dm table: propagate non rotational flag") Cc: stable@vger.kernel.org Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-09 11:27:43 +08:00
static int device_is_rotational(struct dm_target *ti, struct dm_dev *dev,
sector_t start, sector_t len, void *data)
{
struct request_queue *q = bdev_get_queue(dev->bdev);
dm table: fix iterate_devices based device capability checks commit a4c8dd9c2d0987cf542a2a0c42684c9c6d78a04e upstream. According to the definition of dm_iterate_devices_fn: * This function must iterate through each section of device used by the * target until it encounters a non-zero return code, which it then returns. * Returns zero if no callout returned non-zero. For some target type (e.g. dm-stripe), one call of iterate_devices() may iterate multiple underlying devices internally, in which case a non-zero return code returned by iterate_devices_callout_fn will stop the iteration in advance. No iterate_devices_callout_fn should return non-zero unless device iteration should stop. Rename dm_table_requires_stable_pages() to dm_table_any_dev_attr() and elevate it for reuse to stop iterating (and return non-zero) on the first device that causes iterate_devices_callout_fn to return non-zero. Use dm_table_any_dev_attr() to properly iterate through devices. Rename device_is_nonrot() to device_is_rotational() and invert logic accordingly to fix improper disposition. [jeffle: backport notes] Also convert the no_sg_merge capability check, which is introduced by commit 200612ec33e5 ("dm table: propagate QUEUE_FLAG_NO_SG_MERGE"), and removed since commit 2705c93742e9 ("block: kill QUEUE_FLAG_NO_SG_MERGE") in v5.1. Fixes: c3c4555edd10 ("dm table: clear add_random unless all devices have it set") Fixes: 4693c9668fdc ("dm table: propagate non rotational flag") Cc: stable@vger.kernel.org Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-09 11:27:43 +08:00
return q && !blk_queue_nonrot(q);
}
static int device_is_not_random(struct dm_target *ti, struct dm_dev *dev,
sector_t start, sector_t len, void *data)
{
struct request_queue *q = bdev_get_queue(dev->bdev);
return q && !blk_queue_add_random(q);
}
dm table: fix iterate_devices based device capability checks commit a4c8dd9c2d0987cf542a2a0c42684c9c6d78a04e upstream. According to the definition of dm_iterate_devices_fn: * This function must iterate through each section of device used by the * target until it encounters a non-zero return code, which it then returns. * Returns zero if no callout returned non-zero. For some target type (e.g. dm-stripe), one call of iterate_devices() may iterate multiple underlying devices internally, in which case a non-zero return code returned by iterate_devices_callout_fn will stop the iteration in advance. No iterate_devices_callout_fn should return non-zero unless device iteration should stop. Rename dm_table_requires_stable_pages() to dm_table_any_dev_attr() and elevate it for reuse to stop iterating (and return non-zero) on the first device that causes iterate_devices_callout_fn to return non-zero. Use dm_table_any_dev_attr() to properly iterate through devices. Rename device_is_nonrot() to device_is_rotational() and invert logic accordingly to fix improper disposition. [jeffle: backport notes] Also convert the no_sg_merge capability check, which is introduced by commit 200612ec33e5 ("dm table: propagate QUEUE_FLAG_NO_SG_MERGE"), and removed since commit 2705c93742e9 ("block: kill QUEUE_FLAG_NO_SG_MERGE") in v5.1. Fixes: c3c4555edd10 ("dm table: clear add_random unless all devices have it set") Fixes: 4693c9668fdc ("dm table: propagate non rotational flag") Cc: stable@vger.kernel.org Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-09 11:27:43 +08:00
static int queue_no_sg_merge(struct dm_target *ti, struct dm_dev *dev,
sector_t start, sector_t len, void *data)
dm table: propagate QUEUE_FLAG_NO_SG_MERGE Commit 05f1dd5 ("block: add queue flag for disabling SG merging") introduced a new queue flag: QUEUE_FLAG_NO_SG_MERGE. This gets set by default in blk_mq_init_queue for mq-enabled devices. The effect of the flag is to bypass the SG segment merging. Instead, the bio->bi_vcnt is used as the number of hardware segments. With a device mapper target on top of a device with QUEUE_FLAG_NO_SG_MERGE set, we can end up sending down more segments than a driver is prepared to handle. I ran into this when backporting the virtio_blk mq support. It triggerred this BUG_ON, in virtio_queue_rq: BUG_ON(req->nr_phys_segments + 2 > vblk->sg_elems); The queue's max is set here: blk_queue_max_segments(q, vblk->sg_elems-2); Basically, what happens is that a bio is built up for the dm device (which does not have the QUEUE_FLAG_NO_SG_MERGE flag set) using bio_add_page. That path will call into __blk_recalc_rq_segments, so what you end up with is bi_phys_segments being much smaller than bi_vcnt (and bi_vcnt grows beyond the maximum sg elements). Then, when the bio is submitted, it gets cloned. When the cloned bio is submitted, it will end up in blk_recount_segments, here: if (test_bit(QUEUE_FLAG_NO_SG_MERGE, &q->queue_flags)) bio->bi_phys_segments = bio->bi_vcnt; and now we've set bio->bi_phys_segments to a number that is beyond what was registered as queue_max_segments by the driver. The right way to fix this is to propagate the queue flag up the stack. The rules for propagating the flag are simple: - if the flag is set for any underlying device, it must be set for the upper device - consequently, if the flag is not set for any underlying device, it should not be set for the upper device. Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org # 3.16+
2014-08-08 11:03:41 -04:00
{
struct request_queue *q = bdev_get_queue(dev->bdev);
dm table: fix iterate_devices based device capability checks commit a4c8dd9c2d0987cf542a2a0c42684c9c6d78a04e upstream. According to the definition of dm_iterate_devices_fn: * This function must iterate through each section of device used by the * target until it encounters a non-zero return code, which it then returns. * Returns zero if no callout returned non-zero. For some target type (e.g. dm-stripe), one call of iterate_devices() may iterate multiple underlying devices internally, in which case a non-zero return code returned by iterate_devices_callout_fn will stop the iteration in advance. No iterate_devices_callout_fn should return non-zero unless device iteration should stop. Rename dm_table_requires_stable_pages() to dm_table_any_dev_attr() and elevate it for reuse to stop iterating (and return non-zero) on the first device that causes iterate_devices_callout_fn to return non-zero. Use dm_table_any_dev_attr() to properly iterate through devices. Rename device_is_nonrot() to device_is_rotational() and invert logic accordingly to fix improper disposition. [jeffle: backport notes] Also convert the no_sg_merge capability check, which is introduced by commit 200612ec33e5 ("dm table: propagate QUEUE_FLAG_NO_SG_MERGE"), and removed since commit 2705c93742e9 ("block: kill QUEUE_FLAG_NO_SG_MERGE") in v5.1. Fixes: c3c4555edd10 ("dm table: clear add_random unless all devices have it set") Fixes: 4693c9668fdc ("dm table: propagate non rotational flag") Cc: stable@vger.kernel.org Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-09 11:27:43 +08:00
return q && test_bit(QUEUE_FLAG_NO_SG_MERGE, &q->queue_flags);
}
static int device_not_write_same_capable(struct dm_target *ti, struct dm_dev *dev,
sector_t start, sector_t len, void *data)
{
struct request_queue *q = bdev_get_queue(dev->bdev);
return q && !q->limits.max_write_same_sectors;
}
static bool dm_table_supports_write_same(struct dm_table *t)
{
struct dm_target *ti;
unsigned i;
for (i = 0; i < dm_table_get_num_targets(t); i++) {
ti = dm_table_get_target(t, i);
if (!ti->num_write_same_bios)
return false;
if (!ti->type->iterate_devices ||
ti->type->iterate_devices(ti, device_not_write_same_capable, NULL))
return false;
}
return true;
}
static int device_not_write_zeroes_capable(struct dm_target *ti, struct dm_dev *dev,
sector_t start, sector_t len, void *data)
{
struct request_queue *q = bdev_get_queue(dev->bdev);
return q && !q->limits.max_write_zeroes_sectors;
}
static bool dm_table_supports_write_zeroes(struct dm_table *t)
{
struct dm_target *ti;
unsigned i = 0;
while (i < dm_table_get_num_targets(t)) {
ti = dm_table_get_target(t, i++);
if (!ti->num_write_zeroes_bios)
return false;
if (!ti->type->iterate_devices ||
ti->type->iterate_devices(ti, device_not_write_zeroes_capable, NULL))
return false;
}
return true;
}
static int device_not_discard_capable(struct dm_target *ti, struct dm_dev *dev,
sector_t start, sector_t len, void *data)
{
struct request_queue *q = bdev_get_queue(dev->bdev);
return q && !blk_queue_discard(q);
}
static bool dm_table_supports_discards(struct dm_table *t)
{
struct dm_target *ti;
unsigned i;
for (i = 0; i < dm_table_get_num_targets(t); i++) {
ti = dm_table_get_target(t, i);
if (!ti->num_discard_bios)
return false;
/*
* Either the target provides discard support (as implied by setting
* 'discards_supported') or it relies on _all_ data devices having
* discard support.
*/
if (!ti->discards_supported &&
(!ti->type->iterate_devices ||
ti->type->iterate_devices(ti, device_not_discard_capable, NULL)))
return false;
}
return true;
}
static int device_requires_stable_pages(struct dm_target *ti,
struct dm_dev *dev, sector_t start,
sector_t len, void *data)
{
struct request_queue *q = bdev_get_queue(dev->bdev);
return q && bdi_cap_stable_pages_required(q->backing_dev_info);
}
void dm_table_set_restrictions(struct dm_table *t, struct request_queue *q,
struct queue_limits *limits)
{
bool wc = false, fua = false;
/*
* Copy table's limits to the DM device's request_queue
*/
q->limits = *limits;
if (!dm_table_supports_discards(t))
blk_queue_flag_clear(QUEUE_FLAG_DISCARD, q);
else
blk_queue_flag_set(QUEUE_FLAG_DISCARD, q);
if (dm_table_supports_flush(t, (1UL << QUEUE_FLAG_WC))) {
wc = true;
if (dm_table_supports_flush(t, (1UL << QUEUE_FLAG_FUA)))
fua = true;
}
blk_queue_write_cache(q, wc, fua);
if (dm_table_supports_dax(t))
blk_queue_flag_set(QUEUE_FLAG_DAX, q);
else
blk_queue_flag_clear(QUEUE_FLAG_DAX, q);
if (dm_table_any_dev_attr(t, device_dax_write_cache_enabled, NULL))
dax_write_cache(t->md->dax_dev, true);
/* Ensure that all underlying devices are non-rotational. */
if (dm_table_any_dev_attr(t, device_is_rotational, NULL))
blk_queue_flag_clear(QUEUE_FLAG_NONROT, q);
dm table: fix iterate_devices based device capability checks commit a4c8dd9c2d0987cf542a2a0c42684c9c6d78a04e upstream. According to the definition of dm_iterate_devices_fn: * This function must iterate through each section of device used by the * target until it encounters a non-zero return code, which it then returns. * Returns zero if no callout returned non-zero. For some target type (e.g. dm-stripe), one call of iterate_devices() may iterate multiple underlying devices internally, in which case a non-zero return code returned by iterate_devices_callout_fn will stop the iteration in advance. No iterate_devices_callout_fn should return non-zero unless device iteration should stop. Rename dm_table_requires_stable_pages() to dm_table_any_dev_attr() and elevate it for reuse to stop iterating (and return non-zero) on the first device that causes iterate_devices_callout_fn to return non-zero. Use dm_table_any_dev_attr() to properly iterate through devices. Rename device_is_nonrot() to device_is_rotational() and invert logic accordingly to fix improper disposition. [jeffle: backport notes] Also convert the no_sg_merge capability check, which is introduced by commit 200612ec33e5 ("dm table: propagate QUEUE_FLAG_NO_SG_MERGE"), and removed since commit 2705c93742e9 ("block: kill QUEUE_FLAG_NO_SG_MERGE") in v5.1. Fixes: c3c4555edd10 ("dm table: clear add_random unless all devices have it set") Fixes: 4693c9668fdc ("dm table: propagate non rotational flag") Cc: stable@vger.kernel.org Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-09 11:27:43 +08:00
else
blk_queue_flag_set(QUEUE_FLAG_NONROT, q);
if (!dm_table_supports_write_same(t))
q->limits.max_write_same_sectors = 0;
if (!dm_table_supports_write_zeroes(t))
q->limits.max_write_zeroes_sectors = 0;
if (dm_table_any_dev_attr(t, queue_no_sg_merge, NULL))
blk_queue_flag_set(QUEUE_FLAG_NO_SG_MERGE, q);
dm table: fix iterate_devices based device capability checks commit a4c8dd9c2d0987cf542a2a0c42684c9c6d78a04e upstream. According to the definition of dm_iterate_devices_fn: * This function must iterate through each section of device used by the * target until it encounters a non-zero return code, which it then returns. * Returns zero if no callout returned non-zero. For some target type (e.g. dm-stripe), one call of iterate_devices() may iterate multiple underlying devices internally, in which case a non-zero return code returned by iterate_devices_callout_fn will stop the iteration in advance. No iterate_devices_callout_fn should return non-zero unless device iteration should stop. Rename dm_table_requires_stable_pages() to dm_table_any_dev_attr() and elevate it for reuse to stop iterating (and return non-zero) on the first device that causes iterate_devices_callout_fn to return non-zero. Use dm_table_any_dev_attr() to properly iterate through devices. Rename device_is_nonrot() to device_is_rotational() and invert logic accordingly to fix improper disposition. [jeffle: backport notes] Also convert the no_sg_merge capability check, which is introduced by commit 200612ec33e5 ("dm table: propagate QUEUE_FLAG_NO_SG_MERGE"), and removed since commit 2705c93742e9 ("block: kill QUEUE_FLAG_NO_SG_MERGE") in v5.1. Fixes: c3c4555edd10 ("dm table: clear add_random unless all devices have it set") Fixes: 4693c9668fdc ("dm table: propagate non rotational flag") Cc: stable@vger.kernel.org Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-09 11:27:43 +08:00
else
blk_queue_flag_clear(QUEUE_FLAG_NO_SG_MERGE, q);
dm table: propagate QUEUE_FLAG_NO_SG_MERGE Commit 05f1dd5 ("block: add queue flag for disabling SG merging") introduced a new queue flag: QUEUE_FLAG_NO_SG_MERGE. This gets set by default in blk_mq_init_queue for mq-enabled devices. The effect of the flag is to bypass the SG segment merging. Instead, the bio->bi_vcnt is used as the number of hardware segments. With a device mapper target on top of a device with QUEUE_FLAG_NO_SG_MERGE set, we can end up sending down more segments than a driver is prepared to handle. I ran into this when backporting the virtio_blk mq support. It triggerred this BUG_ON, in virtio_queue_rq: BUG_ON(req->nr_phys_segments + 2 > vblk->sg_elems); The queue's max is set here: blk_queue_max_segments(q, vblk->sg_elems-2); Basically, what happens is that a bio is built up for the dm device (which does not have the QUEUE_FLAG_NO_SG_MERGE flag set) using bio_add_page. That path will call into __blk_recalc_rq_segments, so what you end up with is bi_phys_segments being much smaller than bi_vcnt (and bi_vcnt grows beyond the maximum sg elements). Then, when the bio is submitted, it gets cloned. When the cloned bio is submitted, it will end up in blk_recount_segments, here: if (test_bit(QUEUE_FLAG_NO_SG_MERGE, &q->queue_flags)) bio->bi_phys_segments = bio->bi_vcnt; and now we've set bio->bi_phys_segments to a number that is beyond what was registered as queue_max_segments by the driver. The right way to fix this is to propagate the queue flag up the stack. The rules for propagating the flag are simple: - if the flag is set for any underlying device, it must be set for the upper device - consequently, if the flag is not set for any underlying device, it should not be set for the upper device. Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org # 3.16+
2014-08-08 11:03:41 -04:00
dm_table_verify_integrity(t);
Integrate the new file encryption framework These changes integrate new file encryption framework to use new V2 encryption policies. These changes were earlier reverted in 'commit 4211691d298c ("Reverting crypto and incrementalfs changes")', as part of android-4.14.171 merge from Android common kernel. This patch attempts to bring them back post validation. commit a9a5450 ANDROID: dm: prevent default-key from being enabled without needed hooks commit e1a94e6 ANDROID: dm: add dm-default-key target for metadata encryption commit commit 232fd35 ANDROID: dm: enable may_passthrough_inline_crypto on some targets commit 53bc059 ANDROID: dm: add support for passing through inline crypto support commit aeed6db ANDROID: block: Introduce passthrough keyslot manager commit 4f27c8b ANDROID: ext4, f2fs: enable direct I/O with inline encryption commit c91db46 BACKPORT: FROMLIST: scsi: ufs: add program_key() variant op commit f9a8e4a ANDROID: block: export symbols needed for modules to use inline crypto commit 75fea5f ANDROID: block: fix some inline crypto bugs commit 2871f73 ANDROID: fscrypt: add support for hardware-wrapped keys commit bb5a657 ANDROID: block: add KSM op to derive software secret from wrapped key commit d42ba87 ANDROID: block: provide key size as input to inline crypto APIs commit 86646eb ANDROID: ufshcd-crypto: export cap find API commit 83bc20e ANDROID: scsi: ufs-qcom: Enable BROKEN_CRYPTO quirk flag commit c266a13 ANDROID: scsi: ufs: Add quirk bit for controllers that don't play well with inline crypto commit ea09b99 ANDROID: cuttlefish_defconfig: Enable blk-crypto fallback commit e12563c BACKPORT: FROMLIST: Update Inline Encryption from v5 to v6 of patch series commit 8e8f55d ANDROID: scsi: ufs: UFS init should not require inline crypto commit dae9899 ANDROID: scsi: ufs: UFS crypto variant operations API commit a69516d ANDROID: cuttlefish_defconfig: enable inline encryption commit b8f7b23 BACKPORT: FROMLIST: ext4: add inline encryption support commit e64327f BACKPORT: FROMLIST: f2fs: add inline encryption support commit a0dc8da BACKPORT: FROMLIST: fscrypt: add inline encryption support commit 19c3c62 BACKPORT: FROMLIST: scsi: ufs: Add inline encryption support to UFS commit f858a99 BACKPORT: FROMLIST: scsi: ufs: UFS crypto API commit 011b834 BACKPORT: FROMLIST: scsi: ufs: UFS driver v2.1 spec crypto additions commit ec0b569 BACKPORT: FROMLIST: block: blk-crypto for Inline Encryption commit 760b328 ANDROID: block: Fix bio_crypt_should_process WARN_ON commit 138adbb BACKPORT: FROMLIST: block: Add encryption context to struct bio commit 66b5609 BACKPORT: FROMLIST: block: Keyslot Manager for Inline Encryption Git-repo: https://android.googlesource.com/kernel/common/+/refs/heads/android-4.14-stable Git-commit: a9a545067a93d9821f965989b8eaea6fba7d27f7 Git-commit: e1a94e6b17e2610b56c5740b763df7858dad40f0 Git-commit: 232fd353e45d13576d507a011b5dac17e3c320ab Git-commit: 53bc059bc6d98631e8936ab9eeb7ac780c9ab2c3 Git-commit: aeed6db424b22148964d9788d4f9abac6e6cd7d8 Git-commit: 4f27c8b90bd223e967c98dc658961e67b9b864ae Git-commit: c91db466b51479ae761becc233d79c50ca3748a5 Git-commit: f9a8e4a5c5455a6bada70ed6d2f0af8900a872cb Git-commit: 75fea5f6057df78af1655f2f79a9c66a94bc838f Git-commit: 2871f731940165ed4042001a36bbe7d58f9d983b Git-commit: bb5a65771a206ae39086af1a9e78afeaf654cf03 Git-commit: d42ba87e29ab44aac446b5434298d1369c44fe3c Git-commit: 86646ebb1742a663c4c9c39c06d58dcb3f8f89e5 Git-commit: 83bc20ed4ba7dbf76964fd68905fde591b5de8b2 Git-commit: c266a1311e74b3ae1047a9d6abd6c6044059995c Git-commit: ea09b9954cc40b3088b8b2778b2daab12820a7e6 Git-commit: e12563c18d484e6379d03105b4565db7bb3a7975 Git-commit: 8e8f55d1a7e865562d2e3e022a7fcf13753a9c8e Git-commit: dae9899044f320bb119e02b45d816a493b1488ae Git-commit: a69516d0913e7f2c9bdde17c2ea6a793bb474830 Git-commit: b8f7b236748261bec545b69b39d7fb75e519f4ed Git-commit: e64327f5719b4a41e0de341ead7d48ed73216a23 Git-commit: a0dc8da519ccf2040af2dbbd6b4f688b50eb1755 Git-commit: 19c3c62836e5dbc9ceb620ecef0aa0c81578ed43 Git-commit: f858a9981a94a4e1d1b77b00bc05ab61b8431bce Git-commit: 011b8344c36d39255b8057c63d98e593e364ed7f Git-commit: ec0b569b5cc89391d9d6c90d2f76dc0a4db03e57 Git-commit: 760b3283e8056ffa6382722457c2e0cf08328629 Git-commit: 138adbbe5e4bfb6dee0571261f4d96a98f71d228 Git-commit: 66b5609826d60f80623643f1a7a1d865b5233f19 Change-Id: I171d90de41185824e0c7515f3a3b43ab88f4e058 Signed-off-by: Neeraj Soni <neersoni@codeaurora.org>
2020-07-18 19:59:05 +05:30
dm_calculate_supported_crypto_modes(t, q);
/*
* Some devices don't use blk_integrity but still want stable pages
* because they do their own checksumming.
dm table: fix iterate_devices based device capability checks commit a4c8dd9c2d0987cf542a2a0c42684c9c6d78a04e upstream. According to the definition of dm_iterate_devices_fn: * This function must iterate through each section of device used by the * target until it encounters a non-zero return code, which it then returns. * Returns zero if no callout returned non-zero. For some target type (e.g. dm-stripe), one call of iterate_devices() may iterate multiple underlying devices internally, in which case a non-zero return code returned by iterate_devices_callout_fn will stop the iteration in advance. No iterate_devices_callout_fn should return non-zero unless device iteration should stop. Rename dm_table_requires_stable_pages() to dm_table_any_dev_attr() and elevate it for reuse to stop iterating (and return non-zero) on the first device that causes iterate_devices_callout_fn to return non-zero. Use dm_table_any_dev_attr() to properly iterate through devices. Rename device_is_nonrot() to device_is_rotational() and invert logic accordingly to fix improper disposition. [jeffle: backport notes] Also convert the no_sg_merge capability check, which is introduced by commit 200612ec33e5 ("dm table: propagate QUEUE_FLAG_NO_SG_MERGE"), and removed since commit 2705c93742e9 ("block: kill QUEUE_FLAG_NO_SG_MERGE") in v5.1. Fixes: c3c4555edd10 ("dm table: clear add_random unless all devices have it set") Fixes: 4693c9668fdc ("dm table: propagate non rotational flag") Cc: stable@vger.kernel.org Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-09 11:27:43 +08:00
* If any underlying device requires stable pages, a table must require
* them as well. Only targets that support iterate_devices are considered:
* don't want error, zero, etc to require stable pages.
*/
if (dm_table_any_dev_attr(t, device_requires_stable_pages, NULL))
q->backing_dev_info->capabilities |= BDI_CAP_STABLE_WRITES;
else
q->backing_dev_info->capabilities &= ~BDI_CAP_STABLE_WRITES;
/*
* Determine whether or not this queue's I/O timings contribute
* to the entropy pool, Only request-based targets use this.
* Clear QUEUE_FLAG_ADD_RANDOM if any underlying device does not
* have it set.
*/
if (blk_queue_add_random(q) &&
dm_table_any_dev_attr(t, device_is_not_random, NULL))
blk_queue_flag_clear(QUEUE_FLAG_ADD_RANDOM, q);
/*
* QUEUE_FLAG_STACKABLE must be set after all queue settings are
* visible to other CPUs because, once the flag is set, incoming bios
* are processed by request-based dm, which refers to the queue
* settings.
* Until the flag set, bios are passed to bio-based dm and queued to
* md->deferred where queue settings are not needed yet.
* Those bios are passed to request-based dm at the resume time.
*/
smp_mb();
if (dm_table_request_based(t))
blk_queue_flag_set(QUEUE_FLAG_STACKABLE, q);
/* io_pages is used for readahead */
q->backing_dev_info->io_pages = limits->max_sectors >> (PAGE_SHIFT - 9);
}
unsigned int dm_table_get_num_targets(struct dm_table *t)
{
return t->num_targets;
}
struct list_head *dm_table_get_devices(struct dm_table *t)
{
return &t->devices;
}
fmode_t dm_table_get_mode(struct dm_table *t)
{
return t->mode;
}
EXPORT_SYMBOL(dm_table_get_mode);
enum suspend_mode {
PRESUSPEND,
PRESUSPEND_UNDO,
POSTSUSPEND,
};
static void suspend_targets(struct dm_table *t, enum suspend_mode mode)
{
int i = t->num_targets;
struct dm_target *ti = t->targets;
lockdep_assert_held(&t->md->suspend_lock);
while (i--) {
switch (mode) {
case PRESUSPEND:
if (ti->type->presuspend)
ti->type->presuspend(ti);
break;
case PRESUSPEND_UNDO:
if (ti->type->presuspend_undo)
ti->type->presuspend_undo(ti);
break;
case POSTSUSPEND:
if (ti->type->postsuspend)
ti->type->postsuspend(ti);
break;
}
ti++;
}
}
void dm_table_presuspend_targets(struct dm_table *t)
{
if (!t)
return;
suspend_targets(t, PRESUSPEND);
}
void dm_table_presuspend_undo_targets(struct dm_table *t)
{
if (!t)
return;
suspend_targets(t, PRESUSPEND_UNDO);
}
void dm_table_postsuspend_targets(struct dm_table *t)
{
if (!t)
return;
suspend_targets(t, POSTSUSPEND);
}
int dm_table_resume_targets(struct dm_table *t)
{
int i, r = 0;
lockdep_assert_held(&t->md->suspend_lock);
for (i = 0; i < t->num_targets; i++) {
struct dm_target *ti = t->targets + i;
if (!ti->type->preresume)
continue;
r = ti->type->preresume(ti);
if (r) {
DMERR("%s: %s: preresume failed, error = %d",
dm_device_name(t->md), ti->type->name, r);
return r;
}
}
for (i = 0; i < t->num_targets; i++) {
struct dm_target *ti = t->targets + i;
if (ti->type->resume)
ti->type->resume(ti);
}
return 0;
}
void dm_table_add_target_callbacks(struct dm_table *t, struct dm_target_callbacks *cb)
{
list_add(&cb->list, &t->target_callbacks);
}
EXPORT_SYMBOL_GPL(dm_table_add_target_callbacks);
int dm_table_any_congested(struct dm_table *t, int bdi_bits)
{
struct dm_dev_internal *dd;
struct list_head *devices = dm_table_get_devices(t);
struct dm_target_callbacks *cb;
int r = 0;
list_for_each_entry(dd, devices, list) {
dm: allow active and inactive tables to share dm_devs Until this change, when loading a new DM table, DM core would re-open all of the devices in the DM table. Now, DM core will avoid redundant device opens (and closes when destroying the old table) if the old table already has a device open using the same mode. This is achieved by managing reference counts on the table_devices that DM core now stores in the mapped_device structure (rather than in the dm_table structure). So a mapped_device's active and inactive dm_tables' dm_dev lists now just point to the dm_devs stored in the mapped_device's table_devices list. This improvement in DM core's device reference counting has the side-effect of fixing a long-standing limitation of the multipath target: a DM multipath table couldn't include any paths that were unusable (failed). For example: if all paths have failed and you add a new, working, path to the table; you can't use it since the table load would fail due to it still containing failed paths. Now a re-load of a multipath table can include failed devices and when those devices become active again they can be used instantly. The device list code in dm.c isn't a straight copy/paste from the code in dm-table.c, but it's very close (aside from some variable renames). One subtle difference is that find_table_device for the tables_devices list will only match devices with the same name and mode. This is because we don't want to upgrade a device's mode in the active table when an inactive table is loaded. Access to the mapped_device structure's tables_devices list requires a mutex (tables_devices_lock), so that tables cannot be created and destroyed concurrently. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2014-08-13 13:53:43 -05:00
struct request_queue *q = bdev_get_queue(dd->dm_dev->bdev);
char b[BDEVNAME_SIZE];
if (likely(q))
r |= bdi_congested(q->backing_dev_info, bdi_bits);
else
DMWARN_LIMIT("%s: any_congested: nonexistent device %s",
dm_device_name(t->md),
dm: allow active and inactive tables to share dm_devs Until this change, when loading a new DM table, DM core would re-open all of the devices in the DM table. Now, DM core will avoid redundant device opens (and closes when destroying the old table) if the old table already has a device open using the same mode. This is achieved by managing reference counts on the table_devices that DM core now stores in the mapped_device structure (rather than in the dm_table structure). So a mapped_device's active and inactive dm_tables' dm_dev lists now just point to the dm_devs stored in the mapped_device's table_devices list. This improvement in DM core's device reference counting has the side-effect of fixing a long-standing limitation of the multipath target: a DM multipath table couldn't include any paths that were unusable (failed). For example: if all paths have failed and you add a new, working, path to the table; you can't use it since the table load would fail due to it still containing failed paths. Now a re-load of a multipath table can include failed devices and when those devices become active again they can be used instantly. The device list code in dm.c isn't a straight copy/paste from the code in dm-table.c, but it's very close (aside from some variable renames). One subtle difference is that find_table_device for the tables_devices list will only match devices with the same name and mode. This is because we don't want to upgrade a device's mode in the active table when an inactive table is loaded. Access to the mapped_device structure's tables_devices list requires a mutex (tables_devices_lock), so that tables cannot be created and destroyed concurrently. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2014-08-13 13:53:43 -05:00
bdevname(dd->dm_dev->bdev, b));
}
list_for_each_entry(cb, &t->target_callbacks, list)
if (cb->congested_fn)
r |= cb->congested_fn(cb, bdi_bits);
return r;
}
struct mapped_device *dm_table_get_md(struct dm_table *t)
{
return t->md;
}
EXPORT_SYMBOL(dm_table_get_md);
void dm_table_run_md_queue_async(struct dm_table *t)
{
struct mapped_device *md;
struct request_queue *queue;
unsigned long flags;
if (!dm_table_request_based(t))
return;
md = dm_table_get_md(t);
queue = dm_get_md_queue(md);
if (queue) {
if (queue->mq_ops)
blk_mq_run_hw_queues(queue, true);
else {
spin_lock_irqsave(queue->queue_lock, flags);
blk_run_queue_async(queue);
spin_unlock_irqrestore(queue->queue_lock, flags);
}
}
}
EXPORT_SYMBOL(dm_table_run_md_queue_async);