Since commit 9f3d6d7 chsc_get_channel_measurement_chars is called with
interrupts disabled during resume from hibernate. Since this function
used spin_unlock_irq, interrupts have been enabled accidentally. Fix
this by using the irqsave variant.
Since we can't guarantee the IRQ-enablement state for all (future/
external) callers, change the locking in related functions to prevent
similar bugs in the future.
Fixes: 9f3d6d7 ("s390/cio: update measurement characteristics")
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Trival fix, dev_err messages are missing a \n, so add it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
If the DASD device gets blocked for any reason, e.g. because it is reserved
somewhere, the host_access_count sysfs entry or the host_access_list
debugfs entry may sleep forever. Make it interruptible so that userspace
can use ^C to abort the operation.
Signed-off-by: Stefan Haberland <sth@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
A DASD device consists of the device itself and a discipline with a
corresponding private structure. These fields are set up during online
processing right after the device is created and before it is processed by
the state machine and made available for I/O.
During offline processing the discipline pointer and the private data gets
freed within the state machine and without protection of the existing
reference count. This might lead to a kernel panic because a function might
have taken a device reference and accesses the discipline pointer and/or
private data of the device while this is already freed.
Fix by freeing the discipline pointer and the private data after ensuring
that there is no reference to the device left.
Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Stefan Haberland <sth@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Internal I/O is processed by the _sleep_on_function which might wait for a
device to get operational. During offline processing this will never happen
and therefore the refcount of the device will not drop to zero and the
offline processing blocks as well.
Fix by letting requests fail in the _sleep_on function during offline
processing. No further handling of the requests is necessary since this is
internal I/O and the device is thrown away afterwards.
Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Stefan Haberland <sth@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Lazy unmap (defer tlb flush after unmap until dma address reuse) can
greatly reduce the number of RPCIT instructions in the best case. In
reality we are often far away from the best case scenario because our
implementation suffers from the following problem:
To create dma addresses we maintain an iommu bitmap and a pointer into
that bitmap to mark the start of the next search. That pointer moves from
the start to the end of that bitmap and we issue a global tlb flush
once that pointer wraps around. To prevent address reuse before we issue
the tlb flush we even have to move the next pointer during unmaps - when
clearing a bit > next. This could lead to a situation where we only use
the rear part of that bitmap and issue more tlb flushes than expected.
To fix this we no longer clear bits during unmap but maintain a 2nd
bitmap which we use to mark addresses that can't be reused until we issue
the global tlb flush after wrap around.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Split dma_update_trans into __dma_update_trans which handles updating
the dma translation tables and __dma_purge_tlb which takes care of
purging associated entries in the dma translation lookaside buffer.
The map_sg API makes use of this split approach by calling
__dma_update_trans once per physically contiguous address range but
__dma_purge_tlb only once per dma contiguous address range.
This results in less invocations of the expensive RPCIT instruction
when using map_sg.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Our map_sg implementation mapped sg entries independently of each other.
For ease of use and possible performance improvements this patch changes
the implementation to try to map as many (likely physically non-contiguous)
sglist entries as possible into a contiguous DMA segment.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Simplify the code we use to calculate dma addresses by putting
everything related in a dma_alloc_address function. Also provide
a dma_free_address counterpart.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
We calculate dma addresses using an iommu bitmap. Since commit
69eea95c ("s390/pci_dma: fix DMA table corruption with > 4 TB main memory")
we've made sure that addresses created using that bitmap are below
the maximum reported by firmware. Thus the additional check for
that address to be within range can be removed.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
When a new function is attached to an iommu domain we need to register
I/O address translation parameters. Since commit
69eea95c ("s390/pci_dma: fix DMA table corruption with > 4 TB main memory")
start_dma and end_dma correctly describe the range of usable I/O addresses.
Simplify the code by using these values directly.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
These files were only including module.h for exception table
related functions. We've now separated that content out into its
own file "extable.h" so now move over to that and avoid all the
extra header content in module.h that we don't really need to compile
these files.
The additions of uaccess.h are to deal with implict includes like:
arch/s390/kernel/traps.c: In function 'do_report_trap':
arch/s390/kernel/traps.c:56:4: error: implicit declaration of function 'extable_fixup' [-Werror=implicit-function-declaration]
arch/s390/kernel/traps.c: In function 'illegal_op':
arch/s390/kernel/traps.c:173:3: error: implicit declaration of function 'get_user' [-Werror=implicit-function-declaration]
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux-s390@vger.kernel.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Export clp.h for usage by userspace.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
"irq" in vmur's int handler can be an error pointer. Don't dereference
this pointer in that case.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The DASD device driver throws change events for the DASD blockdevice
after the online processing is done so that udev rules can take
actions after it.
The change event was missing for unformatted devices.
Signed-off-by: Stefan Haberland <sth@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This enables UBSAN for s390. We have to disable the null sanitizer
as s390 code does access memory via a null pointer (the prefix page).
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Some architectures use a hardware defined structure at address zero.
Checking for a null pointer will result in many ubsan reports.
Allow users to disable the null sanitizer.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The combo of list_empty() check and return list_first_entry()
can be replaced with list_first_entry_or_null().
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
most unaligned accesses are reasonable efficient (no kernel emulation)
on s390, let's announce it
This also
- removes the ubsan false positives for unaligned accesses on s390 with
default config
- uses simpler arithmetic in several functions in several other areas
of the kernel like ethernet frame classification
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
With git commit 0eab11c7e0d30de14a15ccd8269eef238321a8e1
"s390/vx: allow to include vx-insn.h with .include"
and an older gcc we get errors like this:
{standard input}:6: Error: can't open asm/vx-insn.h for reading:
No such file or directory
arch/s390/kernel/fpu.c:57: Error: Unrecognized opcode: `vstm'
To solve this issue simply add the path to arch/s390/include to
all assembler runs.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Static analysis with cppcheck detected that ret is not initialized
and hence garbage is potentially being returned in the case where
prng_data->ppnows.reseed_counter <= prng_reseed_limit.
Thanks to Martin Schwidefsky for spotting a mistake in my original
fix.
Fixes: 0177db01adf26cf9 ("s390/crypto: simplify return code handling")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The workqueue "appldata_wq" has been replaced with an ordered dedicated
workqueue.
WQ_MEM_RECLAIM has not been set since the workqueue is not being used on
a memory reclaim path.
The adapter->work_queue queues multiple work items viz
&adapter->scan_work, &port->rport_work, &adapter->ns_up_work,
&adapter->stat_work, adapter->work_queue, &adapter->events.work,
&port->gid_pn_work, &port->test_link_work. Hence, an ordered
dedicated workqueue has been used.
WQ_MEM_RECLAIM has been set to ensure forward progress under memory
pressure.
Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The XC instruction can be used to improve the speed of the raid6
recovery. The loops now operate on blocks of 256 bytes.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The double while loops of the CTR mode encryption / decryption functions
are overly complex for little gain. Simplify the functions to a single
while loop at the cost of an additional memcpy of a few bytes for every
4K page worth of data.
Adapt the other crypto functions to make them all look alike.
Reviewed-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The CPACF code makes some assumptions about the availablity of hardware
support. E.g. if the machine supports KM(AES-256) without chaining it is
assumed that KMC(AES-256) with chaining is available as well. For the
existing CPUs this is true but the architecturally correct way is to
check each CPACF functions on its own. This is what the query function
of each instructions is all about.
Reviewed-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The aes and the des module register multiple crypto algorithms
dependent on the availability of specific CPACF instructions.
To simplify the deregistration with crypto_unregister_alg add
an array with pointers to the successfully registered algorithms
and use it for the error handling in the init function and in
the module exit function.
Reviewed-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The CPACF instructions can complete with three different condition codes:
CC=0 for successful completion, CC=1 if the protected key verification
failed, and CC=3 for partial completion.
The inline functions will restart the CPACF instruction for partial
completion, this removes the CC=3 case. The CC=1 case is only relevant
for the protected key functions of the KM, KMC, KMAC and KMCTR
instructions. As the protected key functions are not used by the
current code, there is no need for any kind of return code handling.
Reviewed-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use a separate define for the decryption modifier bit instead of
duplicating the function codes for encryption / decrypton.
In addition use an unsigned type for the function code.
Reviewed-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The machine check handler will do one of two things if the floating-point
control, a floating point register or a vector register can not be
revalidated:
1) if the PSW indicates user mode the process is terminated
2) if the PSW indicates kernel mode the system is stopped
To unconditionally stop the system for 2) is incorrect.
There are three possible outcomes if the floating-point control, a
floating point register or a vector registers can not be revalidated:
1) The kernel is inside a kernel_fpu_begin/kernel_fpu_end block and
needs the register. The system is stopped.
2) No active kernel_fpu_begin/kernel_fpu_end block and the CIF_CPU bit
is not set. The user space process needs the register and is killed.
3) No active kernel_fpu_begin/kernel_fpu_end block and the CIF_FPU bit
is set. Neither the kernel nor the user space process needs the
lost register. Just revalidate it and continue.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
In case of nested user of the FPU or vector registers in the kernel
the current code uses the mask of the FPU/vector registers of the
previous contexts to decide which registers to save and restore.
E.g. if the previous context used KERNEL_VXR_V0V7 and the next
context wants to use KERNEL_VXR_V24V31 the first 8 vector registers
are stored to the FPU state structure. But this is not necessary
as the next context does not use these registers.
Rework the FPU/vector register save and restore code. The new code
does a few things differently:
1) A lowcore field is used instead of a per-cpu variable.
2) The kernel_fpu_end function now has two parameters just like
kernel_fpu_begin. The register flags are required by both
functions to save / restore the minimal register set.
3) The inline functions kernel_fpu_begin/kernel_fpu_end now do the
update of the register masks. If the user space FPU registers
have already been stored neither save_fpu_regs nor the
__kernel_fpu_begin/__kernel_fpu_end functions have to be called
for the first context. In this case kernel_fpu_begin adds 7
instructions and kernel_fpu_end adds 4 instructions.
3) The inline assemblies in __kernel_fpu_begin / __kernel_fpu_end
to save / restore the vector registers are simplified a bit.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
To make the vx-insn.h more versatile avoid cpp preprocessor macros
and allow to use plain numbers for vector and general purpose register
operands. With that you can emit an .include from a C file into the
assembler text and then use the vx-insn macros in inline assemblies.
For example:
asm (".include \"asm/vx-insn.h\"");
static inline void xor_vec(int x, int y, int z)
{
asm volatile("VX %0,%1,%2"
: : "i" (x), "i" (y), "i" (z));
}
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The increment might not be atomic and we're not holding the
timekeeper_lock. Therefore we might lose an update to count, resulting in
VDSO being trapped in a loop. As other archs also simply update the
values and count doesn't seem to have an impact on reloading of these
values in VDSO code, let's just remove the update of tb_update_count.
Suggested-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
By leaving fixup_cc unset, only the clock comparator of the cpu actually
doing the sync is fixed up until now.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
There are still some etr leftovers and wrong comments, let's clean that up.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The way we call do_adjtimex() today is broken. It has 0 effect, as
ADJ_OFFSET_SINGLESHOT (0x0001) in the kernel maps to !ADJ_ADJTIME
(in contrast to user space where it maps to ADJ_OFFSET_SINGLESHOT |
ADJ_ADJTIME - 0x8001). !ADJ_ADJTIME will silently ignore all adjustments
without STA_PLL being active. We could switch to ADJ_ADJTIME or turn
STA_PLL on, but still we would run into some problems:
- Even when switching to nanoseconds, we lose accuracy.
- Successive calls to do_adjtimex() will simply overwrite any leftovers
from the previous call (if not fully handled)
- Anything that NTP does using the sysctl heavily interferes with our
use.
- !ADJ_ADJTIME will silently round stuff > or < than 0.5 seconds
Reusing do_adjtimex() here just feels wrong. The whole STP synchronization
works right now *somehow* only, as do_adjtimex() does nothing and our
TOD clock jumps in time, although it shouldn't. This is especially bad
as the clock could jump backwards in time. We will have to find another
way to fix this up.
As leap seconds are also not properly handled yet, let's just get rid of
all this complex logic altogether and use the correct clock_delta for
fixing up the clock comparator and keeping the sched_clock monotonic.
This change should have 0 effect on the current STP mechanism. Once we
know how to best handle sync events and leap second updates, we'll start
with a fresh implementation.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Pull facility mask patch from the KVM tree.
* tag 's390forkvm' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux
KVM: s390: generate facility mask from readable list
Automatically generate the KVM facility mask out of a readable list.
Manually changing the masks is very error prone, especially if the
special IBM bit numbering has to be considered.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reuse existing functionality from memdup_user() instead of keeping
duplicate source code.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The 'report_error' interface for PCI devices found on s390 can be
used by a user space program to inject an adapter error notification.
Add a new kernel interface zpci_report_error to allow a PCI device
driver to inject these error notifications without a detour over
user space.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
cio_cancel was declared twice. Remove one of them.
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Merge the __p[m|u]xdp_idte and __p[m|u]dp_idte_local functions into a
single __p[m|u]dp_idte function with an additional parameter.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Merge the __ptep_ipte and __ptep_ipte_local functions into a single
__ptep_ipte function with an additional parameter. The __pte_ipte_range
function is still extra as the while loops makes it hard to merge.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The __tlb_flush_mm() helper uses a global flush if the mm struct
has a gmap structure attached to it. Replace the global flush with
two individual flushes by means of the IDTE instruction if only a
single gmap is attached the the mm.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The local-clearing control of the IDTE instruction does not have any effect
for the clearing-by-ASCE operation. Only the invalidation-and-clearing
operation respects the local-clearing bit.
Remove __tlb_flush_idte_local and simplify the batched TLB flushing code.
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
- avoid signed math problems on unexpected compilers
- avoid false positives at very end of kernel text range checks
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: Kees Cook <kees@outflux.net>
iQIcBAABCgAGBQJXu7MGAAoJEIly9N/cbcAmARgP/ArvNIY66hLgESmerZlvDjJC
Iip8497f9bsKYgz+6DmHyRzJwJafN2hwIrRh2rnS1oHI48KLTFYP5N8RigPGvn8u
ixHpCMlEBV2zIkne1rtPDUkAHFgDc3j5zvg0ra/YGmFbhTlwTci67COYsq0PaSY/
LLMRt7gK5NjHxD+X6H7ORyL34gLtQOgAQw96wPeaO4HZ8YEecOR8LsMUw+IrGbbo
KclXNO+v2t+ROCUOZukKG+2h02EuMzW3BLnX+FAVLeJgwrjgsd/6mRVkBlnjYDRP
GDKlw2X5QlyDj+Kz/mHiYVuAJTMbN18y2kns7MQoPmozmtVet4YYXtGL/MRmHHW3
fh5KLuLyF59HY/1OLqQ/6Nxz7ggm5MuPMCF8brfFPlblFBO/OLKOrry/lptKzvwm
/5Lp2tVmH/w5+WdKsZM6gNbTsaC7HxKMlXodi+kHpCO7BF23j+fJLsCCPgNjwRyH
B6pxN4bk5gK2Xd1yRxSPt64BJ+Jt995EddP0dY6+UIhliSrQPHtilTe9Ht0nTFnG
Ar1pei3iSPpp91euVt6Glc5nLktryJ8AL6OEyp847he6C84k/R8lk2gu14iHO/U6
WLa9nOCVuQBLifHOv/oKQSBHt6dezjvmi93cY9/3B//SYC5SxzA6vqoNgmLaBEle
Hb3ZTQ77FCq3PE7Ty712
=5CiX
-----END PGP SIGNATURE-----
Merge tag 'usercopy-v4.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardened usercopy fixes from Kees Cook:
- avoid signed math problems on unexpected compilers
- avoid false positives at very end of kernel text range checks
* tag 'usercopy-v4.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
usercopy: fix overlap check for kernel text
usercopy: avoid potentially undefined behavior in pointer math
Pull crypto fixes from Herbert Xu:
"This fixes a number of memory corruption bugs in the newly added
sha256-mb/sha256-mb code"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: sha512-mb - fix ctx pointer
crypto: sha256-mb - fix ctx pointer and digest copy