edac: cortex: Snapshot arm64 cortex-a cpu edac driver

This is a snapshot of arm64 cortex-a cpu edac driver from
msm-4.4 commit a9a0da48eb3f (Merge "edac: cortex: Remove
 WARN_ON messages"). Below are the changes are done on top
of that to get compiled.

1. #define ARM_CPU_PART_KRYO2XX_GOLD      0x800
   #define ARM_CPU_PART_KRYO2XX_SILVER    0x801

2. register_cpu_notify() have been replaced with CPU hotplug
state-machine.

Change-Id: Ic84ceb52114947f17fc6d0fac87039d4b450f70a
Signed-off-by: Mukesh Ojha <mojha@codeaurora.org>
This commit is contained in:
Mukesh Ojha 2018-09-26 12:37:09 +05:30
parent 26f0535861
commit cd6d26aaae
6 changed files with 1101 additions and 0 deletions

View File

@ -0,0 +1,34 @@
* ARM Cortex A53 / A57 cache error reporting driver
Required properties:
- compatible: Should be "arm,arm64-cpu-erp"
- interrupts: List of hardware interrupts that may indicate an error condition
in the CPU subsystem, or in the L1 / L2 caches. At least one interrupt entry
is required.
- interrupt-names: Must contain one or more of the following IRQ types:
"pri-dbe-irq" - double-bit error interrupt for primary cluster
"sec-dbe-irq" - double-bit error interrupt for secondary cluster
"pri-ext-irq" - external bus error interrupt for primary cluster
"sec-ext-irq" - external bus error interrupt for secondary cluster
"cci-irq" - CCI error interrupt. If this property is present, having
the 'cci' reg-base defined using the 'reg' property is
recommended.
At least one irq entry is required.
Optional properties:
- reg: Should contain physical address of the CCI register space
- reg-names: Should contain 'cci'. Must be present if 'reg' property is present
- poll-delay-msec: Indicates how often the edac check callback should be called. Time in msec.
Example:
cpu_cache_erp {
compatible = "arm,arm64-cpu-erp";
interrupt-names = "pri-dbe-irq",
"sec-dbe-irq",
"pri-ext-irq",
"sec-ext-irq";
interrupts = <0 92 0>,
<0 91 0>,
<0 96 0>,
<0 95 0>;
};

View File

@ -0,0 +1,25 @@
/* Copyright (c) 2014-2018, The Linux Foundation. All rights reserved.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 and
* only version 2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*/
#ifndef ASM_EDAC_H
#define ASM_EDAC_H
#if defined(CONFIG_EDAC_CORTEX_ARM64) && \
!defined(CONFIG_EDAC_CORTEX_ARM64_DBE_IRQ_ONLY)
void arm64_check_cache_ecc(void *info);
#else
static inline void arm64_check_cache_ecc(void *info) { }
#endif
static inline void atomic_scrub(void *addr, int size) { }
#endif

View File

@ -532,6 +532,46 @@ config EDAC_XGENE
Support for error detection and correction on the
APM X-Gene family of SOCs.
config EDAC_CORTEX_ARM64
depends on ARM64
bool "ARM Cortex A CPUs L1/L2 Caches"
help
Support for error detection and correction on the
ARM Cortex A53 and A57 CPUs. For debugging issues having to do with
stability and overall system health, you should probably say 'Y'
here.
config EDAC_CORTEX_ARM64_PANIC_ON_CE
depends on EDAC_CORTEX_ARM64
bool "Panic on correctable errors"
help
Forcibly cause a kernel panic if an correctable error (CE) is
detected, even though the error is (by definition) correctable and
would otherwise result in no adverse system effects. This can reduce
debugging times on hardware which may be operating at voltages or
frequencies outside normal specification.
For production builds, you should definitely say 'N' here.
config EDAC_CORTEX_ARM64_DBE_IRQ_ONLY
depends on EDAC_CORTEX_ARM64
bool "Only check for parity errors when an irq is generated"
help
In ARM64, parity errors will cause an interrupt
to be triggered but may also cause a data abort to
occur. Only check for EDAC errors for the interrupt.
If unsure, say no.
config EDAC_CORTEX_ARM64_PANIC_ON_UE
depends on EDAC_CORTEX_ARM64
bool "Panic on uncorrectable errors"
help
Forcibly cause a kernel panic if an uncorrectable error (UE) is
detected. This can reduce debugging times on hardware which may be
operating at voltages or frequencies outside normal specification.
For production builds, you should probably say 'N' here.
config EDAC_QCOM_LLCC
depends on QCOM_LLCC
tristate "QCOM LLCC Caches"

View File

@ -80,4 +80,5 @@ obj-$(CONFIG_EDAC_THUNDERX) += thunderx_edac.o
obj-$(CONFIG_EDAC_ALTERA) += altera_edac.o
obj-$(CONFIG_EDAC_SYNOPSYS) += synopsys_edac.o
obj-$(CONFIG_EDAC_XGENE) += xgene_edac.o
obj-$(CONFIG_EDAC_CORTEX_ARM64) += cortex_arm64_edac.o
obj-$(CONFIG_EDAC_QCOM_LLCC) += qcom_llcc_edac.o

File diff suppressed because it is too large Load Diff

View File

@ -100,6 +100,7 @@ enum cpuhp_state {
CPUHP_AP_RCUTREE_DYING,
CPUHP_AP_KMAP_DYING,
CPUHP_AP_IRQ_GIC_STARTING,
CPUHP_AP_EDAC_PMU_STARTING,
CPUHP_AP_IRQ_HIP04_STARTING,
CPUHP_AP_IRQ_ARMADA_XP_STARTING,
CPUHP_AP_IRQ_BCM2836_STARTING,