This patch includes following fixes and update:
- Only some events in the Sbox and Mbox can use the match/mask
registers, add code to check this.
- The format definitions for xbr_mm_cfg and xbr_match registers
in the Rbox are wrong, xbr_mm_cfg should use 32 bits, xbr_match
should use 64 bits.
- Cleanup the Rbox code. Compute the addresses extra registers in
the enable_event function instead of the hw_config function.
This simplifies the code in nhmex_rbox_alter_er().
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1344229882-3907-2-git-send-email-zheng.z.yan@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Fix the following section mismatch:
WARNING: arch/x86/kernel/cpu/built-in.o(.text+0x7ad9): Section mismatch in reference from the function uncore_types_exit() to the function .init.text:uncore_type_exit()
The function uncore_types_exit() references the function __init
uncore_type_exit(). This is often because uncore_types_exit lacks a
__init annotation or the annotation of uncore_type_exit is wrong.
caused by 14371cce03c2 ("perf: Add generic PCI uncore PMU device
support").
Cc: Zheng Yan <zheng.z.yan@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1339741902-8449-8-git-send-email-zheng.z.yan@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
GCC built with nonstandard options can enable -fpic by default.
We never want this for 32-bit kernels and it will break the build.
[ hpa: Notably the Android toolchain apparently does this. ]
Change-Id: Iaab7d66e598b1c65ac4a4f0229eca2cd3d0d2898
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Link: http://lkml.kernel.org/r/1344624546-29691-1-git-send-email-andrew.p.boie@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Introducing PERF_SAMPLE_STACK_USER sample type bit to trigger the dump
of the user level stack on sample. The size of the dump is specified by
sample_stack_user value.
Being able to dump parts of the user stack, starting from the stack
pointer, will be useful to make a post mortem dwarf CFI based stack
unwinding.
Added HAVE_PERF_USER_STACK_DUMP config option to determine if the
architecture provides user stack dump on perf event samples. This needs
access to the user stack pointer which is not unified across
architectures. Enabling this for x86 architecture.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Original-patch-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: "Frank Ch. Eigler" <fche@redhat.com>
Cc: Arun Sharma <asharma@fb.com>
Cc: Benjamin Redelings <benjamin.redelings@nescent.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Ulrich Drepper <drepper@gmail.com>
Link: http://lkml.kernel.org/r/1344345647-11536-6-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding a generic way to use __output_copy function with specific copy
function via DEFINE_PERF_OUTPUT_COPY macro.
Using this to add new __output_copy_user function, that provides output
copy from user pointers. For x86 the copy_from_user_nmi function is used
and __copy_from_user_inatomic for the rest of the architectures.
This new function will be used in user stack dump on sample, coming in
next patches.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: "Frank Ch. Eigler" <fche@redhat.com>
Cc: Arun Sharma <asharma@fb.com>
Cc: Benjamin Redelings <benjamin.redelings@nescent.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Ulrich Drepper <drepper@gmail.com>
Link: http://lkml.kernel.org/r/1344345647-11536-4-git-send-email-jolsa@redhat.com
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introducing PERF_SAMPLE_REGS_USER sample type bit to trigger the dump of
user level registers on sample. Registers we want to dump are specified
by sample_regs_user bitmask.
Only user level registers are dumped at the moment. Meaning the register
values of the user space context as it was before the user entered the
kernel for whatever reason (syscall, irq, exception, or a PMI happening
in userspace).
The layout of the sample_regs_user bitmap is described in
asm/perf_regs.h for archs that support register dump.
This is going to be useful to bring Dwarf CFI based stack unwinding on
top of samples.
Original-patch-by: Frederic Weisbecker <fweisbec@gmail.com>
[ Dump registers ABI specification. ]
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Suggested-by: Stephane Eranian <eranian@google.com>
Cc: "Frank Ch. Eigler" <fche@redhat.com>
Cc: Arun Sharma <asharma@fb.com>
Cc: Benjamin Redelings <benjamin.redelings@nescent.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Ulrich Drepper <drepper@gmail.com>
Link: http://lkml.kernel.org/r/1344345647-11536-3-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This brings a new API to help the selective dump of registers on event
sampling, and its implementation for x86 arch.
Added HAVE_PERF_REGS config option to determine if the architecture
provides perf registers ABI.
The information about desired registers will be passed in u64 mask.
It's up to the architecture to map the registers into the mask bits.
For the x86 arch implementation, both 32 and 64 bit registers bits are
defined within single enum to ensure 64 bit system can provide register
dump for compat task if needed in the future.
Original-patch-by: Frederic Weisbecker <fweisbec@gmail.com>
[ Added missing linux/errno.h include ]
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: "Frank Ch. Eigler" <fche@redhat.com>
Cc: Arun Sharma <asharma@fb.com>
Cc: Benjamin Redelings <benjamin.redelings@nescent.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Ulrich Drepper <drepper@gmail.com>
Link: http://lkml.kernel.org/r/1344345647-11536-2-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On Intel systems corrected machine check interrupts (CMCI) may be sent to
multiple logical processors; possibly to all processors on the affected
socket (SDM Volume 3B "15.5.1 CMCI Local APIC Interface"). This means
that a persistent error (such as a stuck bit in ECC memory) may cause
a storm of interrupts that greatly hinders or prevents forward progress
(probably on many processors).
To solve this we keep track of the rate at which each processor sees
CMCI. If we exceed a threshold, we disable CMCI delivery and switch to
polling the machine check banks. If the storm subsides (none of the
affected processors see any more errors for a complete poll interval) we
re-enable CMCI.
[Tony: Added console messages when storm begins/ends and increased storm
threshold from 5 to 15 so we have a few more logged entries before we
disable interrupts and start dropping reports]
Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
cmci_discover() works out which machine check banks support CMCI, and
which of those are shared by multiple logical processors. It uses this
information to ensure that exactly one cpu is designated the owner of
each bank so that when interrupts are broadcast to multiple cpus, only one
of them will look in a shared bank to log the error and clear the bank.
At boot time cmci_discover() performs this task silently. But during
certain cpu hotplug operations it prints out a set of summary lines
like this:
CPU 35 MCA banks CMCI:0 CMCI:1 CMCI:3 CMCI:5 CMCI:6 CMCI:7 CMCI:8 CMCI:9 CMCI:10 CMCI:11
CPU 1 MCA banks CMCI:0 CMCI:1 CMCI:3
CPU 39 MCA banks CMCI:0 CMCI:1 CMCI:3
CPU 38 MCA banks CMCI:0 CMCI:1 CMCI:3
CPU 32 MCA banks CMCI:0 CMCI:1 CMCI:3
CPU 37 MCA banks CMCI:0 CMCI:1 CMCI:3
CPU 36 MCA banks CMCI:0 CMCI:1 CMCI:3
CPU 34 MCA banks CMCI:0 CMCI:1 CMCI:3
The value of these messages seems very low. A user might painstakingly
cross-check against the data sheet for a processor to ensure that all
CMCI supported banks are correctly reported, but this seems improbable.
If users really wanted to do this, we should print the information at
boot time too.
Remove the messages.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Clear AVX, AVX2 features along with clearing XSAVE feature bits,
as part of the parsing "noxsave" parameter.
Fixes the kernel boot panic with "noxsave" boot parameter.
We could have checked cpu_has_osxsave along with cpu_has_avx etc, but Peter
mentioned clearing the feature bits will be better for uses like
static_cpu_has() etc.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/1343755754.2041.2.camel@sbsiddha-desk.sc.intel.com
Cc: <stable@vger.kernel.org> # v3.5
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Run the mprotect.c microbenchmark on all our families >= K8 and preset
the flushall shift variable accordingly.
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Link: http://lkml.kernel.org/r/1344272439-29080-5-git-send-email-bp@amd64.org
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Push the max CPUID leaf check into the ->detect_tlb function and remove
general test case from the generic path.
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Link: http://lkml.kernel.org/r/1344272439-29080-3-git-send-email-bp@amd64.org
Acked-by: Alex Shi <alex.shi@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The TLB characteristics appeared like this in dmesg:
[ 0.065817] Last level iTLB entries: 4KB 512, 2MB 1024, 4MB 512
[ 0.065817] Last level dTLB entries: 4KB 1024, 2MB 1024, 4MB 512
[ 0.065817] tlb_flushall_shift is 0xffffffff
where tlb_flushall_shift is actually -1 but dumped as a hex number.
However, the Kconfig option CONFIG_DEBUG_TLBFLUSH and the rest of the
code treats this as a signed decimal and states "If you set it to -1,
the code flushes the whole TLB unconditionally."
So, fix its formatting in accordance with the other references to it.
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Link: http://lkml.kernel.org/r/1344272439-29080-2-git-send-email-bp@amd64.org
Acked-by: Alex Shi <alex.shi@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
When MSR_KVM_PV_EOI_EN was added to msrs_to_save array
KVM_SAVE_MSRS_BEGIN was not updated accordingly.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Pull ACPI and power management fixes from Len Brown:
"A 3.3 sleep regression fixed, numa bugfix, plus some minor cleanups"
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
ACPI processor: Fix tick_broadcast_mask online/offline regression
ACPI: Only count valid srat memory structures
ACPI: Untangle a return statement for better readability
ACPI / PCI: Do not try to acquire _OSC control if that is hopeless
ACPI: delete _GTS/_BFS support
ACPI/x86: revert 'x86, acpi: Call acpi_enter_sleep_state via an asmlinkage C function from assembler'
ACPI: replace strlen("string") with sizeof("string") -1
ACPI / PM: Fix build warning in sleep.c for CONFIG_ACPI_SLEEP unset
No point in having double cases if we can simply mask the FROZEN bit
out.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Split timer init function into the init and the start part, so the
start part can replace the open coded version in CPU_DOWN_FAILED.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Acked-by: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
raise_mce() fiddles with global state, but lacks any kind of
serialization.
Add a mutex around the raise_mce() call, so concurrent writers do not
stomp on each other toes.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
raise_mce() has a code path which does not disable preemption when the
raise_local() is called. The per cpu variable access in raise_local()
depends on preemption being disabled to be functional. So that code
path was either never tested or never tested with CONFIG_DEBUG_PREEMPT
enabled.
Add the missing preempt_disable/enable() pair around the call.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Pull x86 fixes from Ingo Molnar:
"Various fixes"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86-64, kcmp: The kcmp system call can be common
arch/x86/kernel/kdebugfs.c: Ensure a consistent return value in error case
x86/mce: Add quirk for instruction recovery on Sandy Bridge processors
x86/mce: Move MCACOD defines from mce-severity.c to <asm/mce.h>
x86/ioapic: Fix NULL pointer dereference on CPU hotplug after disabling irqs
x86, nops: Missing break resulting in incorrect selection on Intel
x86: CONFIG_CC_STACKPROTECTOR=y is no longer experimental
Pull perf fixes from Ingo Molnar:
"Fix merge window fallout and fix sleep profiling (this was always
broken, so it's not a fix for the merge window - we can skip this one
from the head of the tree)."
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/trace: Add ability to set a target task for events
perf/x86: Fix USER/KERNEL tagging of samples properly
perf/x86/intel/uncore: Make UNCORE_PMU_HRTIMER_INTERVAL 64-bit
Otherwise you could run into:
WARN_ON in numa_register_memblks(), because node_possible_map is zero
References: https://bugzilla.novell.com/show_bug.cgi?id=757888
On this machine (ProLiant ML570 G3) the SRAT table contains:
- No processor affinities
- One memory affinity structure (which is set disabled)
CC: Per Jessen <per@opensuse.org>
CC: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Pull OLPC platform updates from Andres Salomon:
"These move the OLPC Embedded Controller driver out of
arch/x86/platform and into drivers/platform/olpc.
OLPC machines are now ARM-based (which means lots of x86 and ARM
changes), but are typically pretty self-contained.. so it makes more
sense to go through a separate OLPC tree after getting the appropriate
review/ACKs."
* 'for-linus-3.6' of git://dev.laptop.org/users/dilinger/linux-olpc:
x86: OLPC: move s/r-related EC cmds to EC driver
Platform: OLPC: move global variables into priv struct
Platform: OLPC: move debugfs support from x86 EC driver
x86: OLPC: switch over to using new EC driver on x86
Platform: OLPC: add a suspended flag to the EC driver
Platform: OLPC: turn EC driver into a platform_driver
Platform: OLPC: allow EC cmd to be overridden, and create a workqueue to call it
drivers: OLPC: update various drivers to include olpc-ec.h
Platform: OLPC: add a stub to drivers/platform/ for the OLPC EC driver
When we release pages back during bootup:
Freeing 9d-100 pfn range: 99 pages freed
Freeing 9cf36-9d0d2 pfn range: 412 pages freed
Freeing 9f6bd-9f6bf pfn range: 2 pages freed
Freeing 9f714-9f7bf pfn range: 171 pages freed
Freeing 9f7e0-9f7ff pfn range: 31 pages freed
Freeing 9f800-100000 pfn range: 395264 pages freed
Released 395979 pages of unused memory
We then try to populate those pages back. In the P2M tree however
the space for those leafs must be reserved - as such we use extend_brk.
We reserve 8MB of _brk space, which means we can fit over
1048576 PFNs - which is more than we should ever need.
Without this, on certain compilation of the kernel we would hit:
(XEN) domain_crash_sync called from entry.S
(XEN) CPU: 0
(XEN) RIP: e033:[<ffffffff818aad3b>]
(XEN) RFLAGS: 0000000000000206 EM: 1 CONTEXT: pv guest
(XEN) rax: ffffffff81a7c000 rbx: 000000000000003d rcx: 0000000000001000
(XEN) rdx: ffffffff81a7b000 rsi: 0000000000001000 rdi: 0000000000001000
(XEN) rbp: ffffffff81801cd8 rsp: ffffffff81801c98 r8: 0000000000100000
(XEN) r9: ffffffff81a7a000 r10: 0000000000000001 r11: 0000000000000003
(XEN) r12: 0000000000000004 r13: 0000000000000004 r14: 000000000000003d
(XEN) r15: 00000000000001e8 cr0: 000000008005003b cr4: 00000000000006f0
(XEN) cr3: 0000000125803000 cr2: 0000000000000000
(XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e02b cs: e033
(XEN) Guest stack trace from rsp=ffffffff81801c98:
.. which is extend_brk hitting a BUG_ON.
Interestingly enough, most of the time we are not going to hit this
b/c the _brk space is quite large (v3.5):
ffffffff81a25000 B __brk_base
ffffffff81e43000 B __brk_limit
= ~4MB.
vs earlier kernels (with this back-ported), the space is smaller:
ffffffff81a25000 B __brk_base
ffffffff81a7b000 B __brk_limit
= 344 kBytes.
where we would certainly hit this and hit extend_brk.
Note that git commit c3d93f880197953f86ab90d9da4744e926b38e33
(xen: populate correct number of pages when across mem boundary (v2))
exposed this bug).
[v1: Made it 8MB of _brk space instead of 4MB per Jan's suggestion]
CC: stable@vger.kernel.org #only for 3.5
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Pull UML fixes from Richard Weinberger:
"This patch set contains mostly fixes and cleanups. The UML tty driver
uses now tty_port and is no longer broken like hell :-)"
* 'for-linus-3.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
um: Add arch/x86/um to MAINTAINERS
um: pass siginfo to guest process
um: fix ubd_file_size for read-only files
um: pull interrupt_end() into userspace()
um: split syscall_trace(), pass pt_regs to it
um: switch UPT_SET_RETURN_VALUE and regs_return_value to pt_regs
um: set BLK_CGROUP=y in defconfig
um: remove count_lock
um: fully use tty_port
um: Remove dead code
um: remove line_ioctl()
TTY: um/line, use tty from tty_port
TTY: um/line, add tty_port
Commit b2da15ac26a0c ("KVM: VMX: Optimize %ds, %es reload") broke i386
in the following scenario:
vcpu_load
...
vmx_save_host_state
vmx_vcpu_run
(ds.rpl, es.rpl cleared by hardware)
interrupt
push ds, es # pushes bad ds, es
schedule
vmx_vcpu_put
vmx_load_host_state
reload ds, es (with __USER_DS)
pop ds, es # of other thread's stack
iret
# other thread runs
interrupt
push ds, es
schedule # back in vcpu thread
pop ds, es # now with rpl=0
iret
...
vcpu_put
resume_userspace
iret # clears ds, es due to mismatched rpl
(instead of resume_userspace, we might return with SYSEXIT and then
take an exception; when the exception IRETs we end up with cleared
ds, es)
Fix by avoiding the optimization on i386 and reloading ds, es on the
lightweight exit path.
Reported-by: Chris Clayron <chris2553@googlemail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
We already use the same system call handler for i386 and x86-64, there
is absolutely no reason x32 can't use the same system call, too.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: H.J. Lu <hjl.tools@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: <stable@vger.kernel.org> v3.5
Link: http://lkml.kernel.org/n/tip-vwzk3qbcr3yjyxjg2j38vgy9@git.kernel.org
When a guest migrates to a new host, the system time difference from the
previous host is used in the updates to the kvmclock system time visible
to the guest, resulting in a continuation of correct kvmclock based guest
timekeeping.
The wall clock component of the kvmclock provided time is currently not
updated with this same time offset. Since the Linux guest caches the
wall clock based time, this discrepency is not noticed until the guest is
rebooted. After reboot the guest's time calculations are off.
This patch adjusts the wall clock by the kvmclock_offset, resulting in
correct guest time after a reboot.
Cc: Zachary Amsden <zamsden@gmail.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
The new EC driver calls platform-specific suspend and resume hooks; run
XO-1-specific EC commands from there, rather than deep in s/r code. If we
attempt to run EC commands after the new EC driver has suspended, it is
refused by the ec->suspended checks.
Signed-off-by: Andres Salomon <dilinger@queued.net>
Acked-by: Paul Fox <pgf@laptop.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
There's nothing about the debugfs interface for the EC driver that is
architecture-specific, so move it into the arch-independent driver.
The code is mostly unchanged with the exception of renamed variables, coding
style changes, and API updates.
Signed-off-by: Andres Salomon <dilinger@queued.net>
Acked-by: Paul Fox <pgf@laptop.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
This uses the new EC driver framework in drivers/platform/olpc. The
XO-1 and XO-1.5-specific code is still in arch/x86, but the generic stuff
(including a new workqueue; no more running EC commands with IRQs disabled!)
can be shared with other architectures.
Signed-off-by: Andres Salomon <dilinger@queued.net>
Acked-by: Paul Fox <pgf@laptop.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Switch over to using olpc-ec.h in multiple steps, so as not to break builds.
This covers every driver that calls olpc_ec_cmd().
Signed-off-by: Andres Salomon <dilinger@queued.net>
Acked-by: Paul Fox <pgf@laptop.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
The OLPC EC driver has outgrown arch/x86/platform/. It's time to both
share common code amongst different architectures, as well as move it out
of arch/x86/. The XO-1.75 is ARM-based, and the EC driver shares a lot of
code with the x86 code.
Signed-off-by: Andres Salomon <dilinger@queued.net>
Acked-by: Paul Fox <pgf@laptop.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Pull perf updates from Ingo Molnar:
"The biggest changes are Intel Nehalem-EX PMU uncore support, uprobes
updates/cleanups/fixes from Oleg and diverse tooling updates (mostly
fixes) now that Arnaldo is back from vacation."
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits)
uprobes: __replace_page() needs munlock_vma_page()
uprobes: Rename vma_address() and make it return "unsigned long"
uprobes: Fix register_for_each_vma()->vma_address() check
uprobes: Introduce vaddr_to_offset(vma, vaddr)
uprobes: Teach build_probe_list() to consider the range
uprobes: Remove insert_vm_struct()->uprobe_mmap()
uprobes: Remove copy_vma()->uprobe_mmap()
uprobes: Fix overflow in vma_address()/find_active_uprobe()
uprobes: Suppress uprobe_munmap() from mmput()
uprobes: Uprobe_mmap/munmap needs list_for_each_entry_safe()
uprobes: Clean up and document write_opcode()->lock_page(old_page)
uprobes: Kill write_opcode()->lock_page(new_page)
uprobes: __replace_page() should not use page_address_in_vma()
uprobes: Don't recheck vma/f_mapping in write_opcode()
perf/x86: Fix missing struct before structure name
perf/x86: Fix format definition of SNB-EP uncore QPI box
perf/x86: Make bitfield unsigned
perf/x86: Fix LLC-* and node-* events on Intel SandyBridge
perf/x86: Add Intel Nehalem-EX uncore support
perf/x86: Fix typo in format definition of uncore PCU filter
...
Some PMUs don't provide a full register set for their sample,
specifically 'advanced' PMUs like AMD IBS and Intel PEBS which provide
'better' than regular interrupt accuracy.
In this case we use the interrupt regs as basis and over-write some
fields (typically IP) with different information.
The perf core however uses user_mode() to distinguish user/kernel
samples, user_mode() relies on regs->cs. If the interrupt skid pushed
us over a boundary the new IP might not be in the same domain as the
interrupt.
Commit ce5c1fe9a9e ("perf/x86: Fix USER/KERNEL tagging of samples")
tried to fix this by making the perf core use kernel_ip(). This
however is wrong (TM), as pointed out by Linus, since it doesn't allow
for VM86 and non-zero based segments in IA32 mode.
Therefore, provide a new helper to set the regs->ip field,
set_linear_ip(), which massages the regs into a suitable state
assuming the provided IP is in fact a linear address.
Also modify perf_instruction_pointer() and perf_callchain_user() to
deal with segments base offsets.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1341910954.3462.102.camel@twins
Signed-off-by: Ingo Molnar <mingo@kernel.org>
i386 allmodconfig:
arch/x86/kernel/cpu/perf_event_intel_uncore.c: In function 'uncore_pmu_hrtimer':
arch/x86/kernel/cpu/perf_event_intel_uncore.c:728: warning: integer overflow in expression
arch/x86/kernel/cpu/perf_event_intel_uncore.c: In function 'uncore_pmu_start_hrtimer':
arch/x86/kernel/cpu/perf_event_intel_uncore.c:735: warning: integer overflow in expression
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Zheng Yan <zheng.z.yan@intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-h84qlqj02zrojmxxybzmy9hi@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Add function tracer based kprobe optimization support
handlers on x86. This allows kprobes to use function
tracer for probing on mcount call.
Link: http://lkml.kernel.org/r/20120605102838.27845.26317.stgit@localhost.localdomain
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: "Frank Ch. Eigler" <fche@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
[ Updated to new port of ftrace save regs functions ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
The graph caller is called by the mcount callers, which already does
the check against the function_trace_stop variable. No reason to
check it again.
Link: http://lkml.kernel.org/r/20120711195745.588538769@goodmis.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
The final position of the stack after saving regs and setting up
the parameters for ftrace_regs_call, is the position of the pt_regs
needed for the 4th parameter. Instead of saving it into a temporary
reg and pushing the reg, simply push the stack pointer.
Link: http://lkml.kernel.org/r/1342702344.12353.16.camel@gandalf.stny.rr.com
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
cd74257b974d6d26442c97891c4d05772748b177
patched up GTS/BFS -- a feature we want to remove.
So revert it (by hand, due to conflict in sleep.h)
to prepare for GTS/BFS removal.
Signed-off-by: Len Brown <len.brown@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Rather than #define the options manually in the architecture code, add
Kconfig options for them and select them there instead. This also allows
us to select the compat IPC version parsing automatically for platforms
using the old compat IPC interface.
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There are two ways to create /sys/firmware/memmap/X sysfs:
- firmware_map_add_early
When the system starts, it is calledd from e820_reserve_resources()
- firmware_map_add_hotplug
When the memory is hot plugged, it is called from add_memory()
But these functions are called without unifying value of end argument as
below:
- end argument of firmware_map_add_early() : start + size - 1
- end argument of firmware_map_add_hogplug() : start + size
The patch unifies them to "start + size". Even if applying the patch,
/sys/firmware/memmap/X/end file content does not change.
[akpm@linux-foundation.org: clarify comments]
Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Reviewed-by: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Introduce CONFIG_ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE and use this instead
of the multitude of #if defined() checks in atomic64_test.c
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull x86/mm changes from Peter Anvin:
"The big change here is the patchset by Alex Shi to use INVLPG to flush
only the affected pages when we only need to flush a small page range.
It also removes the special INVALIDATE_TLB_VECTOR interrupts (32
vectors!) and replace it with an ordinary IPI function call."
Fix up trivial conflicts in arch/x86/include/asm/apic.h (added code next
to changed line)
* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/tlb: Fix build warning and crash when building for !SMP
x86/tlb: do flush_tlb_kernel_range by 'invlpg'
x86/tlb: replace INVALIDATE_TLB_VECTOR by CALL_FUNCTION_VECTOR
x86/tlb: enable tlb flush range support for x86
mm/mmu_gather: enable tlb flush range in generic mmu_gather
x86/tlb: add tlb_flushall_shift knob into debugfs
x86/tlb: add tlb_flushall_shift for specific CPU
x86/tlb: fall back to flush all when meet a THP large page
x86/flush_tlb: try flush_tlb_single one by one in flush_tlb_range
x86/tlb_info: get last level TLB entry number of CPU
x86: Add read_mostly declaration/definition to variables from smp.h
x86: Define early read-mostly per-cpu macros
Pul x86/efi changes from Ingo Molnar:
"This tree adds an EFI bootloader handover protocol, which, once
supported on the bootloader side, will make bootup faster and might
result in simpler bootloaders.
The other change activates the EFI wall clock time accessors on x86-64
as well, instead of the legacy RTC readout."
* 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, efi: Handover Protocol
x86-64/efi: Use EFI to deal with platform wall clock
Pull x86 cleanup and cpufeature from Ingo Molnar:
"Just a single cleanup and and a commit that adds new CPU feature
names"
* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, boot: Remove ancient, unconditionally #ifdef'd out dead code
* 'x86-cpufeature-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, cpufeature: Add the RDSEED and ADX features