804085 Commits

Author SHA1 Message Date
Chao Yu
b9a020d6d4
staging: erofs: introduce error injection infrastructure
This patch introduces error injection infrastructure, with it, we can
inject error in any kernel exported common functions which erofs used,
so that it can force erofs running into error paths, it turns out that
tests can cover real rare paths more easily to find bugs.

Reviewed-by: Gao Xiang <gaoxiang25@huawei.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:15:24 +00:00
Chao Yu
1bd856d7de
staging: erofs: support special inode
This patch adds to support special inode, such as block dev, char,
socket, pipe inode.

Reviewed-by: Gao Xiang <gaoxiang25@huawei.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:15:24 +00:00
Gao Xiang
77a341b38d
staging: erofs: introduce xattr & acl support
This implements xattr and acl functionalities.

Inline and shared xattrs are introduced for flexibility.
Specifically, if the same xattr occurs for many times
in a large number of inodes or the value of a xattr is so large
that it isn't suitable to be inlined, a shared xattr
kept in the xattr meta will be used instead.

Signed-off-by: Miao Xie <miaoxie@huawei.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:15:24 +00:00
Gao Xiang
bb1b0c3994
staging: erofs: update Kconfig and Makefile
This commit adds Makefile and Kconfig for erofs, and
updates Makefile and Kconfig files in the fs directory.

Signed-off-by: Miao Xie <miaoxie@huawei.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:15:23 +00:00
Gao Xiang
a0459bb15f
staging: erofs: add namei functions
This commit adds functions that transfer names to inodes.

Signed-off-by: Miao Xie <miaoxie@huawei.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:15:23 +00:00
Gao Xiang
d42f1abfaf
staging: erofs: add directory operations
This adds functions for directory, mainly readdir.

Signed-off-by: Miao Xie <miaoxie@huawei.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:15:23 +00:00
Gao Xiang
f7314d121c
staging: erofs: add inode operations
This adds core functions to get, read an inode.

Signed-off-by: Miao Xie <miaoxie@huawei.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:15:23 +00:00
Gao Xiang
f1476fc717
staging: erofs: add raw address_space operations
This commit adds functions for meta and raw data, and also
provides address_space_operations for raw data access.

Signed-off-by: Miao Xie <miaoxie@huawei.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:15:23 +00:00
Gao Xiang
6313bb1853
staging: erofs: add super block operations
This commit adds erofs super block operations, including (u)mount,
remount_fs, show_options, statfs, in addition to some private
icache management functions.

Signed-off-by: Miao Xie <miaoxie@huawei.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:15:23 +00:00
Gao Xiang
2248d18fd5
staging: erofs: add erofs in-memory stuffs
- erofs_sb_info:
   contains erofs-specific in-memory information.

 - erofs_vnode:
   contains vfs_inode and other fs-specific information.
   same as super block, the only one in-memory definition exists.

 - erofs_map_blocks
   plays a role in the file L2P mapping

Signed-off-by: Miao Xie <miaoxie@huawei.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:15:22 +00:00
Gao Xiang
35daf89258
staging: erofs: add on-disk layout
This commit adds the on-disk layout header file of erofs.

Note that the on-disk layout is still WIP, and some fields are
reserved for the future use by design.

Any comments are welcome.

Thanks-to: Li Guifu <liguifu2@huawei.com>
Thanks-to: Sun Qiuyang <sunqiuyang@huawei.com>
Signed-off-by: Miao Xie <miaoxie@huawei.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:15:22 +00:00
John Galt
f76c7bd799
lz4: allow LZ4_decompress_safe_partial to be used
- erofs is currently the only user.

Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:15:22 +00:00
John Galt
607eff5412
lz4: Revert "lz4: remove unused functions"
- This reverts commit 9f4c0c41c6d2 as EROFS uses LZ4_decompress_safe_partial().

Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:15:22 +00:00
Surabhi Vishnoi
90e7e976f1
qcacld-3.0: Fix null pointer dereference in bus suspend related code
Fix hdd_ctx null pointer derefernce issue in hdd bus suspend
related code by addition of hdd_ctx null check before its use.

Change-Id: Iea92dcdf8591e61853af233f8e28fe7e0f9ec4bf
CRs-Fixed: 3144126
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:12:59 +00:00
Sachin Ahuja
dfbd39a30f
qcacld-3.0: Flush the idle shutdown before the dsc op start
Currently idle shutdown is flushed after the suspend dsc op
start. This can lead to the deadlock when the idle shutdown
work is running as it waits for the dsc ops to complete.

To mitigate this issue, flush the idle shutdown work before
starting the dsc ops in suspend sequence.

CRs-Fixed: 2941314
Change-Id: I023b9eb3543d83a30e0587cec3977e08ac7ab752
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:12:59 +00:00
Jyoti Kumari
60037a686a
qcacld-3.0: Avoid 0 bytes mem alloc in csr_get_cfg_max_tx_power
In some cases, max_2_4_g_power.len/max_5_g_power.len value
could be 0 and driver tries to allocate 0 bytes through
qdf_mem_alloc. qdf_mem_alloc has a check for 0 size which
logs the failure and returns an error code. But this error
log could cause some delay in the roaming process
unnecessarily  as this is not an error case from roaming
perspective. It's better to avoid calling qdf_mem_alloc if
the length is 0.

Add validation for cfg_length in csr_get_cfg_max_tx_power,
if cfg_length is 0 return default maxTxPwr.

Change-Id: Ifd5d90186605e141ed2c107b4170a1d2c82bee0e
CRs-Fixed: 2768190
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:12:59 +00:00
azrim
904295f18a
[SQUASH] drop boeffla wakelock blocker
Revert "configs: surya: Regenerate full defconfig"

This reverts commit 0fbd555de64a68e43d50c1d55f9ceb93ff88c307.

Revert "boeffla_wl_blocker: increase wakelock list length"

This reverts commit c5df629a4c5203558c134487be73ed9bef5c1cc4.

Revert "power: Block all them crazy wakelocks just to make sultan wet his own knickers"

This reverts commit 98753b29fcfe1ef8a43dd5c54b1e65b3cbce8167.

Revert "boeffla_wl_blocker: update to wakelock blocker driver v1.1.0"

This reverts commit 0d5e86244ea3f9a1f6c75436c463854d9e167d71.

Revert "boeffla_wl_blocker: update to wakelock blocker driver v1.0.1"

This reverts commit 825d10a4410dba33aaef430ac9531bb685e45dae.

Revert "boeffla_wl_blocker: add generic wakelock blocker driver v1.0.0"

This reverts commit ba7aec6082b445c4d41bc5cd72a6cc51107fcf72.

Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:12:59 +00:00
Panchajanya1999
77281044f8
power/wakelock: Add a timeout to wakelocks globally
Few wakelocks tends to get stuck for no reason. Blocking them
isn't necessary and sometimes blocking them breaks basic
functionality.
Wakelocks like "tx_swr_ctrl" tends to get stuck if we keep earphones
connected and drops battery massively.

Test: Keep earphones plugged in and leave device for few hours
Expected result: No "tx_swr_ctrl" is being stuck.
Actual result: Patch is working as expected.

Change-Id: I5296990a84ab44cf6e449d6535b8b99408c415c8
Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:12:59 +00:00
Cyber Knight
69425a234b
kernel/power: Reduce verbosity of logging
Silences:
[41385.565295] Restarting tasks ...
[41385.579844] Resume caused by IRQ 178, WLAN_CE_2
[41385.579855] PM: PM: suspend exit 2022-01-10 10:54:26.808561566 UTC
[41385.579857] PM: suspend exit
[41385.680504] PM: PM: suspend entry 2022-01-10 10:54:26.909195420 UTC
[41385.680519] PM: suspend entry (deep)
[41385.704201] Freezing user space processes ...
[41385.716293] Freezing remaining freezable tasks ...

Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:12:58 +00:00
Park Ju Hyung
174d642f68
msm_geni_serial: ensure proper ioctl_count range
This can't and shouldn't be negative

Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com>
Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:12:58 +00:00
David Dai
dfe97a5c20
msm: msm_bus: remove wait on wake/sleep request for disp RSC
In order to improve the performance for display usecases,
remove the wait_for_completion bits in the TCS commands generated
when sending the batch requests for WAKE and SLEEP sets. This
will also result in a Fire-and-Forget batch request when sent
to RPMH.

Change-Id: I8b8bedf51bf086fcb83ae7fa9e21e70f036b4012
Signed-off-by: David Dai <daidavid1@codeaurora.org>
Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:12:58 +00:00
David Dai
9ee0b8d825
msm: msm_bus: create blacklist for source nodes
Instead of checking for blacklisted nodes for each node as
we traverse through the graph and only consider directly connected
nodes, create a list of blacklisted nodes at the source node that
presists throughout the path finding. The previous implemmentation
wasn't very useful in preventing circular paths for some masters
as it would block out every request from an upstream master and is
equvalent to not listing the connection to start with. Listing black
listed nodes at the master would allow certain connections to traverse
downstream through the gateway while blocking illegal paths from certain
masters.

Change-Id: I6ae4660b05b08a66835f7936a948dc1e17ba218d
Signed-off-by: David Dai <daidavid1@codeaurora.org>
Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:12:58 +00:00
Amir Levy
f8210cf554
msm: ipa: added likely/unlikely branch prediction to the lan dp
Added branch prediction in an effort to make
the data path more efficent.

Acked-by: Tal Gelbard <tgelbard@qti.qualcomm.com>
Change-Id: I3bd2157ee6c263d89de9425c7a0249370ab918fc
Signed-off-by: Amir Levy <alevy@codeaurora.org>
Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:12:58 +00:00
Jebaitedneko
79f874bbe6
include: trace: events: sched: macro out walt tracers
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:12:58 +00:00
Jebaitedneko
cee82cdce6
include: trace: events: sched: macro out some more schedstat tracers
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:12:57 +00:00
Sultan Alsawaf
2cea0e127d
RELAND: Revert "kernel: time: Add delay after cpu_relax() in tight loops"
We have queued spin locks, so spin lock starvation isn't a problem.

Signed-off-by: Jebaitedneko <Jebaitedneko@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:12:57 +00:00
Jebaitedneko
022cea9f7a
futex: use hrtimer_init_sleeper_on_stack in futex_setup_timer
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:12:57 +00:00
Frederic Weisbecker
6d25fee8e8
sched/clock, sched/cputime: Use lockdep to assert IRQs are disabled/enabled
Use lockdep to check that IRQs are enabled or disabled as expected. This
way the sanity check only shows overhead when concurrency correctness
debug code is enabled.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: David S . Miller <davem@davemloft.net>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Link: http://lkml.kernel.org/r/1509980490-4285-12-git-send-email-frederic@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: celtare21 <celtare21@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:12:57 +00:00
celtare21
3bdb9758bd
kernel: time: Fix compilation
Signed-off-by: celtare21 <celtare21@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:12:57 +00:00
celtare21
231031dd22
kernel: time: Fix compilation
Signed-off-by: celtare21 <celtare21@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:12:57 +00:00
Sebastian Andrzej Siewior
c90e39de38
block/ioprio: Use a helper to check for RT prio
A side-effect to the old code is that now SCHED_DEADLINE is also
recognized.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Jens Axboe <axboe@fb.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20171004154901.26904-2-bigeasy@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: celtare21 <celtare21@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:12:57 +00:00
Sebastian Andrzej Siewior
81abaa3850
sched/rt: Add a helper to test for a RT task
This helper returns true if a task has elevated priority which is true
for RT tasks (SCHED_RR and SCHED_FIFO) and also for SCHED_DEADLINE.
A task which runs at RT priority due to PI-boosting is not considered as
one with elevated priority.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Jens Axboe <axboe@fb.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20171004154901.26904-1-bigeasy@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: celtare21 <celtare21@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:12:56 +00:00
celtare21
d0fd7c1b67
kernel: time: Remove base->nohz_active leftovers
Signed-off-by: celtare21 <celtare21@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:12:56 +00:00
celtare21
0b56fc1c33
Revert "timer: Initialize global deferrable timer"
This reverts commit 4b7081aa32eaa854e13d12331c3302f1cc2ecaaa.

Signed-off-by: celtare21 <celtare21@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:12:56 +00:00
Frederic Weisbecker
0275231ff3
hrtimer: Improve comments on handling priority inversion against softirq kthread
The handling of a priority inversion between timer cancelling and a a not
well defined possible preemption of softirq kthread is not very clear.

Especially in the posix timers side it's unclear why there is a specific RT
wait callback.

All the nice explanations can be found in the initial changelog of
f61eff83cec9 (hrtimer: Prepare support for PREEMPT_RT").

Extract the detailed informations from there and put it into comments.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20190820132656.GC2093@lenoir
Signed-off-by: celtare21 <celtare21@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:12:56 +00:00
Anna-Maria Behnsen
78c2250842
hrtimer: Update softirq_expires_next correctly after __hrtimer_get_next_event()
hrtimer_force_reprogram() and hrtimer_interrupt() invokes
__hrtimer_get_next_event() to find the earliest expiry time of hrtimer
bases. __hrtimer_get_next_event() does not update
cpu_base::[softirq_]_expires_next to preserve reprogramming logic. That
needs to be done at the callsites.

hrtimer_force_reprogram() updates cpu_base::softirq_expires_next only when
the first expiring timer is a softirq timer and the soft interrupt is not
activated. That's wrong because cpu_base::softirq_expires_next is left
stale when the first expiring timer of all bases is a timer which expires
in hard interrupt context. hrtimer_interrupt() does never update
cpu_base::softirq_expires_next which is wrong too.

That becomes a problem when clock_settime() sets CLOCK_REALTIME forward and
the first soft expiring timer is in the CLOCK_REALTIME_SOFT base. Setting
CLOCK_REALTIME forward moves the clock MONOTONIC based expiry time of that
timer before the stale cpu_base::softirq_expires_next.

cpu_base::softirq_expires_next is cached to make the check for raising the
soft interrupt fast. In the above case the soft interrupt won't be raised
until clock monotonic reaches the stale cpu_base::softirq_expires_next
value. That's incorrect, but what's worse it that if the softirq timer
becomes the first expiring timer of all clock bases after the hard expiry
timer has been handled the reprogramming of the clockevent from
hrtimer_interrupt() will result in an interrupt storm. That happens because
the reprogramming does not use cpu_base::softirq_expires_next, it uses
__hrtimer_get_next_event() which returns the actual expiry time. Once clock
MONOTONIC reaches cpu_base::softirq_expires_next the soft interrupt is
raised and the storm subsides.

Change the logic in hrtimer_force_reprogram() to evaluate the soft and hard
bases seperately, update softirq_expires_next and handle the case when a
soft expiring timer is the first of all bases by comparing the expiry times
and updating the required cpu base fields. Split this functionality into a
separate function to be able to use it in hrtimer_interrupt() as well
without copy paste.

Fixes: 5da70160462e ("hrtimer: Implement support for softirq based hrtimers")
Reported-by: Mikael Beckius <mikael.beckius@windriver.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Mikael Beckius <mikael.beckius@windriver.com>
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210223160240.27518-1-anna-maria@linutronix.de
Signed-off-by: celtare21 <celtare21@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:12:56 +00:00
Sebastian Andrzej Siewior
1607023333
hrtimer: Don't dereference the hrtimer pointer after the callback
A hrtimer can be released in its callback, but lockdep_hrtimer_exit()
dereferences the pointer after the callback returns, i.e. a potential use
after free.

Retrieve the context in which the hrtimer expires before the callback is
invoked and use it in lockdep_hrtimer_exit().

Fixes: 40db173965c0 ("lockdep: Add hrtimer context tracing bits")
Reported-by: syzbot+62c155c276e580cfb606@syzkaller.appspotmail.com
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200331201849.fkp2siy3vcdqvqlz@linutronix.de
Signed-off-by: celtare21 <celtare21@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:12:56 +00:00
Sebastian Andrzej Siewior
51af9ff14e
lockdep: Add hrtimer context tracing bits
Set current->irq_config = 1 for hrtimers which are not marked to expire in
hard interrupt context during hrtimer_init(). These timers will expire in
softirq context on PREEMPT_RT.

Setting this allows lockdep to differentiate these timers. If a timer is
marked to expire in hard interrupt context then the timer callback is not
supposed to acquire a regular spinlock instead of a raw_spinlock in the
expiry callback.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200321113242.534508206@linutronix.de
Signed-off-by: celtare21 <celtare21@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:12:55 +00:00
Wen Yang
f36f50f27c
hrtimer: Cast explicitely to u32t in __ktime_divns()
do_div() does a 64-by-32 division at least on 32bit platforms, while the
divisor 'div' is explicitly casted to unsigned long, thus 64-bit on 64-bit
platforms.

The code already ensures that the divisor is less than 2^32. Hence the
proper cast type is u32.

Signed-off-by: Wen Yang <wenyang@linux.alibaba.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200130130851.29204-1-wenyang@linux.alibaba.com

Signed-off-by: celtare21 <celtare21@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:12:55 +00:00
Jules Irenge
a4a20acd98
hrtimer: Add missing sparse annotation for __run_timer()
Sparse reports a warning at __run_hrtimer()
|warning: context imbalance in __run_hrtimer() - unexpected unlock

Add the missing must_hold() annotation.

Signed-off-by: Jules Irenge <jbi.octave@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20200120224347.51843-1-jbi.octave@gmail.com

Signed-off-by: celtare21 <celtare21@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:12:55 +00:00
Eric Dumazet
d96478bc24
hrtimer: Annotate lockless access to timer->base
Followup to commit dd2261ed45aa ("hrtimer: Protect lockless access
to timer->base")

lock_hrtimer_base() fetches timer->base without lock exclusion.

Compiler is allowed to read timer->base twice (even if considered dumb)
which could end up trying to lock migration_base and return
&migration_base.

  base = timer->base;
  if (likely(base != &migration_base)) {

       /* compiler reads timer->base again, and now (base == &migration_base)

       raw_spin_lock_irqsave(&base->cpu_base->lock, *flags);
       if (likely(base == timer->base))
            return base; /* == &migration_base ! */

Similarly the write sides must use WRITE_ONCE() to avoid store tearing.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20191008173204.180879-1-edumazet@google.com

Signed-off-by: celtare21 <celtare21@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:12:55 +00:00
Sebastian Andrzej Siewior
162f69ddca
hrtimer: Add a missing bracket and hide `migration_base' on !SMP
The recent change to avoid taking the expiry lock when a timer is currently
migrated missed to add a bracket at the end of the if statement leading to
compile errors.  Since that commit the variable `migration_base' is always
used but it is only available on SMP configuration thus leading to another
compile error.  The changelog says "The timer base and base->cpu_base
cannot be NULL in the code path", so it is safe to limit this check to SMP
configurations only.

Add the missing bracket to the if statement and hide `migration_base'
behind CONFIG_SMP bars.

[ tglx: Mark the functions inline ... ]

Fixes: 68b2c8c1e4210 ("hrtimer: Don't take expiry_lock when timer is currently migrated")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20190904145527.eah7z56ntwobqm6j@linutronix.de

Signed-off-by: celtare21 <celtare21@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:12:55 +00:00
Julien Grall
b52cba7da3
hrtimer: Don't take expiry_lock when timer is currently migrated
migration_base is used as a placeholder when an hrtimer is migrated to a
different CPU. In the case that hrtimer_cancel_wait_running() hits a timer
which is currently migrated it would pointlessly acquire the expiry lock of
the migration base, which is even not initialized.

Surely it could be initialized, but there is absolutely no point in
acquiring this lock because the timer is guaranteed not to run it's
callback for which the caller waits to finish on that base. So it would
just do the inc/lock/dec/unlock dance for nothing.

As the base switch is short and non-preemptible, there is no issue when the
wait function returns immediately.

The timer base and base->cpu_base cannot be NULL in the code path which is
invoking that, so just replace those checks with a check whether base is
migration base.

[ tglx: Updated from RT patch. Massaged changelog. Added comment. ]

Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20190821092409.13225-4-julien.grall@arm.com

Signed-off-by: celtare21 <celtare21@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:12:55 +00:00
Julien Grall
4e3f887bea
hrtimer: Protect lockless access to timer->base
The update to timer->base is protected by the base->cpu_base->lock().
However, hrtimer_cancel_wait_running() does access it lockless.  So the
compiler is allowed to refetch timer->base which can cause havoc when the
timer base is changed concurrently.

Use READ_ONCE() to prevent this.

[ tglx: Adapted from a RT patch ]

Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20190821092409.13225-2-julien.grall@arm.com
Signed-off-by: celtare21 <celtare21@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:12:55 +00:00
Thomas Gleixner
3149419478
posix-timers: Use a callback for cancel synchronization on PREEMPT_RT
Posix timer delete retry loops are affected by the same priority inversion
and live lock issues as the other timers.

Provide a RT specific synchronization function which keeps a reference to
the timer by holding rcu read lock to prevent the timer from being freed,
dropping the timer lock and invoking the timer specific wait function via a
new callback.

This does not yet cover posix CPU timers because they need more special
treatment on PREEMPT_RT.

[ This is folded into the original attempt which did not use a callback. ]

Originally-by: Anna-Maria Gleixenr <anna-maria@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lkml.kernel.org/r/20190819143801.656864506@linutronix.de
Signed-off-by: celtare21 <celtare21@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:12:54 +00:00
Thomas Gleixner
f60aa648bd
posix-timers: Rework cancel retry loops
As a preparatory step for adding the PREEMPT RT specific synchronization
mechanism to wait for a running timer callback, rework the timer cancel
retry loops so they call a common function. This allows trivial
substitution in one place.

Originally-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20190730223828.874901027@linutronix.de

Signed-off-by: celtare21 <celtare21@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:12:54 +00:00
Anna-Maria Gleixner
ab44492262
hrtimer: Prepare support for PREEMPT_RT
When PREEMPT_RT is enabled, the soft interrupt thread can be preempted.  If
the soft interrupt thread is preempted in the middle of a timer callback,
then calling hrtimer_cancel() can lead to two issues:

  - If the caller is on a remote CPU then it has to spin wait for the timer
    handler to complete. This can result in unbound priority inversion.

  - If the caller originates from the task which preempted the timer
    handler on the same CPU, then spin waiting for the timer handler to
    complete is never going to end.

To avoid these issues, add a new lock to the timer base which is held
around the execution of the timer callbacks. If hrtimer_cancel() detects
that the timer callback is currently running, it blocks on the expiry
lock. When the callback is finished, the expiry lock is dropped by the
softirq thread which wakes up the waiter and the system makes progress.

This addresses both the priority inversion and the life lock issues.

The same issue can happen in virtual machines when the vCPU which runs a
timer callback is scheduled out. If a second vCPU of the same guest calls
hrtimer_cancel() it will spin wait for the other vCPU to be scheduled back
in. The expiry lock mechanism would avoid that. It'd be trivial to enable
this when paravirt spinlocks are enabled in a guest, but it's not clear
whether this is an actual problem in the wild, so for now it's an RT only
mechanism.

[ tglx: Refactored it for mainline ]

Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20190726185753.737767218@linutronix.de

Signed-off-by: celtare21 <celtare21@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:12:54 +00:00
Sebastian Andrzej Siewior
1c4ee5a066
hrtimer: Determine hard/soft expiry mode for hrtimer sleepers on RT
On PREEMPT_RT enabled kernels hrtimers which are not explicitely marked for
hard interrupt expiry mode are moved into soft interrupt context either for
latency reasons or because the hrtimer callback takes regular spinlocks or
invokes other functions which are not suitable for hard interrupt context
on PREEMPT_RT.

The hrtimer_sleeper callback is RT compatible in hard interrupt context,
but there is a latency concern: Untrusted userspace can spawn many threads
which arm timers for the same expiry time on the same CPU. On expiry that
causes a latency spike due to the wakeup of a gazillion threads.

OTOH, priviledged real-time user space applications rely on the low latency
of hard interrupt wakeups. These syscall related wakeups are all based on
hrtimer sleepers.

If the current task is in a real-time scheduling class, mark the mode for
hard interrupt expiry.

[ tglx: Split out of a larger combo patch. Added changelog ]

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20190726185753.645792403@linutronix.de

Signed-off-by: celtare21 <celtare21@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:12:54 +00:00
Sebastian Andrzej Siewior
a106cdc4be
hrtimer: Move unmarked hrtimers to soft interrupt expiry on RT
On PREEMPT_RT not all hrtimers can be expired in hard interrupt context
even if that is perfectly fine on a PREEMPT_RT=n kernel, e.g. because they
take regular spinlocks. Also for latency reasons PREEMPT_RT tries to defer
most hrtimers' expiry into softirq context.

hrtimers marked with HRTIMER_MODE_HARD must be kept in hard interrupt
context expiry mode. Add the required logic.

No functional change for PREEMPT_RT=n kernels.

[ tglx: Split out of a larger combo patch. Added changelog ]

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20190726185753.551967692@linutronix.de

Signed-off-by: celtare21 <celtare21@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:12:54 +00:00
Thomas Gleixner
c9cf4e5cff
hrtimer: Make enqueue mode check work on RT
hrtimer_start_range_ns() has a WARN_ONCE() which verifies that a timer
which is marker for softirq expiry is not queued in the hard interrupt base
and vice versa.

When PREEMPT_RT is enabled, timers which are not explicitely marked to
expire in hard interrupt context are deferrred to the soft interrupt. So
the regular check would trigger.

Change the check, so when PREEMPT_RT is enabled, it is verified that the
timers marked for hard interrupt expiry are not tried to be queued for soft
interrupt expiry or any of the unmarked and softirq marked is tried to be
expired in hard interrupt context.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Signed-off-by: celtare21 <celtare21@gmail.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-30 14:12:54 +00:00