msm-4.14

mirror of https://github.com/rd-stuffs/msm-4.14.git synced 2025-02-20 11:45:48 +08:00

Author	SHA1	Message	Date
Tashfin Shakeer Rhythm	a9566ccc56	msm-4.14: Make macros no-op using `((void)0)` Do not solely rely on compiler optimizations to get the workaround of having macros do nothing using an empty do-while loop. It's inefficient. Use ((void)0) to which the standard assert macro expands when NDEBUG is defined. No functional change intended. [mcdofrenchfreis]: Implement this patch to tree using the command: git grep -l "do {} while (0)" \| xargs sed -i "s/do {} while (0)/((void)0)/g" Change-Id: I9615c62c46670e31ed8d0d89d195144541baa3e6 Signed-off-by: Tashfin Shakeer Rhythm <tashfinshakeerrhythm@gmail.com> Signed-off-by: mcdofrenchfreis <xyzevan@androidist.net> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2025-01-17 00:51:10 -03:00
Andrey Ignatov	ae13d58598	BACKPORT: bpf: Hooks for sys_connect == The problem == See description of the problem in the initial patch of this patch set. == The solution == The patch provides much more reliable in-kernel solution for the 2nd part of the problem: making outgoing connecttion from desired IP. It adds new attach types `BPF_CGROUP_INET4_CONNECT` and `BPF_CGROUP_INET6_CONNECT` for program type `BPF_PROG_TYPE_CGROUP_SOCK_ADDR` that can be used to override both source and destination of a connection at connect(2) time. Local end of connection can be bound to desired IP using newly introduced BPF-helper `bpf_bind()`. It allows to bind to only IP though, and doesn't support binding to port, i.e. leverages `IP_BIND_ADDRESS_NO_PORT` socket option. There are two reasons for this: * looking for a free port is expensive and can affect performance significantly; * there is no use-case for port. As for remote end (`struct sockaddr *` passed by user), both parts of it can be overridden, remote IP and remote port. It's useful if an application inside cgroup wants to connect to another application inside same cgroup or to itself, but knows nothing about IP assigned to the cgroup. Support is added for IPv4 and IPv6, for TCP and UDP. IPv4 and IPv6 have separate attach types for same reason as sys_bind hooks, i.e. to prevent reading from / writing to e.g. user_ip6 fields when user passes sockaddr_in since it'd be out-of-bound. == Implementation notes == The patch introduces new field in `struct proto`: `pre_connect` that is a pointer to a function with same signature as `connect` but is called before it. The reason is in some cases BPF hooks should be called way before control is passed to `sk->sk_prot->connect`. Specifically `inet_dgram_connect` autobinds socket before calling `sk->sk_prot->connect` and there is no way to call `bpf_bind()` from hooks from e.g. `ip4_datagram_connect` or `ip6_datagram_connect` since it'd cause double-bind. On the other hand `proto.pre_connect` provides a flexible way to add BPF hooks for connect only for necessary `proto` and call them at desired time before `connect`. Since `bpf_bind()` is allowed to bind only to IP and autobind in `inet_dgram_connect` binds only port there is no chance of double-bind. bpf_bind() sets `force_bind_address_no_port` to bind to only IP despite of value of `bind_address_no_port` socket field. bpf_bind() sets `with_lock` to `false` when calling to __inet_bind() and __inet6_bind() since all call-sites, where bpf_bind() is called, already hold socket lock. Change-Id: I03eb513369c630b203466621d1fbdb9b29c8333c Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2025-01-13 14:37:40 -03:00
Andrey Ignatov	989ae8b398	BACKPORT: net: Introduce __inet_bind() and __inet6_bind Refactor `bind()` code to make it ready to be called from BPF helper function `bpf_bind()` (will be added soon). Implementation of `inet_bind()` and `inet6_bind()` is separated into `__inet_bind()` and `__inet6_bind()` correspondingly. These function can be used from both `sk_prot->bind` and `bpf_bind()` contexts. New functions have two additional arguments. `force_bind_address_no_port` forces binding to IP only w/o checking `inet_sock.bind_address_no_port` field. It'll allow to bind local end of a connection to desired IP in `bpf_bind()` w/o changing `bind_address_no_port` field of a socket. It's useful since `bpf_bind()` can return an error and we'd need to restore original value of `bind_address_no_port` in that case if we changed this before calling to the helper. `with_lock` specifies whether to lock socket when working with `struct sk` or not. The argument is set to `true` for `sk_prot->bind`, i.e. old behavior is preserved. But it will be set to `false` for `bpf_bind()` use-case. The reason is all call-sites, where `bpf_bind()` will be called, already hold that socket lock. Change-Id: I3cd102acdb2b3c14946ef8452fd7afb763e8215f Signed-off-by: Andrey Ignatov <rdna@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2025-01-13 14:37:40 -03:00
Richard Raya	00fa45c0a6	This is the 4.14.355 OpenELA-Extended LTS stable release -----BEGIN PGP SIGNATURE----- iQJNBAABCAA3FiEERFwmR4yFob14UDOYC8702P6YulgFAmciUfMZHHZlZ2FyZC5u b3NzdW1Ab3JhY2xlLmNvbQAKCRALzvTY/pi6WLuWD/9Ygfcw1geLS8AUG78WElHy l9L6JBbhAMCd7UTIn6ulSxNalJSMUg9P5xbtBVwz64qLCeJYEpJZ23UOj1fNK9Sm +zBI3R3LURhRr/qwRVN8xJvKfGNODx8LsI61H8bcowKOjC1zdtd11vtZfu+KmumM G26MeAabNszHOaFzoAoD/0VR6xxbHbvWv9oiyaCvyyB1iPU2wpGfO26dQ6YOO87K 914tsdY+I70tpAgCks3DyZaN+h0kXGww9k9YCG8awxHHgAvxsVQ5+cZtC0QO04bN uEwOHapieFtoFGG/c6cUq9ARiVkWdgXV0+xeGzefDbygcfZjfC4b6a+iSf9j42oq X1hM7mGxJMvzoiweXOF9XhfrmBKpnXcLgZqpFTk3Iy/EALstd9AnvY7SJCJLa+u8 Dp2NOZ/9tKfvrfiI/AA+uwjlFLeA1dbfvf5HcUtfKuJ7Y1Qq4q9QxRr9svk+7+ZG nbrYxNfpxmrgffrm5W9gt/02M8v+ymC9fIFK82V6EEPbnikPE8SGJ2JAuirjiG1i u1bvIzTCBtu6074tuqCjHKAyTUFNZMKS8xxT+prEolXguBiuEmlJodK7kkJKFurc uAiLlYpMZNvFJox5E0vCLD8HInTHNR3yd8D/kNg733ukg6o2mKvLkR83lHpvhe67 0bxx5K76GV7p4I/wvIJ6zw== =JpII -----END PGP SIGNATURE----- Merge tag 'v4.14.355-openela' of https://github.com/openela/kernel-lts This is the 4.14.355 OpenELA-Extended LTS stable release * tag 'v4.14.355-openela' of https://github.com/openela/kernel-lts: (79 commits) LTS: Update to 4.14.355 Revert "parisc: Use irq_enter_rcu() to fix warning at kernel/context_tracking.c:367" netns: restore ops before calling ops_exit_list cx82310_eth: fix error return code in cx82310_bind() rtmutex: Drop rt_mutex::wait_lock before scheduling locking/rtmutex: Handle non enqueued waiters gracefully in remove_waiter() drm/i915/fence: Mark debug_fence_free() with __maybe_unused ACPI: processor: Fix memory leaks in error paths of processor_add() ACPI: processor: Return an error if acpi_processor_get_info() fails in processor_add() netns: add pre_exit method to struct pernet_operations net: Add comment about pernet_operations methods and synchronization nilfs2: protect references to superblock parameters exposed in sysfs nilfs2: replace snprintf in show functions with sysfs_emit nilfs2: use time64_t internally tracing: Avoid possible softlockup in tracing_iter_reset() ring-buffer: Rename ring_buffer_read() to read_buffer_iter_advance() uprobes: Use kzalloc to allocate xol area clocksource/drivers/imx-tpm: Fix next event not taking effect sometime clocksource/drivers/imx-tpm: Fix return -ETIME when delta exceeds INT_MAX VMCI: Fix use-after-free when removing resource in vmci_resource_remove() ... Change-Id: I237799395c31c147d9e602b34bff999c65fe9ef0 Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-10-31 20:49:41 -03:00
Eric Dumazet	9541fb132b	netns: add pre_exit method to struct pernet_operations commit d7d99872c144a2c2f5d9c9d83627fa833836cba5 upstream. Current struct pernet_operations exit() handlers are highly discouraged to call synchronize_rcu(). There are cases where we need them, and exit_batch() does not help the common case where a single netns is dismantled. This patch leverages the existing synchronize_rcu() call in cleanup_net() Calling optional ->pre_exit() method before ->exit() or ->exit_batch() allows to benefit from a single synchronize_rcu() call. Note that the synchronize_rcu() calls added in this patch are only in error paths or slow paths. Tested: $ time for i in {1..1000}; do unshare -n /bin/false;done real 0m2.612s user 0m0.171s sys 0m2.216s Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit bc596c2026c7f52dc12a784403ccfc3b152844e6) Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2024-10-30 14:08:25 +00:00
Kirill Tkhai	9e0063de11	net: Add comment about pernet_operations methods and synchronization Make locking scheme be visible for users, and provide a comment what for we are need exit_batch() methods, and when it should be used. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit 6056415d3a513846f774e7bbee0de0460b1c15df) Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2024-10-30 14:08:25 +00:00
Richard Raya	2cd059fb56	This is the 4.14.354 OpenELA-Extended LTS stable release -----BEGIN PGP SIGNATURE----- iQJNBAABCAA3FiEERFwmR4yFob14UDOYC8702P6YulgFAmcgko0ZHHZlZ2FyZC5u b3NzdW1Ab3JhY2xlLmNvbQAKCRALzvTY/pi6WL/GD/0em+uP/O8QiPYqeGrEECpW bgRsBiN3XnyEsghAjplWX12G/zjxA0PY0u2zh9K9sdPw60n8nVZ1OxvPHINwuSC9 kE9N60SCpJ88ju9OtU+4xz/nxtEmlel8fWy5elagB5wqbWbvsjT52ceZXqSxqhy7 pQdIDHSiUUwx9JL6vDuJSL+Z/Y216qvBETZLnDSo90raFp/MDa5JmQsh81lLeUt8 wGKwC/Olnbd21QTStNK34aQGyX5b+3YeACFVPud66Zs9airz9EE6Yq78gwL29L2k 4jxzihXxSkkfa66eR63ap53+/mEqOZX72m2qEMVOvAcAwU0XsNDTdkXN7z8YQ5T3 E1rJwr4Ox0hmM+hHBA20w9xRDXZoZmdrcjsU1aNKuK2zTJ0h9DBIvMM2XY5n5sWK I4F8E15KyKmu4nXBETreXZixqVLZMgjNFncRLf8XBIL1kxXm65LYCHypp3AgdVgo Ccdq5PbC6LAyNPrIOaftIaS9VlU15cqcalu7A+gSoWq55LGWAa3G9vX0ZtYQB9QX 0R18fbzyjqG6Wa5J5KRDJ+HyS4IvdnEWS8hMR3jfosjMNgJhfDlDeev8NARBiDpX d26xogNA7xOOvtdpuwEbnxD5kR0zUdnC73pC4wxdMptYSK6ULKNPmTkA0dKE9qvl TDgw4DML8vXQqJ4P+w3Njw== =gX2R -----END PGP SIGNATURE----- Merge tag 'v4.14.354-openela' of https://github.com/openela/kernel-lts This is the 4.14.354 OpenELA-Extended LTS stable release * tag 'v4.14.354-openela' of https://github.com/openela/kernel-lts: (90 commits) LTS: Update to 4.14.354 drm/fb-helper: set x/yres_virtual in drm_fb_helper_check_var ipc: remove memcg accounting for sops objects in do_semtimedop() scsi: aacraid: Fix double-free on probe failure usb: core: sysfs: Unmerge @usb3_hardware_lpm_attr_group in remove_power_attributes() usb: dwc3: st: fix probed platform device ref count on probe error path usb: dwc3: core: Prevent USB core invalid event buffer address access usb: dwc3: omap: add missing depopulate in probe error path USB: serial: option: add MeiG Smart SRM825L cdc-acm: Add DISABLE_ECHO quirk for GE HealthCare UI Controller net: busy-poll: use ktime_get_ns() instead of local_clock() gtp: fix a potential NULL pointer dereference net: prevent mss overflow in skb_segment() ida: Fix crash in ida_free when the bitmap is empty net:rds: Fix possible deadlock in rds_message_put fbmem: Check virtual screen sizes in fb_set_var() fbcon: Prevent that screen size is smaller than font size printk: Export is_console_locked memcg: enable accounting of ipc resources cgroup/cpuset: Prevent UAF in proc_cpuset_show() ... Change-Id: I7da4d8d188dec9d2833216e5d6580dbd72b99240 Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-10-29 20:17:04 -03:00
Eric Dumazet	843fa098a5	net: busy-poll: use ktime_get_ns() instead of local_clock() [ Upstream commit 0870b0d8b393dde53106678a1e2cec9dfa52f9b7 ] Typically, busy-polling durations are below 100 usec. When/if the busy-poller thread migrates to another cpu, local_clock() can be off by +/-2msec or more for small values of HZ, depending on the platform. Use ktimer_get_ns() to ensure deterministic behavior, which is the whole point of busy-polling. Fixes: 060212928670 ("net: add low latency socket poll") Fixes: 9a3c71aa8024 ("net: convert low latency sockets to sched_clock()") Fixes: 37089834528b ("sched, net: Fixup busy_loop_us_clock()") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Mina Almasry <almasrymina@google.com> Cc: Willem de Bruijn <willemb@google.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20240827114916.223377-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 1b1f0890fb51fc50bf990a800106a133f9036f32) Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2024-10-24 10:07:41 +00:00
Kuniyuki Iwashima	6bff278ca0	kcm: Serialise kcm_sendmsg() for the same socket. [ Upstream commit 807067bf014d4a3ae2cc55bd3de16f22a01eb580 ] syzkaller reported UAF in kcm_release(). [0] The scenario is 1. Thread A builds a skb with MSG_MORE and sets kcm->seq_skb. 2. Thread A resumes building skb from kcm->seq_skb but is blocked by sk_stream_wait_memory() 3. Thread B calls sendmsg() concurrently, finishes building kcm->seq_skb and puts the skb to the write queue 4. Thread A faces an error and finally frees skb that is already in the write queue 5. kcm_release() does double-free the skb in the write queue When a thread is building a MSG_MORE skb, another thread must not touch it. Let's add a per-sk mutex and serialise kcm_sendmsg(). [0]: BUG: KASAN: slab-use-after-free in __skb_unlink include/linux/skbuff.h:2366 [inline] BUG: KASAN: slab-use-after-free in __skb_dequeue include/linux/skbuff.h:2385 [inline] BUG: KASAN: slab-use-after-free in __skb_queue_purge_reason include/linux/skbuff.h:3175 [inline] BUG: KASAN: slab-use-after-free in __skb_queue_purge include/linux/skbuff.h:3181 [inline] BUG: KASAN: slab-use-after-free in kcm_release+0x170/0x4c8 net/kcm/kcmsock.c:1691 Read of size 8 at addr ffff0000ced0fc80 by task syz-executor329/6167 CPU: 1 PID: 6167 Comm: syz-executor329 Tainted: G B 6.8.0-rc5-syzkaller-g9abbc24128bc #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024 Call trace: dump_backtrace+0x1b8/0x1e4 arch/arm64/kernel/stacktrace.c:291 show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:298 __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xd0/0x124 lib/dump_stack.c:106 print_address_description mm/kasan/report.c:377 [inline] print_report+0x178/0x518 mm/kasan/report.c:488 kasan_report+0xd8/0x138 mm/kasan/report.c:601 __asan_report_load8_noabort+0x20/0x2c mm/kasan/report_generic.c:381 __skb_unlink include/linux/skbuff.h:2366 [inline] __skb_dequeue include/linux/skbuff.h:2385 [inline] __skb_queue_purge_reason include/linux/skbuff.h:3175 [inline] __skb_queue_purge include/linux/skbuff.h:3181 [inline] kcm_release+0x170/0x4c8 net/kcm/kcmsock.c:1691 __sock_release net/socket.c:659 [inline] sock_close+0xa4/0x1e8 net/socket.c:1421 __fput+0x30c/0x738 fs/file_table.c:376 ____fput+0x20/0x30 fs/file_table.c:404 task_work_run+0x230/0x2e0 kernel/task_work.c:180 exit_task_work include/linux/task_work.h:38 [inline] do_exit+0x618/0x1f64 kernel/exit.c:871 do_group_exit+0x194/0x22c kernel/exit.c:1020 get_signal+0x1500/0x15ec kernel/signal.c:2893 do_signal+0x23c/0x3b44 arch/arm64/kernel/signal.c:1249 do_notify_resume+0x74/0x1f4 arch/arm64/kernel/entry-common.c:148 exit_to_user_mode_prepare arch/arm64/kernel/entry-common.c:169 [inline] exit_to_user_mode arch/arm64/kernel/entry-common.c:178 [inline] el0_svc+0xac/0x168 arch/arm64/kernel/entry-common.c:713 el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730 el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598 Allocated by task 6166: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x40/0x78 mm/kasan/common.c:68 kasan_save_alloc_info+0x70/0x84 mm/kasan/generic.c:626 unpoison_slab_object mm/kasan/common.c:314 [inline] __kasan_slab_alloc+0x74/0x8c mm/kasan/common.c:340 kasan_slab_alloc include/linux/kasan.h:201 [inline] slab_post_alloc_hook mm/slub.c:3813 [inline] slab_alloc_node mm/slub.c:3860 [inline] kmem_cache_alloc_node+0x204/0x4c0 mm/slub.c:3903 __alloc_skb+0x19c/0x3d8 net/core/skbuff.c:641 alloc_skb include/linux/skbuff.h:1296 [inline] kcm_sendmsg+0x1d3c/0x2124 net/kcm/kcmsock.c:783 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg net/socket.c:745 [inline] sock_sendmsg+0x220/0x2c0 net/socket.c:768 splice_to_socket+0x7cc/0xd58 fs/splice.c:889 do_splice_from fs/splice.c:941 [inline] direct_splice_actor+0xec/0x1d8 fs/splice.c:1164 splice_direct_to_actor+0x438/0xa0c fs/splice.c:1108 do_splice_direct_actor fs/splice.c:1207 [inline] do_splice_direct+0x1e4/0x304 fs/splice.c:1233 do_sendfile+0x460/0xb3c fs/read_write.c:1295 __do_sys_sendfile64 fs/read_write.c:1362 [inline] __se_sys_sendfile64 fs/read_write.c:1348 [inline] __arm64_sys_sendfile64+0x160/0x3b4 fs/read_write.c:1348 __invoke_syscall arch/arm64/kernel/syscall.c:37 [inline] invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:51 el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:136 do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:155 el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:712 el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730 el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598 Freed by task 6167: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x40/0x78 mm/kasan/common.c:68 kasan_save_free_info+0x5c/0x74 mm/kasan/generic.c:640 poison_slab_object+0x124/0x18c mm/kasan/common.c:241 __kasan_slab_free+0x3c/0x78 mm/kasan/common.c:257 kasan_slab_free include/linux/kasan.h:184 [inline] slab_free_hook mm/slub.c:2121 [inline] slab_free mm/slub.c:4299 [inline] kmem_cache_free+0x15c/0x3d4 mm/slub.c:4363 kfree_skbmem+0x10c/0x19c __kfree_skb net/core/skbuff.c:1109 [inline] kfree_skb_reason+0x240/0x6f4 net/core/skbuff.c:1144 kfree_skb include/linux/skbuff.h:1244 [inline] kcm_release+0x104/0x4c8 net/kcm/kcmsock.c:1685 __sock_release net/socket.c:659 [inline] sock_close+0xa4/0x1e8 net/socket.c:1421 __fput+0x30c/0x738 fs/file_table.c:376 ____fput+0x20/0x30 fs/file_table.c:404 task_work_run+0x230/0x2e0 kernel/task_work.c:180 exit_task_work include/linux/task_work.h:38 [inline] do_exit+0x618/0x1f64 kernel/exit.c:871 do_group_exit+0x194/0x22c kernel/exit.c:1020 get_signal+0x1500/0x15ec kernel/signal.c:2893 do_signal+0x23c/0x3b44 arch/arm64/kernel/signal.c:1249 do_notify_resume+0x74/0x1f4 arch/arm64/kernel/entry-common.c:148 exit_to_user_mode_prepare arch/arm64/kernel/entry-common.c:169 [inline] exit_to_user_mode arch/arm64/kernel/entry-common.c:178 [inline] el0_svc+0xac/0x168 arch/arm64/kernel/entry-common.c:713 el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730 el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598 The buggy address belongs to the object at ffff0000ced0fc80 which belongs to the cache skbuff_head_cache of size 240 The buggy address is located 0 bytes inside of freed 240-byte region [ffff0000ced0fc80, ffff0000ced0fd70) The buggy address belongs to the physical page: page:00000000d35f4ae4 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x10ed0f flags: 0x5ffc00000000800(slab\|node=0\|zone=2\|lastcpupid=0x7ff) page_type: 0xffffffff() raw: 05ffc00000000800 ffff0000c1cbf640 fffffdffc3423100 dead000000000004 raw: 0000000000000000 00000000000c000c 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff0000ced0fb80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff0000ced0fc00: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc >ffff0000ced0fc80: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff0000ced0fd00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc ffff0000ced0fd80: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module") Reported-by: syzbot+b72d86aa5df17ce74c60@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=b72d86aa5df17ce74c60 Tested-by: syzbot+b72d86aa5df17ce74c60@syzkaller.appspotmail.com Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20240815220437.69511-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 8c9cdbf600143bd6835c8b8351e5ac956da79aec) Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2024-10-24 10:07:39 +00:00
Eric Dumazet	795faf9727	net: fix __dst_negative_advice() race commit 92f1655aa2b2294d0b49925f3b875a634bd3b59e upstream. __dst_negative_advice() does not enforce proper RCU rules when sk->dst_cache must be cleared, leading to possible UAF. RCU rules are that we must first clear sk->sk_dst_cache, then call dst_release(old_dst). Note that sk_dst_reset(sk) is implementing this protocol correctly, while __dst_negative_advice() uses the wrong order. Given that ip6_negative_advice() has special logic against RTF_CACHE, this means each of the three ->negative_advice() existing methods must perform the sk_dst_reset() themselves. Note the check against NULL dst is centralized in __dst_negative_advice(), there is no need to duplicate it in various callbacks. Many thanks to Clement Lecigne for tracking this issue. This old bug became visible after the blamed commit, using UDP sockets. Fixes: a87cb3e48ee8 ("net: Facility to report route quality of connected sockets") Reported-by: Clement Lecigne <clecigne@google.com> Diagnosed-by: Clement Lecigne <clecigne@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <tom@herbertland.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20240528114353.1794151-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> [mheyne: contextual conflict in ip6_negative_advice due to missing commit c3c14da0288d ("net/ipv6: add rcu locking to ip6_negative_advice") and commit 93531c674315 ("net/ipv6: separate handling of FIB entries from dst based routes")] Signed-off-by: Maximilian Heyne <mheyne@amazon.de> Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2024-10-10 10:27:57 +00:00
Park Ju Hyung	7f9792e95c	sock: Inline SELinux's sk_security to struct sock As we don't have to care about any other security mechanism than SELinux, inline its struct sk_security_struct into struct sock. This way, the kernel no longer has to allocate and de-allocate struct sk_security_struct, which happens extremely frequently under Android. Change-Id: Ic1e584a7d31bbb225cbd3079a5002542a7683818 Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com> Signed-off-by: Yaroslav Furman <yaro330@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-08-27 11:40:02 -03:00
Eric Dumazet	9de03bfcc2	net: fix __dst_negative_advice() race commit 92f1655aa2b2294d0b49925f3b875a634bd3b59e upstream. __dst_negative_advice() does not enforce proper RCU rules when sk->dst_cache must be cleared, leading to possible UAF. RCU rules are that we must first clear sk->sk_dst_cache, then call dst_release(old_dst). Note that sk_dst_reset(sk) is implementing this protocol correctly, while __dst_negative_advice() uses the wrong order. Given that ip6_negative_advice() has special logic against RTF_CACHE, this means each of the three ->negative_advice() existing methods must perform the sk_dst_reset() themselves. Note the check against NULL dst is centralized in __dst_negative_advice(), there is no need to duplicate it in various callbacks. Many thanks to Clement Lecigne for tracking this issue. This old bug became visible after the blamed commit, using UDP sockets. Fixes: a87cb3e48ee8 ("net: Facility to report route quality of connected sockets") Reported-by: Clement Lecigne <clecigne@google.com> Diagnosed-by: Clement Lecigne <clecigne@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <tom@herbertland.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20240528114353.1794151-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> [mheyne: contextual conflict in ip6_negative_advice due to missing commit c3c14da0288d ("net/ipv6: add rcu locking to ip6_negative_advice") and commit 93531c674315 ("net/ipv6: separate handling of FIB entries from dst based routes")] Signed-off-by: Maximilian Heyne <mheyne@amazon.de>	2024-08-25 23:41:58 -03:00
Richard Raya	15a5216c8c	Merge branch 'linux-4.14.y' of https://github.com/openela/kernel-lts * 'linux-4.14.y' of https://github.com/openela/kernel-lts: (133 commits) LTS: Update to 4.14.350 SUNRPC: Fix RPC client cleaned up the freed pipefs dentries arm64: dts: rockchip: Add sound-dai-cells for RK3368 tcp: Fix data races around icsk->icsk_af_ops. ipv6: Fix data races around sk->sk_prot. ipv6: annotate some data-races around sk->sk_prot pwm: stm32: Refuse too small period requests ftruncate: pass a signed offset batman-adv: Don't accept TT entries for out-of-spec VIDs batman-adv: include gfp.h for GFP_* defines drm/nouveau/dispnv04: fix null pointer dereference in nv17_tv_get_hd_modes drm/nouveau/dispnv04: fix null pointer dereference in nv17_tv_get_ld_modes hexagon: fix fadvise64_64 calling conventions tty: mcf: MCF54418 has 10 UARTS usb: atm: cxacru: fix endpoint checking in cxacru_bind() usb: musb: da8xx: fix a resource leak in probe() usb: gadget: printer: SS+ support net: usb: ax88179_178a: improve link status logs iio: adc: ad7266: Fix variable checking bug mmc: sdhci-pci: Convert PCIBIOS_* return codes to errnos ... Change-Id: I0ab862e0932a48f13c64657e5456f3034b766445 Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-08-16 22:07:21 -03:00
Pablo Neira Ayuso	c01fc5238f	netfilter: nf_tables: fully validate NFT_DATA_VALUE on store to data registers [ Upstream commit 7931d32955e09d0a11b1fe0b6aac1bfa061c005c ] register store validation for NFT_DATA_VALUE is conditional, however, the datatype is always either NFT_DATA_VALUE or NFT_DATA_VERDICT. This only requires a new helper function to infer the register type from the set datatype so this conditional check can be removed. Otherwise, pointer to chain object can be leaked through the registers. Fixes: 96518518cc41 ("netfilter: add nftables") Reported-by: Linus Torvalds <torvalds@linuxfoundation.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 40188a25a9847dbeb7ec67517174a835a677752f) [Vegard: fixed unrelated conflict due to missing commit 9c22bd1ab442c552e9481f1157589362887a7f47 ("netfilter: nf_tables: defer gc run if previous batch is still pending").] Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2024-08-08 15:52:20 +00:00
Luiz Augusto von Dentz	9ae3694d96	Bluetooth: L2CAP: Fix rejecting L2CAP_CONN_PARAM_UPDATE_REQ [ Upstream commit 806a5198c05987b748b50f3d0c0cfb3d417381a4 ] This removes the bogus check for max > hcon->le_conn_max_interval since the later is just the initial maximum conn interval not the maximum the stack could support which is really 3200=4000ms. In order to pass GAP/CONN/CPUP/BV-05-C one shall probably enter values of the following fields in IXIT that would cause hci_check_conn_params to fail: TSPX_conn_update_int_min TSPX_conn_update_int_max TSPX_conn_update_peripheral_latency TSPX_conn_update_supervision_timeout Link: https://github.com/bluez/bluez/issues/847 Fixes: e4b019515f95 ("Bluetooth: Enforce validation on max value of connection interval") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit a1f9c8328219b9bb2c29846d6c74ea981e60b5a7) Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2024-08-08 15:52:16 +00:00
Richard Raya	fad3b5236d	This is the 4.14.349 OpenELA-Extended LTS stable release -----BEGIN PGP SIGNATURE----- iQJNBAABCAA3FiEERFwmR4yFob14UDOYC8702P6YulgFAmaYMNYZHHZlZ2FyZC5u b3NzdW1Ab3JhY2xlLmNvbQAKCRALzvTY/pi6WI+2EACbJP/GYZL4iZezt3yp9J6y ObeobshL3ODENH9J4Rpjo7EJNdRbiJmqK07C6g3gxfEBqYhMDxYCBbhwTTvvHmu7 ezr1rmQmUlyzf2qW905a+rTawUrKztZpvZ0ycRXgfQHjX8w64salq/G5X9kJ1CZQ 0TYwhDXXYRc1yuhJkVH0+ZUP+FvSBYXY42QZQ8tRzviBKgHUqyQ2JiLN7yGXStSp PEOCeXuEsQxkzbFU1rG7J9KXfUYndih+fiGSvuUUZF6WTHNobfkh+nrGzsdadtUp UW9nEdHjjEhTpTr125uOGc3H2Y1rWVPrcZ9kvJBhzf4WKNBFu2v7Bc5i2/Yz/jKU 5cz7bjqpSnFOAmNe1f+pOO2oIsBk/xhAbMrPHS1eTJfUJmVL21HgDS3nXfV3yYcR 0cHH10HGf7DEx2PRh3DM53XzaiumOXY3e/eFt+syYFWtsPY0XKHjsfwLeoujCVgh Sb6yiV1HTNg2hkGck+CQKTvHKZhSs1uE+vGSHiSTpryrsXYCTRJySSXEdiU0QpeL c9xzRE0PrUaUKNucdimGr6EqvXL11M1I59Z3ygk8vyLGI13vSmkRZ9Sl7m0tbirA 0K1Ws2PkwuYQEOut8Esp6DJ2n38Uz3j0lnb2lreC0KbfXMvPWQfP81M1Lc+Pkpn6 Zgbbs68F6jYs0KV/iRty2A== =RvUO -----END PGP SIGNATURE----- Merge tag 'v4.14.349-openela' of https://github.com/openela/kernel-lts This is the 4.14.349 OpenELA-Extended LTS stable release * tag 'v4.14.349-openela' of https://github.com/openela/kernel-lts: (160 commits) LTS: Update to 4.14.349 x86/kvm: Disable all PV features on crash x86/kvm: Disable kvmclock on all CPUs on shutdown x86/kvm: Teardown PV features on boot CPU as well crypto: algif_aead - fix uninitialized ctx->init nfs: fix undefined behavior in nfs_block_bits() ext4: fix mb_cache_entry's e_refcnt leak in ext4_xattr_block_cache_find() sparc: move struct termio to asm/termios.h kdb: Use format-specifiers rather than memset() for padding in kdb_read() kdb: Merge identical case statements in kdb_read() kdb: Fix console handling when editing and tab-completing commands kdb: Use format-strings rather than '\0' injection in kdb_read() kdb: Fix buffer overflow during tab-complete sparc64: Fix number of online CPUs intel_th: pci: Add Meteor Lake-S CPU support net/9p: fix uninit-value in p9_client_rpc() crypto: qat - Fix ADF_DEV_RESET_SYNC memory leak KVM: arm64: Allow AArch32 PSTATE.M to be restored as System mode netfilter: nft_dynset: relax superfluous check on set updates netfilter: nft_dynset: report EOPNOTSUPP on missing set feature ... Change-Id: Idb0053e6b2186ef17f31e15fdb601ae451c81283 Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-18 01:54:59 -03:00
Pablo Neira Ayuso	50bfcb0af9	netfilter: nf_tables: drop map element references from preparation phase [ Upstream commit 628bd3e49cba1c066228e23d71a852c23e26da73 ] set .destroy callback releases the references to other objects in maps. This is very late and it results in spurious EBUSY errors. Drop refcount from the preparation phase instead, update set backend not to drop reference counter from set .destroy path. Exceptions: NFT_TRANS_PREPARE_ERROR does not require to drop the reference counter because the transaction abort path releases the map references for each element since the set is unbound. The abort path also deals with releasing reference counter for new elements added to unbound sets. Fixes: 591054469b3e ("netfilter: nf_tables: revisit chain/object refcounting from elements") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit bc9f791d2593f17e39f87c6e2b3a36549a3705b1) [Harshit: Fixed 3 different conflicts due to missing commit: 71cc0873e0e0 ("netfilter: nf_tables: Simplify set backend selection") and commit: 79b174ade16d ("netfilter: nf_tables: garbage collection for stateful expressions") and commit: 88bae77d6606 ("netfilter: nf_tables: use net_generic infra for transaction data") and these can't be backported to 4.14.y] Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2024-07-15 18:30:23 +00:00
Pablo Neira Ayuso	e4382ad0d8	netfilter: nf_tables: pass ctx to nf_tables_expr_destroy() nft_set_elem_destroy() can be called from call_rcu context. Annotate netns and table in set object so we can populate the context object. Moreover, pass context object to nf_tables_set_elem_destroy() from the commit phase, since it is already available from there. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> (cherry picked from commit 3453c92731884bad7c4c3a0667228b964747f3d5) [Harshit: 4.14.y had backport commit: 4e0dbab570de ("netfilter: nf_tables: do not allow SET_ID to refer to another table") which does add couple of things which this commit is supposed to add] Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> [Vegard: removed .family = set->table->family assignment in nft_set_elem_destroy() as we're missing commit 36596dadf54a920d26286cf9f421fb4ef648b51f ("netfilter: nf_tables: add single table list for all families").] Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2024-07-15 18:30:23 +00:00
Pablo Neira Ayuso	05084047e5	netfilter: nf_tables: fix set double-free in abort path [ Upstream commit 40ba1d9b4d19796afc9b7ece872f5f3e8f5e2c13 ] The abort path can cause a double-free of an anonymous set. Added-and-to-be-aborted rule looks like this: udp dport { 137, 138 } drop The to-be-aborted transaction list looks like this: newset newsetelem newsetelem rule This gets walked in reverse order, so first pass disables the rule, the set elements, then the set. After synchronize_rcu(), we then destroy those in same order: rule, set element, set element, newset. Problem is that the anonymous set has already been bound to the rule, so the rule (lookup expression destructor) already frees the set, when then cause use-after-free when trying to delete the elements from this set, then try to free the set again when handling the newset expression. Rule releases the bound set in first place from the abort path, this causes the use-after-free on set element removal when undoing the new element transactions. To handle this, skip new element transaction if set is bound from the abort path. This is still causes the use-after-free on set element removal. To handle this, remove transaction from the list when the set is already bound. Joint work with Florian Westphal. Fixes: f6ac85858976 ("netfilter: nf_tables: unbind set in rule from commit path") Bugzilla: https://bugzilla.netfilter.org/show_bug.cgi?id=1325 Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 25ddad73070c8fd15528fe804db90f0bda0dc6ad) [Vegard: fix conflicts due to commit 59c38c80769a5528a617ea8857049b94a4b67041 ("netfilter: nf_tables: use-after-free in failing rule with bound set") which included extra code for the NFT_MSG_NEWSETELEM case that was added upstream in commit 40ba1d9b4d19796afc9b7ece872f5f3e8f5e2c13 ("netfilter: nf_tables: fix set double-free in abort path") and conflicts due to commit 63d921c3e52a14125f293efea5f78508c36668c1 ("netfilter: nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain") for the bind/true differences.] Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2024-07-15 18:28:11 +00:00
Pablo Neira Ayuso	832e81e147	netfilter: nf_tables: add nft_set_is_anonymous() helper Add helper function to test for the NFT_SET_ANONYMOUS flag. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> (cherry picked from commit 408070d6ee3490da63430bc8ce13348cf2eb47ea) [Harshit: Adding this helper commit as it would help for future commit backports, fix conlfict reoslution due to missing commit: af26f3e2903b ("netfilter: nf_tables: unbind set in rule from commit path") which is a later commit but applied to 4.14.y already and I have considered using nft_is_anonymous() helper at couple of other places where the previous backports avoided to use it as the helper wasn't present] Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>	2024-07-15 17:44:33 +00:00
Qingfang DENG	34358bf261	neighbour: fix unaligned access to pneigh_entry commit ed779fe4c9b5a20b4ab4fd6f3e19807445bb78c7 upstream. After the blamed commit, the member key is longer 4-byte aligned. On platforms that do not support unaligned access, e.g., MIPS32R2 with unaligned_action set to 1, this will trigger a crash when accessing an IPv6 pneigh_entry, as the key is cast to an in6_addr pointer. Change the type of the key to u32 to make it aligned. Fixes: 62dd93181aaa ("[IPV6] NDISC: Set per-entry is_router flag in Proxy NA.") Signed-off-by: Qingfang DENG <qingfang.deng@siflower.com.cn> Link: https://lore.kernel.org/r/20230601015432.159066-1-dqfext@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit f451d1a013fd585cbf70a65ca6b9cf3548bb039f) Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2024-07-15 17:44:33 +00:00
Richard Raya	1316bc860c	net: Revert recent patchsets Revert "ipv4/tcp: Force applications to use TCP_NODELAY to improve network latency" This reverts commit b693890811c6bd93748c2ca6cb52ccdecd136d04. Revert "ipv4/tcp_output: Disable tcp_slow_start_after_idle" This reverts commit 59b2d297d1792b9b8334cfcd58271cdbd57c56c2. Revert "netdev: Increase the size of the receive queue" This reverts commit ec01380ffe59eacb23df414451b7c12ef2268994. Revert "tcp: Allow TCP Fast Open for both incoming & outgoing connections" This reverts commit bc31532231af219645813ac1442e4222f57d9769. Change-Id: I7ecf328d8375bbe796fc1592b5b6b85bfc0a6302 Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-13 20:47:49 -03:00
Panchajanya1999	bc31532231	tcp: Allow TCP Fast Open for both incoming & outgoing connections The proc(RO) "tcp_fastopen" reads the value of "sysctl_tcp_fastopen", which is set to "TFO_CLIENT_ENABLE" in tcp_fastopen.c Change-Id: Ie30448d69e148cad5ba12cec800026b20563eb2e Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-03 22:33:22 -03:00
Richard Raya	3143685e95	Merge branch 'linux-4.14.y' of https://github.com/openela/kernel-lts * 'linux-4.14.y' of https://github.com/openela/kernel-lts: (278 commits) LTS: Update to 4.14.348 docs: kernel_include.py: Cope with docutils 0.21 serial: kgdboc: Fix NMI-safety problems from keyboard reset code btrfs: add missing mutex_unlock in btrfs_relocate_sys_chunks() dm: limit the number of targets and parameter size area Revert "selftests: mm: fix map_hugetlb failure on 64K page size systems" LTS: Update to 4.14.347 rds: Fix build regression. RDS: IB: Use DEFINE_PER_CPU_SHARED_ALIGNED for rds_ib_stats af_unix: Suppress false-positive lockdep splat for spin_lock() in __unix_gc(). net: fix out-of-bounds access in ops_init drm/vmwgfx: Fix invalid reads in fence signaled events dyndbg: fix old BUG_ON in >control parser tipc: fix UAF in error path usb: gadget: f_fs: Fix a race condition when processing setup packets. usb: gadget: composite: fix OS descriptors w_value logic firewire: nosy: ensure user_length is taken into account when fetching packet contents af_unix: Fix garbage collector racing against connect() af_unix: Do not use atomic ops for unix_sk(sk)->inflight. ipv6: fib6_rules: avoid possible NULL dereference in fib6_rule_action() ... Change-Id: If329d39dd4e95e14045bb7c58494c197d1352d60 Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-06-04 16:33:29 -03:00
Kuniyuki Iwashima	bade56293a	af_unix: Suppress false-positive lockdep splat for spin_lock() in __unix_gc(). commit 1971d13ffa84a551d29a81fdf5b5ec5be166ac83 upstream. syzbot reported a lockdep splat regarding unix_gc_lock and unix_state_lock(). One is called from recvmsg() for a connected socket, and another is called from GC for TCP_LISTEN socket. So, the splat is false-positive. Let's add a dedicated lock class for the latter to suppress the splat. Note that this change is not necessary for net-next.git as the issue is only applied to the old GC impl. [0]: WARNING: possible circular locking dependency detected 6.9.0-rc5-syzkaller-00007-g4d2008430ce8 #0 Not tainted ----------------------------------------------------- kworker/u8:1/11 is trying to acquire lock: ffff88807cea4e70 (&u->lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline] ffff88807cea4e70 (&u->lock){+.+.}-{2:2}, at: __unix_gc+0x40e/0xf70 net/unix/garbage.c:302 but task is already holding lock: ffffffff8f6ab638 (unix_gc_lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline] ffffffff8f6ab638 (unix_gc_lock){+.+.}-{2:2}, at: __unix_gc+0x117/0xf70 net/unix/garbage.c:261 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (unix_gc_lock){+.+.}-{2:2}: lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754 __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline] _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154 spin_lock include/linux/spinlock.h:351 [inline] unix_notinflight+0x13d/0x390 net/unix/garbage.c:140 unix_detach_fds net/unix/af_unix.c:1819 [inline] unix_destruct_scm+0x221/0x350 net/unix/af_unix.c:1876 skb_release_head_state+0x100/0x250 net/core/skbuff.c:1188 skb_release_all net/core/skbuff.c:1200 [inline] __kfree_skb net/core/skbuff.c:1216 [inline] kfree_skb_reason+0x16d/0x3b0 net/core/skbuff.c:1252 kfree_skb include/linux/skbuff.h:1262 [inline] manage_oob net/unix/af_unix.c:2672 [inline] unix_stream_read_generic+0x1125/0x2700 net/unix/af_unix.c:2749 unix_stream_splice_read+0x239/0x320 net/unix/af_unix.c:2981 do_splice_read fs/splice.c:985 [inline] splice_file_to_pipe+0x299/0x500 fs/splice.c:1295 do_splice+0xf2d/0x1880 fs/splice.c:1379 __do_splice fs/splice.c:1436 [inline] __do_sys_splice fs/splice.c:1652 [inline] __se_sys_splice+0x331/0x4a0 fs/splice.c:1634 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f -> #0 (&u->lock){+.+.}-{2:2}: check_prev_add kernel/locking/lockdep.c:3134 [inline] check_prevs_add kernel/locking/lockdep.c:3253 [inline] validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869 __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137 lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754 __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline] _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154 spin_lock include/linux/spinlock.h:351 [inline] __unix_gc+0x40e/0xf70 net/unix/garbage.c:302 process_one_work kernel/workqueue.c:3254 [inline] process_scheduled_works+0xa10/0x17c0 kernel/workqueue.c:3335 worker_thread+0x86d/0xd70 kernel/workqueue.c:3416 kthread+0x2f0/0x390 kernel/kthread.c:388 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(unix_gc_lock); lock(&u->lock); lock(unix_gc_lock); lock(&u->lock); * DEADLOCK * 3 locks held by kworker/u8:1/11: #0: ffff888015089148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3229 [inline] #0: ffff888015089148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_scheduled_works+0x8e0/0x17c0 kernel/workqueue.c:3335 #1: ffffc90000107d00 (unix_gc_work){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3230 [inline] #1: ffffc90000107d00 (unix_gc_work){+.+.}-{0:0}, at: process_scheduled_works+0x91b/0x17c0 kernel/workqueue.c:3335 #2: ffffffff8f6ab638 (unix_gc_lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline] #2: ffffffff8f6ab638 (unix_gc_lock){+.+.}-{2:2}, at: __unix_gc+0x117/0xf70 net/unix/garbage.c:261 stack backtrace: CPU: 0 PID: 11 Comm: kworker/u8:1 Not tainted 6.9.0-rc5-syzkaller-00007-g4d2008430ce8 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024 Workqueue: events_unbound __unix_gc Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114 check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2187 check_prev_add kernel/locking/lockdep.c:3134 [inline] check_prevs_add kernel/locking/lockdep.c:3253 [inline] validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869 __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137 lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754 __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline] _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154 spin_lock include/linux/spinlock.h:351 [inline] __unix_gc+0x40e/0xf70 net/unix/garbage.c:302 process_one_work kernel/workqueue.c:3254 [inline] process_scheduled_works+0xa10/0x17c0 kernel/workqueue.c:3335 worker_thread+0x86d/0xd70 kernel/workqueue.c:3416 kthread+0x2f0/0x390 kernel/kthread.c:388 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 </TASK> Fixes: 47d8ac011fe1 ("af_unix: Fix garbage collector racing against connect()") Reported-and-tested-by: syzbot+fa379358c28cc87cc307@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=fa379358c28cc87cc307 Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20240424170443.9832-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit b29dcdd0582c00cd6ee0bd7c958d3639aa9db27f) Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2024-06-03 12:47:31 +00:00
Kuniyuki Iwashima	40d8d26e71	af_unix: Do not use atomic ops for unix_sk(sk)->inflight. [ Upstream commit 97af84a6bba2ab2b9c704c08e67de3b5ea551bb2 ] When touching unix_sk(sk)->inflight, we are always under spin_lock(&unix_gc_lock). Let's convert unix_sk(sk)->inflight to the normal unsigned long. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240123170856.41348-3-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit c8a2b1f7208b0ea0a4ad4355e0510d84f508a9ff) Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2024-06-03 12:47:30 +00:00
Jiri Benc	fee87d3871	ipv6: fix race condition between ipv6_get_ifaddr and ipv6_del_addr [ Upstream commit 7633c4da919ad51164acbf1aa322cc1a3ead6129 ] Although ipv6_get_ifaddr walks inet6_addr_lst under the RCU lock, it still means hlist_for_each_entry_rcu can return an item that got removed from the list. The memory itself of such item is not freed thanks to RCU but nothing guarantees the actual content of the memory is sane. In particular, the reference count can be zero. This can happen if ipv6_del_addr is called in parallel. ipv6_del_addr removes the entry from inet6_addr_lst (hlist_del_init_rcu(&ifp->addr_lst)) and drops all references (__in6_ifa_put(ifp) + in6_ifa_put(ifp)). With bad enough timing, this can happen: 1. In ipv6_get_ifaddr, hlist_for_each_entry_rcu returns an entry. 2. Then, the whole ipv6_del_addr is executed for the given entry. The reference count drops to zero and kfree_rcu is scheduled. 3. ipv6_get_ifaddr continues and tries to increments the reference count (in6_ifa_hold). 4. The rcu is unlocked and the entry is freed. 5. The freed entry is returned. Prevent increasing of the reference count in such case. The name in6_ifa_hold_safe is chosen to mimic the existing fib6_info_hold_safe. [ 41.506330] refcount_t: addition on 0; use-after-free. [ 41.506760] WARNING: CPU: 0 PID: 595 at lib/refcount.c:25 refcount_warn_saturate+0xa5/0x130 [ 41.507413] Modules linked in: veth bridge stp llc [ 41.507821] CPU: 0 PID: 595 Comm: python3 Not tainted 6.9.0-rc2.main-00208-g49563be82afa #14 [ 41.508479] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) [ 41.509163] RIP: 0010:refcount_warn_saturate+0xa5/0x130 [ 41.509586] Code: ad ff 90 0f 0b 90 90 c3 cc cc cc cc 80 3d c0 30 ad 01 00 75 a0 c6 05 b7 30 ad 01 01 90 48 c7 c7 38 cc 7a 8c e8 cc 18 ad ff 90 <0f> 0b 90 90 c3 cc cc cc cc 80 3d 98 30 ad 01 00 0f 85 75 ff ff ff [ 41.510956] RSP: 0018:ffffbda3c026baf0 EFLAGS: 00010282 [ 41.511368] RAX: 0000000000000000 RBX: ffff9e9c46914800 RCX: 0000000000000000 [ 41.511910] RDX: ffff9e9c7ec29c00 RSI: ffff9e9c7ec1c900 RDI: ffff9e9c7ec1c900 [ 41.512445] RBP: ffff9e9c43660c9c R08: 0000000000009ffb R09: 00000000ffffdfff [ 41.512998] R10: 00000000ffffdfff R11: ffffffff8ca58a40 R12: ffff9e9c4339a000 [ 41.513534] R13: 0000000000000001 R14: ffff9e9c438a0000 R15: ffffbda3c026bb48 [ 41.514086] FS: 00007fbc4cda1740(0000) GS:ffff9e9c7ec00000(0000) knlGS:0000000000000000 [ 41.514726] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 41.515176] CR2: 000056233b337d88 CR3: 000000000376e006 CR4: 0000000000370ef0 [ 41.515713] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 41.516252] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 41.516799] Call Trace: [ 41.517037] <TASK> [ 41.517249] ? __warn+0x7b/0x120 [ 41.517535] ? refcount_warn_saturate+0xa5/0x130 [ 41.517923] ? report_bug+0x164/0x190 [ 41.518240] ? handle_bug+0x3d/0x70 [ 41.518541] ? exc_invalid_op+0x17/0x70 [ 41.520972] ? asm_exc_invalid_op+0x1a/0x20 [ 41.521325] ? refcount_warn_saturate+0xa5/0x130 [ 41.521708] ipv6_get_ifaddr+0xda/0xe0 [ 41.522035] inet6_rtm_getaddr+0x342/0x3f0 [ 41.522376] ? __pfx_inet6_rtm_getaddr+0x10/0x10 [ 41.522758] rtnetlink_rcv_msg+0x334/0x3d0 [ 41.523102] ? netlink_unicast+0x30f/0x390 [ 41.523445] ? __pfx_rtnetlink_rcv_msg+0x10/0x10 [ 41.523832] netlink_rcv_skb+0x53/0x100 [ 41.524157] netlink_unicast+0x23b/0x390 [ 41.524484] netlink_sendmsg+0x1f2/0x440 [ 41.524826] __sys_sendto+0x1d8/0x1f0 [ 41.525145] __x64_sys_sendto+0x1f/0x30 [ 41.525467] do_syscall_64+0xa5/0x1b0 [ 41.525794] entry_SYSCALL_64_after_hwframe+0x72/0x7a [ 41.526213] RIP: 0033:0x7fbc4cfcea9a [ 41.526528] Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89 [ 41.527942] RSP: 002b:00007ffcf54012a8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c [ 41.528593] RAX: ffffffffffffffda RBX: 00007ffcf5401368 RCX: 00007fbc4cfcea9a [ 41.529173] RDX: 000000000000002c RSI: 00007fbc4b9d9bd0 RDI: 0000000000000005 [ 41.529786] RBP: 00007fbc4bafb040 R08: 00007ffcf54013e0 R09: 000000000000000c [ 41.530375] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 41.530977] R13: ffffffffc4653600 R14: 0000000000000001 R15: 00007fbc4ca85d1b [ 41.531573] </TASK> Fixes: 5c578aedcb21d ("IPv6: convert addrconf hash list to RCU") Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jiri Benc <jbenc@redhat.com> Link: https://lore.kernel.org/r/8ab821e36073a4a406c50ec83c9e8dc586c539e4.1712585809.git.jbenc@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit b4b3b69a19016d4e7fbdbd1dbcc184915eb862e1) Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2024-05-31 12:30:38 +00:00
Eric Dumazet	5f11455435	geneve: fix header validation in geneve[6]_xmit_skb [ Upstream commit d8a6213d70accb403b82924a1c229e733433a5ef ] syzbot is able to trigger an uninit-value in geneve_xmit() [1] Problem : While most ip tunnel helpers (like ip_tunnel_get_dsfield()) uses skb_protocol(skb, true), pskb_inet_may_pull() is only using skb->protocol. If anything else than ETH_P_IPV6 or ETH_P_IP is found in skb->protocol, pskb_inet_may_pull() does nothing at all. If a vlan tag was provided by the caller (af_packet in the syzbot case), the network header might not point to the correct location, and skb linear part could be smaller than expected. Add skb_vlan_inet_prepare() to perform a complete mac validation. Use this in geneve for the moment, I suspect we need to adopt this more broadly. v4 - Jakub reported v3 broke l2_tos_ttl_inherit.sh selftest - Only call __vlan_get_protocol() for vlan types. Link: https://lore.kernel.org/netdev/20240404100035.3270a7d5@kernel.org/ v2,v3 - Addressed Sabrina comments on v1 and v2 Link: https://lore.kernel.org/netdev/Zg1l9L2BNoZWZDZG@hog/ [1] BUG: KMSAN: uninit-value in geneve_xmit_skb drivers/net/geneve.c:910 [inline] BUG: KMSAN: uninit-value in geneve_xmit+0x302d/0x5420 drivers/net/geneve.c:1030 geneve_xmit_skb drivers/net/geneve.c:910 [inline] geneve_xmit+0x302d/0x5420 drivers/net/geneve.c:1030 __netdev_start_xmit include/linux/netdevice.h:4903 [inline] netdev_start_xmit include/linux/netdevice.h:4917 [inline] xmit_one net/core/dev.c:3531 [inline] dev_hard_start_xmit+0x247/0xa20 net/core/dev.c:3547 __dev_queue_xmit+0x348d/0x52c0 net/core/dev.c:4335 dev_queue_xmit include/linux/netdevice.h:3091 [inline] packet_xmit+0x9c/0x6c0 net/packet/af_packet.c:276 packet_snd net/packet/af_packet.c:3081 [inline] packet_sendmsg+0x8bb0/0x9ef0 net/packet/af_packet.c:3113 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0x30f/0x380 net/socket.c:745 __sys_sendto+0x685/0x830 net/socket.c:2191 __do_sys_sendto net/socket.c:2203 [inline] __se_sys_sendto net/socket.c:2199 [inline] __x64_sys_sendto+0x125/0x1d0 net/socket.c:2199 do_syscall_64+0xd5/0x1f0 entry_SYSCALL_64_after_hwframe+0x6d/0x75 Uninit was created at: slab_post_alloc_hook mm/slub.c:3804 [inline] slab_alloc_node mm/slub.c:3845 [inline] kmem_cache_alloc_node+0x613/0xc50 mm/slub.c:3888 kmalloc_reserve+0x13d/0x4a0 net/core/skbuff.c:577 __alloc_skb+0x35b/0x7a0 net/core/skbuff.c:668 alloc_skb include/linux/skbuff.h:1318 [inline] alloc_skb_with_frags+0xc8/0xbf0 net/core/skbuff.c:6504 sock_alloc_send_pskb+0xa81/0xbf0 net/core/sock.c:2795 packet_alloc_skb net/packet/af_packet.c:2930 [inline] packet_snd net/packet/af_packet.c:3024 [inline] packet_sendmsg+0x722d/0x9ef0 net/packet/af_packet.c:3113 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0x30f/0x380 net/socket.c:745 __sys_sendto+0x685/0x830 net/socket.c:2191 __do_sys_sendto net/socket.c:2203 [inline] __se_sys_sendto net/socket.c:2199 [inline] __x64_sys_sendto+0x125/0x1d0 net/socket.c:2199 do_syscall_64+0xd5/0x1f0 entry_SYSCALL_64_after_hwframe+0x6d/0x75 CPU: 0 PID: 5033 Comm: syz-executor346 Not tainted 6.9.0-rc1-syzkaller-00005-g928a87efa423 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024 Fixes: d13f048dd40e ("net: geneve: modify IP header check in geneve6_xmit_skb and geneve_xmit_skb") Reported-by: syzbot+9ee20ec1de7b3168db09@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/000000000000d19c3a06152f9ee4@google.com/ Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Phillip Potter <phil@philpotter.co.uk> Cc: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Phillip Potter <phil@philpotter.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 43be590456e1f3566054ce78ae2dbb68cbe1a536) Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2024-05-31 10:46:18 +00:00
Eric Dumazet	a15af438bc	tcp: properly terminate timers for kernel sockets [ Upstream commit 151c9c724d05d5b0dd8acd3e11cb69ef1f2dbada ] We had various syzbot reports about tcp timers firing after the corresponding netns has been dismantled. Fortunately Josef Bacik could trigger the issue more often, and could test a patch I wrote two years ago. When TCP sockets are closed, we call inet_csk_clear_xmit_timers() to 'stop' the timers. inet_csk_clear_xmit_timers() can be called from any context, including when socket lock is held. This is the reason it uses sk_stop_timer(), aka del_timer(). This means that ongoing timers might finish much later. For user sockets, this is fine because each running timer holds a reference on the socket, and the user socket holds a reference on the netns. For kernel sockets, we risk that the netns is freed before timer can complete, because kernel sockets do not hold reference on the netns. This patch adds inet_csk_clear_xmit_timers_sync() function that using sk_stop_timer_sync() to make sure all timers are terminated before the kernel socket is released. Modules using kernel sockets close them in their netns exit() handler. Also add sock_not_owned_by_me() helper to get LOCKDEP support : inet_csk_clear_xmit_timers_sync() must not be called while socket lock is held. It is very possible we can revert in the future commit 3a58f13a881e ("net: rds: acquire refcount on TCP sockets") which attempted to solve the issue in rds only. (net/smc/af_smc.c and net/mptcp/subflow.c have similar code) We probably can remove the check_net() tests from tcp_out_of_resources() and __tcp_close() in the future. Reported-by: Josef Bacik <josef@toxicpanda.com> Closes: https://lore.kernel.org/netdev/20240314210740.GA2823176@perftesting/ Fixes: 26abe14379f8 ("net: Modify sk_alloc to not reference count the netns of kernel sockets.") Fixes: 8a68173691f0 ("net: sk_clone_lock() should only do get_net() if the parent is not a kernel socket") Link: https://lore.kernel.org/bpf/CANn89i+484ffqb93aQm1N-tjxxvb3WDKX0EbD7318RwRgsatjw@mail.gmail.com/ Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Josef Bacik <josef@toxicpanda.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Link: https://lore.kernel.org/r/20240322135732.1535772-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 93f0133b9d589cc6e865f254ad9be3e9d8133f50) Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2024-05-30 09:00:42 +00:00
Geliang Tang	f113056cc5	mptcp: add sk_stop_timer_sync helper [ Upstream commit 08b81d873126b413cda511b1ea1cbb0e99938bbd ] This patch added a new helper sk_stop_timer_sync, it deactivates a timer like sk_stop_timer, but waits for the handler to finish. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Stable-dep-of: 151c9c724d05 ("tcp: properly terminate timers for kernel sockets") Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 9c382bc16fa8f7499b0663398437e125cf4f763b) Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2024-05-30 09:00:42 +00:00
Richard Raya	59c72f3544	Merge branch 'linux-4.14.y' of https://github.com/openela/kernel-lts * 'linux-4.14.y' of https://github.com/openela/kernel-lts: (186 commits) LTS: Update to 4.14.344 binder: signal epoll threads of self-work ANDROID: binder: Add thread->process_todo flag. scsi: bnx2fc: Fix skb double free in bnx2fc_rcv() scsi: bnx2fc: Remove set but not used variable 'oxid' net: check dev->gso_max_size in gso_features_check() driver: staging: count ashmem_range into SLAB_RECLAIMBLE net: warn if gso_type isn't set for a GSO SKB staging: android: ashmem: Remove use of unlikely() ALSA: hda/realtek: Enable headset on Lenovo M90 Gen5 ALSA: hda/realtek: Enable headset onLenovo M70/M90 ALSA: hda/realtek: Add quirk for Lenovo TianYi510Pro-14IOB ALSA: hda/realtek - ALC897 headset MIC no sound ALSA: hda/realtek - Add headset Mic support for Lenovo ALC897 platform ALSA: hda/realtek: Fix the mic type detection issue for ASUS G551JW ALSA: hda/realtek - The front Mic on a HP machine doesn't work ALSA: hda/realtek - Enable the headset of Acer N50-600 with ALC662 ALSA: hda/realtek - Enable headset mic of Acer X2660G with ALC662 ALSA: hda/realtek - Add Headset Mic supported for HP cPC ALSA: hda/realtek - More constifications ... Change-Id: I3d093c0e457ab7e7e7b98b46eb44e82b6f4636f9 Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-05-08 19:24:35 -03:00
Neal Cardwell	5069afc313	tcp: fix excessive TLP and RACK timeouts from HZ rounding [ Upstream commit 1c2709cfff1dedbb9591e989e2f001484208d914 ] We discovered from packet traces of slow loss recovery on kernels with the default HZ=250 setting (and min_rtt < 1ms) that after reordering, when receiving a SACKed sequence range, the RACK reordering timer was firing after about 16ms rather than the desired value of roughly min_rtt/4 + 2ms. The problem is largely due to the RACK reorder timer calculation adding in TCP_TIMEOUT_MIN, which is 2 jiffies. On kernels with HZ=250, this is 2*4ms = 8ms. The TLP timer calculation has the exact same issue. This commit fixes the TLP transmit timer and RACK reordering timer floor calculation to more closely match the intended 2ms floor even on kernels with HZ=250. It does this by adding in a new TCP_TIMEOUT_MIN_US floor of 2000 us and then converting to jiffies, instead of the current approach of converting to jiffies and then adding th TCP_TIMEOUT_MIN value of 2 jiffies. Our testing has verified that on kernels with HZ=1000, as expected, this does not produce significant changes in behavior, but on kernels with the default HZ=250 the latency improvement can be large. For example, our tests show that for HZ=250 kernels at low RTTs this fix roughly halves the latency for the RACK reorder timer: instead of mostly firing at 16ms it mostly fires at 8ms. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Fixes: bb4d991a28cc ("tcp: adjust tail loss probe timeout") Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20231015174700.2206872-1-ncardwell.sw@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>	2024-05-06 14:36:40 +00:00
Eric Dumazet	759b99e274	tcp: Namespace-ify sysctl_tcp_early_retrans [ Upstream commit 2ae21cf527da0e5cf9d7ee14bd5b0909bb9d1a75 ] Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Stable-dep-of: 1c2709cfff1d ("tcp: fix excessive TLP and RACK timeouts from HZ rounding") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>	2024-05-06 14:36:40 +00:00
Hangbin Liu	7fb02a7f7d	bonding: fix macvlan over alb bond support [ Upstream commit e74216b8def3803e98ae536de78733e9d7f3b109 ] The commit 14af9963ba1e ("bonding: Support macvlans on top of tlb/rlb mode bonds") aims to enable the use of macvlans on top of rlb bond mode. However, the current rlb bond mode only handles ARP packets to update remote neighbor entries. This causes an issue when a macvlan is on top of the bond, and remote devices send packets to the macvlan using the bond's MAC address as the destination. After delivering the packets to the macvlan, the macvlan will rejects them as the MAC address is incorrect. Consequently, this commit makes macvlan over bond non-functional. To address this problem, one potential solution is to check for the presence of a macvlan port on the bond device using netif_is_macvlan_port(bond->dev) and return NULL in the rlb_arp_xmit() function. However, this approach doesn't fully resolve the situation when a VLAN exists between the bond and macvlan. So let's just do a partial revert for commit 14af9963ba1e in rlb_arp_xmit(). As the comment said, Don't modify or load balance ARPs that do not originate locally. Fixes: 14af9963ba1e ("bonding: Support macvlans on top of tlb/rlb mode bonds") Reported-by: susan.zheng@veritas.com Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2117816 Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>	2024-05-06 14:36:38 +00:00
Jakub Kicinski	a975f7d7c8	net: remove bond_slave_has_mac_rcu() [ Upstream commit 8b0fdcdc3a7d44aff907f0103f5ffb86b12bfe71 ] No caller since v3.16. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Stable-dep-of: e74216b8def3 ("bonding: fix macvlan over alb bond support") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>	2024-05-06 14:36:38 +00:00
Cambda Zhu	c3b84f9003	net: Replace the limit of TCP_LINGER2 with TCP_FIN_TIMEOUT_MAX [ Upstream commit f0628c524fd188c3f9418e12478dfdfadacba815 ] This patch changes the behavior of TCP_LINGER2 about its limit. The sysctl_tcp_fin_timeout used to be the limit of TCP_LINGER2 but now it's only the default value. A new macro named TCP_FIN_TIMEOUT_MAX is added as the limit of TCP_LINGER2, which is 2 minutes. Since TCP_LINGER2 used sysctl_tcp_fin_timeout as the default value and the limit in the past, the system administrator cannot set the default value for most of sockets and let some sockets have a greater timeout. It might be a mistake that let the sysctl to be the limit of the TCP_LINGER2. Maybe we can add a new sysctl to set the max of TCP_LINGER2, but FIN-WAIT-2 timeout is usually no need to be too long and 2 minutes are legal considering TCP specs. Changes in v3: - Remove the new socket option and change the TCP_LINGER2 behavior so that the timeout can be set to value between sysctl_tcp_fin_timeout and 2 minutes. Changes in v2: - Add int overflow check for the new socket option. Changes in v1: - Add a new socket option to set timeout greater than sysctl_tcp_fin_timeout. Signed-off-by: Cambda Zhu <cambda@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net> Stable-dep-of: 9df5335ca974 ("tcp: annotate data-races around tp->linger2") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>	2024-05-06 14:36:36 +00:00
Krzysztof Kozlowski	b6e41b5c23	nfc: constify several pointers to u8, char and sk_buff [ Upstream commit 3df40eb3a2ea58bf404a38f15a7a2768e4762cb0 ] Several functions receive pointers to u8, char or sk_buff but do not modify the contents so make them const. This allows doing the same for local variables and in total makes the code a little bit safer. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: 0d9b41daa590 ("nfc: llcp: fix possible use of uninitialized variable in nfc_llcp_send_connect()") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>	2024-05-06 14:36:35 +00:00
Richard Raya	d5b9f06c4e	Merge branch 'linux-4.14.y' of https://github.com/openela/kernel-lts * 'linux-4.14.y' of https://github.com/openela/kernel-lts: (176 commits) LTS: Update to 4.14.343 crypto: af_alg - Work around empty control messages without MSG_MORE crypto: af_alg - Fix regression on empty requests spi: spi-mt65xx: Fix NULL pointer access in interrupt handler net/bnx2x: Prevent access to a freed page in page_pool hsr: Handle failures in module init rds: introduce acquire/release ordering in acquire/release_in_xmit() hsr: Fix uninit-value access in hsr_get_node() net: hsr: fix placement of logical operator in a multi-line statement usb: gadget: net2272: Use irqflags in the call to net2272_probe_fin staging: greybus: fix get_channel_from_mode() failure path serial: 8250_exar: Don't remove GPIO device on suspend rtc: mt6397: select IRQ_DOMAIN instead of depending on it rtc: mediatek: enhance the description for MediaTek PMIC based RTC tty: serial: samsung: fix tx_empty() to return TIOCSER_TEMT serial: max310x: fix syntax error in IRQ error message clk: qcom: gdsc: Add support to update GDSC transition delay NFS: Fix an off by one in root_nfs_cat() net: sunrpc: Fix an off by one in rpc_sockaddr2uaddr() scsi: bfa: Fix function pointer type mismatch for hcb_qe->cbfn ... Change-Id: Ib9b7d4f4fbb66b54b4fc2d35e945418da4c02331 Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-04-18 09:46:38 -03:00
Xin Long	c4d2af79e4	sock_diag: request _diag module only when the family or proto has been registered Now when using 'ss' in iproute, kernel would try to load all _diag modules, which also causes corresponding family and proto modules to be loaded as well due to module dependencies. Like after running 'ss', sctp, dccp, af_packet (if it works as a module) would be loaded. For example: $ lsmod\|grep sctp $ ss $ lsmod\|grep sctp sctp_diag 16384 0 sctp 323584 5 sctp_diag inet_diag 24576 4 raw_diag,tcp_diag,sctp_diag,udp_diag libcrc32c 16384 3 nf_conntrack,nf_nat,sctp As these family and proto modules are loaded unintentionally, it could cause some problems, like: - Some debug tools use 'ss' to collect the socket info, which loads all those diag and family and protocol modules. It's noisy for identifying issues. - Users usually expect to drop sctp init packet silently when they have no sense of sctp protocol instead of sending abort back. - It wastes resources (especially with multiple netns), and SCTP module can't be unloaded once it's loaded. ... In short, it's really inappropriate to have these family and proto modules loaded unexpectedly when just doing debugging with inet_diag. This patch is to introduce sock_load_diag_module() where it loads the _diag module only when it's corresponding family or proto has been already registered. Note that we can't just load _diag module without the family or proto loaded, as some symbols used in _diag module are from the family or proto module. v1->v2: - move inet proto check to inet_diag to avoid a compiling err. v2->v3: - define sock_load_diag_module in sock.c and export one symbol only. - improve the changelog. Reported-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Phil Sutter <phil@nwl.cc> Acked-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit bf2ae2e4bf9360e07c0cdfa166bcdc0afd92f4ce) Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>	2024-04-16 10:30:28 +00:00
Willem de Bruijn	683e08ff6e	ip: validate header length on virtual device xmit [ Upstream commit cb9f1b783850b14cbd7f87d061d784a666dfba1f ] KMSAN detected read beyond end of buffer in vti and sit devices when passing truncated packets with PF_PACKET. The issue affects additional ip tunnel devices. Extend commit 76c0ddd8c3a6 ("ip6_tunnel: be careful when accessing the inner header") and commit ccfec9e5cb2d ("ip_tunnel: be careful when accessing the inner header"). Move the check to a separate helper and call at the start of each ndo_start_xmit function in net/ipv4 and net/ipv6. Minor changes: - convert dev_kfree_skb to kfree_skb on error path, as dev_kfree_skb calls consume_skb which is not for error paths. - use pskb_network_may_pull even though that is pedantic here, as the same as pskb_may_pull for devices without llheaders. - do not cache ipv6 hdrs if used only once (unsafe across pskb_may_pull, was more relevant to earlier patch) Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit cde81154f86e36df0fad21c23976cc8b80c01600) [vegard: drop changes to ip6erspan_tunnel_xmit(), which doesn't exist in 4.14 due to missing commit 5a963eb61b7c39e6c422b6e48619d19d04719358 ("ip6_gre: Add ERSPAN native tunnel support") from 4.16] Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2024-04-13 08:29:03 +00:00
Richard Raya	a9e2d194be	Merge branch 'linux-4.14.y' of https://github.com/openela/kernel-lts * 'linux-4.14.y' of https://github.com/openela/kernel-lts: (350 commits) LTS: Update to 4.14.340 fs/aio: Restrict kiocb_set_cancel_fn() to I/O submitted via libaio KVM: arm64: vgic-its: Test for valid IRQ in its_sync_lpi_pending_table() PCI/MSI: Prevent MSI hardware interrupt number truncation s390: use the correct count for __iowrite64_copy() packet: move from strlcpy with unused retval to strscpy ipv6: sr: fix possible use-after-free and null-ptr-deref nouveau: fix function cast warnings scsi: jazz_esp: Only build if SCSI core is builtin RDMA/srpt: fix function pointer cast warnings RDMA/srpt: Support specifying the srpt_service_guid parameter IB/hfi1: Fix a memleak in init_credit_return usb: gadget: ncm: Avoid dropping datagrams of properly parsed NTBs l2tp: pass correct message length to ip6_append_data gtp: fix use-after-free and null-ptr-deref in gtp_genl_dump_pdp() dm-crypt: don't modify the data when using authenticated encryption mm: memcontrol: switch to rcu protection in drain_all_stock() s390/qeth: Fix potential loss of L3-IP@ in case of network issues virtio-blk: Ensure no requests in virtqueues before deleting vqs. firewire: core: send bus reset promptly on gap count error ... Change-Id: Ieafdd459ee41343bf15ed781b3e45adc2be29cc1 Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-03-26 00:15:05 -03:00
Richard Raya	669eb74484	Merge branch 'deprecated/android-4.14-stable' of https://android.googlesource.com/kernel/common into HEAD * 'deprecated/android-4.14-stable' of https://android.googlesource.com/kernel/common: (101 commits) Linux 4.14.336 mmc: core: Cancel delayed work before releasing host mmc: rpmb: fixes pause retune on all RPMB partitions. firewire: ohci: suppress unexpected system reboot in AMD Ryzen machines and ASM108x/VT630x PCIe cards i40e: fix use-after-free in i40e_aqc_add_filters() net: bcmgenet: Fix FCS generation for fragmented skbuffs net: sched: em_text: fix possible memory leak in em_text_destroy() nfc: llcp_core: Hold a ref to llcp_local->dev when holding a ref to llcp_local UPSTREAM: drm: Fix doc warning in drm_connector_attach_edid_property() BACKPORT: lib/vsprintf: Hash legacy clock addresses UPSTREAM: xfrm: fix gro_cells leak when remove virtual xfrm interfaces UPSTREAM: xfrm: Make function xfrmi_get_link_net() static UPSTREAM: cpuidle: menu: Retain tick when shallow state is selected UPSTREAM: bpf: fix rcu annotations in compute_effective_progs() UPSTREAM: bpf: bpf_prog_array_alloc() should return a generic non-rcu pointer UPSTREAM: sched/util_est: Fix util_est_dequeue() for throttled cfs_rq UPSTREAM: softirq: Reorder trace_softirqs_on to prevent lockdep splat UPSTREAM: l2tp: fix refcount leakage on PPPoL2TP sockets UPSTREAM: HID: steam: select CONFIG_POWER_SUPPLY BACKPORT: mac80211_hwsim: fix a possible memory leak in hwsim_new_radio_nl() ... Change-Id: I1c98fbb0918986a06bee16b0c11fe8bee003fd3f Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-03-25 23:54:05 -03:00
Eric Dumazet	fc4e079263	af_unix: fix lockdep positive in sk_diag_dump_icons() [ Upstream commit 4d322dce82a1d44f8c83f0f54f95dd1b8dcf46c9 ] syzbot reported a lockdep splat [1]. Blamed commit hinted about the possible lockdep violation, and code used unix_state_lock_nested() in an attempt to silence lockdep. It is not sufficient, because unix_state_lock_nested() is already used from unix_state_double_lock(). We need to use a separate subclass. This patch adds a distinct enumeration to make things more explicit. Also use swap() in unix_state_double_lock() as a clean up. v2: add a missing inline keyword to unix_state_lock_nested() [1] WARNING: possible circular locking dependency detected 6.8.0-rc1-syzkaller-00356-g8a696a29c690 #0 Not tainted syz-executor.1/2542 is trying to acquire lock: ffff88808b5df9e8 (rlock-AF_UNIX){+.+.}-{2:2}, at: skb_queue_tail+0x36/0x120 net/core/skbuff.c:3863 but task is already holding lock: ffff88808b5dfe70 (&u->lock/1){+.+.}-{2:2}, at: unix_dgram_sendmsg+0xfc7/0x2200 net/unix/af_unix.c:2089 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&u->lock/1){+.+.}-{2:2}: lock_acquire+0x1e3/0x530 kernel/locking/lockdep.c:5754 _raw_spin_lock_nested+0x31/0x40 kernel/locking/spinlock.c:378 sk_diag_dump_icons net/unix/diag.c:87 [inline] sk_diag_fill+0x6ea/0xfe0 net/unix/diag.c:157 sk_diag_dump net/unix/diag.c:196 [inline] unix_diag_dump+0x3e9/0x630 net/unix/diag.c:220 netlink_dump+0x5c1/0xcd0 net/netlink/af_netlink.c:2264 __netlink_dump_start+0x5d7/0x780 net/netlink/af_netlink.c:2370 netlink_dump_start include/linux/netlink.h:338 [inline] unix_diag_handler_dump+0x1c3/0x8f0 net/unix/diag.c:319 sock_diag_rcv_msg+0xe3/0x400 netlink_rcv_skb+0x1df/0x430 net/netlink/af_netlink.c:2543 sock_diag_rcv+0x2a/0x40 net/core/sock_diag.c:280 netlink_unicast_kernel net/netlink/af_netlink.c:1341 [inline] netlink_unicast+0x7e6/0x980 net/netlink/af_netlink.c:1367 netlink_sendmsg+0xa37/0xd70 net/netlink/af_netlink.c:1908 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg net/socket.c:745 [inline] sock_write_iter+0x39a/0x520 net/socket.c:1160 call_write_iter include/linux/fs.h:2085 [inline] new_sync_write fs/read_write.c:497 [inline] vfs_write+0xa74/0xca0 fs/read_write.c:590 ksys_write+0x1a0/0x2c0 fs/read_write.c:643 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf5/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x63/0x6b -> #0 (rlock-AF_UNIX){+.+.}-{2:2}: check_prev_add kernel/locking/lockdep.c:3134 [inline] check_prevs_add kernel/locking/lockdep.c:3253 [inline] validate_chain+0x1909/0x5ab0 kernel/locking/lockdep.c:3869 __lock_acquire+0x1345/0x1fd0 kernel/locking/lockdep.c:5137 lock_acquire+0x1e3/0x530 kernel/locking/lockdep.c:5754 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162 skb_queue_tail+0x36/0x120 net/core/skbuff.c:3863 unix_dgram_sendmsg+0x15d9/0x2200 net/unix/af_unix.c:2112 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg net/socket.c:745 [inline] ____sys_sendmsg+0x592/0x890 net/socket.c:2584 ___sys_sendmsg net/socket.c:2638 [inline] __sys_sendmmsg+0x3b2/0x730 net/socket.c:2724 __do_sys_sendmmsg net/socket.c:2753 [inline] __se_sys_sendmmsg net/socket.c:2750 [inline] __x64_sys_sendmmsg+0xa0/0xb0 net/socket.c:2750 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf5/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x63/0x6b other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&u->lock/1); lock(rlock-AF_UNIX); lock(&u->lock/1); lock(rlock-AF_UNIX); * DEADLOCK * 1 lock held by syz-executor.1/2542: #0: ffff88808b5dfe70 (&u->lock/1){+.+.}-{2:2}, at: unix_dgram_sendmsg+0xfc7/0x2200 net/unix/af_unix.c:2089 stack backtrace: CPU: 1 PID: 2542 Comm: syz-executor.1 Not tainted 6.8.0-rc1-syzkaller-00356-g8a696a29c690 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x1e7/0x2d0 lib/dump_stack.c:106 check_noncircular+0x366/0x490 kernel/locking/lockdep.c:2187 check_prev_add kernel/locking/lockdep.c:3134 [inline] check_prevs_add kernel/locking/lockdep.c:3253 [inline] validate_chain+0x1909/0x5ab0 kernel/locking/lockdep.c:3869 __lock_acquire+0x1345/0x1fd0 kernel/locking/lockdep.c:5137 lock_acquire+0x1e3/0x530 kernel/locking/lockdep.c:5754 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162 skb_queue_tail+0x36/0x120 net/core/skbuff.c:3863 unix_dgram_sendmsg+0x15d9/0x2200 net/unix/af_unix.c:2112 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg net/socket.c:745 [inline] ____sys_sendmsg+0x592/0x890 net/socket.c:2584 ___sys_sendmsg net/socket.c:2638 [inline] __sys_sendmmsg+0x3b2/0x730 net/socket.c:2724 __do_sys_sendmmsg net/socket.c:2753 [inline] __se_sys_sendmmsg net/socket.c:2750 [inline] __x64_sys_sendmmsg+0xa0/0xb0 net/socket.c:2750 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf5/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x63/0x6b RIP: 0033:0x7f26d887cda9 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f26d95a60c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000133 RAX: ffffffffffffffda RBX: 00007f26d89abf80 RCX: 00007f26d887cda9 RDX: 000000000000003e RSI: 00000000200bd000 RDI: 0000000000000004 RBP: 00007f26d88c947a R08: 0000000000000000 R09: 0000000000000000 R10: 00000000000008c0 R11: 0000000000000246 R12: 0000000000000000 R13: 000000000000000b R14: 00007f26d89abf80 R15: 00007ffcfe081a68 Fixes: 2aac7a2cb0d9 ("unix_diag: Pending connections IDs NLA") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20240130184235.1620738-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 875f31aaa67e306098befa5e798a049075910fa7) Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2024-03-08 08:21:35 +00:00
Kuniyuki Iwashima	dc5870e21f	llc: Drop support for ETH_P_TR_802_2. [ Upstream commit e3f9bed9bee261e3347131764e42aeedf1ffea61 ] syzbot reported an uninit-value bug below. [0] llc supports ETH_P_802_2 (0x0004) and used to support ETH_P_TR_802_2 (0x0011), and syzbot abused the latter to trigger the bug. write$tun(r0, &(0x7f0000000040)={@val={0x0, 0x11}, @val, @mpls={[], @llc={@snap={0xaa, 0x1, ')', "90e5dd"}}}}, 0x16) llc_conn_handler() initialises local variables {saddr,daddr}.mac based on skb in llc_pdu_decode_sa()/llc_pdu_decode_da() and passes them to __llc_lookup(). However, the initialisation is done only when skb->protocol is htons(ETH_P_802_2), otherwise, __llc_lookup_established() and __llc_lookup_listener() will read garbage. The missing initialisation existed prior to commit 211ed865108e ("net: delete all instances of special processing for token ring"). It removed the part to kick out the token ring stuff but forgot to close the door allowing ETH_P_TR_802_2 packets to sneak into llc_rcv(). Let's remove llc_tr_packet_type and complete the deprecation. [0]: BUG: KMSAN: uninit-value in __llc_lookup_established+0xe9d/0xf90 __llc_lookup_established+0xe9d/0xf90 __llc_lookup net/llc/llc_conn.c:611 [inline] llc_conn_handler+0x4bd/0x1360 net/llc/llc_conn.c:791 llc_rcv+0xfbb/0x14a0 net/llc/llc_input.c:206 __netif_receive_skb_one_core net/core/dev.c:5527 [inline] __netif_receive_skb+0x1a6/0x5a0 net/core/dev.c:5641 netif_receive_skb_internal net/core/dev.c:5727 [inline] netif_receive_skb+0x58/0x660 net/core/dev.c:5786 tun_rx_batched+0x3ee/0x980 drivers/net/tun.c:1555 tun_get_user+0x53af/0x66d0 drivers/net/tun.c:2002 tun_chr_write_iter+0x3af/0x5d0 drivers/net/tun.c:2048 call_write_iter include/linux/fs.h:2020 [inline] new_sync_write fs/read_write.c:491 [inline] vfs_write+0x8ef/0x1490 fs/read_write.c:584 ksys_write+0x20f/0x4c0 fs/read_write.c:637 __do_sys_write fs/read_write.c:649 [inline] __se_sys_write fs/read_write.c:646 [inline] __x64_sys_write+0x93/0xd0 fs/read_write.c:646 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b Local variable daddr created at: llc_conn_handler+0x53/0x1360 net/llc/llc_conn.c:783 llc_rcv+0xfbb/0x14a0 net/llc/llc_input.c:206 CPU: 1 PID: 5004 Comm: syz-executor994 Not tainted 6.6.0-syzkaller-14500-g1c41041124bd #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/09/2023 Fixes: 211ed865108e ("net: delete all instances of special processing for token ring") Reported-by: syzbot+b5ad66046b913bc04c6f@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=b5ad66046b913bc04c6f Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240119015515.61898-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 165ad1e22779685c3ed3dd349c6c4c632309cc62) Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2024-03-08 08:21:28 +00:00
Luiz Augusto von Dentz	6ba5a14ad6	Bluetooth: Fix bogus check for re-auth no supported with non-ssp [ Upstream commit d03376c185926098cb4d668d6458801eb785c0a5 ] This reverts 19f8def031bfa50c579149b200bfeeb919727b27 "Bluetooth: Fix auth_complete_evt for legacy units" which seems to be working around a bug on a broken controller rather then any limitation imposed by the Bluetooth spec, in fact if there ws not possible to re-auth the command shall fail not succeed. Fixes: 19f8def031bf ("Bluetooth: Fix auth_complete_evt for legacy units") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit f7f627ac761b2fb0c487e5aaff1585f1014ab9a6) [fix trivial conflict due to missing commit 2064ee332e4c1b7495cf68b84355c213d8fe71fd ("Bluetooth: Use bt_dev_err and bt_dev_info when possible")] Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2024-02-02 11:33:42 +00:00
Jon Maxwell	12cda1d577	ipv6: remove max_size check inline with ipv4 commit af6d10345ca76670c1b7c37799f0d5576ccef277 upstream. In ip6_dst_gc() replace: if (entries > gc_thresh) With: if (entries > ops->gc_thresh) Sending Ipv6 packets in a loop via a raw socket triggers an issue where a route is cloned by ip6_rt_cache_alloc() for each packet sent. This quickly consumes the Ipv6 max_size threshold which defaults to 4096 resulting in these warnings: [1] 99.187805] dst_alloc: 7728 callbacks suppressed [2] Route cache is full: consider increasing sysctl net.ipv6.route.max_size. . . [300] Route cache is full: consider increasing sysctl net.ipv6.route.max_size. When this happens the packet is dropped and sendto() gets a network is unreachable error: remaining pkt 200557 errno 101 remaining pkt 196462 errno 101 . . remaining pkt 126821 errno 101 Implement David Aherns suggestion to remove max_size check seeing that Ipv6 has a GC to manage memory usage. Ipv4 already does not check max_size. Here are some memory comparisons for Ipv4 vs Ipv6 with the patch: Test by running 5 instances of a program that sends UDP packets to a raw socket 5000000 times. Compare Ipv4 and Ipv6 performance with a similar program. Ipv4: Before test: MemFree: 29427108 kB Slab: 237612 kB ip6_dst_cache 1912 2528 256 32 2 : tunables 0 0 0 xfrm_dst_cache 0 0 320 25 2 : tunables 0 0 0 ip_dst_cache 2881 3990 192 42 2 : tunables 0 0 0 During test: MemFree: 29417608 kB Slab: 247712 kB ip6_dst_cache 1912 2528 256 32 2 : tunables 0 0 0 xfrm_dst_cache 0 0 320 25 2 : tunables 0 0 0 ip_dst_cache 44394 44394 192 42 2 : tunables 0 0 0 After test: MemFree: 29422308 kB Slab: 238104 kB ip6_dst_cache 1912 2528 256 32 2 : tunables 0 0 0 xfrm_dst_cache 0 0 320 25 2 : tunables 0 0 0 ip_dst_cache 3048 4116 192 42 2 : tunables 0 0 0 Ipv6 with patch: Errno 101 errors are not observed anymore with the patch. Before test: MemFree: 29422308 kB Slab: 238104 kB ip6_dst_cache 1912 2528 256 32 2 : tunables 0 0 0 xfrm_dst_cache 0 0 320 25 2 : tunables 0 0 0 ip_dst_cache 3048 4116 192 42 2 : tunables 0 0 0 During Test: MemFree: 29431516 kB Slab: 240940 kB ip6_dst_cache 11980 12064 256 32 2 : tunables 0 0 0 xfrm_dst_cache 0 0 320 25 2 : tunables 0 0 0 ip_dst_cache 3048 4116 192 42 2 : tunables 0 0 0 After Test: MemFree: 29441816 kB Slab: 238132 kB ip6_dst_cache 1902 2432 256 32 2 : tunables 0 0 0 xfrm_dst_cache 0 0 320 25 2 : tunables 0 0 0 ip_dst_cache 3048 4116 192 42 2 : tunables 0 0 0 Tested-by: Andrea Mayer <andrea.mayer@uniroma2.it> Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20230112012532.311021-1-jmaxwell37@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Suraj Jitindar Singh <surajjs@amazon.com> Cc: <stable@vger.kernel.org> # 4.19.x Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 95372b040ae689293c6863b90049f1af68410c8b) Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2024-01-18 11:27:48 +00:00
Eric Dumazet	e0411760af	ipv6: make ip6_rt_gc_expire an atomic_t commit 9cb7c013420f98fa6fd12fc6a5dc055170c108db upstream. Reads and Writes to ip6_rt_gc_expire always have been racy, as syzbot reported lately [1] There is a possible risk of under-flow, leading to unexpected high value passed to fib6_run_gc(), although I have not observed this in the field. Hosts hitting ip6_dst_gc() very hard are under pretty bad state anyway. [1] BUG: KCSAN: data-race in ip6_dst_gc / ip6_dst_gc read-write to 0xffff888102110744 of 4 bytes by task 13165 on cpu 1: ip6_dst_gc+0x1f3/0x220 net/ipv6/route.c:3311 dst_alloc+0x9b/0x160 net/core/dst.c:86 ip6_dst_alloc net/ipv6/route.c:344 [inline] icmp6_dst_alloc+0xb2/0x360 net/ipv6/route.c:3261 mld_sendpack+0x2b9/0x580 net/ipv6/mcast.c:1807 mld_send_cr net/ipv6/mcast.c:2119 [inline] mld_ifc_work+0x576/0x800 net/ipv6/mcast.c:2651 process_one_work+0x3d3/0x720 kernel/workqueue.c:2289 worker_thread+0x618/0xa70 kernel/workqueue.c:2436 kthread+0x1a9/0x1e0 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 read-write to 0xffff888102110744 of 4 bytes by task 11607 on cpu 0: ip6_dst_gc+0x1f3/0x220 net/ipv6/route.c:3311 dst_alloc+0x9b/0x160 net/core/dst.c:86 ip6_dst_alloc net/ipv6/route.c:344 [inline] icmp6_dst_alloc+0xb2/0x360 net/ipv6/route.c:3261 mld_sendpack+0x2b9/0x580 net/ipv6/mcast.c:1807 mld_send_cr net/ipv6/mcast.c:2119 [inline] mld_ifc_work+0x576/0x800 net/ipv6/mcast.c:2651 process_one_work+0x3d3/0x720 kernel/workqueue.c:2289 worker_thread+0x618/0xa70 kernel/workqueue.c:2436 kthread+0x1a9/0x1e0 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 value changed: 0x00000bb3 -> 0x00000ba9 Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 11607 Comm: kworker/0:21 Not tainted 5.18.0-rc1-syzkaller-00037-g42e7a03d3bad-dirty #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: mld mld_ifc_work Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20220413181333.649424-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> [ 4.19: context adjustment in include/net/netns/ipv6.h ] Signed-off-by: Suraj Jitindar Singh <surajjs@amazon.com> Cc: <stable@vger.kernel.org> # 4.19.x Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit b4cfbeaebeb355dbaefb218470055de2e8a73020) Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2024-01-18 11:27:47 +00:00
Eric Dumazet	2ee1663e55	net/dst: use a smaller percpu_counter batch for dst entries accounting commit cf86a086a18095e33e0637cb78cda1fcf5280852 upstream. percpu_counter_add() uses a default batch size which is quite big on platforms with 256 cpus. (2256 -> 512) This means dst_entries_get_fast() can be off by +/- 2(nr_cpus^2) (131072 on servers with 256 cpus) Reduce the batch size to something more reasonable, and add logic to ip6_dst_gc() to call dst_entries_get_slow() before calling the _very_ expensive fib6_run_gc() function. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Suraj Jitindar Singh <surajjs@amazon.com> Cc: <stable@vger.kernel.org> # 4.19.x Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 9635bd0a5296e2e725c6b33e530d0ef582e2f64e) Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2024-01-18 11:27:47 +00:00
Greg Kroah-Hartman	8382692884	This is the 4.14.333 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmV50c4ACgkQONu9yGCS aT7NBxAAtKBLAoVN5J4r4/i/H6ac1RonF4Lq6Y2S0CgV81IYPH602dV6OsC6bGh6 UrSBiA0p2jBDLSFcDJOyrneMxJHveA0KSczAPNscE+ml7bVmiT47ySw6KxdM8wEU 3fnSdmUZ96Sa0CQoJSU50ot7lhzAiiMG8JWCLRPDRofIN0+qpTw5oCSnwKGsqyO8 LJkRujzKfWAykYQIrUYXqeIzzxww0JbE/8MRbeNT+2OfjG/jZamZwBQPFUWih67Q qAGFxV4n1MUdo4+kd5rpaYw5/5boPoVo8KIaxnrCWbauXn2MUT0ZWLDKnGu5hptL 6PHy66FFTYQjFJpuTc4+X7vzqptSJta8SSDqpcJ9FX9bVUdTuH07QDkA5yGmttb6 2W1fJKR9rTyt1+J526xBWgNdyilv08IUP4R6g4RUe2aRuqDMFrPAegcCyeQ7g99f cpg5z/knynn1qvJ4CznM83z1ZxwgG861G94ZJPPd2hKTPRltQpt9fF35ekeaHzcF f8vZfnYzD228R0FgtDcA8d9VIU/K3gICbhr1SCASy8uUyt+8RRtxxjGX4QUOwbZW PQwdX500xLzV5Lg7fOzyuaM/6+oYk+vU5iYJnkeglg5ReYtfEerHNtVRh4PAtgEu 04G81HY9cHjSqne6I5MTE/sx9L3JZj5dED8ZqDMqag5pB+G/1EQ= =ey14 -----END PGP SIGNATURE----- Merge 4.14.333 into android-4.14-stable Changes in 4.14.333 tg3: Move the [rt]x_dropped counters to tg3_napi tg3: Increment tx_dropped in tg3_tso_bug() drm/amdgpu: correct chunk_ptr to a pointer to chunk. net: hns: fix fake link up on xge port tcp: do not accept ACK of bytes we never sent RDMA/bnxt_re: Correct module description string hwmon: (acpi_power_meter) Fix 4.29 MW bug tracing: Fix a warning when allocating buffered events fails scsi: be2iscsi: Fix a memleak in beiscsi_init_wrb_handle() ALSA: pcm: fix out-of-bounds in snd_pcm_state_names nilfs2: prevent WARNING in nilfs_sufile_set_segment_usage() tracing: Always update snapshot buffer size tracing: Fix incomplete locking when disabling buffered events tracing: Fix a possible race when disabling buffered events packet: Move reference count in packet_sock to atomic_long_t parport: Add support for Brainboxes IX/UC/PX parallel cards serial: sc16is7xx: address RX timeout interrupt errata serial: 8250_omap: Add earlycon support for the AM654 UART controller KVM: s390/mm: Properly reset no-dat nilfs2: fix missing error check for sb_set_blocksize call netlink: don't call ->netlink_bind with table lock held genetlink: add CAP_NET_ADMIN test for multicast bind psample: Require 'CAP_NET_ADMIN' when joining "packets" group drop_monitor: Require 'CAP_SYS_ADMIN' when joining "events" group Linux 4.14.333 Change-Id: Iebcaaf9d6c5e2ef71dd23c3c6246f6cef45d296a Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2023-12-14 08:59:48 +00:00
Ido Schimmel	d9478fe0a8	drop_monitor: Require 'CAP_SYS_ADMIN' when joining "events" group commit e03781879a0d524ce3126678d50a80484a513c4b upstream. The "NET_DM" generic netlink family notifies drop locations over the "events" multicast group. This is problematic since by default generic netlink allows non-root users to listen to these notifications. Fix by adding a new field to the generic netlink multicast group structure that when set prevents non-root users or root without the 'CAP_SYS_ADMIN' capability (in the user namespace owning the network namespace) from joining the group. Set this field for the "events" group. Use 'CAP_SYS_ADMIN' rather than 'CAP_NET_ADMIN' because of the nature of the information that is shared over this group. Note that the capability check in this case will always be performed against the initial user namespace since the family is not netns aware and only operates in the initial network namespace. A new field is added to the structure rather than using the "flags" field because the existing field uses uAPI flags and it is inappropriate to add a new uAPI flag for an internal kernel check. In net-next we can rework the "flags" field to use internal flags and fold the new field into it. But for now, in order to reduce the amount of changes, add a new field. Since the information can only be consumed by root, mark the control plane operations that start and stop the tracing as root-only using the 'GENL_ADMIN_PERM' flag. Tested using [1]. Before: # capsh -- -c ./dm_repo # capsh --drop=cap_sys_admin -- -c ./dm_repo After: # capsh -- -c ./dm_repo # capsh --drop=cap_sys_admin -- -c ./dm_repo Failed to join "events" multicast group [1] $ cat dm.c #include <stdio.h> #include <netlink/genl/ctrl.h> #include <netlink/genl/genl.h> #include <netlink/socket.h> int main(int argc, char *argv) { struct nl_sock sk; int grp, err; sk = nl_socket_alloc(); if (!sk) { fprintf(stderr, "Failed to allocate socket\n"); return -1; } err = genl_connect(sk); if (err) { fprintf(stderr, "Failed to connect socket\n"); return err; } grp = genl_ctrl_resolve_grp(sk, "NET_DM", "events"); if (grp < 0) { fprintf(stderr, "Failed to resolve \"events\" multicast group\n"); return grp; } err = nl_socket_add_memberships(sk, grp, NFNLGRP_NONE); if (err) { fprintf(stderr, "Failed to join \"events\" multicast group\n"); return err; } return 0; } $ gcc -I/usr/include/libnl3 -lnl-3 -lnl-genl-3 -o dm_repo dm.c Fixes: 9a8afc8d3962 ("Network Drop Monitor: Adding drop monitor implementation & Netlink protocol") Reported-by: "The UK's National Cyber Security Centre (NCSC)" <security@ncsc.gov.uk> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20231206213102.1824398-3-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 16:46:18 +01:00

1 2 3 4 5 ...

11654 Commits