358 Commits

Author SHA1 Message Date
Neeraj Soni
00537bfa05 Revert "Reverting crypto patches"
This reverts commit b73e822d12ecbea7cad3742c46fd1be17aa141c8.

This is reverted to integrate new file encryption framework support changes
to ensure all fixes are present to use new encryption policies.

Change-Id: I455ec66664064069ac34e6fe410bd28dc3a53d07
Signed-off-by: Neeraj Soni <neersoni@codeaurora.org>
2020-08-18 05:01:30 -07:00
Srinivasarao P
b73e822d12 Reverting crypto patches
c57952b UPSTREAM: ubifs: wire up FS_IOC_GET_ENCRYPTION_NONCE
379237b UPSTREAM: f2fs: wire up FS_IOC_GET_ENCRYPTION_NONCE
10e5acf UPSTREAM: ext4: wire up FS_IOC_GET_ENCRYPTION_NONCE
63bf273 ANDROID: scsi: ufs: add ->map_sg_crypto() variant op
10d4512 FROMLIST: f2fs: Handle casefolding with Encryption
4efb7e2 ANDROID: fscrypt: fall back to filesystem-layer crypto when needed
a14fa7b ANDROID: block: require drivers to declare supported crypto key type(s)
5578bea ANDROID: block: make blk_crypto_start_using_mode() properly check for support
e9c80bd UPSTREAM: fscrypt: add FS_IOC_GET_ENCRYPTION_NONCE ioctl
9e469e7 UPSTREAM: fscrypt: don't evict dirty inodes after removing key
53f2446 fscrypt: don't evict dirty inodes after removing key
207be96 FROMLIST: fscrypt: Have filesystems handle their d_ops
06ab740 ANDROID: dm: Add wrapped key support in dm-default-key
23e670a ANDROID: dm: add support for passing through derive_raw_secret
166fda7 ANDROID: block: Prevent crypto fallback for wrapped keys
fe6e855 fscrypt: improve format of no-key names
216d8ca fscrypt: clarify what is meant by a per-file key
7e25032 fscrypt: derive dirhash key for casefolded directories
e16d849 fscrypt: don't allow v1 policies with casefolding
0bc68c1 fscrypt: add "fscrypt_" prefix to fname_encrypt()
85b9c3e fscrypt: don't print name of busy file when removing key
9c5c8c5 fscrypt: document gfp_flags for bounce page allocation
bee5bd5 fscrypt: optimize fscrypt_zeroout_range()
1c88eea fscrypt: remove redundant bi_status check
04f5184 fscrypt: Allow modular crypto algorithms
737ae90 fscrypt: include <linux/ioctl.h> in UAPI header
8842133 fscrypt: don't check for ENOKEY from fscrypt_get_encryption_info()
b21b79d fscrypt: remove fscrypt_is_direct_key_policy()
19b132b fscrypt: move fscrypt_valid_enc_modes() to policy.c
add6ac4 fscrypt: check for appropriate use of DIRECT_KEY flag earlier
2454b5b fscrypt: split up fscrypt_supported_policy() by policy version
bfa4ca6 fscrypt: introduce fscrypt_needs_contents_encryption()
3871977 fscrypt: move fscrypt_d_revalidate() to fname.c
39a0acc fscrypt: constify inode parameter to filename encryption functions
3942229 fscrypt: constify struct fscrypt_hkdf parameter to fscrypt_hkdf_expand()
a7b6398 fscrypt: verify that the crypto_skcipher has the correct ivsize
9c1b3af fscrypt: use crypto_skcipher_driver_name()
3529026 fscrypt: support passing a keyring key to FS_IOC_ADD_ENCRYPTION_KEY

Change-Id: Ib1abe832e16d5f40bfcc9e34bdccbb063b37dbbc
Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
2020-08-05 10:53:22 +05:30
Greg Kroah-Hartman
fae4e1d295 This is the 4.14.175 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl6F9/8ACgkQONu9yGCS
 aT5DJQ//aAbpT3q2hDbthg9szl/SsAlJr6UX90k7ZjxlI/wnXTepNIWTZqvSnvV4
 sb8HeOz2KUuTUh/PvM2vS37kdtzKqefku77tGl3JOE+pIIlKQ1au82U7vuSmo/FH
 Ua+/MEN55f8baiZvYTGGQtwc36Bnj64iO8RUT5iSK2GU7KVVyIgFwKKqRQIzJ+Ds
 dPACfMErty/+gvC9t0nx5u4BkC9ilIj5DH0OXiQvxZr9PQfg3lg7FFF/a6M0gaRF
 qhBZFX2xKzQRKVKnbob5kSpir6gsW/cu8S43YIcNzx72Ce4ROFi910J7P1Jzlb5j
 KEQGL7IuP+k8fwCpMZ7B9Goh9ian9VSUXKjrlr+UGotOGLzQ+dk4c/NJvCjxQvqx
 m8FtHNjo3WUl72Ul1p6zJc4JMC3LD3ZSkIQGhVny4Z52n4D4CnWI7+b5ppQe9RZD
 Iu8XjS0pTGfUUiomtci9ZcpWcTiWvW/VY0sRQbKj94h1nETWblXzXef5vJygZbMm
 hL950oGkWeh2MoBM3FYyBSP0YYkruTtUSQ1GRs7tsboUsiMM9cNSkwzsFU9xeEvh
 ZPIN5IdAIRilauOiI3YLEfO7JPz4OG0AlzodgnjbFchLqSIVzme8Wr84tFOYBhp1
 868Am3/E3p8qqmnMvtS8/TTETeehhbrPVUp1D+7zHnkv/mRC1CU=
 =uswL
 -----END PGP SIGNATURE-----

Merge 4.14.175 into android-4.14

Changes in 4.14.175
	spi: qup: call spi_qup_pm_resume_runtime before suspending
	powerpc: Include .BTF section
	ARM: dts: dra7: Add "dma-ranges" property to PCIe RC DT nodes
	spi: pxa2xx: Add CS control clock quirk
	spi/zynqmp: remove entry that causes a cs glitch
	drm/exynos: dsi: propagate error value and silence meaningless warning
	drm/exynos: dsi: fix workaround for the legacy clock name
	drivers/perf: arm_pmu_acpi: Fix incorrect checking of gicc pointer
	altera-stapl: altera_get_note: prevent write beyond end of 'key'
	dm bio record: save/restore bi_end_io and bi_integrity
	xenbus: req->body should be updated before req->state
	xenbus: req->err should be updated before req->state
	block, bfq: fix overwrite of bfq_group pointer in bfq_find_set_group()
	parse-maintainers: Mark as executable
	USB: Disable LPM on WD19's Realtek Hub
	usb: quirks: add NO_LPM quirk for RTL8153 based ethernet adapters
	USB: serial: option: add ME910G1 ECM composition 0x110b
	usb: host: xhci-plat: add a shutdown
	USB: serial: pl2303: add device-id for HP LD381
	usb: xhci: apply XHCI_SUSPEND_DELAY to AMD XHCI controller 1022:145c
	ALSA: line6: Fix endless MIDI read loop
	ALSA: seq: virmidi: Fix running status after receiving sysex
	ALSA: seq: oss: Fix running status after receiving sysex
	ALSA: pcm: oss: Avoid plugin buffer overflow
	ALSA: pcm: oss: Remove WARNING from snd_pcm_plug_alloc() checks
	iio: trigger: stm32-timer: disable master mode when stopping
	iio: magnetometer: ak8974: Fix negative raw values in sysfs
	mmc: sdhci-of-at91: fix cd-gpios for SAMA5D2
	staging: rtl8188eu: Add device id for MERCUSYS MW150US v2
	staging/speakup: fix get_word non-space look-ahead
	intel_th: Fix user-visible error codes
	intel_th: pci: Add Elkhart Lake CPU support
	rtc: max8907: add missing select REGMAP_IRQ
	xhci: Do not open code __print_symbolic() in xhci trace events
	memcg: fix NULL pointer dereference in __mem_cgroup_usage_unregister_event
	mm: slub: be more careful about the double cmpxchg of freelist
	mm, slub: prevent kmalloc_node crashes and memory leaks
	page-flags: fix a crash at SetPageError(THP_SWAP)
	x86/mm: split vmalloc_sync_all()
	USB: cdc-acm: fix close_delay and closing_wait units in TIOCSSERIAL
	USB: cdc-acm: fix rounding error in TIOCSSERIAL
	iio: adc: at91-sama5d2_adc: fix channel configuration for differential channels
	iio: adc: at91-sama5d2_adc: fix differential channels in triggered mode
	kbuild: Disable -Wpointer-to-enum-cast
	futex: Fix inode life-time issue
	futex: Unbreak futex hashing
	Revert "vrf: mark skb for multicast or link-local as enslaved to VRF"
	Revert "ipv6: Fix handling of LLA with VRF and sockets bound to VRF"
	ALSA: hda/realtek: Fix pop noise on ALC225
	arm64: smp: fix smp_send_stop() behaviour
	arm64: smp: fix crash_smp_send_stop() behaviour
	drm/bridge: dw-hdmi: fix AVI frame colorimetry
	staging: greybus: loopback_test: fix potential path truncation
	staging: greybus: loopback_test: fix potential path truncations
	Revert "drm/dp_mst: Skip validating ports during destruction, just ref"
	hsr: fix general protection fault in hsr_addr_is_self()
	macsec: restrict to ethernet devices
	net: dsa: Fix duplicate frames flooded by learning
	net: mvneta: Fix the case where the last poll did not process all rx
	net/packet: tpacket_rcv: avoid a producer race condition
	net: qmi_wwan: add support for ASKEY WWHC050
	net_sched: cls_route: remove the right filter from hashtable
	net_sched: keep alloc_hash updated after hash allocation
	net: stmmac: dwmac-rk: fix error path in rk_gmac_probe
	NFC: fdp: Fix a signedness bug in fdp_nci_send_patch()
	slcan: not call free_netdev before rtnl_unlock in slcan_open
	bnxt_en: fix memory leaks in bnxt_dcbnl_ieee_getets()
	net: dsa: mt7530: Change the LINK bit to reflect the link status
	vxlan: check return value of gro_cells_init()
	hsr: use rcu_read_lock() in hsr_get_node_{list/status}()
	hsr: add restart routine into hsr_get_node_list()
	hsr: set .netnsok flag
	net: ipv4: don't let PMTU updates increase route MTU
	cgroup-v1: cgroup_pidlist_next should update position index
	cpupower: avoid multiple definition with gcc -fno-common
	drivers/of/of_mdio.c:fix of_mdiobus_register()
	cgroup1: don't call release_agent when it is ""
	dt-bindings: net: FMan erratum A050385
	arm64: dts: ls1043a: FMan erratum A050385
	fsl/fman: detect FMan erratum A050385
	scsi: ipr: Fix softlockup when rescanning devices in petitboot
	mac80211: Do not send mesh HWMP PREQ if HWMP is disabled
	dpaa_eth: Remove unnecessary boolean expression in dpaa_get_headroom
	sxgbe: Fix off by one in samsung driver strncpy size arg
	arm64: ptrace: map SPSR_ELx<->PSR for compat tasks
	arm64: compat: map SPSR_ELx<->PSR for signals
	ftrace/x86: Anotate text_mutex split between ftrace_arch_code_modify_post_process() and ftrace_arch_code_modify_prepare()
	i2c: hix5hd2: add missed clk_disable_unprepare in remove
	Input: synaptics - enable RMI on HP Envy 13-ad105ng
	Input: avoid BIT() macro usage in the serio.h UAPI header
	ARM: dts: dra7: Add bus_dma_limit for L3 bus
	ARM: dts: omap5: Add bus_dma_limit for L3 bus
	perf probe: Do not depend on dwfl_module_addrsym()
	tools: Let O= makes handle a relative path with -C option
	scripts/dtc: Remove redundant YYLOC global declaration
	scsi: sd: Fix optimal I/O size for devices that change reported values
	mac80211: mark station unauthorized before key removal
	gpiolib: acpi: Correct comment for HP x2 10 honor_wakeup quirk
	gpiolib: acpi: Rework honor_wakeup option into an ignore_wake option
	gpiolib: acpi: Add quirk to ignore EC wakeups on HP x2 10 BYT + AXP288 model
	RDMA/core: Ensure security pkey modify is not lost
	genirq: Fix reference leaks on irq affinity notifiers
	xfrm: handle NETDEV_UNREGISTER for xfrm device
	vti[6]: fix packet tx through bpf_redirect() in XinY cases
	RDMA/mlx5: Block delay drop to unprivileged users
	xfrm: fix uctx len check in verify_sec_ctx_len
	xfrm: add the missing verify_sec_ctx_len check in xfrm_add_acquire
	xfrm: policy: Fix doulbe free in xfrm_policy_timer
	netfilter: nft_fwd_netdev: validate family and chain type
	vti6: Fix memory leak of skb if input policy check fails
	Input: raydium_i2c_ts - use true and false for boolean values
	Input: raydium_i2c_ts - fix error codes in raydium_i2c_boot_trigger()
	afs: Fix some tracing details
	USB: serial: option: add support for ASKEY WWHC050
	USB: serial: option: add BroadMobi BM806U
	USB: serial: option: add Wistron Neweb D19Q1
	USB: cdc-acm: restore capability check order
	USB: serial: io_edgeport: fix slab-out-of-bounds read in edge_interrupt_callback
	usb: musb: fix crash with highmen PIO and usbmon
	media: flexcop-usb: fix endpoint sanity check
	media: usbtv: fix control-message timeouts
	staging: rtl8188eu: Add ASUS USB-N10 Nano B1 to device table
	staging: wlan-ng: fix ODEBUG bug in prism2sta_disconnect_usb
	staging: wlan-ng: fix use-after-free Read in hfa384x_usbin_callback
	libfs: fix infoleak in simple_attr_read()
	media: ov519: add missing endpoint sanity checks
	media: dib0700: fix rc endpoint lookup
	media: stv06xx: add missing descriptor sanity checks
	media: xirlink_cit: add missing descriptor sanity checks
	mac80211: Check port authorization in the ieee80211_tx_dequeue() case
	mac80211: fix authentication with iwlwifi/mvm
	vt: selection, introduce vc_is_sel
	vt: ioctl, switch VT_IS_IN_USE and VT_BUSY to inlines
	vt: switch vt_dont_switch to bool
	vt: vt_ioctl: remove unnecessary console allocation checks
	vt: vt_ioctl: fix VT_DISALLOCATE freeing in-use virtual console
	vt: vt_ioctl: fix use-after-free in vt_in_use()
	platform/x86: pmc_atom: Add Lex 2I385SW to critclk_systems DMI table
	bpf: Explicitly memset the bpf_attr structure
	bpf: Explicitly memset some bpf info structures declared on the stack
	gpiolib: acpi: Add quirk to ignore EC wakeups on HP x2 10 CHT + AXP288 model
	net: ks8851-ml: Fix IO operations, again
	arm64: alternative: fix build with clang integrated assembler
	perf map: Fix off by one in strncpy() size argument
	ARM: dts: oxnas: Fix clear-mask property
	ARM: bcm2835-rpi-zero-w: Add missing pinctrl name
	arm64: dts: ls1043a-rdb: correct RGMII delay mode to rgmii-id
	arm64: dts: ls1046ardb: set RGMII interfaces to RGMII_ID mode
	Linux 4.14.175

Change-Id: If2c2cb5b3745ed6fbc5cb77737cfb1758fea4cb9
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2020-04-03 08:18:27 +02:00
Peter Zijlstra
e52694b56e futex: Fix inode life-time issue
commit 8019ad13ef7f64be44d4f892af9c840179009254 upstream.

As reported by Jann, ihold() does not in fact guarantee inode
persistence. And instead of making it so, replace the usage of inode
pointers with a per boot, machine wide, unique inode identifier.

This sequence number is global, but shared (file backed) futexes are
rare enough that this should not become a performance issue.

Reported-by: Jann Horn <jannh@google.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-02 16:34:21 +02:00
Eric Biggers
7c80641242 Merge remote-tracking branch 'aosp/upstream-f2fs-stable-linux-4.14.y' into android-4.14
* aosp/upstream-f2fs-stable-linux-4.14.y:
  fs-verity: use u64_to_user_ptr()
  fs-verity: use mempool for hash requests
  fs-verity: implement readahead of Merkle tree pages
  ext4: readpages() should submit IO as read-ahead
  fs-verity: implement readahead for FS_IOC_ENABLE_VERITY
  fscrypt: improve format of no-key names
  ubifs: allow both hash and disk name to be provided in no-key names
  ubifs: don't trigger assertion on invalid no-key filename
  fscrypt: clarify what is meant by a per-file key
  fscrypt: derive dirhash key for casefolded directories
  fscrypt: don't allow v1 policies with casefolding
  fscrypt: add "fscrypt_" prefix to fname_encrypt()
  fscrypt: don't print name of busy file when removing key
  fscrypt: document gfp_flags for bounce page allocation
  fscrypt: optimize fscrypt_zeroout_range()
  fscrypt: remove redundant bi_status check
  fscrypt: Allow modular crypto algorithms
  fscrypt: include <linux/ioctl.h> in UAPI header
  fscrypt: don't check for ENOKEY from fscrypt_get_encryption_info()
  fscrypt: remove fscrypt_is_direct_key_policy()
  fscrypt: move fscrypt_valid_enc_modes() to policy.c
  fscrypt: check for appropriate use of DIRECT_KEY flag earlier
  fscrypt: split up fscrypt_supported_policy() by policy version
  fscrypt: introduce fscrypt_needs_contents_encryption()
  fscrypt: move fscrypt_d_revalidate() to fname.c
  fscrypt: constify inode parameter to filename encryption functions
  fscrypt: constify struct fscrypt_hkdf parameter to fscrypt_hkdf_expand()
  fscrypt: verify that the crypto_skcipher has the correct ivsize
  fscrypt: use crypto_skcipher_driver_name()
  fscrypt: support passing a keyring key to FS_IOC_ADD_ENCRYPTION_KEY
  keys: Export lookup_user_key to external users
  f2fs: fix build error on PAGE_KERNEL_RO

Conflicts:
        fs/crypto/Kconfig
        fs/crypto/bio.c
        fs/crypto/fname.c
        fs/crypto/fscrypt_private.h
        fs/crypto/keyring.c
        fs/crypto/keysetup.c
        fs/ubifs/dir.c
        include/uapi/linux/fscrypt.h

Resolved the conflicts as per the corresponding android-mainline change,
Ib1e6b9eda8fb5dcfc6bdc8fa89d93f72b088c5f6.

Bug: 148667616
Change-Id: I5f8b846f0cd4d5403d8c61b9e12acb4581fac6f7
Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-02-21 13:04:40 -08:00
Daniel Rosenberg
e16d8494ec fscrypt: don't allow v1 policies with casefolding
Casefolded encrypted directories will use a new dirhash method that
requires a secret key.  If the directory uses a v2 encryption policy,
it's easy to derive this key from the master key using HKDF.  However,
v1 encryption policies don't provide a way to derive additional keys.

Therefore, don't allow casefolding on directories that use a v1 policy.
Specifically, make it so that trying to enable casefolding on a
directory that has a v1 policy fails, trying to set a v1 policy on a
casefolded directory fails, and trying to open a casefolded directory
that has a v1 policy (if one somehow exists on-disk) fails.

Signed-off-by: Daniel Rosenberg <drosen@google.com>
[EB: improved commit message, updated fscrypt.rst, and other cleanups]
Link: https://lore.kernel.org/r/20200120223201.241390-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-02-13 15:12:08 -08:00
Greg Kroah-Hartman
d2905c6a0e This is the 4.14.164 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl4a/2MACgkQONu9yGCS
 aT4BwA//diCficMfLINrc/9bMq3VS2Y+/lnuURMXEM9MJibjQCUS1spc6YhhNFrE
 8m3aavAYywjjD3zGHj8KEaKQFDrPQxYQDzPOPK9rxjpxlUFpnYWUGlI2krpwBV6c
 8xAekM62sMEIq09EHqqhKVls+WmYi47/pdfGAAt3PUR8c2eTOlxiFsiwq4nuZDdv
 rcMkQm87V8Wn1Nq+Dfp6R3U+X9f4DcU5n5cKiGq6ujoalT7h5/jj36JIFxBwMapF
 WjpqXMUUeylXxXnNFMUbEMg+lEqJlWfvj1sxdxyMdgS+L9rc9bXk/NTub4TZPaXu
 odwMl9RKWjJvFsvn26Pc4s31K2raEhCDYdkVoFTXWsc7vbE4A/h/yAw4Wq+cuBI4
 H4fBXYYZ3D0Il9kxYYbfSaki5z1YbI54tkWcrs8f8jli5C0M3Wkkux1TA4HPj2Ja
 8zJFH0++cyfpuKRiYXro+H2Tq4KxBwsWEtync8230MEywlTxkz4IIue+SCgVV+WD
 jmg/enRjbnkpYBSH1pKOdAAga0kHSxtwWlfLFrjhcgGse8y6sCJhUOPPcQMnf/k0
 Jrmc3InHg+mtLiSsJXAp4iGABJlW+W/ouaxaxYoA9wucwQlcgxXpkigl5rOgFTma
 153RYc1TSZJAe+cjx42qZxRxcD8/Vg5d6D2tL1otbMSIsD3e7Gk=
 =sq63
 -----END PGP SIGNATURE-----

Merge 4.14.164 into android-4.14

Changes in 4.14.164
	USB: dummy-hcd: use usb_urb_dir_in instead of usb_pipein
	USB: dummy-hcd: increase max number of devices to 32
	locking/spinlock/debug: Fix various data races
	netfilter: ctnetlink: netns exit must wait for callbacks
	mwifiex: Fix heap overflow in mmwifiex_process_tdls_action_frame()
	libtraceevent: Fix lib installation with O=
	x86/efi: Update e820 with reserved EFI boot services data to fix kexec breakage
	efi/gop: Return EFI_NOT_FOUND if there are no usable GOPs
	efi/gop: Return EFI_SUCCESS if a usable GOP was found
	efi/gop: Fix memory leak in __gop_query32/64()
	ARM: vexpress: Set-up shared OPP table instead of individual for each CPU
	netfilter: uapi: Avoid undefined left-shift in xt_sctp.h
	netfilter: nf_tables: validate NFT_SET_ELEM_INTERVAL_END
	ARM: dts: Cygnus: Fix MDIO node address/size cells
	spi: spi-cavium-thunderx: Add missing pci_release_regions()
	ASoC: topology: Check return value for soc_tplg_pcm_create()
	ARM: dts: bcm283x: Fix critical trip point
	bpf, mips: Limit to 33 tail calls
	ARM: dts: am437x-gp/epos-evm: fix panel compatible
	samples: bpf: Replace symbol compare of trace_event
	samples: bpf: fix syscall_tp due to unused syscall
	powerpc: Ensure that swiotlb buffer is allocated from low memory
	bnx2x: Do not handle requests from VFs after parity
	bnx2x: Fix logic to get total no. of PFs per engine
	net: usb: lan78xx: Fix error message format specifier
	rfkill: Fix incorrect check to avoid NULL pointer dereference
	ASoC: wm8962: fix lambda value
	regulator: rn5t618: fix module aliases
	kconfig: don't crash on NULL expressions in expr_eq()
	perf/x86/intel: Fix PT PMI handling
	fs: avoid softlockups in s_inodes iterators
	net: stmmac: Do not accept invalid MTU values
	net: stmmac: RX buffer size must be 16 byte aligned
	s390/dasd/cio: Interpret ccw_device_get_mdc return value correctly
	s390/dasd: fix memleak in path handling error case
	block: fix memleak when __blk_rq_map_user_iov() is failed
	parisc: Fix compiler warnings in debug_core.c
	llc2: Fix return statement of llc_stat_ev_rx_null_dsap_xid_c (and _test_c)
	hv_netvsc: Fix unwanted rx_table reset
	bpf: reject passing modified ctx to helper functions
	bpf: Fix passing modified ctx to ld/abs/ind instruction
	PCI/switchtec: Read all 64 bits of part_event_bitmap
	mmc: block: Convert RPMB to a character device
	mmc: block: Delete mmc_access_rpmb()
	mmc: block: Fix bug when removing RPMB chardev
	mmc: core: Prevent bus reference leak in mmc_blk_init()
	mmc: block: propagate correct returned value in mmc_rpmb_ioctl
	gtp: fix bad unlock balance in gtp_encap_enable_socket
	macvlan: do not assume mac_header is set in macvlan_broadcast()
	net: dsa: mv88e6xxx: Preserve priority when setting CPU port.
	net: stmmac: dwmac-sun8i: Allow all RGMII modes
	net: stmmac: dwmac-sunxi: Allow all RGMII modes
	net: usb: lan78xx: fix possible skb leak
	pkt_sched: fq: do not accept silly TCA_FQ_QUANTUM
	USB: core: fix check for duplicate endpoints
	USB: serial: option: add Telit ME910G1 0x110a composition
	sctp: free cmd->obj.chunk for the unprocessed SCTP_CMD_REPLY
	tcp: fix "old stuff" D-SACK causing SACK to be treated as D-SACK
	vxlan: fix tos value before xmit
	vlan: vlan_changelink() should propagate errors
	net: sch_prio: When ungrafting, replace with FIFO
	vlan: fix memory leak in vlan_dev_set_egress_priority
	Linux 4.14.164

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ifbce6635b5a3df896c29e23dd15098e80ecddeba
2020-01-12 12:24:05 +01:00
Eric Sandeen
61855d6805 fs: avoid softlockups in s_inodes iterators
[ Upstream commit 04646aebd30b99f2cfa0182435a2ec252fcb16d0 ]

Anything that walks all inodes on sb->s_inodes list without rescheduling
risks softlockups.

Previous efforts were made in 2 functions, see:

c27d82f fs/drop_caches.c: avoid softlockups in drop_pagecache_sb()
ac05fbb inode: don't softlockup when evicting inodes

but there hasn't been an audit of all walkers, so do that now.  This
also consistently moves the cond_resched() calls to the bottom of each
loop in cases where it already exists.

One loop remains: remove_dquot_ref(), because I'm not quite sure how
to deal with that one w/o taking the i_lock.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-12 12:11:59 +01:00
Jaegeuk Kim
2ea8815046 Merge remote-tracking branch 'origin/upstream-f2fs-stable-linux-4.14.y' into android-4.14
* origin/upstream-f2fs-stable-linux-4.14.y:
  f2fs: use EINVAL for superblock with invalid magic
  f2fs: fix to read source block before invalidating it
  f2fs: remove redundant check from f2fs_setflags_common()
  f2fs: use generic checking function for FS_IOC_FSSETXATTR
  f2fs: use generic checking and prep function for FS_IOC_SETFLAGS
  ubifs, fscrypt: cache decrypted symlink target in ->i_link
  vfs: use READ_ONCE() to access ->i_link
  fs, fscrypt: clear DCACHE_ENCRYPTED_NAME when unaliasing directory
  fscrypt: cache decrypted symlink target in ->i_link
  fscrypt: fix race where ->lookup() marks plaintext dentry as ciphertext
  fscrypt: only set dentry_operations on ciphertext dentries
  fscrypt: fix race allowing rename() and link() of ciphertext dentries
  fscrypt: clean up and improve dentry revalidation
  fscrypt: use READ_ONCE() to access ->i_crypt_info
  fscrypt: remove WARN_ON_ONCE() when decryption fails
  fscrypt: drop inode argument from fscrypt_get_ctx()
  f2fs: improve print log in f2fs_sanity_check_ckpt()
  f2fs: avoid out-of-range memory access
  f2fs: fix to avoid long latency during umount
  f2fs: allow all the users to pin a file
  f2fs: support swap file w/ DIO
  f2fs: allocate blocks for pinned file
  f2fs: fix is_idle() check for discard type
  f2fs: add a rw_sem to cover quota flag changes
  f2fs: set SBI_NEED_FSCK for xattr corruption case
  f2fs: use generic EFSBADCRC/EFSCORRUPTED
  f2fs: Use DIV_ROUND_UP() instead of open-coding
  f2fs: print kernel message if filesystem is inconsistent
  f2fs: introduce f2fs_<level> macros to wrap f2fs_printk()
  f2fs: avoid get_valid_blocks() for cleanup
  f2fs: ioctl for removing a range from F2FS
  f2fs: only set project inherit bit for directory
  f2fs: separate f2fs i_flags from fs_flags and ext4 i_flags
  f2fs: Add option to limit required GC for checkpoint=disable
  f2fs: Fix accounting for unusable blocks
  f2fs: Fix root reserved on remount
  f2fs: Lower threshold for disable_cp_again
  f2fs: fix sparse warning
  f2fs: fix f2fs_show_options to show nodiscard mount option
  f2fs: add error prints for debugging mount failure
  f2fs: fix to do sanity check on segment bitmap of LFS curseg
  f2fs: add missing sysfs entries in documentation
  f2fs: fix to avoid deadloop if data_flush is on
  f2fs: always assume that the device is idle under gc_urgent
  f2fs: add bio cache for IPU
  f2fs: allow ssr block allocation during checkpoint=disable period
  f2fs: fix to check layout on last valid checkpoint park

Change-Id: I765f6ed215533097c63d1207a7d60ce7fc4a7269
Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
2019-08-02 10:12:21 -07:00
Eric Biggers
0dca705f7c f2fs: use generic checking function for FS_IOC_FSSETXATTR
Make the f2fs implementation of FS_IOC_FSSETXATTR use the new VFS helper
function vfs_ioc_fssetxattr_check(), and remove the project quota check
since it's now done by the helper function.

This is based on a patch from Darrick Wong, but reworked to apply after
commit 360985573b55 ("f2fs: separate f2fs i_flags from fs_flags and ext4
i_flags").

Originally-from: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-07-30 13:45:46 -07:00
Eric Biggers
a506a578ca f2fs: use generic checking and prep function for FS_IOC_SETFLAGS
Make the f2fs implementation of FS_IOC_SETFLAGS use the new VFS helper
function vfs_ioc_setflags_prepare().

This is based on a patch from Darrick Wong, but reworked to apply after
commit 360985573b55 ("f2fs: separate f2fs i_flags from fs_flags and ext4
i_flags").

Originally-from: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-07-30 13:39:59 -07:00
Greg Kroah-Hartman
93c338c2e7 This is the 4.14.129 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl0Nx7MACgkQONu9yGCS
 aT6rcxAA0os++XT8OzzsnJamD3O+oqDxCm3Pd3mNj8uck2XCHtzSEvq0z8a3/Xo2
 RXWBNDbYRHDGCNwK+2OxSbmyBO6ZqH7lp+KkipGTwWmc6spmm0gnDMSQ1312g6v5
 uVL8Tyirc9xEvkkHHNbxxuQ5e8RoxU0P+dBktDg5ipUCNLdO397/ldrg2cpxSsT4
 SeUL6kQmXyH9X/btZVq+xUdIqgn5JDuzrDbGSmFa7SV/D+IVIvxiR5tt7ZLuWVVl
 n9AqjTERAg0bLcOq7j5/eyephR0ooumd1J8z6PgMnmj66UTYtgkdDBVBYEFRUlEB
 9tmmX5KWlpdbVfhvTRrbKo0kfRrVCy1h3CDf0hiJpYJesf6f+CBOorUORdvXKHEp
 rQ2o6nshVWGsGv0fD3j4FzURZxbWFDOvveGApRj2p5626gLnRwz9kBvsy3FK/Gb9
 d9b2fZRDf1Iz6QYKTybexhPfDxA2Gy3MvZ1Yj7EXIrf4rUrRSf+WSjSE69g8i9Q0
 /1lWVQ9aW1UQ9Ya/r6xS5q2VNzWUDCpYDRDqyiiND0E1MrgsvL56YsmWbzQOI/hV
 tm7j5NEqCPHVY0UQqjkAUspQbkLKbkVoj4TiQKgppkp1a2uZjo48AGbIqq0tJ/h8
 aHkb/PgRyxLpD/+NFhLbwJGN7kATuMlegduhwJkxfv0kCmyEwrw=
 =flvv
 -----END PGP SIGNATURE-----

Merge 4.14.129 into android-4.14

Changes in 4.14.129
	perf machine: Guard against NULL in machine__exit()
	ax25: fix inconsistent lock state in ax25_destroy_timer
	be2net: Fix number of Rx queues used for flow hashing
	ipv6: flowlabel: fl6_sock_lookup() must use atomic_inc_not_zero
	lapb: fixed leak of control-blocks.
	neigh: fix use-after-free read in pneigh_get_next
	net: openvswitch: do not free vport if register_netdevice() is failed.
	sctp: Free cookie before we memdup a new one
	sunhv: Fix device naming inconsistency between sunhv_console and sunhv_reg
	Staging: vc04_services: Fix a couple error codes
	perf/x86/intel/ds: Fix EVENT vs. UEVENT PEBS constraints
	netfilter: nf_queue: fix reinject verdict handling
	ipvs: Fix use-after-free in ip_vs_in
	selftests: netfilter: missing error check when setting up veth interface
	clk: ti: clkctrl: Fix clkdm_clk handling
	powerpc/powernv: Return for invalid IMC domain
	mISDN: make sure device name is NUL terminated
	x86/CPU/AMD: Don't force the CPB cap when running under a hypervisor
	perf/ring_buffer: Fix exposing a temporarily decreased data_head
	perf/ring_buffer: Add ordering to rb->nest increment
	perf/ring-buffer: Always use {READ,WRITE}_ONCE() for rb->user_page data
	gpio: fix gpio-adp5588 build errors
	net: tulip: de4x5: Drop redundant MODULE_DEVICE_TABLE()
	net: aquantia: fix LRO with FCS error
	i2c: dev: fix potential memory leak in i2cdev_ioctl_rdwr
	ALSA: hda - Force polling mode on CNL for fixing codec communication
	configfs: Fix use-after-free when accessing sd->s_dentry
	perf data: Fix 'strncat may truncate' build failure with recent gcc
	perf record: Fix s390 missing module symbol and warning for non-root users
	ia64: fix build errors by exporting paddr_to_nid()
	KVM: PPC: Book3S: Use new mutex to synchronize access to rtas token list
	KVM: PPC: Book3S HV: Don't take kvm->lock around kvm_for_each_vcpu
	net: sh_eth: fix mdio access in sh_eth_close() for R-Car Gen2 and RZ/A1 SoCs
	net: phy: dp83867: Set up RGMII TX delay
	scsi: libcxgbi: add a check for NULL pointer in cxgbi_check_route()
	scsi: smartpqi: properly set both the DMA mask and the coherent DMA mask
	scsi: scsi_dh_alua: Fix possible null-ptr-deref
	scsi: libsas: delete sas port if expander discover failed
	mlxsw: spectrum: Prevent force of 56G
	HID: wacom: Don't set tool type until we're in range
	HID: wacom: Don't report anything prior to the tool entering range
	HID: wacom: Send BTN_TOUCH in response to INTUOSP2_BT eraser contact
	coredump: fix race condition between collapse_huge_page() and core dumping
	infiniband: fix race condition between infiniband mlx4, mlx5 driver and core dumping
	Abort file_remove_privs() for non-reg. files
	Linux 4.14.129

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2019-06-22 08:51:41 +02:00
Alexander Lochmann
2c54624255 Abort file_remove_privs() for non-reg. files
commit f69e749a49353d96af1a293f56b5b56de59c668a upstream.

file_remove_privs() might be called for non-regular files, e.g.
blkdev inode. There is no reason to do its job on things
like blkdev inodes, pipes, or cdevs. Hence, abort if
file does not refer to a regular inode.

AV: more to the point, for devices there might be any number of
inodes refering to given device.  Which one to strip the permissions
from, even if that made any sense in the first place?  All of them
will be observed with contents modified, after all.

Found by LockDoc (Alexander Lochmann, Horst Schirmeier and Olaf
Spinczyk)

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Alexander Lochmann <alexander.lochmann@tu-dortmund.de>
Signed-off-by: Horst Schirmeier <horst.schirmeier@tu-dortmund.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Zubin Mithra <zsm@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-22 08:16:19 +02:00
Greg Kroah-Hartman
818299f6bd This is the 4.14.56 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAltNuVYACgkQONu9yGCS
 aT7kTA/+MRHC5oFvdnhSsF6jAHsY9rgJNQXPtZCFhZnHhhYHtubQ2OJOmSZ7IfM0
 9yhz/7vijC9+tLufXQxQnu2UUL3ojNu1+l+q9s0U1GUzNiONlJ9q/CyB4xjXFRCS
 1RdiDZaQbIqUCYs38UCTsEJF65uKjzQ6dpF21XdIXp5FPxgiZawo4HpjQRJswbAl
 Du97ybMEPN3XnAn207GjZwy58ubRLF5HDG1sqNGfjVWJ7oMTi+QJOCvY3PJtU3j2
 unS0qjxLU432rOyDfaJK7Yj9s61zu0PurbJrHo+dw3O3hd/Og7soqoqohUEjZWXd
 z7jjrntXZOZ/0st2yHmygfAPUJm/8jsh7Pd39Jgyfeu/3Clo51gO494rwATQsyE5
 mwIdllyzyMNBEJI2F2fxE60WlFsbTjeBOX3BaOwnF8pGRJWsCAfbFknRbuKh1fO5
 czFbUSOi00POw4WHT1rxV9u0yDBXmP47fy9zHquOim+PfK8pFvWuf6GSFjvqRTv8
 20w1w7eixMi09ZXOkgTJ3S00MKHSpxoaenI3n2NcEVVRgDEVfh3C/zelvvfCDMHD
 i36DN39Sj41PNA/R4n0TIA4W+ab9qBVzQl16yaj9JURR2rA92GyMVC1+Xjqo1Py3
 GRFOf2Gprlm0/vfkiRsMu9coAJuKV6+8fHXQU4mzHulKUaDWuJ0=
 =/wBU
 -----END PGP SIGNATURE-----

Merge 4.14.56 into android-4.14

Changes in 4.14.56
	media: rc: mce_kbd decoder: fix stuck keys
	ASoC: mediatek: preallocate pages use platform device
	MIPS: Call dump_stack() from show_regs()
	MIPS: Use async IPIs for arch_trigger_cpumask_backtrace()
	MIPS: Fix ioremap() RAM check
	mmc: sdhci-esdhc-imx: allow 1.8V modes without 100/200MHz pinctrl states
	mmc: dw_mmc: fix card threshold control configuration
	ibmasm: don't write out of bounds in read handler
	staging: rtl8723bs: Prevent an underflow in rtw_check_beacon_data().
	staging: r8822be: Fix RTL8822be can't find any wireless AP
	ata: Fix ZBC_OUT command block check
	ata: Fix ZBC_OUT all bit handling
	vmw_balloon: fix inflation with batching
	ahci: Disable LPM on Lenovo 50 series laptops with a too old BIOS
	USB: serial: ch341: fix type promotion bug in ch341_control_in()
	USB: serial: cp210x: add another USB ID for Qivicon ZigBee stick
	USB: serial: keyspan_pda: fix modem-status error handling
	USB: yurex: fix out-of-bounds uaccess in read handler
	USB: serial: mos7840: fix status-register error handling
	usb: quirks: add delay quirks for Corsair Strafe
	xhci: xhci-mem: off by one in xhci_stream_id_to_ring()
	devpts: hoist out check for DEVPTS_SUPER_MAGIC
	devpts: resolve devpts bind-mounts
	Fix up non-directory creation in SGID directories
	genirq/affinity: assign vectors to all possible CPUs
	scsi: megaraid_sas: use adapter_type for all gen controllers
	scsi: megaraid_sas: replace instance->ctrl_context checks with instance->adapter_type
	scsi: megaraid_sas: replace is_ventura with adapter_type checks
	scsi: megaraid_sas: Create separate functions to allocate ctrl memory
	scsi: megaraid_sas: fix selection of reply queue
	ALSA: hda/realtek - two more lenovo models need fixup of MIC_LOCATION
	ALSA: hda - Handle pm failure during hotplug
	mm: do not drop unused pages when userfaultd is running
	fs/proc/task_mmu.c: fix Locked field in /proc/pid/smaps*
	fs, elf: make sure to page align bss in load_elf_library
	mm: do not bug_on on incorrect length in __mm_populate()
	tracing: Reorder display of TGID to be after PID
	kbuild: delete INSTALL_FW_PATH from kbuild documentation
	arm64: neon: Fix function may_use_simd() return error status
	tools build: fix # escaping in .cmd files for future Make
	IB/hfi1: Fix incorrect mixing of ERR_PTR and NULL return values
	i2c: tegra: Fix NACK error handling
	iw_cxgb4: correctly enforce the max reg_mr depth
	xen: setup pv irq ops vector earlier
	nvme-pci: Remap CMB SQ entries on every controller reset
	crypto: x86/salsa20 - remove x86 salsa20 implementations
	uprobes/x86: Remove incorrect WARN_ON() in uprobe_init_insn()
	netfilter: nf_queue: augment nfqa_cfg_policy
	netfilter: x_tables: initialise match/target check parameter struct
	loop: add recursion validation to LOOP_CHANGE_FD
	PM / hibernate: Fix oops at snapshot_write()
	RDMA/ucm: Mark UCM interface as BROKEN
	loop: remember whether sysfs_create_group() was done
	f2fs: give message and set need_fsck given broken node id
	Linux 4.14.56

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-07-17 12:29:15 +02:00
Linus Torvalds
298243a5fb Fix up non-directory creation in SGID directories
commit 0fa3ecd87848c9c93c2c828ef4c3a8ca36ce46c7 upstream.

sgid directories have special semantics, making newly created files in
the directory belong to the group of the directory, and newly created
subdirectories will also become sgid.  This is historically used for
group-shared directories.

But group directories writable by non-group members should not imply
that such non-group members can magically join the group, so make sure
to clear the sgid bit on non-directories for non-members (but remember
that sgid without group execute means "mandatory locking", just to
confuse things even more).

Reported-by: Jann Horn <jannh@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-17 11:39:27 +02:00
Greg Kroah-Hartman
a6d6913801 This is the 4.14.54 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAltCEg0ACgkQONu9yGCS
 aT6p4hAAnnf0LGGJAg4dtOU/xwaTosd/gtqSi7qsy6h7SzK/GC+2zeQ2iqINs4Gy
 EPJRRV7CgbhiRUzLl8hfn0jMHOZd4v8BeVxngLsbkFyHPHfph5e99b70i6SLx0BV
 3Evo3KjLvnijWIai2JduaN3F92iwRanLUUoqYKIBs3a5vRrRE9tTpNdz7j7273sq
 QvjWoE1d1oytRQZ4I493QaDhHuWi/dFGjHrHMczs1G5uRB3klMFV/MueQmsXLs6V
 pi35VX6UjpGY0y6ZZwjExCZZPFdkk9DV9qkCC3CAYPEemymJZSHPtnLIUG345Nso
 B0nExOFKSIa4RA1USzg/0OMPI3tpdP5AknSzpRXrNp10SGiXHQqev7chAx9YkBiI
 f5ZWE9DmT5bTY8tnx+SLwpvObXXwKkqjaRT7BkhmYmgx8gLRxO766uzKX3ucjV2a
 8YPuFcrx61T5zTjHlKEc3p4HkVJIEigF2EOrnRj0z80RAgsTyGS6ZF6T1cXSPJ9h
 ZAX0m76bX2lhRo5RHOteYttpZQoHb26E2+I16tc6wX7ueeaOjQ1gzdsUVZxMnGhA
 +ewyAP7GrJ+tVoe26g0Jmf5k4r3wtNnfSKnm9Tykwaps2LhtP4/LxmCTzVXafSlb
 8HoUB42QclzgwoKRTyMVozTZleyeu7jAk+4Q/AVc1GkGF4AeWYA=
 =c3A6
 -----END PGP SIGNATURE-----

Merge 4.14.54 into android-4.14

Changes in 4.14.54
	usb: cdc_acm: Add quirk for Uniden UBC125 scanner
	USB: serial: cp210x: add CESINEL device ids
	USB: serial: cp210x: add Silicon Labs IDs for Windows Update
	usb: dwc2: fix the incorrect bitmaps for the ports of multi_tt hub
	acpi: Add helper for deactivating memory region
	usb: typec: ucsi: acpi: Workaround for cache mode issue
	usb: typec: ucsi: Fix for incorrect status data issue
	xhci: Fix kernel oops in trace_xhci_free_virt_device
	n_tty: Fix stall at n_tty_receive_char_special().
	n_tty: Access echo_* variables carefully.
	staging: android: ion: Return an ERR_PTR in ion_map_kernel
	serial: 8250_pci: Remove stalled entries in blacklist
	serdev: fix memleak on module unload
	vt: prevent leaking uninitialized data to userspace via /dev/vcs*
	drm/amdgpu: Add APU support in vi_set_uvd_clocks
	drm/amdgpu: Add APU support in vi_set_vce_clocks
	drm/amdgpu: fix the missed vcn fw version report
	drm/qxl: Call qxl_bo_unref outside atomic context
	drm/atmel-hlcdc: check stride values in the first plane
	drm/amdgpu: Use kvmalloc_array for allocating VRAM manager nodes array
	drm/amdgpu: Refactor amdgpu_vram_mgr_bo_invisible_size helper
	drm/i915: Enable provoking vertex fix on Gen9 systems.
	netfilter: nf_tables: nft_compat: fix refcount leak on xt module
	netfilter: nft_compat: prepare for indirect info storage
	netfilter: nft_compat: fix handling of large matchinfo size
	netfilter: nf_tables: don't assume chain stats are set when jumplabel is set
	netfilter: nf_tables: bogus EBUSY in chain deletions
	netfilter: nft_meta: fix wrong value dereference in nft_meta_set_eval
	netfilter: nf_tables: disable preemption in nft_update_chain_stats()
	netfilter: nf_tables: increase nft_counters_enabled in nft_chain_stats_replace()
	netfilter: nf_tables: fix memory leak on error exit return
	netfilter: nf_tables: add missing netlink attrs to policies
	netfilter: nf_tables: fix NULL-ptr in nf_tables_dump_obj()
	md: always hold reconfig_mutex when calling mddev_suspend()
	md: don't call bitmap_create() while array is quiesced.
	md: move suspend_hi/lo handling into core md code
	md: use mddev_suspend/resume instead of ->quiesce()
	md: allow metadata update while suspending.
	md: remove special meaning of ->quiesce(.., 2)
	netfilter: don't set F_IFACE on ipv6 fib lookups
	netfilter: ip6t_rpfilter: provide input interface for route lookup
	netfilter: nf_tables: use WARN_ON_ONCE instead of BUG_ON in nft_do_chain()
	ARM: dts: imx6q: Use correct SDMA script for SPI5 core
	mtd: rawnand: fix return value check for bad block status
	xfrm6: avoid potential infinite loop in _decode_session6()
	afs: Fix directory permissions check
	netfilter: ebtables: handle string from userspace with care
	s390/dasd: use blk_mq_rq_from_pdu for per request data
	netfilter: nft_limit: fix packet ratelimiting
	ipvs: fix buffer overflow with sync daemon and service
	iwlwifi: pcie: compare with number of IRQs requested for, not number of CPUs
	atm: zatm: fix memcmp casting
	net: qmi_wwan: Add Netgear Aircard 779S
	perf test: "Session topology" dumps core on s390
	perf bpf: Fix NULL return handling in bpf__prepare_load()
	fs: clear writeback errors in inode_init_always
	sched/core: Fix rules for running on online && !active CPUs
	sched/core: Require cpu_active() in select_task_rq(), for user tasks
	platform/x86: asus-wmi: Fix NULL pointer dereference
	net/sonic: Use dma_mapping_error()
	net: dsa: b53: Add BCM5389 support
	Linux 4.14.54

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-07-08 16:14:26 +02:00
Darrick J. Wong
93b84462ea fs: clear writeback errors in inode_init_always
[ Upstream commit 829bc787c1a0403e4d886296dd4d90c5f9c1744a ]

In inode_init_always(), we clear the inode mapping flags, which clears
any retained error (AS_EIO, AS_ENOSPC) bits.  Unfortunately, we do not
also clear wb_err, which means that old mapping errors can leak through
to new inodes.

This is crucial for the XFS inode allocation path because we recycle old
in-core inodes and we do not want error state from an old file to leak
into the new file.  This bug was discovered by running generic/036 and
generic/047 in a loop and noticing that the EIOs generated by the
collision of direct and buffered writes in generic/036 would survive the
remount between 036 and 047, and get reported to the fsyncs (on
different files!) in generic/047.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-08 15:30:53 +02:00
Daniel Rosenberg
06b4c45094 ANDROID: vfs: Add setattr2 for filesystems with per mount permissions
This allows filesystems to use their mount private data to
influence the permssions they use in setattr2. It has
been separated into a new call to avoid disrupting current
setattr users.

Change-Id: I19959038309284448f1b7f232d579674ef546385
Signed-off-by: Daniel Rosenberg <drosen@google.com>
2018-01-29 19:39:58 -08:00
Jaegeuk Kim
9d468a2b52 Revert "locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns to READ_ONCE()/WRITE_ONCE()"
Change fscrypt only.

This reverts commit 4bb665c7e388a1ea9770d8b40fc7f48a6e9c4438.
2018-01-09 17:01:36 -08:00
Mark Rutland
4bb665c7e3 locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns to READ_ONCE()/WRITE_ONCE()
Please do not apply this to mainline directly, instead please re-run the
coccinelle script shown below and apply its output.

For several reasons, it is desirable to use {READ,WRITE}_ONCE() in
preference to ACCESS_ONCE(), and new code is expected to use one of the
former. So far, there's been no reason to change most existing uses of
ACCESS_ONCE(), as these aren't harmful, and changing them results in
churn.

However, for some features, the read/write distinction is critical to
correct operation. To distinguish these cases, separate read/write
accessors must be used. This patch migrates (most) remaining
ACCESS_ONCE() instances to {READ,WRITE}_ONCE(), using the following
coccinelle script:

----
// Convert trivial ACCESS_ONCE() uses to equivalent READ_ONCE() and
// WRITE_ONCE()

// $ make coccicheck COCCI=/home/mark/once.cocci SPFLAGS="--include-headers" MODE=patch

virtual patch

@ depends on patch @
expression E1, E2;
@@

- ACCESS_ONCE(E1) = E2
+ WRITE_ONCE(E1, E2)

@ depends on patch @
expression E;
@@

- ACCESS_ONCE(E)
+ READ_ONCE(E)
----

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: davem@davemloft.net
Cc: linux-arch@vger.kernel.org
Cc: mpe@ellerman.id.au
Cc: shuah@kernel.org
Cc: snitzer@redhat.com
Cc: thor.thayer@linux.intel.com
Cc: tj@kernel.org
Cc: viro@zeniv.linux.org.uk
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/1508792849-3115-19-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-04 18:14:32 -08:00
Linus Torvalds
c353f88f3d Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull overlayfs updates from Miklos Szeredi:
 "This fixes d_ino correctness in readdir, which brings overlayfs on par
  with normal filesystems regarding inode number semantics, as long as
  all layers are on the same filesystem.

  There are also some bug fixes, one in particular (random ioctl's
  shouldn't be able to modify lower layers) that touches some vfs code,
  but of course no-op for non-overlay fs"

* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  ovl: fix false positive ESTALE on lookup
  ovl: don't allow writing ioctl on lower layer
  ovl: fix relatime for directories
  vfs: add flags to d_real()
  ovl: cleanup d_real for negative
  ovl: constant d_ino for non-merge dirs
  ovl: constant d_ino across copy up
  ovl: fix readdir error value
  ovl: check snprintf return
2017-09-13 09:11:44 -07:00
Davidlohr Bueso
f808c13fd3 lib/interval_tree: fast overlap detection
Allow interval trees to quickly check for overlaps to avoid unnecesary
tree lookups in interval_tree_iter_first().

As of this patch, all interval tree flavors will require using a
'rb_root_cached' such that we can have the leftmost node easily
available.  While most users will make use of this feature, those with
special functions (in addition to the generic insert, delete, search
calls) will avoid using the cached option as they can do funky things
with insertions -- for example, vma_interval_tree_insert_after().

[jglisse@redhat.com: fix deadlock from typo vm_lock_anon_vma()]
  Link: http://lkml.kernel.org/r/20170808225719.20723-1-jglisse@redhat.com
Link: http://lkml.kernel.org/r/20170719014603.19029-12-dave@stgolabs.net
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Doug Ledford <dledford@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Christian Benvenuti <benve@cisco.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-08 18:26:49 -07:00
Miklos Szeredi
cd91304e71 ovl: fix relatime for directories
Need to treat non-regular overlayfs files the same as regular files when
checking for an atime update.

Add a d_real() flag to make it return the upper dentry for all file types.

Reported-by: "zhangyi (F)" <yi.zhang@huawei.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-09-05 12:53:11 +02:00
Darrick J. Wong
799ea9e9c5 xfs: evict all inodes involved with log redo item
When we introduced the bmap redo log items, we set MS_ACTIVE on the
mountpoint and XFS_IRECOVERY on the inode to prevent unlinked inodes
from being truncated prematurely during log recovery.  This also had the
effect of putting linked inodes on the lru instead of evicting them.

Unfortunately, we neglected to find all those unreferenced lru inodes
and evict them after finishing log recovery, which means that we leak
them if anything goes wrong in the rest of xfs_mountfs, because the lru
is only cleaned out on unmount.

Therefore, evict unreferenced inodes in the lru list immediately
after clearing MS_ACTIVE.

Fixes: 17c12bcd30 ("xfs: when replaying bmap operations, don't let unlinked inodes get reaped")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Cc: viro@ZenIV.linux.org.uk
Reviewed-by: Brian Foster <bfoster@redhat.com>
2017-09-01 10:55:30 -07:00
Linus Torvalds
b8d4c1f9f4 Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc filesystem updates from Al Viro:
 "Assorted normal VFS / filesystems stuff..."

* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  dentry name snapshots
  Make statfs properly return read-only state after emergency remount
  fs/dcache: init in_lookup_hashtable
  minix: Deinline get_block, save 2691 bytes
  fs: Reorder inode_owner_or_capable() to avoid needless
  fs: warn in case userspace lied about modprobe return
2017-07-08 10:50:54 -07:00
Pavel Tatashin
3d375d7859 mm: update callers to use HASH_ZERO flag
Update dcache, inode, pid, mountpoint, and mount hash tables to use
HASH_ZERO, and remove initialization after allocations.  In case of
places where HASH_EARLY was used such as in __pv_init_lock_hash the
zeroed hash table was already assumed, because memblock zeroes the
memory.

CPU: SPARC M6, Memory: 7T
Before fix:
  Dentry cache hash table entries: 1073741824
  Inode-cache hash table entries: 536870912
  Mount-cache hash table entries: 16777216
  Mountpoint-cache hash table entries: 16777216
  ftrace: allocating 20414 entries in 40 pages
  Total time: 11.798s

After fix:
  Dentry cache hash table entries: 1073741824
  Inode-cache hash table entries: 536870912
  Mount-cache hash table entries: 16777216
  Mountpoint-cache hash table entries: 16777216
  ftrace: allocating 20414 entries in 40 pages
  Total time: 3.198s

CPU: Intel Xeon E5-2630, Memory: 2.2T:
Before fix:
  Dentry cache hash table entries: 536870912
  Inode-cache hash table entries: 268435456
  Mount-cache hash table entries: 8388608
  Mountpoint-cache hash table entries: 8388608
  CPU: Physical Processor ID: 0
  Total time: 3.245s

After fix:
  Dentry cache hash table entries: 536870912
  Inode-cache hash table entries: 268435456
  Mount-cache hash table entries: 8388608
  Mountpoint-cache hash table entries: 8388608
  CPU: Physical Processor ID: 0
  Total time: 3.244s

Link: http://lkml.kernel.org/r/1488432825-92126-4-git-send-email-pasha.tatashin@oracle.com
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Babu Moger <babu.moger@oracle.com>
Cc: David Miller <davem@davemloft.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-06 16:24:33 -07:00
Linus Torvalds
9bd42183b9 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
 "The main changes in this cycle were:

   - Add the SYSTEM_SCHEDULING bootup state to move various scheduler
     debug checks earlier into the bootup. This turns silent and
     sporadically deadly bugs into nice, deterministic splats. Fix some
     of the splats that triggered. (Thomas Gleixner)

   - A round of restructuring and refactoring of the load-balancing and
     topology code (Peter Zijlstra)

   - Another round of consolidating ~20 of incremental scheduler code
     history: this time in terms of wait-queue nomenclature. (I didn't
     get much feedback on these renaming patches, and we can still
     easily change any names I might have misplaced, so if anyone hates
     a new name, please holler and I'll fix it.) (Ingo Molnar)

   - sched/numa improvements, fixes and updates (Rik van Riel)

   - Another round of x86/tsc scheduler clock code improvements, in hope
     of making it more robust (Peter Zijlstra)

   - Improve NOHZ behavior (Frederic Weisbecker)

   - Deadline scheduler improvements and fixes (Luca Abeni, Daniel
     Bristot de Oliveira)

   - Simplify and optimize the topology setup code (Lauro Ramos
     Venancio)

   - Debloat and decouple scheduler code some more (Nicolas Pitre)

   - Simplify code by making better use of llist primitives (Byungchul
     Park)

   - ... plus other fixes and improvements"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (103 commits)
  sched/cputime: Refactor the cputime_adjust() code
  sched/debug: Expose the number of RT/DL tasks that can migrate
  sched/numa: Hide numa_wake_affine() from UP build
  sched/fair: Remove effective_load()
  sched/numa: Implement NUMA node level wake_affine()
  sched/fair: Simplify wake_affine() for the single socket case
  sched/numa: Override part of migrate_degrades_locality() when idle balancing
  sched/rt: Move RT related code from sched/core.c to sched/rt.c
  sched/deadline: Move DL related code from sched/core.c to sched/deadline.c
  sched/cpuset: Only offer CONFIG_CPUSETS if SMP is enabled
  sched/fair: Spare idle load balancing on nohz_full CPUs
  nohz: Move idle balancer registration to the idle path
  sched/loadavg: Generalize "_idle" naming to "_nohz"
  sched/core: Drop the unused try_get_task_struct() helper function
  sched/fair: WARN() and refuse to set buddy when !se->on_rq
  sched/debug: Fix SCHED_WARN_ON() to return a value on !CONFIG_SCHED_DEBUG as well
  sched/wait: Disambiguate wq_entry->task_list and wq_head->task_list naming
  sched/wait: Move bit_wait_table[] and related functionality from sched/core.c to sched/wait_bit.c
  sched/wait: Split out the wait_bit*() APIs from <linux/wait.h> into <linux/wait_bit.h>
  sched/wait: Re-adjust macro line continuation backslashes in <linux/wait.h>
  ...
2017-07-03 13:08:04 -07:00
Kees Cook
cc658db47d fs: Reorder inode_owner_or_capable() to avoid needless
Checking for capabilities should be the last operation when performing
access control tests so that PF_SUPERPRIV is set only when it was required
for success (implying that the capability was needed for the operation).

Reported-by: Solar Designer <solar@openwall.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Serge Hallyn <serge@hallyn.com>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-29 20:08:32 -04:00
Jens Axboe
c75b1d9421 fs: add fcntl() interface for setting/getting write life time hints
Define a set of write life time hints:

RWH_WRITE_LIFE_NOT_SET	No hint information set
RWH_WRITE_LIFE_NONE	No hints about write life time
RWH_WRITE_LIFE_SHORT	Data written has a short life time
RWH_WRITE_LIFE_MEDIUM	Data written has a medium life time
RWH_WRITE_LIFE_LONG	Data written has a long life time
RWH_WRITE_LIFE_EXTREME	Data written has an extremely long life time

The intent is for these values to be relative to each other, no
absolute meaning should be attached to these flag names.

Add an fcntl interface for querying these flags, and also for
setting them as well:

F_GET_RW_HINT		Returns the read/write hint set on the
			underlying inode.

F_SET_RW_HINT		Set one of the above write hints on the
			underlying inode.

F_GET_FILE_RW_HINT	Returns the read/write hint set on the
			file descriptor.

F_SET_FILE_RW_HINT	Set one of the above write hints on the
			file descriptor.

The user passes in a 64-bit pointer to get/set these values, and
the interface returns 0/-1 on success/error.

Sample program testing/implementing basic setting/getting of write
hints is below.

Add support for storing the write life time hint in the inode flags
and in struct file as well, and pass them to the kiocb flags. If
both a file and its corresponding inode has a write hint, then we
use the one in the file, if available. The file hint can be used
for sync/direct IO, for buffered writeback only the inode hint
is available.

This is in preparation for utilizing these hints in the block layer,
to guide on-media data placement.

/*
 * writehint.c: get or set an inode write hint
 */
 #include <stdio.h>
 #include <fcntl.h>
 #include <stdlib.h>
 #include <unistd.h>
 #include <stdbool.h>
 #include <inttypes.h>

 #ifndef F_GET_RW_HINT
 #define F_LINUX_SPECIFIC_BASE	1024
 #define F_GET_RW_HINT		(F_LINUX_SPECIFIC_BASE + 11)
 #define F_SET_RW_HINT		(F_LINUX_SPECIFIC_BASE + 12)
 #endif

static char *str[] = { "RWF_WRITE_LIFE_NOT_SET", "RWH_WRITE_LIFE_NONE",
			"RWH_WRITE_LIFE_SHORT", "RWH_WRITE_LIFE_MEDIUM",
			"RWH_WRITE_LIFE_LONG", "RWH_WRITE_LIFE_EXTREME" };

int main(int argc, char *argv[])
{
	uint64_t hint;
	int fd, ret;

	if (argc < 2) {
		fprintf(stderr, "%s: file <hint>\n", argv[0]);
		return 1;
	}

	fd = open(argv[1], O_RDONLY);
	if (fd < 0) {
		perror("open");
		return 2;
	}

	if (argc > 2) {
		hint = atoi(argv[2]);
		ret = fcntl(fd, F_SET_RW_HINT, &hint);
		if (ret < 0) {
			perror("fcntl: F_SET_RW_HINT");
			return 4;
		}
	}

	ret = fcntl(fd, F_GET_RW_HINT, &hint);
	if (ret < 0) {
		perror("fcntl: F_GET_RW_HINT");
		return 3;
	}

	printf("%s: hint %s\n", argv[1], str[hint]);
	close(fd);
	return 0;
}

Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-27 12:05:22 -06:00
Ingo Molnar
2141713616 sched/wait: Standardize 'struct wait_bit_queue' wait-queue entry field name
Rename 'struct wait_bit_queue::wait' to ::wq_entry, to more clearly
name it as a wait-queue entry.

Propagate it to a couple of usage sites where the wait-bit-queue internals
are exposed.

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-20 12:18:28 +02:00
Linus Torvalds
11fbf53d66 Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc vfs updates from Al Viro:
 "Assorted bits and pieces from various people. No common topic in this
  pile, sorry"

* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fs/affs: add rename exchange
  fs/affs: add rename2 to prepare multiple methods
  Make stat/lstat/fstatat pass AT_NO_AUTOMOUNT to vfs_statx()
  fs: don't set *REFERENCED on single use objects
  fs: compat: Remove warning from COMPATIBLE_IOCTL
  remove pointless extern of atime_need_update_rcu()
  fs: completely ignore unknown open flags
  fs: add a VALID_OPEN_FLAGS
  fs: remove _submit_bh()
  fs: constify tree_descr arrays passed to simple_fill_super()
  fs: drop duplicate header percpu-rwsem.h
  fs/affs: bugfix: Write files greater than page size on OFS
  fs/affs: bugfix: enable writes on OFS disks
  fs/affs: remove node generation check
  fs/affs: import amigaffs.h
  fs/affs: bugfix: make symbolic links work again
2017-05-09 09:12:53 -07:00
Masahiro Yamada
6e7c2b4dd3 scripts/spelling.txt: add "intialise(d)" pattern and fix typo instances
Fix typos and add the following to the scripts/spelling.txt:

  intialisation||initialisation
  intialised||initialised
  intialise||initialise

This commit does not intend to change the British spelling itself.

Link: http://lkml.kernel.org/r/1481573103-11329-18-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-08 17:15:13 -07:00
Josef Bacik
563f40019d fs: don't set *REFERENCED on single use objects
By default we set DCACHE_REFERENCED and I_REFERENCED on any dentry or
inode we create.  This is problematic as this means that it takes two
trips through the LRU for any of these objects to be reclaimed,
regardless of their actual lifetime.  With enough pressure from these
caches we can easily evict our working set from page cache with single
use objects.  So instead only set *REFERENCED if we've already been
added to the LRU list.  This means that we've been touched since the
first time we were accessed, and so more likely to need to hang out in
cache.

To illustrate this issue I wrote the following scripts

https://github.com/josefbacik/debug-scripts/tree/master/cache-pressure

on my test box.  It is a single socket 4 core CPU with 16gib of RAM and
I tested on an Intel 2tib NVME drive.  The cache-pressure.sh script
creates a new file system and creates 2 6.5gib files in order to take up
13gib of the 16gib of ram with pagecache.  Then it runs a test program
that reads these 2 files in a loop, and keeps track of how often it has
to read bytes for each loop.  On an ideal system with no pressure we
should have to read 0 bytes indefinitely.  The second thing this script
does is start a fs_mark job that creates a ton of 0 length files,
putting pressure on the system with slab only allocations.  On exit the
script prints out how many bytes were read by the read-file program.
The results are as follows

Without patch:
/mnt/btrfs-test/reads/file1: total read during loops 27262988288
/mnt/btrfs-test/reads/file2: total read during loops 27262976000

With patch:
/mnt/btrfs-test/reads/file2: total read during loops 18640457728
/mnt/btrfs-test/reads/file1: total read during loops 9565376512

This patch results in a 50% reduction of the amount of pages evicted
from our working set.

Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-05-03 11:47:05 -04:00
Jan Kara
08991e83b7 fsnotify: Free fsnotify_mark_connector when there is no mark attached
Currently we free fsnotify_mark_connector structure only when inode /
vfsmount is getting freed. This can however impose noticeable memory
overhead when marks get attached to inodes only temporarily. So free the
connector structure once the last mark is detached from the object.
Since notification infrastructure can be working with the connector
under the protection of fsnotify_mark_srcu, we have to be careful and
free the fsnotify_mark_connector only after SRCU period passes.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2017-04-10 17:37:35 +02:00
Jan Kara
9dd813c15b fsnotify: Move mark list head from object into dedicated structure
Currently notification marks are attached to object (inode or vfsmnt) by
a hlist_head in the object. The list is also protected by a spinlock in
the object. So while there is any mark attached to the list of marks,
the object must be pinned in memory (and thus e.g. last iput() deleting
inode cannot happen). Also for list iteration in fsnotify() to work, we
must hold fsnotify_mark_srcu lock so that mark itself and
mark->obj_list.next cannot get freed. Thus we are required to wait for
response to fanotify events from userspace process with
fsnotify_mark_srcu lock held. That causes issues when userspace process
is buggy and does not reply to some event - basically the whole
notification subsystem gets eventually stuck.

So to be able to drop fsnotify_mark_srcu lock while waiting for
response, we have to pin the mark in memory and make sure it stays in
the object list (as removing the mark waiting for response could lead to
lost notification events for groups later in the list). However we don't
want inode reclaim to block on such mark as that would lead to system
just locking up elsewhere.

This commit is the first in the series that paves way towards solving
these conflicting lifetime needs. Instead of anchoring the list of marks
directly in the object, we anchor it in a dedicated structure
(fsnotify_mark_connector) and just point to that structure from the
object. The following commits will also add spinlock protecting the list
and object pointer to the structure.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2017-04-10 17:37:34 +02:00
Linus Torvalds
101105b171 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull more vfs updates from Al Viro:
 ">rename2() work from Miklos + current_time() from Deepa"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fs: Replace current_fs_time() with current_time()
  fs: Replace CURRENT_TIME_SEC with current_time() for inode timestamps
  fs: Replace CURRENT_TIME with current_time() for inode timestamps
  fs: proc: Delete inode time initializations in proc_alloc_inode()
  vfs: Add current_time() api
  vfs: add note about i_op->rename changes to porting
  fs: rename "rename2" i_op to "rename"
  vfs: remove unused i_op->rename
  fs: make remaining filesystems use .rename2
  libfs: support RENAME_NOREPLACE in simple_rename()
  fs: support RENAME_NOREPLACE for local filesystems
  ncpfs: fix unused variable warning
2016-10-10 20:16:43 -07:00
Linus Torvalds
97d2116708 Merge branch 'work.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs xattr updates from Al Viro:
 "xattr stuff from Andreas

  This completes the switch to xattr_handler ->get()/->set() from
  ->getxattr/->setxattr/->removexattr"

* 'work.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  vfs: Remove {get,set,remove}xattr inode operations
  xattr: Stop calling {get,set,remove}xattr inode operations
  vfs: Check for the IOP_XATTR flag in listxattr
  xattr: Add __vfs_{get,set,remove}xattr helpers
  libfs: Use IOP_XATTR flag for empty directory handling
  vfs: Use IOP_XATTR flag for bad-inode handling
  vfs: Add IOP_XATTR inode operations flag
  vfs: Move xattr_resolve_name to the front of fs/xattr.c
  ecryptfs: Switch to generic xattr handlers
  sockfs: Get rid of getxattr iop
  sockfs: getxattr: Fail with -EOPNOTSUPP for invalid attribute names
  kernfs: Switch to generic xattr handlers
  hfs: Switch to generic xattr handlers
  jffs2: Remove jffs2_{get,set,remove}xattr macros
  xattr: Remove unnecessary NULL attribute name check
2016-10-10 17:11:50 -07:00
Al Viro
f334bcd94b Merge remote-tracking branch 'ovl/misc' into work.misc 2016-10-08 11:00:01 -04:00
Al Viro
33e09f0ee7 Merge branch 'work.iget' into work.misc 2016-10-08 10:44:37 -04:00
Andreas Gruenbacher
d0a5b995a3 vfs: Add IOP_XATTR inode operations flag
The IOP_XATTR inode operations flag in inode->i_opflags indicates that
the inode has xattr support.  The flag is automatically set by
new_inode() on filesystems with xattr support (where sb->s_xattr is
defined), and cleared otherwise.  Filesystems can explicitly clear it
for inodes that should not have xattr support.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-10-07 20:10:42 -04:00
Deepa Dinamani
c2050a454c fs: Replace current_fs_time() with current_time()
current_fs_time() uses struct super_block* as an argument.
As per Linus's suggestion, this is changed to take struct
inode* as a parameter instead. This is because the function
is primarily meant for vfs inode timestamps.
Also the function was renamed as per Arnd's suggestion.

Change all calls to current_fs_time() to use the new
current_time() function instead. current_fs_time() will be
deleted.

Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-27 21:06:22 -04:00
Deepa Dinamani
3cd886666f vfs: Add current_time() api
current_fs_time() is used for inode timestamps.

Change the signature of the function to take inode pointer
instead of superblock as per Linus's suggestion.

Also, move the api under vfs as per the discussion on the
thread: https://lkml.org/lkml/2016/6/9/36 . As per Arnd's
suggestion on the thread, changing the function name.

current_fs_time() will be deleted after all the references
to it are replaced by current_time().

There was a bug reported by kbuild test bot with the change
as some of the calls to current_time() were made before the
super_block was initialized. Catch these accidental assignments
as timespec_trunc() does for wrong granularities. This allows
for the function to work right even in these circumstances.
But, adds a warning to make the user aware of the bug.

A coccinelle script was used to identify all the current
.alloc_inode super_block callbacks that updated inode timestamps.
proc filesystem was the only one that was modifying inode times
as part of this callback. The series includes a patch to fix that.

Note that timespec_trunc() will also be moved to fs/inode.c
in a separate patch when this will need to be revamped for
bounds checking purposes.

Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-27 21:06:20 -04:00
Miklos Szeredi
598e3c8f72 vfs: update ovl inode before relatime check
On overlayfs relatime_need_update() needs inode times to be correct on
overlay inode.  But i_mtime and i_ctime are updated by filesystem code on
underlying inode only, so they will be out-of-date on the overlay inode.

This patch copies the times from the underlying inode if needed.  This
can't be done if called from RCU lookup (link following) but link m/ctime
are not updated by fs, so this is all right.

This patch doesn't change functionality for anything but overlayfs.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-09-16 12:44:20 +02:00
Linus Torvalds
fe64f3283f Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull more vfs updates from Al Viro:
 "Assorted cleanups and fixes.

  In the "trivial API change" department - ->d_compare() losing 'parent'
  argument"

* 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  cachefiles: Fix race between inactivating and culling a cache object
  9p: use clone_fid()
  9p: fix braino introduced in "9p: new helper - v9fs_parent_fid()"
  vfs: make dentry_needs_remove_privs() internal
  vfs: remove file_needs_remove_privs()
  vfs: fix deadlock in file_remove_privs() on overlayfs
  get rid of 'parent' argument of ->d_compare()
  cifs, msdos, vfat, hfs+: don't bother with parent in ->d_compare()
  affs ->d_compare(): don't bother with ->d_inode
  fold _d_rehash() and __d_rehash() together
  fold dentry_rcuwalk_invalidate() into its only remaining caller
2016-08-07 10:01:14 -04:00
Al Viro
8ecfb75216 Merge branch 'for-viro' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs into for-linus 2016-08-03 13:31:51 -04:00
Miklos Szeredi
f0fce87c36 vfs: make dentry_needs_remove_privs() internal
Only used by the vfs.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-08-03 13:57:57 +02:00
Miklos Szeredi
c1892c3776 vfs: fix deadlock in file_remove_privs() on overlayfs
file_remove_privs() is called with inode lock on file_inode(), which
proceeds to calling notify_change() on file->f_path.dentry.  Which triggers
the WARN_ON_ONCE(!inode_is_locked(inode)) in addition to deadlocking later
when ovl_setattr tries to lock the underlying inode again.

Fix this mess by not mixing the layers, but doing everything on underlying
dentry/inode.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: 07a2daab49c5 ("ovl: Copy up underlying inode's ->i_mode to overlay inode")
Cc: <stable@vger.kernel.org>
2016-08-03 13:57:56 +02:00
Vladimir Davydov
05eb6e7263 radix-tree: account nodes to memcg only if explicitly requested
Radix trees may be used not only for storing page cache pages, so
unconditionally accounting radix tree nodes to the current memory cgroup
is bad: if a radix tree node is used for storing data shared among
different cgroups we risk pinning dead memory cgroups forever.

So let's only account radix tree nodes if it was explicitly requested by
passing __GFP_ACCOUNT to INIT_RADIX_TREE.  Currently, we only want to
account page cache entries, so mark mapping->page_tree so.

Fixes: 58e698af4c63 ("radix-tree: account radix_tree_node to memory cgroup")
Link: http://lkml.kernel.org/r/1470057188-7864-1-git-send-email-vdavydov@virtuozzo.com
Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org>	[4.6+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-02 17:31:41 -04:00
Linus Torvalds
a867d7349e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull userns vfs updates from Eric Biederman:
 "This tree contains some very long awaited work on generalizing the
  user namespace support for mounting filesystems to include filesystems
  with a backing store.  The real world target is fuse but the goal is
  to update the vfs to allow any filesystem to be supported.  This
  patchset is based on a lot of code review and testing to approach that
  goal.

  While looking at what is needed to support the fuse filesystem it
  became clear that there were things like xattrs for security modules
  that needed special treatment.  That the resolution of those concerns
  would not be fuse specific.  That sorting out these general issues
  made most sense at the generic level, where the right people could be
  drawn into the conversation, and the issues could be solved for
  everyone.

  At a high level what this patchset does a couple of simple things:

   - Add a user namespace owner (s_user_ns) to struct super_block.

   - Teach the vfs to handle filesystem uids and gids not mapping into
     to kuids and kgids and being reported as INVALID_UID and
     INVALID_GID in vfs data structures.

  By assigning a user namespace owner filesystems that are mounted with
  only user namespace privilege can be detected.  This allows security
  modules and the like to know which mounts may not be trusted.  This
  also allows the set of uids and gids that are communicated to the
  filesystem to be capped at the set of kuids and kgids that are in the
  owning user namespace of the filesystem.

  One of the crazier corner casees this handles is the case of inodes
  whose i_uid or i_gid are not mapped into the vfs.  Most of the code
  simply doesn't care but it is easy to confuse the inode writeback path
  so no operation that could cause an inode write-back is permitted for
  such inodes (aka only reads are allowed).

  This set of changes starts out by cleaning up the code paths involved
  in user namespace permirted mounts.  Then when things are clean enough
  adds code that cleanly sets s_user_ns.  Then additional restrictions
  are added that are possible now that the filesystem superblock
  contains owner information.

  These changes should not affect anyone in practice, but there are some
  parts of these restrictions that are changes in behavior.

   - Andy's restriction on suid executables that does not honor the
     suid bit when the path is from another mount namespace (think
     /proc/[pid]/fd/) or when the filesystem was mounted by a less
     privileged user.

   - The replacement of the user namespace implicit setting of MNT_NODEV
     with implicitly setting SB_I_NODEV on the filesystem superblock
     instead.

     Using SB_I_NODEV is a stronger form that happens to make this state
     user invisible.  The user visibility can be managed but it caused
     problems when it was introduced from applications reasonably
     expecting mount flags to be what they were set to.

  There is a little bit of work remaining before it is safe to support
  mounting filesystems with backing store in user namespaces, beyond
  what is in this set of changes.

   - Verifying the mounter has permission to read/write the block device
     during mount.

   - Teaching the integrity modules IMA and EVM to handle filesystems
     mounted with only user namespace root and to reduce trust in their
     security xattrs accordingly.

   - Capturing the mounters credentials and using that for permission
     checks in d_automount and the like.  (Given that overlayfs already
     does this, and we need the work in d_automount it make sense to
     generalize this case).

  Furthermore there are a few changes that are on the wishlist:

   - Get all filesystems supporting posix acls using the generic posix
     acls so that posix_acl_fix_xattr_from_user and
     posix_acl_fix_xattr_to_user may be removed.  [Maintainability]

   - Reducing the permission checks in places such as remount to allow
     the superblock owner to perform them.

   - Allowing the superblock owner to chown files with unmapped uids and
     gids to something that is mapped so the files may be treated
     normally.

  I am not considering even obvious relaxations of permission checks
  until it is clear there are no more corner cases that need to be
  locked down and handled generically.

  Many thanks to Seth Forshee who kept this code alive, and putting up
  with me rewriting substantial portions of what he did to handle more
  corner cases, and for his diligent testing and reviewing of my
  changes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (30 commits)
  fs: Call d_automount with the filesystems creds
  fs: Update i_[ug]id_(read|write) to translate relative to s_user_ns
  evm: Translate user/group ids relative to s_user_ns when computing HMAC
  dquot: For now explicitly don't support filesystems outside of init_user_ns
  quota: Handle quota data stored in s_user_ns in quota_setxquota
  quota: Ensure qids map to the filesystem
  vfs: Don't create inodes with a uid or gid unknown to the vfs
  vfs: Don't modify inodes with a uid or gid unknown to the vfs
  cred: Reject inodes with invalid ids in set_create_file_as()
  fs: Check for invalid i_uid in may_follow_link()
  vfs: Verify acls are valid within superblock's s_user_ns.
  userns: Handle -1 in k[ug]id_has_mapping when !CONFIG_USER_NS
  fs: Refuse uid/gid changes which don't map into s_user_ns
  selinux: Add support for unprivileged mounts from user namespaces
  Smack: Handle labels consistently in untrusted mounts
  Smack: Add support for unprivileged mounts from user namespaces
  fs: Treat foreign mounts as nosuid
  fs: Limit file caps to the user namespace of the super block
  userns: Remove the now unnecessary FS_USERNS_DEV_MOUNT flag
  userns: Remove implicit MNT_NODEV fragility.
  ...
2016-07-29 15:54:19 -07:00
Dave Chinner
6c60d2b574 fs/fs-writeback.c: add a new writeback list for sync
wait_sb_inodes() currently does a walk of all inodes in the filesystem
to find dirty one to wait on during sync.  This is highly inefficient
and wastes a lot of CPU when there are lots of clean cached inodes that
we don't need to wait on.

To avoid this "all inode" walk, we need to track inodes that are
currently under writeback that we need to wait for.  We do this by
adding inodes to a writeback list on the sb when the mapping is first
tagged as having pages under writeback.  wait_sb_inodes() can then walk
this list of "inodes under IO" and wait specifically just for the inodes
that the current sync(2) needs to wait for.

Define a couple helpers to add/remove an inode from the writeback list
and call them when the overall mapping is tagged for or cleared from
writeback.  Update wait_sb_inodes() to walk only the inodes under
writeback due to the sync.

With this change, filesystem sync times are significantly reduced for
fs' with largely populated inode caches and otherwise no other work to
do.  For example, on a 16xcpu 2GHz x86-64 server, 10TB XFS filesystem
with a ~10m entry inode cache, sync times are reduced from ~7.3s to less
than 0.1s when the filesystem is fully clean.

Link: http://lkml.kernel.org/r/1466594593-6757-2-git-send-email-bfoster@redhat.com
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Tested-by: Holger Hoffstätte <holger.hoffstaette@applied-asynchrony.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00