msm-4.14

mirror of https://github.com/rd-stuffs/msm-4.14.git synced 2025-03-27 14:44:21 +08:00

Author	SHA1	Message	Date
Scott Feldman	8e05fd7166	fib: hook IPv4 fib for hardware offload Call into the switchdev driver any time an IPv4 fib entry is added/modified/deleted from the kernel's FIB. The switchdev driver may or may not install the route to the offload device. In the case where the driver tries to install the route and something goes wrong (device's routing table is full, etc), then all of the offloaded routes will be flushed from the device, route forwarding falls back to the kernel, and no more routes are offloading. We can refine this logic later. For now, use the simplist model of offloading routes up to the point of failure, and then on failure, undo everything and mark IPv4 offloading disabled. Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 00:24:58 -05:00
Scott Feldman	448b128a14	ipv4: add net bool fib_offload_disabled If something goes wrong with IPv4 FIB offload, mark entire net offload disabled. This is brute force policy to basically shut down IPv4 FIB offload permanently if there is a problem offloading any route to an external device. We can refine the policy in the future, to handle failures on a per-device or per-route basis, but for now, this policy is per-net. What we're trying to avoid is an inconsistent split between the kernel's FIB and the offload device's FIB. We don't want the device to fwd a pkt inconsitent with what the kernel would do. An example of a split is if device has 10.0.0.0/16 and kernel has 10.0.0.0/16 and 10.0.0.0/24, the device wouldn't see the longest prefix 10.0.0.0/24 and potentially forward pkts incorrectly. Limited capacity or limited capability are two ways a route may fail to install to the offload device. We'll not differentiate between failures at this time, and treat any failure as fatal and mark the net as fib_offload_disabled. Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 00:24:58 -05:00
Scott Feldman	b5d6fbdeed	switchdev: implement IPv4 fib ndo wrappers Flesh out ndo wrappers to call into device driver. To call into device driver, the wrapper must interate over route's nexthops to ensure all nexthop devs belong to the same switch device. Currently, there is no support for route's nexthops spanning offloaded and non-offloaded devices, or spanning ports of multiple offload devices. Since switch device ports may be stacked under virtual interfaces (bonds and/or bridges), and the route's nexthop may be on the virtual interface, the wrapper will traverse the nexthop dev down to the base dev. It's the base dev that's passed to the switchdev driver's ndo ops. Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 00:24:58 -05:00
Scott Feldman	104616e74e	switchdev: don't support custom ip rules, for now Keep switchdev FIB offload model simple for now and don't allow custom ip rules. Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 00:24:58 -05:00
Scott Feldman	5e8d90497d	switchdev: add IPv4 fib ndo ops wrappers Add IPv4 fib ndo wrapper funcs and stub them out for now. Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 00:24:58 -05:00
Scott Feldman	4586f1bb91	netdevice: add IPv4 fib add/del ops Add two new ndo ops for IPv4 fib offload support, add and del. Add uses modifiy semantics if fib entry already offloaded. Drivers implementing the new ndo ops will return err<0 if programming device fails, for example if device's tables are full. Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 00:24:58 -05:00
Scott Feldman	37ed949369	rtnetlink: add RTNH_F_EXTERNAL flag for fib offload Add new RTNH_F_EXTERNAL flag to mark fib entries offloaded externally, for example to a switchdev switch device. Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 00:24:57 -05:00
Eric Dumazet	24d2e4a507	tg3: use napi_complete_done() Using napi_complete_done() instead of napi_complete() allows us to use /sys/class/net/ethX/gro_flush_timeout GRO layer can aggregate more packets if the flush is delayed a bit, without having to set too big coalescing parameters that impact latencies. Tested: lpx:~# echo 0 >/sys/class/net/eth1/gro_flush_timeout lpx:~# sar -n DEV 1 10 \| grep eth1 10:36:25 AM eth1 81290.00 40617.00 120479.67 2777.01 0.00 0.00 0.00 10:36:26 AM eth1 81283.00 40608.00 120481.81 2778.13 0.00 0.00 1.00 10:36:27 AM eth1 81304.00 40639.00 120518.42 2778.28 0.00 0.00 0.00 10:36:28 AM eth1 81255.00 40605.00 120437.34 2775.95 0.00 0.00 1.00 10:36:29 AM eth1 81306.00 40630.00 120521.44 2777.70 0.00 0.00 0.00 10:36:30 AM eth1 81286.00 40564.00 120480.20 2773.31 0.00 0.00 0.00 10:36:31 AM eth1 81256.00 40599.00 120438.81 2776.27 0.00 0.00 0.00 10:36:32 AM eth1 81287.00 40594.00 120480.69 2776.69 0.00 0.00 0.00 10:36:33 AM eth1 81279.00 40601.00 120478.53 2775.84 0.00 0.00 0.00 10:36:34 AM eth1 81277.00 40610.00 120476.94 2776.25 0.00 0.00 0.00 Average: eth1 81282.30 40606.70 120479.39 2776.54 0.00 0.00 0.20 lpx:~# echo 13000 >/sys/class/net/eth1/gro_flush_timeout lpx:~# sar -n DEV 1 10 \| grep eth1 10:36:43 AM eth1 81257.00 7747.00 120437.44 530.00 0.00 0.00 0.00 10:36:44 AM eth1 81278.00 7748.00 120480.00 529.85 0.00 0.00 0.00 10:36:45 AM eth1 81282.00 7752.00 120479.09 531.09 0.00 0.00 0.00 10:36:46 AM eth1 81282.00 7751.00 120478.80 530.90 0.00 0.00 0.00 10:36:47 AM eth1 81276.00 7745.00 120478.31 529.64 0.00 0.00 0.00 10:36:48 AM eth1 81278.00 7747.00 120478.50 529.81 0.00 0.00 0.00 10:36:49 AM eth1 81282.00 7749.00 120478.88 530.01 0.00 0.00 0.00 10:36:50 AM eth1 81284.00 7751.00 120481.52 530.20 0.00 0.00 0.00 10:36:51 AM eth1 81299.00 7769.00 120481.74 533.81 0.00 0.00 0.00 10:36:52 AM eth1 81281.00 7748.00 120478.62 529.96 0.00 0.00 0.00 Average: eth1 81279.90 7750.70 120475.29 530.53 0.00 0.00 0.00 Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 00:20:09 -05:00
David S. Miller	050f7dcd27	Merge branch 'dsa-next' Florian Fainelli says: ==================== net: dsa: code re-organization This pull request contains the first part of the patches required to implement the grand plan detailed here: http://www.spinics.net/lists/netdev/msg295942.html These are mostly code re-organization and function bodies re-arrangement to allow different callers of lower-level initialization functions for 'struct dsa_switch' and 'struct dsa_switch_tree' to be later introduced. There is no functional code change at this point. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 00:18:29 -05:00
Florian Fainelli	c86e59b9e6	net: dsa: extract dsa switch tree setup and removal Extract the core logic that setups a 'struct dsa_switch_tree' and removes it, update dsa_probe() and dsa_remove() to use the two helper functions. This will be useful to allow for other callers to setup this structure differently. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 00:18:20 -05:00
Florian Fainelli	5929903103	net: dsa: let switches specify their tagging protocol In order to support the new DSA device driver model, a dsa_switch should be able to advertise the type of tagging protocol supported by the underlying switch device. This also removes constraints on how tagging can be stacked to each other. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 00:18:20 -05:00
Florian Fainelli	df197195a5	net: dsa: split dsa_switch_setup into two functions Split the part of dsa_switch_setup() which is responsible for allocating and initializing a 'struct dsa_switch' and the part which is doing a given switch device setup and slave network device creation. This is a preliminary change to allow a separate caller of dsa_switch_setup_one() which may have externally initialized the dsa_switch structure, outside of dsa_switch_setup(). Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 00:18:20 -05:00
Florian Fainelli	b324c07ac4	net: dsa: allow deferred probing In preparation for allowing a different model to register DSA switches, update dsa_of_probe() and dsa_probe() to return -EPROBE_DEFER where appropriate. Failure to find a phandle or Device Tree property is still fatal, but looking up the internal device structure associated with a Device Tree node is something that might need to be delayed based on driver probe ordering. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 00:18:20 -05:00
Florian Fainelli	f1a26a062f	net: dsa: update dsa_of_{probe, remove} to use a device pointer In preparation for allowing a different mechanism to register DSA switch devices and driver, update dsa_of_probe and dsa_of_remove to take a struct device pointer since neither of these two functions uses the struct platform_device pointer. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 00:18:20 -05:00
Eric Dumazet	496127290f	inet_diag: remove duplicate code from inet_twsk_diag_dump() timewait sockets now share a common base with established sockets. inet_twsk_diag_dump() can use inet_diag_bc_sk() instead of duplicating code, granted that inet_diag_bc_sk() does proper userlocks initialization. twsk_build_assert() will catch any future changes that could break the assumptions. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-05 22:55:44 -05:00
Eric Dumazet	6c09fa09d4	tcp: align tcp_xmit_size_goal() on tcp_tso_autosize() With some mss values, it is possible tcp_xmit_size_goal() puts one segment more in TSO packet than tcp_tso_autosize(). We send then one TSO packet followed by one single MSS. It is not a serious bug, but we can do slightly better, especially for drivers using netif_set_gso_max_size() to lower gso_max_size. Using same formula avoids these corner cases and makes tcp_xmit_size_goal() a bit faster. Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: 605ad7f184b6 ("tcp: refine TSO autosizing") Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-05 22:31:12 -05:00
Stefan Agner	e9647d1e74	net: fec: fix unbalanced clk disable on driver unbind When the driver is removed (e.g. using unbind through sysfs), the clocks get disabled twice, once on fec_enet_close and once on fec_drv_remove. Since the clocks are enabled only once, this leads to a warning: WARNING: CPU: 0 PID: 402 at drivers/clk/clk.c:992 clk_core_disable+0x64/0x68() Remove the call to fec_enet_clk_enable in fec_drv_remove to balance the clock enable/disable calls again. This has been introduce by e8fcfcd5684a ("net: fec: optimize the clock management to save power"). Signed-off-by: Stefan Agner <stefan@agner.ch> Acked-by: Fugang Duan <B38611@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-05 22:23:33 -05:00
Punnaiah Choudary Kalluri	d941bebf5e	net: macb: Correct the MID field length value The latest spec "I-IPA01-0266-USR Rev 10" limit the MID field length to 12 bit value. For previous versions it is 16 bit value. This change will not break the backward compatibility as the latest ID value is 7 and with in the 12 bit value limit. Signed-off-by: Punnaiah Choudary Kalluri <punnaia@xilinx.com> Signed-off-by: Michal Simek <michal.simek@xilinx.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-05 22:19:24 -05:00
Tobias Waldekranz	f50724cdfe	net: gianfar: correctly determine the number of queue groups eTSEC of-nodes may have children which are not queue-group nodes. For example new-style fixed-phy declarations. These where incorrectly assumed to be additional queue-groups. Change the search to filter out any nodes which are not queue-groups, or have been disabled. Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-05 22:18:32 -05:00
Jeff Kirsher	e815665e1a	i40e: Fix mismatching type for ioremap_len As pointed out by Ben Hutchings, ioremap uses unsigned long as its parameter type, so we should be using that instead of u32 or int. Reported-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-05 22:11:38 -05:00
Erik Hugne	d0f91938be	tipc: add ip/udp media type The ip/udp bearer can be configured in a point-to-point mode by specifying both local and remote ip/hostname, or it can be enabled in multicast mode, where links are established to all tipc nodes that have joined the same multicast group. The multicast IP address is generated based on the TIPC network ID, but can be overridden by using another multicast address as remote ip. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-05 22:08:42 -05:00
Erik Hugne	948fa2d115	tipc: increase size of tipc discovery messages The payload area following the TIPC discovery message header is an opaque area defined by the media. INT_H_SIZE was enough for Ethernet/IB/IPv4 but needs to be expanded to carry IPv6 addressing information. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-05 22:08:42 -05:00
David S. Miller	9d73b42bbf	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following patchset contains Netfilter/IPVS fixes for your net tree, they are: 1) Don't truncate ethernet protocol type to u8 in nft_compat, from Arturo Borrero. 2) Fix several problems in the addition/deletion of elements in nf_tables. 3) Fix module refcount leak in ip_vs_sync, from Julian Anastasov. 4) Fix a race condition in the abort path in the nf_tables transaction infrastructure. Basically aborted rules can show up as active rules until changes are unrolled, oneliner from Patrick McHardy. 5) Check for overflows in the data area of the rule, also from Patrick. 6) Fix off-by-one in the per-rule user data size field. This introduces a new nft_userdata structure that is placed at the beginning of the user data area that contains the length to save some bits from the rule and we only need one bit to indicate its presence, from Patrick. 7) Fix rule replacement error path, the replaced rule is deleted on error instead of leaving it in place. This has been fixed by relying on the abort path to undo the incomplete replacement. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-05 21:51:07 -05:00
Alexander Drozdov	3e32e733d1	ipv4: ip_check_defrag should not assume that skb_network_offset is zero ip_check_defrag() may be used by af_packet to defragment outgoing packets. skb_network_offset() of af_packet's outgoing packets is not zero. Signed-off-by: Alexander Drozdov <al.drozdov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-05 21:43:48 -05:00
WANG Cong	33f8b9ecdb	net_sched: move tp->root allocation into fw_init() Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-05 21:30:44 -05:00
WANG Cong	a05c2d112c	net_sched: move tp->root allocation into route4_init() Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-05 21:30:44 -05:00
Florian Fainelli	386668a61f	net: bcmgenet: properly disable password matching bcmgenet_set_wol() correctly sets MPD_PW_EN when a password is specified to match magic packets against, however, when we switch from a password-matching to a matching without password we would leave this bit turned on, and GENET would only match magic packets with passwords. This can be reproduced using the following sequence: ethtool -s eth0 wol g ethtool -s eth0 wol s sopass 00:11:22:33:44:55 ethtool -s eth0 wol g The simple fix is to clear the MPD_PWD_EN bit when WAKE_MAGICSECURE is not set. Fixes: c51de7f3976b ("net: bcmgenet: add Wake-on-LAN support code") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-05 21:15:21 -05:00
David S. Miller	2490c65fe8	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next Jeff Kirsher says: ==================== This series contains updates to i40e only. Greg provides fixes for the NPAR transmit scheduler where the driver initialization caused the BW configurations to not take effect, so use a BW configuration read and write back to "kick" the transmit scheduler into action. Fixes the ethtool offline test, where we were not actually taking the device offline before doing the testing. Matt modifies the get and set LED functions so they ignore activity LEDs since we are required to blink the link LEDs only. Neerav provides a workaround for whenever a DCBX configuration is changed, where the firmware doe not set the operational status bit of the application TLV status as returned from the "Get CEE DCBX Oper Cfg" admin queue command. So remove the check for the operational and sync bits of the application TLV status until a firmware fix is provided. Shannon changes the driver to grab the NVM devstarter version and not the image version, since it is the more useful version and is what should be displayed. Moves the IRQ tracking setup and tear down into the same routines that do the IRQ setup and tear down. This keeps like activities together and allows us to track exactly the number of vectors reserved from the OS, which may be fewer than are available from the hardware. Jesse provides a fix to use a more portable sign extension by replacing 0xffff.... with ~(u64)0 or ~(u32)0. Also fixes XPS mask when resetting, where the driver would accidentally clear the XPS mask for all queues back to 0. This caused higher CPU utilization and had some other performance impacts for transmit tests. Cleans up some whitespace formatting. Catherine provides a fix where some firmware versions are incorrectly reporting a breakout cable as PHY type 0x3 when it should be 0x16 (I40E_PHY_TYPE_10GBASE_SFPP_CU). Adds the 10G and 40G AOC PHY types to the case statement in get_media_type and ethtool get_settings so that the correct information gets reported back to the user. Anjali provides IOREMAP changes for future device support, where we do not want to map the whole CSR space since some of it is mapped by other drivers with different mapping methods. Mitch changes the i40e driver to not "spam" the system log with messages about VF VSI when VFs are created and when they are reset to reduce user annoyance. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-05 21:07:15 -05:00
Stephen Rothwell	4b5edb2f4a	mpls: using vzalloc requires including vmalloc.h Fixes this build error: net/mpls/af_mpls.c: In function 'resize_platform_label_table': net/mpls/af_mpls.c:767:4: error: implicit declaration of function 'vzalloc' [-Werror=implicit-function-declaration] labels = vzalloc(size); ^ Fixes: 7720c01f3f59 ("mpls: Add a sysctl to control the size of the mpls label table") Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-05 21:01:33 -05:00
Quentin Casasnovas	dd9ef135e3	Btrfs:__add_inode_ref: out of bounds memory read when looking for extended ref. Improper arithmetics when calculting the address of the extended ref could lead to an out of bounds memory read and kernel panic. Signed-off-by: Quentin Casasnovas <quentin.casasnovas@oracle.com> Reviewed-by: David Sterba <dsterba@suse.cz> cc: stable@vger.kernel.org # v3.7+ Signed-off-by: Chris Mason <clm@fb.com>	2015-03-05 17:28:33 -08:00
Filipe Manana	3a8b36f378	Btrfs: fix data loss in the fast fsync path When using the fast file fsync code path we can miss the fact that new writes happened since the last file fsync and therefore return without waiting for the IO to finish and write the new extents to the fsync log. Here's an example scenario where the fsync will miss the fact that new file data exists that wasn't yet durably persisted: 1. fs_info->last_trans_committed == N - 1 and current transaction is transaction N (fs_info->generation == N); 2. do a buffered write; 3. fsync our inode, this clears our inode's full sync flag, starts an ordered extent and waits for it to complete - when it completes at btrfs_finish_ordered_io(), the inode's last_trans is set to the value N (via btrfs_update_inode_fallback -> btrfs_update_inode -> btrfs_set_inode_last_trans); 4. transaction N is committed, so fs_info->last_trans_committed is now set to the value N and fs_info->generation remains with the value N; 5. do another buffered write, when this happens btrfs_file_write_iter sets our inode's last_trans to the value N + 1 (that is fs_info->generation + 1 == N + 1); 6. transaction N + 1 is started and fs_info->generation now has the value N + 1; 7. transaction N + 1 is committed, so fs_info->last_trans_committed is set to the value N + 1; 8. fsync our inode - because it doesn't have the full sync flag set, we only start the ordered extent, we don't wait for it to complete (only in a later phase) therefore its last_trans field has the value N + 1 set previously by btrfs_file_write_iter(), and so we have: inode->last_trans <= fs_info->last_trans_committed (N + 1) (N + 1) Which made us not log the last buffered write and exit the fsync handler immediately, returning success (0) to user space and resulting in data loss after a crash. This can actually be triggered deterministically and the following excerpt from a testcase I made for xfstests triggers the issue. It moves a dummy file across directories and then fsyncs the old parent directory - this is just to trigger a transaction commit, so moving files around isn't directly related to the issue but it was chosen because running 'sync' for example does more than just committing the current transaction, as it flushes/waits for all file data to be persisted. The issue can also happen at random periods, since the transaction kthread periodicaly commits the current transaction (about every 30 seconds by default). The body of the test is: _scratch_mkfs >> $seqres.full 2>&1 _init_flakey _mount_flakey # Create our main test file 'foo', the one we check for data loss. # By doing an fsync against our file, it makes btrfs clear the 'needs_full_sync' # bit from its flags (btrfs inode specific flags). $XFS_IO_PROG -f -c "pwrite -S 0xaa 0 8K" \ -c "fsync" $SCRATCH_MNT/foo \| _filter_xfs_io # Now create one other file and 2 directories. We will move this second file # from one directory to the other later because it forces btrfs to commit its # currently open transaction if we fsync the old parent directory. This is # necessary to trigger the data loss bug that affected btrfs. mkdir $SCRATCH_MNT/testdir_1 touch $SCRATCH_MNT/testdir_1/bar mkdir $SCRATCH_MNT/testdir_2 # Make sure everything is durably persisted. sync # Write more 8Kb of data to our file. $XFS_IO_PROG -c "pwrite -S 0xbb 8K 8K" $SCRATCH_MNT/foo \| _filter_xfs_io # Move our 'bar' file into a new directory. mv $SCRATCH_MNT/testdir_1/bar $SCRATCH_MNT/testdir_2/bar # Fsync our first directory. Because it had a file moved into some other # directory, this made btrfs commit the currently open transaction. This is # a condition necessary to trigger the data loss bug. $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/testdir_1 # Now fsync our main test file. If the fsync succeeds, we expect the 8Kb of # data we wrote previously to be persisted and available if a crash happens. # This did not happen with btrfs, because of the transaction commit that # happened when we fsynced the parent directory. $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foo # Simulate a crash/power loss. _load_flakey_table $FLAKEY_DROP_WRITES _unmount_flakey _load_flakey_table $FLAKEY_ALLOW_WRITES _mount_flakey # Now check that all data we wrote before are available. echo "File content after log replay:" od -t x1 $SCRATCH_MNT/foo status=0 exit The expected golden output for the test, which is what we get with this fix applied (or when running against ext3/4 and xfs), is: wrote 8192/8192 bytes at offset 0 XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) wrote 8192/8192 bytes at offset 8192 XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) File content after log replay: 0000000 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa * 0020000 bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb * 0040000 Without this fix applied, the output shows the test file does not have the second 8Kb extent that we successfully fsynced: wrote 8192/8192 bytes at offset 0 XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) wrote 8192/8192 bytes at offset 8192 XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) File content after log replay: 0000000 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa * 0020000 So fix this by skipping the fsync only if we're doing a full sync and if the inode's last_trans is <= fs_info->last_trans_committed, or if the inode is already in the log. Also remove setting the inode's last_trans in btrfs_file_write_iter since it's useless/unreliable. Also because btrfs_file_write_iter no longer sets inode->last_trans to fs_info->generation + 1, don't set last_trans to 0 if we bail out and don't bail out if last_trans is 0, otherwise something as simple as the following example wouldn't log the second write on the last fsync: 1. write to file 2. fsync file 3. fsync file \|--> btrfs_inode_in_log() returns true and it set last_trans to 0 4. write to file \|--> btrfs_file_write_iter() no longers sets last_trans, so it remained with a value of 0 5. fsync \|--> inode->last_trans == 0, so it bails out without logging the second write A test case for xfstests will be sent soon. CC: <stable@vger.kernel.org> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>	2015-03-05 17:28:32 -08:00
Josef Bacik	f5c0a12280	Btrfs: remove extra run_delayed_refs in update_cowonly_root This got added with my dirty_bgs patch, it's not needed. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Chris Mason <clm@fb.com>	2015-03-05 17:28:30 -08:00
Rafael J. Wysocki	e178e7d6df	Merge branches 'pm-domains' and 'pm-cpufreq' * pm-domains: PM / Domains: cleanup: rename gpd -> genpd in debugfs interface * pm-cpufreq: cpufreq: ppc: Add missing #include <asm/smp.h>	2015-03-06 01:29:31 +01:00
Rafael J. Wysocki	1e3e770cfb	Merge branch 'acpi-video' * acpi-video: ACPI / video: Propagate the error code for acpi_video_register ACPI / video: Load the module even if ACPI is disabled	2015-03-06 01:29:16 +01:00
Rafael J. Wysocki	79d223646b	Merge branch 'irq-pm' * irq-pm: genirq / PM: describe IRQF_COND_SUSPEND tty: serial: atmel: rework interrupt and wakeup handling watchdog: at91sam9: request the irq with IRQF_NO_SUSPEND clk: at91: implement suspend/resume for the PMC irqchip rtc: at91rm9200: rework wakeup and interrupt handling rtc: at91sam9: rework wakeup and interrupt handling PM / wakeup: export pm_system_wakeup symbol genirq / PM: Add flag for shared NO_SUSPEND interrupt lines genirq / PM: better describe IRQF_NO_SUSPEND semantics	2015-03-06 01:29:05 +01:00
Mark Rutland	7438b633a6	genirq / PM: describe IRQF_COND_SUSPEND With certain restrictions it is possible for a wakeup device to share an IRQ with an IRQF_NO_SUSPEND user, and the warnings introduced by commit cab303be91dc47942bc25de33dc1140123540800 are spurious. The new IRQF_COND_SUSPEND flag allows drivers to tell the core when these restrictions are met, allowing spurious warnings to be silenced. This patch documents how IRQF_COND_SUSPEND is expected to be used, updating some of the text now made invalid by its addition. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-03-06 01:28:14 +01:00
Pablo Neira Ayuso	1cae565e8b	netfilter: nf_tables: limit maximum table name length to 32 bytes Set the same as we use for chain names, it should be enough. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-06 01:21:21 +01:00
Pablo Neira Ayuso	f04e599e20	netfilter: nf_tables: consolidate Kconfig options Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-06 01:21:15 +01:00
Patrick McHardy	1a1e1a1219	netfilter: nf_tables: cleanup nf_tables.h The transaction related definitions are squeezed in between the rule and expression definitions, which are closely related and should be next to each other. The transaction definitions actually don't belong into that file at all since it defines the global objects and API and transactions are internal to nf_tables_api, but for now simply move them to a seperate section. Similar, the chain types are in between a set of registration functions, they belong to the chain section. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-06 01:21:13 +01:00
Patrick McHardy	354bf5a0d7	netfilter: nf_tables: consolidate tracing invocations * JUMP and GOTO are equivalent except for JUMP pushing the current context to the stack * RETURN and implicit RETURN (CONTINUE) are equivalent except that the logged rule number differs Result: nft_do_chain \| -112 1 function changed, 112 bytes removed, diff: -112 Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-06 01:21:12 +01:00
Patrick McHardy	01ef16c2dd	netfilter: nf_tables: minor tracing cleanups The tracing code is squeezed between multiple related parts of the evaluation code, move it out. Also add an inline wrapper for the reoccuring test for skb->nf_trace. Small code savings in nft_do_chain(): nft_trace_packet \| -137 nft_do_chain \| -8 2 functions changed, 145 bytes removed, diff: -145 net/netfilter/nf_tables_core.c: __nft_trace_packet \| +137 1 function changed, 137 bytes added, diff: +137 net/netfilter/nf_tables_core.o: 3 functions changed, 137 bytes added, 145 bytes removed, diff: -8 Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-06 01:21:11 +01:00
Pablo Neira Ayuso	43270b1bc5	netfilter: ipt_CLUSTERIP: deprecate it in favour of xt_cluster xt_cluster supersedes ipt_CLUSTERIP since it can be also used in gateway configurations (not only from the backend side). ipt_CLUSTER is also known to leak the netdev that it uses on device removal, which requires a rather large fix to workaround the problem: http://patchwork.ozlabs.org/patch/358629/ So let's deprecate this so we can probably kill code this in the future. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-06 01:21:05 +01:00
Boris BREZILLON	2c7af5ba65	tty: serial: atmel: rework interrupt and wakeup handling The IRQ line connected to the DBGU UART is often shared with a timer device which request the IRQ with IRQF_NO_SUSPEND. Since the UART driver is correctly disabling IRQs when entering suspend we can safely request the IRQ with IRQF_COND_SUSPEND so that irq core will not complain about mixing IRQF_NO_SUSPEND and !IRQF_NO_SUSPEND. Rework the interrupt handler to wake the system up when an interrupt happens on the DEBUG_UART while the system is suspended. Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Reviewed-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-03-06 00:46:44 +01:00
Boris BREZILLON	d677772e13	watchdog: at91sam9: request the irq with IRQF_NO_SUSPEND The watchdog interrupt (only used when activating software watchdog) shouldn't be suspended when entering suspend mode, because it is shared with a timer device (which request the line with IRQF_NO_SUSPEND) and once the watchdog "Mode Register" has been written, it cannot be changed (which means we cannot disable the watchdog interrupt when entering suspend). Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Reviewed-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Acked-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-03-06 00:46:31 +01:00
Rafael J. Wysocki	eef16e4362	Merge branch 'suspend-to-idle' * suspend-to-idle: cpuidle / sleep: Use broadcast timer for states that stop local timer cpuidle: Clean up fallback handling in cpuidle_idle_call() cpuidle / sleep: Do sanity checks in cpuidle_enter_freeze() too idle / sleep: Avoid excessive disabling and enabling interrupts	2015-03-05 23:14:51 +01:00
Rafael J. Wysocki	8204680c7b	Merge branch 'acpi-resources' * acpi-resources: x86/PCI/ACPI: Relax ACPI resource descriptor checks to work around BIOS bugs x86/PCI/ACPI: Ignore resources consumed by host bridge itself PCI: versatile: Update for list_for_each_entry() API change	2015-03-05 23:14:40 +01:00
Rafael J. Wysocki	ef2b22ac54	cpuidle / sleep: Use broadcast timer for states that stop local timer Commit 381063133246 (PM / sleep: Re-implement suspend-to-idle handling) overlooked the fact that entering some sufficiently deep idle states by CPUs may cause their local timers to stop and in those cases it is necessary to switch over to a broadcast timer prior to entering the idle state. If the cpuidle driver in use does not provide the new ->enter_freeze callback for any of the idle states, that problem affects suspend-to-idle too, but it is not taken into account after the changes made by commit 381063133246. Fix that by changing the definition of cpuidle_enter_freeze() and re-arranging of the code in cpuidle_idle_call(), so the former does not call cpuidle_enter() any more and the fallback case is handled by cpuidle_idle_call() directly. Fixes: 381063133246 (PM / sleep: Re-implement suspend-to-idle handling) Reported-and-tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>	2015-03-05 23:13:19 +01:00
Mark Salter	b0ab0afaeb	net: eth: xgene: fix booting with devicetree Commit de7b5b3d790a ("net: eth: xgene: change APM X-Gene SoC platform ethernet to support ACPI") breaks booting with devicetree with UEFI firmware. In that case, I get: Unhandled fault: synchronous external abort (0x96000010) at 0xfffffc0000620010 Internal error: : 96000010 [#1] SMP Modules linked in: vfat fat xfs libcrc32c ahci_xgene libahci_platform libahci CPU: 7 PID: 634 Comm: NetworkManager Not tainted 4.0.0-rc1+ #4 Hardware name: AppliedMicro Mustang/Mustang, BIOS 1.1.0-rh-0.14 Mar 1 2015 task: fffffe03d4c7e100 ti: fffffe03d4e24000 task.ti: fffffe03d4e24000 PC is at xgene_enet_rd_mcx_mac.isra.11+0x58/0xd4 LR is at xgene_gmac_tx_enable+0x2c/0x50 pc : [<fffffe000069d6fc>] lr : [<fffffe000069dcc4>] pstate: 80000145 sp : fffffe03d4e27590 x29: fffffe03d4e27590 x28: 0000000000000000 x27: fffffe03d4e277c0 x26: fffffe03da8fda10 x25: fffffe03d4e2760c x24: fffffe03d49e28c0 x23: fffffc0000620004 x22: 0000000000000000 x21: fffffc0000620000 x20: fffffc0000620010 x19: 000000000000000b x18: 000003ffd4a96020 x17: 000003ff7fc1f7a0 x16: fffffe000079b9cc x15: 0000000000000000 x14: 0000000000000000 x13: 0000000000000000 x12: fffffe03d4e24000 x11: fffffe03d4e27da0 x10: 0000000000000001 x9 : 0000000000000000 x8 : fffffe03d4e27a20 x7 : 0000000000000000 x6 : 00000000ffffffef x5 : fffffe000105f7d0 x4 : fffffe00007ca8c8 x3 : fffffe03d4e2760c x2 : 0000000000000000 x1 : fffffc0000620000 x0 : 0000000040000000 Process NetworkManager (pid: 634, stack limit = 0xfffffe03d4e24028) Stack: (0xfffffe03d4e27590 to 0xfffffe03d4e28000) ... Call trace: [<fffffe000069d6fc>] xgene_enet_rd_mcx_mac.isra.11+0x58/0xd4 [<fffffe000069dcc0>] xgene_gmac_tx_enable+0x28/0x50 [<fffffe00006a112c>] xgene_enet_open+0x2c/0x130 [<fffffe00007b9254>] __dev_open+0xc8/0x148 [<fffffe00007b956c>] __dev_change_flags+0x90/0x158 [<fffffe00007b9664>] dev_change_flags+0x30/0x70 [<fffffe00007c8ab8>] do_setlink+0x278/0x870 [<fffffe00007c95bc>] rtnl_newlink+0x404/0x6a8 [<fffffe00007c8040>] rtnetlink_rcv_msg+0x98/0x218 [<fffffe00007e78e4>] netlink_rcv_skb+0xe0/0xf8 [<fffffe00007c7f94>] rtnetlink_rcv+0x30/0x44 [<fffffe00007e6f2c>] netlink_unicast+0xfc/0x210 [<fffffe00007e75b8>] netlink_sendmsg+0x498/0x5ac [<fffffe00007990b8>] do_sock_sendmsg+0xa4/0xcc [<fffffe000079a958>] ___sys_sendmsg+0x1fc/0x208 [<fffffe000079b984>] __sys_sendmsg+0x4c/0x94 [<fffffe000079b9f8>] SyS_sendmsg+0x2c/0x3c The problem here is that the enet hw clocks are not getting initialized because of a test to avoid the initialization if UEFI is used to boot. This is an incorrect test. When booting with UEFI and devicetree, the kernel must still initialize the enet hw clocks. If booting with ACPI, the clock hw is not exposed to the kernel and it is that case where we want to avoid initializing clocks. Signed-off-by: Mark Salter <msalter@redhat.com> Acked-by: Feng Kan <fkan@apm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-05 15:40:10 -05:00
Brian King	da29370056	bnx2x: Force fundamental reset for EEH recovery EEH recovery for bnx2x based adapters is not reliable on all Power systems using the default hot reset, which can result in an unrecoverable EEH error. Forcing the use of fundamental reset during EEH recovery fixes this. Cc: stable<stable@vger.kernel.org> Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-05 15:38:24 -05:00
David S. Miller	0e9974f727	Merge branch 'cxgb4-next' Hariprasad Shenai says: ==================== cxgb4: RX Queue related cleanup and fixes This patch series adds a common function to allocate RX queues and queue allocation changes to RDMA CIQ The patches series is created against 'net-next' tree. And includes patches on cxgb4 driver. We have included all the maintainers of respective drivers. Kindly review the change and let us know in case of any review comments. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-05 15:11:57 -05:00

... 3 4 5 6 7 ...

506894 Commits