TCP-MD5 sessions have intermittent failures, when route cache is
invalidated. ip_queue_xmit() has to find a new route, calls
sk_setup_caps(sk, &rt->u.dst), destroying the
sk->sk_route_caps &= ~NETIF_F_GSO_MASK
that MD5 desperately try to make all over its way (from
tcp_transmit_skb() for example)
So we send few bad packets, and everything is fine when
tcp_transmit_skb() is called again for this socket.
Since ip_queue_xmit() is at a lower level than TCP-MD5, I chose to use a
socket field, sk_route_nocaps, containing bits to mask on sk_route_caps.
Reported-by: Bhaskar Dutta <bhaskie@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
TCP MD5 support uses percpu data for temporary storage. It currently
disables preemption so that same storage cannot be reclaimed by another
thread on same cpu.
We also have to make sure a softirq handler wont try to use also same
context. Various bug reports demonstrated corruptions.
Fix is to disable preemption and BH.
Reported-by: Bhaskar Dutta <bhaskie@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With RPS inclusion, skb timestamping is not consistent in RX path.
If netif_receive_skb() is used, its deferred after RPS dispatch.
If netif_rx() is used, its done before RPS dispatch.
This can give strange tcpdump timestamps results.
I think timestamping should be done as soon as possible in the receive
path, to get meaningful values (ie timestamps taken at the time packet
was delivered by NIC driver to our stack), even if NAPI already can
defer timestamping a bit (RPS can help to reduce the gap)
Tom Herbert prefer to sample timestamps after RPS dispatch. In case
sampling is expensive (HPET/acpi_pm on x86), this makes sense.
Let admins switch from one mode to another, using a new
sysctl, /proc/sys/net/core/netdev_tstamp_prequeue
Its default value (1), means timestamps are taken as soon as possible,
before backlog queueing, giving accurate timestamps.
Setting a 0 value permits to sample timestamps when processing backlog,
after RPS dispatch, to lower the load of the pre-RPS cpu.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now there's null check here and also again in the hook. Looking at bridge bits
which are simmilar, port structure is rcu_dereferenced right away in
handle_bridge and passed to hook. Looks nicer.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
(Dropped the infiniband part, because Tetsuo modified the related code,
I will send a separate patch for it once this is accepted.)
This patch introduces /proc/sys/net/ipv4/ip_local_reserved_ports which
allows users to reserve ports for third-party applications.
The reserved ports will not be used by automatic port assignments
(e.g. when calling connect() or bind() with port number 0). Explicit
port allocation behavior is unchanged.
Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The new function can be used to read/write large bitmaps via /proc. A
comma separated range format is used for compact output and input
(e.g. 1,3-4,10-10).
Writing into the file will first reset the bitmap then update it
based on the given input.
Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1) i_flags simply doesn't work for mount/unlink race prevention;
we may have many links to file and rm on one of those obviously
shouldn't prevent bind on top of another later on. To fix it
right way we need to mark _dentry_ as unsuitable for mounting
upon; new flag (DCACHE_CANT_MOUNT) is protected by d_flags and
i_mutex on the inode in question. Set it (with dont_mount(dentry))
in unlink/rmdir/etc., check (with cant_mount(dentry)) in places
in namespace.c that used to check for S_DEAD. Setting S_DEAD
is still needed in places where we used to set it (for directories
getting killed), since we rely on it for readdir/rmdir race
prevention.
2) rename()/mount() protection has another bogosity - we unhash
the target before we'd checked that it's not a mountpoint. Fixed.
3) ancient bogosity in pivot_root() - we locked i_mutex on the
right directory, but checked S_DEAD on the different (and wrong)
one. Noticed and fixed.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Now, with the introduction of the sff_set_devctl() method, we can
use it in sff_irq_on() method too -- that way its implementations
in 'pata_bf54x' and 'pata_scc' become virtually identical to
ata_sff_irq_on(). The sff_irq_on() method now becomes quite
superfluous, and the only reason not to remove it completely is
the existence of the 'pata_octeon_cf' driver which implements it
as an empty function. Just make the method optional then, with
ata_sff_irq_on() becoming generic taskfile-bound function, still
global for the 'pata_bf54x' driver to be able to call it from its
thaw() and postreset() methods.
While at it, make the sff_irq_on() method and ata_sff_irq_on() return
'void' as the result is always ignored anyway.
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
The set of libata's taskfile access methods is clearly incomplete as
it lacks a method to write to the device control register -- which
forces drivers like 'pata_bf54x' and 'pata_scc' to implement more
"high level" (and more weighty) methods like freeze() and postreset().
So, introduce the optional sff_set_devctl() method which the drivers
only have to implement if the standard iowrite8() can't be used (just
like the existing sff_check_altstatus() method) and make use of it
in the freeze() and postreset() method implementations (I could also
have used it in softreset() method but it also reads other taskfile
registers without using tf_read() making that quite pointless);
this makes freeze() method implementations in the 'pata_bf54x' and
'pata_scc' methods virtually identical to ata_sff_freeze(), so we
can get rid of them completely.
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
There are some SATA devices which take relatively long to get out of
0xff status after reset. In libata, this timeout is determined by
ATA_TMOUT_FF_WAIT. Quantum GoVault is the worst requring about 2s for
reliable detection. However, because 2s 0xff timeout can introduce
rather long spurious delay during boot, libata has been compromising
at the next longest timeout of 800ms for HHD424020F7SV00 iVDR drive.
Now that parallel scan is in place for common drivers, libata can
afford 2s 0xff timeout. Use 2s 0xff timeout if parallel scan is
enabled.
Please note that the chance of spurious wait is pretty slim w/ working
SCR access so this will only affect SATA controllers w/o SCR access
which isn't too common these days.
Please read the following thread for more information on the GoVault
drive.
http://thread.gmane.org/gmane.linux.ide/14545/focus=14663
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Gary Hade <garyhade@us.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
This can be used for AHCI-compatible interfaces implemented inside
System-On-Chip solutions, or AHCI devices connected via localbus.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Now that the rpc.gssd daemon can explicitly tell us that the key expired,
we should cache that information to avoid spamming gssd.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This improves the packing of the rpc_task, and ensures that on 64-bit
platforms the size reduces to 216 bytes.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
It seems strange to maintain stats for bytes_sent in one structure, and
bytes received in another. Try to assemble all the RPC request-related
stats in struct rpc_rqst
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Again, we can deadlock if the memory reclaim triggers a writeback that
requires a rpcsec_gss credential lookup.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Currently RPC performance metrics that tabulate elapsed time use
jiffies time values. This is problematic on systems that use slow
jiffies (for instance 100HZ systems built for paravirtualized
environments). It is also a problem for computing precise latency
statistics for advanced network transports, such as InfiniBand,
that can have round-trip latencies significanly faster than a single
clock tick.
For the RPC client, adopt the high resolution time stamp mechanism
already used by the network layer and blktrace: ktime.
We use ktime format time stamps for all internal computations, and
convert to milliseconds for presentation. As a result, we need only
addition operations in the performance critical paths; multiply/divide
is required only for presentation.
We could report RTT metrics in microseconds. In fact the mountstats
format is versioned to accomodate exactly this kind of interface
improvement.
For now, however, we'll stay with millisecond precision for
presentation to maintain backwards compatibility with the handful of
currently deployed user space tools. At a later point, we'll move to
an API such as BDI_STATS where a finer timestamp precision can be
reported.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
To report ktime statistics to user space in milliseconds, a new helper
is required.
When considering how to do this conversion, I didn't immediately see
why the extra step of converting ktime to a timeval was needed. To
make that more clear, introduce a couple of large comments.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Compute an RPC request's RTT once, and use that value both for reporting
RPC metrics, and for adjusting the RTT context used by the RPC client's RTT
estimator algorithm.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Clean up: Update the documenting comment, and fix some minor white
space issues.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
We should not allow soft tasks to wait for longer than the major timeout
period when waiting for a reconnect to occur.
Remove the field xprt->connect_timeout since it has been obsoleted by
xprt->reestablish_timeout.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
NFS Filehandles and struct fattr are really too large to be allocated on
the stack. This patch adds in a couple of helper functions to allocate them
dynamically instead.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Add necessary changes to add kernel support for the rc4-hmac Kerberos
encryption type used by Microsoft and described in rfc4757.
Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
All encryption types use a confounder at the beginning of the
wrap token. In all encryption types except arcfour-hmac, the
confounder is the same as the blocksize. arcfour-hmac has a
blocksize of one, but uses an eight byte confounder.
Add an entry to the crypto framework definitions for the
confounder length and change the wrap/unwrap code to use
the confounder length rather than assuming it is always
the blocksize.
Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
For the arcfour-hmac support, the make_seq_num and get_seq_num
functions need access to the kerberos context structure.
This will be used in a later patch.
Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This is needed for deriving arcfour-hmac keys "on the fly"
using the sequence number or checksu
Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
For arcfour-hmac support, the make_checksum function needs a usage
field to correctly calculate the checksum differently for MIC and
WRAP tokens.
Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Add the remaining pieces to enable support for Kerberos AES
encryption types.
Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This is a step toward support for AES encryption types which are
required to use the new token formats defined in rfc4121.
Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
[SteveD: Fixed a typo in gss_verify_mic_v2()]
Signed-off-by: Steve Dickson <steved@redhat.com>
[Trond: Got rid of the TEST_ROTATE/TEST_EXTRA_COUNT crap]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Add the final pieces to support the triple-des encryption type.
Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The text based upcall now indicates which Kerberos encryption types are
supported by the kernel rpcsecgss code. This is used by gssd to
determine which encryption types it should attempt to negotiate
when creating a context with a server.
The server principal's database and keytab encryption types are
what limits what it should negotiate. Therefore, its keytab
should be created with only the enctypes listed by this file.
Currently we support des-cbc-crc, des-cbc-md4 and des-cbc-md5
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
For encryption types other than DES, gssd sends down context information
in a new format. This new format includes the information needed to
support the new Kerberos GSS-API tokens defined in rfc4121.
Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Import the code to derive Kerberos keys from a base key into the
kernel. This will allow us to change the format of the context
information sent down from gssd to include only a single key.
Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Encryption types besides DES may use a keyed checksum (hmac).
Modify the make_checksum() function to allow for a key
and take care of enctype-specific processing such as truncating
the resulting hash.
Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Add enctype framework and change functions to use the generic
values from it rather than the values hard-coded for des.
Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Add encryption type to the krb5 context structure and use it to switch
to the correct functions depending on the encryption type.
Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Make the client and server code consistent regarding the extra buffer
space made available for the auth code when wrapping data.
Add some comments/documentation about the available buffer space
in the xdr_buf head and tail when gss_wrap is called.
Add a compile-time check to make sure we are not exceeding the available
buffer space.
Add a central function to shift head data.
Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Now that the definitions have been consolidated in an alternate header,
update the template accordingly.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch is V2 of the SH clock framework move from
arch/sh/kernel/cpu/clock.c to drivers/sh/clk.c. All
code except the following functions are moved:
clk_init(), clk_get() and clk_put().
The init function is still kept in clock.c since it
depends on the SH-specific machvec implementation.
The symbols clk_get() and clk_put() already exist in
the common ARM clkdev code, those symbols are left in
the SH tree to avoid duplicating them for SH-Mobile ARM.
Signed-off-by: Magnus Damm <damm@opensource.se>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch is V2 of the clock framework move from
arch/sh/include/asm/clock.h to include/linux/sh_clk.h
and updates the include paths for files that will be
shared between SH and SH-Mobile ARM.
The file asm/clock.h is still kept in this version,
this to depend on as few files as possible at this
point. We keep SH specific stuff in there.
Signed-off-by: Magnus Damm <damm@opensource.se>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Eliminate comments in TIPC's main API files that are either obsolete,
incorrect, misleading, or unhelpful. It also adds in one new comment.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Provide initial support for displaying overall TIPC status/statistics
information at runtime. Currently, only version info for the TIPC
kernel module is displayed.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds support for offloading the channel switch
operation to devices that support such, typically
by having specific firmware API for it. The reasons
for this could be that the firmware provides better
timing or that regulatory enforcement done by the
device requires special handling of CSAs.
In order to allow drivers to specify the timing to
the device, the new channel_switch callback will
pass through the received frame's mactime, where
available.
Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>