deliverable/linux.git
9 years agonet: dsa: Centralize setting up ports
Andrew Lunn [Tue, 5 May 2015 23:09:48 +0000 (01:09 +0200)] 
net: dsa: Centralize setting up ports

Now that setting up a port is identical for all switches, centralisers
the code looping over all the ports to set them up.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: dsa: Centralise global and port setup code into mv88e6xxx.
Andrew Lunn [Tue, 5 May 2015 23:09:47 +0000 (01:09 +0200)] 
net: dsa: Centralise global and port setup code into mv88e6xxx.

The port setup code in the individual drivers is identical for 6123,
6171, and 6352, and very similar in 6131. Move it all into mv88e6xxx,
using the chip families to differentiate on features.

Similarly, the global setup is also very similar. Move the majority
into mv8e6xxx.

The chips themselves fall into families. Add helpers which uses the
device IDs to determine if a device is a member of a family or not.
Add some additional device IDs to the existing list, to make these
helper functions more complete. However these IDs are not yet added to
the probe functions.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agocan: flexcan: replace open coded "mailbox code" by proper define
Marc Kleine-Budde [Tue, 23 Sep 2014 09:18:11 +0000 (11:18 +0200)] 
can: flexcan: replace open coded "mailbox code" by proper define

This patch replaces a open coded variant of a "mailbox code" definition by an
existing define, improves code readability.

Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
9 years agocan: flexcan: rename struct flexcan_regs::crl2 -> ctrl2
Marc Kleine-Budde [Tue, 23 Sep 2014 09:03:01 +0000 (11:03 +0200)] 
can: flexcan: rename struct flexcan_regs::crl2 -> ctrl2

This is done to mach the abbreviationin of the register in the datasheets.

Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
9 years agocan: flexcan: add documentation about mailbox organization
Marc Kleine-Budde [Wed, 17 Sep 2014 10:50:48 +0000 (12:50 +0200)] 
can: flexcan: add documentation about mailbox organization

This patch adds a short documentation snippet about the mailbox organization as
it's regularly not correct in freescale's datasheets.

Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
9 years agocan: flexcan: add MB/FIFO specific column to comment table of IP versions
David Jander [Fri, 10 Oct 2014 13:04:03 +0000 (15:04 +0200)] 
can: flexcan: add MB/FIFO specific column to comment table of IP versions

Flexcan V10 and newer are able to receive RTR frames in a MB. Older versions
are not. Those should use flexcan in FIFO mode.

Signed-off-by: David Jander <david@protonic.nl>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
9 years agomac80211: add missing documentation for rate_ctrl_lock
Johannes Berg [Wed, 6 May 2015 14:00:32 +0000 (16:00 +0200)] 
mac80211: add missing documentation for rate_ctrl_lock

This was missed in the previous patch, add some documentation
for rate_ctrl_lock to avoid docbook warnings.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
9 years agocfg80211: change GO_CONCURRENT to IR_CONCURRENT for STA
Arik Nemtsov [Wed, 6 May 2015 13:28:31 +0000 (16:28 +0300)] 
cfg80211: change GO_CONCURRENT to IR_CONCURRENT for STA

The GO_CONCURRENT regulatory definition can be extended to station
interfaces requesting to IR as part of TDLS off-channel operations.
Rename the GO_CONCURRENT flag to IR_CONCURRENT and allow the added
use-case.

Change internal users of GO_CONCURRENT to use the new definition.

Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
9 years agomac80211_hwsim: Fix the supported VHT mcs rates
Ilan Peer [Tue, 7 Apr 2015 16:05:22 +0000 (19:05 +0300)] 
mac80211_hwsim: Fix the supported VHT mcs rates

Declare that MCS 0-9 are supported for all Rx chains.

Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
9 years agomac80211_hwsim: Set VHT capabilities only for the 5.2 GHz band
Ilan Peer [Tue, 7 Apr 2015 16:05:21 +0000 (19:05 +0300)] 
mac80211_hwsim: Set VHT capabilities only for the 5.2 GHz band

Previously, VHT capabilities and supported MCSs where set for all
bands, although VHT is only allowed on 5.2 GHz band. Fix it.

Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
9 years agocfg80211: Allow GO concurrent relaxation after BSS disconnection
Avraham Stern [Mon, 27 Apr 2015 13:52:16 +0000 (16:52 +0300)] 
cfg80211: Allow GO concurrent relaxation after BSS disconnection

If a P2P GO was allowed on a channel because of the GO concurrent
relaxation, i.e., another station interface was associated to an AP on
the same channel or the same UNII band, and the station interface
disconnected from the AP, allow the following use cases unless the
channel is marked as indoor only and the device is not operating in an
indoor environment:

1. Allow the P2P GO to stay on its current channel. The rationale behind
   this is that if the channel or UNII band were allowed by the AP they
   could still be used to continue the P2P GO operation, and avoid connection
   breakage.
2. Allow another P2P GO to start on the same channel or another channel
   that is in the same UNII band as the previous instantiated P2P GO.

Signed-off-by: Avraham Stern <avraham.stern@intel.com>
Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com>
Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
9 years agomac80211: validate cipher scheme PN length better
Johannes Berg [Tue, 5 May 2015 14:32:29 +0000 (16:32 +0200)] 
mac80211: validate cipher scheme PN length better

Currently, a cipher scheme can advertise an arbitrarily long
sequence counter, but mac80211 only supports up to 16 bytes
and the initial value from userspace will be truncated.

Fix two things:
 * don't allow the driver to register anything longer than
   the 16 bytes that mac80211 reserves space for
 * require userspace to specify a starting value with the
   correct length (or none at all)

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
9 years agomac80211: extend get_key() to return PN for all ciphers
Johannes Berg [Mon, 20 Apr 2015 16:21:58 +0000 (18:21 +0200)] 
mac80211: extend get_key() to return PN for all ciphers

For ciphers not supported by mac80211, the function currently
doesn't return any PN data. Fix this by extending the driver's
get_key_seq() a little more to allow moving arbitrary PN data.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
9 years agomac80211: extend get_tkip_seq to all keys
Johannes Berg [Mon, 20 Apr 2015 16:12:41 +0000 (18:12 +0200)] 
mac80211: extend get_tkip_seq to all keys

Extend the function to read the TKIP IV32/IV16 to read the IV/PN for
all ciphers in order to allow drivers with full hardware crypto to
properly support this.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
9 years agocan: janz-ican3: add support for CAL/CANopen firmware
Andreas Gröger [Tue, 5 May 2015 18:08:34 +0000 (20:08 +0200)] 
can: janz-ican3: add support for CAL/CANopen firmware

In our department we are using some older Janz ICAN3-modules in our dekstop
pcs. There we have slightly different carrier boards than the janz-cmodio
supported in the kernel sources, called CAN-PCI2 with two submodules. But the
pci configuration regions are identical. So extending the supported pci devices
to the corresponding device ids is sufficient to get the drivers working.

* The old ICAN3-modules with firmware 1.28 need more then 250ms for the restart
  after reset. I've increased the timeout to 500ms.
* The janz_ican3 module uses the raw can services of the Janz-firmware, this
  means firmware must be ICANOS/2. Our ICAN3-modules are equipped with
  CAL/CANopen-firmware, so I must use the appropriate commands for the layer
  management services.

Te driver detects the firmware after module reset and selects the commands
matching the firmware. This affects the bus on/off-command
(ican3_set_bus_state) and the configuration of the bittiming
(ican3_set_bittiming). For better diagnostics the detected firmware string is
presented as sysfs attribute (fwinfo).

Signed-off-by: Andreas Gröger <andreas24groeger@gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
9 years agocan: janz-ican3: add documentation for existing sysfs entries
Andreas Gröger [Tue, 5 May 2015 18:08:34 +0000 (20:08 +0200)] 
can: janz-ican3: add documentation for existing sysfs entries

This patch adds documentation for the existing sysfs entries for the janz PCI
module.

Signed-off-by: Andreas Gröger <andreas24groeger@gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
9 years agocan.h: make padding given by gcc explicit
Shawn Landden [Tue, 5 May 2015 16:07:16 +0000 (09:07 -0700)] 
can.h: make padding given by gcc explicit

The current definition of struct can_frame has a 16-byte size, with 8-byte
alignment, but the 3 bytes of padding are not explicit like the similar 2 bytes
of padding of struct canfd_frame. Make it explicit so it is easier to read.

Signed-off-by: Shawn Landden <shawn@churchofgit.com>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
9 years agovxlan: Correctly set flow*i_mark and flow4i_proto in route lookups
Thomas Graf [Tue, 5 May 2015 13:09:21 +0000 (15:09 +0200)] 
vxlan: Correctly set flow*i_mark and flow4i_proto in route lookups

VXLAN must provide the skb mark and specifiy IPPROTO_UDP when doing
the FIB lookup for the remote ip. Otherwise an invalid route might
be returned.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: axienet: Fix kernel-doc warnings
Michal Simek [Tue, 5 May 2015 09:26:05 +0000 (11:26 +0200)] 
net: axienet: Fix kernel-doc warnings

This patch remove kernel-doc warnings.

Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: axienet: Removed _of_ prefix in probe and remove functions
Srikanth Thokala [Tue, 5 May 2015 09:26:04 +0000 (11:26 +0200)] 
net: axienet: Removed _of_ prefix in probe and remove functions

Synchronize names with other drivers.

Signed-off-by: Srikanth Thokala <sthokal@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: axienet: Use of_property_* calls
Srikanth Thokala [Tue, 5 May 2015 09:26:03 +0000 (11:26 +0200)] 
net: axienet: Use of_property_* calls

Use of_property_* calls

Signed-off-by: Srikanth Thokala <sthokal@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: axienet: Use devm_* calls
Srikanth Thokala [Tue, 5 May 2015 09:26:02 +0000 (11:26 +0200)] 
net: axienet: Use devm_* calls

use devm_* calls

Signed-off-by: Srikanth Thokala <sthokal@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: axienet: Use pdev instead of op
Srikanth Thokala [Tue, 5 May 2015 09:26:01 +0000 (11:26 +0200)] 
net: axienet: Use pdev instead of op

Synchronize names with other drivers

Signed-off-by: Srikanth Thokala <sthokal@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: axienet: Fix comments blocks
Michal Simek [Tue, 5 May 2015 09:26:00 +0000 (11:26 +0200)] 
net: axienet: Fix comments blocks

There is rule for network drivers with comments blocks
which is newly checked by checkpatch.pl script.
Let's fix it.

Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: axienet: Removed coding style errors and warnings
Srikanth Thokala [Tue, 5 May 2015 09:25:59 +0000 (11:25 +0200)] 
net: axienet: Removed coding style errors and warnings

Removed checkpatch.pl errors and warnings.

Signed-off-by: Srikanth Thokala <sthokal@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: axienet: Support phy-less mode of operation
Srikanth Thokala [Tue, 5 May 2015 09:25:58 +0000 (11:25 +0200)] 
net: axienet: Support phy-less mode of operation

This patch adds proper checks to handle the PHY-less case.

Signed-off-by: Srikanth Thokala <sthokal@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: axienet: Handle jumbo frames for lesser frame sizes
Srikanth Thokala [Tue, 5 May 2015 09:25:57 +0000 (11:25 +0200)] 
net: axienet: Handle jumbo frames for lesser frame sizes

In the current implementation, jumbo frames are supported only
for the frame sizes > 16K. This patch corrects this logic to
handle jumbo frames for lesser frame sizes (< 16K) ensuring jumbo frame
MTU is within the limit of max frame size configured in the h/w
design.

Signed-off-by: Srikanth Thokala <sthokal@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: axienet: Service completion interrupts ASAP
Peter Crosthwaite [Tue, 5 May 2015 09:25:56 +0000 (11:25 +0200)] 
net: axienet: Service completion interrupts ASAP

The packet completion interrupts for TX and RX should be serviced before
the packets are consumed. This ensures against the degenerate case when a
new completion interrupt is raised after the handler has exited but before
the interrupts are cleared. In this case its possible for the ISR to clear
an unhandled interrupt (leading to potential deadlock).

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Tested-by: Jason Wu <huanyu@xilinx.com>
Acked-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: axienet: Handle 0 packet receive gracefully
Peter Crosthwaite [Tue, 5 May 2015 09:25:55 +0000 (11:25 +0200)] 
net: axienet: Handle 0 packet receive gracefully

The AXI-DMA rx-delay interrupt can sometimes be triggered
when there are 0 outstanding packets received. This is due
to the fact that the receive function will greedily consume
as many packets as possible on interrupt. So if two packets
(with a very particular timing) arrive in succession they
will each cause the rx-delay interrupt, but the first interrupt
will consume both packets.
This means the second interrupt is a 0 packet receive.

This is mostly OK, except that the tail pointer register is
updated unconditionally on receive. Currently the tail pointer
is always set to the current bd-ring descriptor under
the assumption that the hardware has moved onto the next
descriptor. What this means for length 0 recv is the current
descriptor that the hardware is potentially yet to use will
be marked as the tail. This causes the hardware to think
its run out of descriptors deadlocking the whole rx path.

Fixed by updating the tail pointer to the most recent
successfully consumed descriptor.

Reported-by: Wendy Liang <wendy.liang@xilinx.com>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Tested-by: Jason Wu <huanyu@xilinx.com>
Acked-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: axienet: Support for RGMII
Srikanth Thokala [Tue, 5 May 2015 09:25:54 +0000 (11:25 +0200)] 
net: axienet: Support for RGMII

This patch adds support for the RGMII. The h/w configuration
parameter C_PHY_TYPE, which represents the interface configured in
the design, is used to differentiate various interfaces supported
by AXI Ethernet.

Signed-off-by: Srikanth Thokala <sthokal@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'cxgb4-next'
David S. Miller [Tue, 5 May 2015 23:31:50 +0000 (19:31 -0400)] 
Merge branch 'cxgb4-next'

Hariprasad Shenai says:

====================
Trivial fixes and changes for SGE

This patch series adds the following.
Discard packet if length is greater than MTU, move sge monitor code to a
new routine, add device node to ULD info, add congestion notification from
SGE for ingress queue and freelists and for T5, setting up the Congestion
Manager values of the new RX Ethernet Queue is done by firmware now.

This patch series has been created against net-next tree and includes
patches on cxgb4 driver.

We have included all the maintainers of respective drivers. Kindly review
the change and let us know in case of any review comments.

Thanks

V2: Align parenthesis for PATCH 2/6 and PATCH 5/6
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agocxgb4: Discard the packet if the length is greater than mtu
Hariprasad Shenai [Tue, 5 May 2015 09:29:56 +0000 (14:59 +0530)] 
cxgb4: Discard the packet if the length is greater than mtu

pktgen sends raw udp packets and bypasses most of the
linux networking stack. User can specify different packet sizes.
Hence we need to discard the packet if the length is greater than mtu

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agocxgb4: Move SGE Ingress DMA state monitor code to a new routine
Hariprasad Shenai [Tue, 5 May 2015 09:29:55 +0000 (14:59 +0530)] 
cxgb4: Move SGE Ingress DMA state monitor code to a new routine

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agocxgb4: Add device node to ULD info
Hariprasad Shenai [Tue, 5 May 2015 09:29:54 +0000 (14:59 +0530)] 
cxgb4: Add device node to ULD info

Adds device node to ULD info. Use the node info to alloc_ring() for ctrl
TX queues

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agocxgb4: Pass in a Congestion Channel Map to t4_sge_alloc_rxq()
Hariprasad Shenai [Tue, 5 May 2015 09:29:53 +0000 (14:59 +0530)] 
cxgb4: Pass in a Congestion Channel Map to t4_sge_alloc_rxq()

Passes a Congestion Channel Map to t4_sge_alloc_rxq()
for the Ethernet RX Queues based on the MPS Buffer Group Map
of the TX Channel rather than just the TX Channel Map.
Also, in t4_sge_alloc_rxq() for T5, setting up the
Congestion Manager values of the new RX Ethernet Queue is
done by firmware now.

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agocxgb4: Enable congestion notification from SGE for IQs and FLs.
Hariprasad Shenai [Tue, 5 May 2015 09:29:52 +0000 (14:59 +0530)] 
cxgb4: Enable congestion notification from SGE for IQs and FLs.

Also changed the name of t4_hw.c:get_mps_bg_map() to t4_get_mps_bg_map()
and make it an exported routine with a definition in cxgb4.h.

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agocxgb4: Make sure that Freelist size is larger than Egress Congestion Threshold
Hariprasad Shenai [Tue, 5 May 2015 09:29:51 +0000 (14:59 +0530)] 
cxgb4: Make sure that Freelist size is larger than Egress Congestion Threshold

We need to make sure that the Free List Size, in pointers, is at
least 2 Egress Queue Units (8 pointers/each) larger than the SGE's Egress
Congestion Threshold (in pointers).

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agorhashtable-test: Fix 64bit division
Thomas Graf [Tue, 5 May 2015 00:27:02 +0000 (02:27 +0200)] 
rhashtable-test: Fix 64bit division

A 64bit division went in unnoticed. Use do_div() to accomodate
non 64bit architectures.

Reported-by: kbuild test robot
Fixes: 1aa661f5c3df ("rhashtable-test: Measure time to insert, remove & traverse entries")
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agorhashtable: Simplify iterator code
Thomas Graf [Tue, 5 May 2015 00:22:53 +0000 (02:22 +0200)] 
rhashtable: Simplify iterator code

Remove useless obj variable and goto logic.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'ipvlan-mcast'
David S. Miller [Tue, 5 May 2015 23:29:50 +0000 (19:29 -0400)] 
Merge branch 'ipvlan-mcast'

Mahesh Bandewar says:

====================
Multicast processing in IPvlan

Dan Willems pointed out that autoconf in IPvlan is broken because of the
way broadcast bit gets set. Since broadcast processing is a real performance
drain, the broadcast bit in multicast filter was only set when the interface
was configured with IPv4 address. In autoconf scenario, when there are
no addresses configured; this logic did not work and it wouldn't allow
DHCPv4 to work. The only way was to add protocol specific hacks to avoid
processing unnecessary broadcast burdon.

This jugglery could be avoided if these multicast / broadcast packets are taken
out of fast-path and are processed in a work-queue. This will enable us to add
broadcast bit in all multicast filters without any impact on performance of
the virtual device. This patch series just does that.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoipvlan: Always set broadcast bit in multicast filter
Mahesh Bandewar [Tue, 5 May 2015 00:06:11 +0000 (17:06 -0700)] 
ipvlan: Always set broadcast bit in multicast filter

Earlier tricks of setting broadcast bit only when IPv4 address is added
onto interface are not good enough especially when autoconf comes in play.
Setting them on always is performance drag but now that multicast /
broadcast is not processed in fast-path; enabling broadcast will let
autoconf work correctly without affecting performance characteristics of
the device.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoipvlan: Defer multicast / broadcast processing to a work-queue
Mahesh Bandewar [Tue, 5 May 2015 00:06:03 +0000 (17:06 -0700)] 
ipvlan: Defer multicast / broadcast processing to a work-queue

Processing multicast / broadcast in fast path is performance draining
and having more links means more cloning and bringing performance
down further.

Broadcast; in particular, need to be given to all the virtual links.
Earlier tricks of enabling broadcast bit for IPv4 only interfaces are not
really working since it fails autoconf. Which means enabling broadcast
for all the links if protocol specific hacks do not have to be added into
the driver.

This patch defers all (incoming as well as outgoing) multicast traffic to
a work-queue leaving only the unicast traffic in the fast-path. Now if we
need to apply any additional tricks to further reduce the impact of this
(multicast / broadcast) type of traffic, it can be implemented while
processing this work without affecting the fast-path.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'eth_proto_is_802_3'
David S. Miller [Tue, 5 May 2015 23:24:43 +0000 (19:24 -0400)] 
Merge branch 'eth_proto_is_802_3'

Alexander Duyck says:

====================
Add eth_proto_is_802_3 to provide improved means of checking Ethertype

This patch series implements and makes use of eth_proto_is_802_3().  The
idea behind the function is to provide an optimized means of testing to
determine if a given Ethertype value is a length or 802.3 protocol number.
The standard path for this was to use ntohs(proto) and then perform a
comparison.  This adds a slight cost as it usually requires either a 16b
rotate or byte swap which can cost 1 cycle or more depending on the
processor.

I had previously addressed this for eth_type_trans, however in doing so I had
overlooked checking with sparse and had introduced a couple sparse warnings.
The first patch in this series fixes those sparse warnings as well as does
some additional optimization for big endian systems.  In addition it pushes
the code out into a separate function which can then be used in the other
patches to reduce the instruction count/processing time in those functions
as well.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agovlan: Use eth_proto_is_802_3
Alexander Duyck [Mon, 4 May 2015 21:34:10 +0000 (14:34 -0700)] 
vlan: Use eth_proto_is_802_3

Replace "ntohs(proto) >= ETH_P_802_3_MIN" w/ eth_proto_is_802_3(proto).

Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoopenvswitch: Use eth_proto_is_802_3
Alexander Duyck [Mon, 4 May 2015 21:34:05 +0000 (14:34 -0700)] 
openvswitch: Use eth_proto_is_802_3

Replace "ntohs(proto) >= ETH_P_802_3_MIN" w/ eth_proto_is_802_3(proto).

Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoipv4/ip_tunnel_core: Use eth_proto_is_802_3
Alexander Duyck [Mon, 4 May 2015 21:33:59 +0000 (14:33 -0700)] 
ipv4/ip_tunnel_core: Use eth_proto_is_802_3

Replace "ntohs(proto) >= ETH_P_802_3_MIN" w/ eth_proto_is_802_3(proto).

Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoebtables: Use eth_proto_is_802_3
Alexander Duyck [Mon, 4 May 2015 21:33:54 +0000 (14:33 -0700)] 
ebtables: Use eth_proto_is_802_3

Replace "ntohs(proto) >= ETH_P_802_3_MIN" w/ eth_proto_is_802_3(proto).

Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoetherdev: Fix sparse error, make test usable by other functions
Alexander Duyck [Mon, 4 May 2015 21:33:48 +0000 (14:33 -0700)] 
etherdev: Fix sparse error, make test usable by other functions

This change does two things.  First it fixes a sparse error for the fact
that the __be16 degrades to an integer.  Since that is actually what I am
kind of doing I am simply working around that by forcing both sides of the
comparison to u16.

Also I realized on some compilers I was generating another instruction for
big endian systems such as PowerPC since it was masking the value before
doing the comparison.  So to resolve that I have simply pulled the mask out
and wrapped it in an #ifndef __BIG_ENDIAN.

Lastly I pulled this all out into its own function.  I notices there are
similar checks in a number of other places so this function can be reused
there to help reduce overhead in these paths as well.

Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agobridge: change BR_GROUPFWD_RESTRICTED to allow forwarding of LLDP frames
Bernhard Thaler [Mon, 4 May 2015 20:47:13 +0000 (22:47 +0200)] 
bridge: change BR_GROUPFWD_RESTRICTED to allow forwarding of LLDP frames

BR_GROUPFWD_RESTRICTED bitmask restricts users from setting values to
/sys/class/net/brX/bridge/group_fwd_mask that allow forwarding of
some IEEE 802.1D Table 7-10 Reserved addresses:

(MAC Control) 802.3 01-80-C2-00-00-01
(Link Aggregation) 802.3 01-80-C2-00-00-02
802.1AB LLDP 01-80-C2-00-00-0E

Change BR_GROUPFWD_RESTRICTED to allow to forward LLDP frames and document
group_fwd_mask.

e.g.
   echo 16384 > /sys/class/net/brX/bridge/group_fwd_mask
allows to forward LLDP frames.

This may be needed for bridge setups used for network troubleshooting or
any other scenario where forwarding of LLDP frames is desired (e.g. bridge
connecting a virtual machine to real switch transmitting LLDP frames that
virtual machine needs to receive).

Tested on a simple bridge setup with two interfaces and host transmitting
LLDP frames on one side of this bridge (used lldpd). Setting group_fwd_mask
as described above lets LLDP frames traverse bridge.

Signed-off-by: Bernhard Thaler <bernhard.thaler@wvnet.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agotcp: provide SYN headers for passive connections
Eric Dumazet [Mon, 4 May 2015 04:34:46 +0000 (21:34 -0700)] 
tcp: provide SYN headers for passive connections

This patch allows a server application to get the TCP SYN headers for
its passive connections.  This is useful if the server is doing
fingerprinting of clients based on SYN packet contents.

Two socket options are added: TCP_SAVE_SYN and TCP_SAVED_SYN.

The first is used on a socket to enable saving the SYN headers
for child connections. This can be set before or after the listen()
call.

The latter is used to retrieve the SYN headers for passive connections,
if the parent listener has enabled TCP_SAVE_SYN.

TCP_SAVED_SYN is read once, it frees the saved SYN headers.

The data returned in TCP_SAVED_SYN are network (IPv4/IPv6) and TCP
headers.

Original patch was written by Tom Herbert, I changed it to not hold
a full skb (and associated dst and conntracking reference).

We have used such patch for about 3 years at Google.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Tested-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agomac80211: remove useless skb->encapsulation check
Johannes Berg [Tue, 5 May 2015 13:25:33 +0000 (15:25 +0200)] 
mac80211: remove useless skb->encapsulation check

No current (and planned, as far as I know) wifi devices support
encapsulation checksum offload, so remove the useless test here.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
9 years agomac80211: make LED triggering depend on activation
Johannes Berg [Thu, 23 Apr 2015 10:19:22 +0000 (12:19 +0200)] 
mac80211: make LED triggering depend on activation

When LED triggers are compiled in, but not used, mac80211 will still
call them to update the status. This isn't really a problem for the
assoc and radio ones, but the TX/RX (and to a certain extend TPT)
ones can be called very frequently (for every packet.)

In order to avoid that when they're not used, track their activation
and call the corresponding trigger (and in the TPT case, account for
throughput) only when the trigger is actually used by an LED.

Additionally, make those trigger functions inlines since theyre only
used once in the remaining code.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
9 years agomac80211: make LED trigger names const
Johannes Berg [Thu, 23 Apr 2015 10:09:01 +0000 (12:09 +0200)] 
mac80211: make LED trigger names const

This is just a code cleanup, make the LED trigger names const
as they're not expected to be modified by drivers.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
9 years agomac80211: clean up station debugfs
Johannes Berg [Wed, 22 Apr 2015 19:07:39 +0000 (21:07 +0200)] 
mac80211: clean up station debugfs

Remove items that can be retrieved through nl80211. This also
removes two items (tx_packets and tx_bytes) where only the VO
counter was exposed since they are split up per AC but in the
debugfs file only the first AC was shown.

Also remove the useless "dev" file - the stations have long
been in a sub-directory of the netdev so there's no need for
that any more.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
9 years agomac80211: remove sta->tx_fragments counter
Johannes Berg [Wed, 22 Apr 2015 18:55:55 +0000 (20:55 +0200)] 
mac80211: remove sta->tx_fragments counter

This counter is unsafe with concurrent TX and is only exposed
through debugfs and ethtool. Instead of trying to fix it just
remove it for now, if it's really needed then it should be
exposed through nl80211 and in a way that drivers that do the
fragmentation in the device could support it as well.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
9 years agomac80211: move dot11 counters under MAC80211_DEBUG_COUNTERS
Johannes Berg [Wed, 22 Apr 2015 18:47:28 +0000 (20:47 +0200)] 
mac80211: move dot11 counters under MAC80211_DEBUG_COUNTERS

Since these counters can only be read through debugfs, there's
very little point in maintaining them all the time. However,
even just making them depend on debugfs is pointless - they're
not normally used. Additionally a number of them aren't even
concurrency safe.

Move them under MAC80211_DEBUG_COUNTERS so they're normally
not even compiled in.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
9 years agomac80211: clean up global debugfs statistics
Johannes Berg [Wed, 22 Apr 2015 18:25:20 +0000 (20:25 +0200)] 
mac80211: clean up global debugfs statistics

The debugfs statistics macros are pointlessly verbose, so change
that macro to just have a single argument. While at it, remove
the unused counters and rename rx_expand_skb_head2 to the better
rx_expand_skb_head_defrag.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
9 years agonet: fix two sparse warnings introduced by IGMP/MLD parsing exports
Linus Lüssing [Mon, 4 May 2015 22:19:35 +0000 (00:19 +0200)] 
net: fix two sparse warnings introduced by IGMP/MLD parsing exports

> net/core/skbuff.c:4108:13: sparse: incorrect type in assignment (different base types)
> net/ipv6/mcast_snoop.c:63 ipv6_mc_check_exthdrs() warn: unsigned 'offset' is never less than zero.

Introduced by 9afd85c9e4552b276e2f4cfefd622bdeeffbbf26
("net: Export IGMP/MLD message validation code")

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next...
David S. Miller [Mon, 4 May 2015 19:37:08 +0000 (15:37 -0400)] 
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2015-05-04

This series contains updates to igb, e100, e1000e and ixgbe.

Todd cleans up igb_enable_mas() since it should only be called for the
82575 silicon and has no clear return, so modify the function to void.

Jean Sacren found upon inspection that 'err' did not need to be
initialized, since it is immediately overwritten.

Alex Duyck provides two patches for e1000e, the first cleans up the
handling VLAN_HLEN as a part of max frame size.  Fixes the issue:
c751a3d58cf2d ("e1000e: Correctly include VLAN_HLEN when changing
interface MTU").  The second fixes an issue where the driver was not
allowing jumbo frames to be enabled when CRC stripping was disabled,
however it was allowing CRC stripping to be disabled while jumbo frames
were enabled.

Jeff (me) fixes a warning found on PPC where the use of do_div() needed
to use u64 arg and not s64.

Mark provides three ixgbe patches, first to fix the Intel On-chip System
Fabric (IOSF) Sideband message interfaces, to serialize access using both
PHY bits in the SWFW_SEMAPHORE register.  Then fixes how semaphore bits
were released, since they should be released in reverse of the order that
they were taken.  Lastly updates ixgbe to use a signed type to hold
error codes, since error codes are negative, so consistently use signed
types when handling them.

v2: dropped the previous #6-#8 patches by Hiroshi Shimanoto based on
    feedback from Or Gerlitz (and David Miller) that it appears there
    needs to be further discussion on how this gets implemented.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'tipc-topology-cleanup'
David S. Miller [Mon, 4 May 2015 19:04:02 +0000 (15:04 -0400)] 
Merge branch 'tipc-topology-cleanup'

Ying Xue says:

====================
tipc: cleanup topology server

Not only function names declared in subscr.c are very confused, but
also topology server's locking policy is not designed very well, for
instance, usually leading to panic in some special corner cases.

In this series, we attempt to eliminate the confusion of function names
and simplify topology server's locking policy to solve above mentioned
issues. More importantly, the change will make relevant code easily
understandable and maintainable.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agotipc: deal with return value of tipc_conn_new callback
Ying Xue [Mon, 4 May 2015 02:36:48 +0000 (10:36 +0800)] 
tipc: deal with return value of tipc_conn_new callback

Once tipc_conn_new() returns NULL, the connection should be shut
down immediately, otherwise, oops may happen due to the NULL pointer.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agotipc: adjust locking policy of subscription
Ying Xue [Mon, 4 May 2015 02:36:47 +0000 (10:36 +0800)] 
tipc: adjust locking policy of subscription

Currently subscriber's lock protects not only subscriber's subscription
list but also all subscriptions linked into the list. However, as all
members of subscription are never changed after they are initialized,
it's unnecessary for subscription to be protected under subscriber's
lock. If the lock is used to only protect subscriber's subscription
list, the adjustment not only makes the locking policy simpler, but
also helps to avoid a deadlock which may happen once creating a
subscription is failed.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agotipc: involve reference counter for subscriber
Ying Xue [Mon, 4 May 2015 02:36:46 +0000 (10:36 +0800)] 
tipc: involve reference counter for subscriber

At present subscriber's lock is used to protect the subscription list
of subscriber as well as subscriptions linked into the list. While one
or all subscriptions are deleted through iterating the list, the
subscriber's lock must be held. Meanwhile, as deletion of subscription
may happen in subscription timer's handler, the lock must be grabbed
in the function as well. When subscription's timer is terminated with
del_timer_sync() during above iteration, subscriber's lock has to be
temporarily released, otherwise, deadlock may occur. However, the
temporary release may cause the double free of a subscription as the
subscription is not disconnected from the subscription list.

Now if a reference counter is introduced to subscriber, subscription's
timer can be asynchronously stopped with del_timer(). As a result, the
issue is not only able to be fixed, but also relevant code is pretty
readable and understandable.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agotipc: introduce tipc_subscrb_create routine
Ying Xue [Mon, 4 May 2015 02:36:45 +0000 (10:36 +0800)] 
tipc: introduce tipc_subscrb_create routine

Introducing a new function makes the purpose of tipc_subscrb_connect_cb
callback routine more clear.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agotipc: rename functions defined in subscr.c
Ying Xue [Mon, 4 May 2015 02:36:44 +0000 (10:36 +0800)] 
tipc: rename functions defined in subscr.c

When a topology server accepts a connection request from its client,
it allocates a connection instance and a tipc_subscriber structure
object. The former is used to communicate with client, and the latter
is often treated as a subscriber which manages all subscription events
requested from a same client. When a topology server receives a request
of subscribing name services from a client through the connection, it
creates a tipc_subscription structure instance which is seen as a
subscription recording what name services are subscribed. In order to
manage all subscriptions from a same client, topology server links
them into the subscrp_list of the subscriber. So subscriber and
subscription completely represents different meanings respectively,
but function names associated with them make us so confused that we
are unable to easily tell which function is against subscriber and
which is to subscription. So we want to eliminate the confusion by
renaming them.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'igmp_mld_export'
David S. Miller [Mon, 4 May 2015 18:49:23 +0000 (14:49 -0400)] 
Merge branch 'igmp_mld_export'

Linus Lüssing says:

====================
Exporting IGMP/MLD checking from bridge code

The multicast optimizations in batman-adv are yet only usable and
enabled in non-bridged scenarios. To be able to support bridged setups
batman-adv needs to be able to detect IGMP/MLD queriers and reports on
mesh nodes without bridges, too. See the following link for details:

http://www.open-mesh.org/projects/batman-adv/wiki/Multicast-optimizations-listener-reports

To avoid duplicate code between the bridge and batman-adv, the IGMP/MLD
message validation code is moved from the bridge to the IPv4/IPv6 stack.

On the way, some refactoring to increase readability and to iron out
some subtle differences between the IGMP and MLD parsing code is done.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: Export IGMP/MLD message validation code
Linus Lüssing [Sat, 2 May 2015 12:01:07 +0000 (14:01 +0200)] 
net: Export IGMP/MLD message validation code

With this patch, the IGMP and MLD message validation functions are moved
from the bridge code to IPv4/IPv6 multicast files. Some small
refactoring was done to enhance readibility and to iron out some
differences in behaviour between the IGMP and MLD parsing code (e.g. the
skb-cloning of MLD messages is now only done if necessary, just like the
IGMP part always did).

Finally, these IGMP and MLD message validation functions are exported so
that not only the bridge can use it but batman-adv later, too.

Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agobridge: multicast: call skb_checksum_{simple_, }validate
Linus Lüssing [Sat, 2 May 2015 12:01:06 +0000 (14:01 +0200)] 
bridge: multicast: call skb_checksum_{simple_, }validate

Let's use these new, neat helpers.

Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agotc: remove unused redirect ttl
Jamal Hadi Salim [Sat, 2 May 2015 05:19:43 +0000 (22:19 -0700)] 
tc: remove unused redirect ttl

improves ingress+u32 performance from 22.4 Mpps to 22.9 Mpps

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Florian Westphal <fw@strlen.de>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoixgbe: Use a signed type to hold error codes
Mark Rustad [Fri, 10 Apr 2015 17:36:36 +0000 (10:36 -0700)] 
ixgbe: Use a signed type to hold error codes

Because error codes are negative, it only makes sense to
consistently use signed types when handling them. Also remove
some explicit comparisons with 0 on these variables.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoixgbe: Release semaphore bits in the right order
Mark Rustad [Fri, 10 Apr 2015 17:36:31 +0000 (10:36 -0700)] 
ixgbe: Release semaphore bits in the right order

The global semaphore bits should be released in the reverse of the
order that they were taken, so correct that.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoixgbe: Fix IOSF SB access issues
Mark Rustad [Fri, 10 Apr 2015 17:36:26 +0000 (10:36 -0700)] 
ixgbe: Fix IOSF SB access issues

IOSF is the Intel On-chip System Fabric used in SOCs. IOSF SB is
the IOSF SideBand message interface. This patch serializes IOSF SB
access using both phy bits in the SWFW_SEMAPHORE register. It also
adds a helper function to wait for IOSF SB accesses to complete.
Use the new function to perform this wait before each access, as
specified in the datasheet, in addition to using it to wait for
IOSF SB read/write completion.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoe1000e: fix call to do_div() to use u64 arg
Jeff Kirsher [Sat, 2 May 2015 08:20:04 +0000 (01:20 -0700)] 
e1000e: fix call to do_div() to use u64 arg

We were using s64 for lat_ns (latency nano-second value) since in
our calculations a negative value could be a resultant.  For negative
values, we then assign lat_ns to be zero, so the value passed to
do_div() was never negative, but do_div() expects the argument type
to be u64, so do a cast to resolve a compile warning seen on
PowerPC.

CC: Yanjiang Jin <yanjiang.jin@windriver.com>
CC: Yanir Lubetkin <yanirx.lubetkin@intel.com>
Reported-by: Yanjiang Jin <yanjiang.jin@windriver.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
9 years agoe1000e: Do not allow CRC stripping to be disabled on 82579 w/ jumbo frames
Alexander Duyck [Sat, 2 May 2015 08:09:59 +0000 (01:09 -0700)] 
e1000e: Do not allow CRC stripping to be disabled on 82579 w/ jumbo frames

 The driver wasn't allowing jumbo frames to be
 enabled when CRC stripping was disabled, however it was allowing CRC
 stripping to be disabled while jumbo frames were enabled.  This fixes that by
 making it so that the NETIF_F_RXFCS flag cannot be set when jumbo frames are
 enabled on 82579 and newer parts.

Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoe1000e: Cleanup handling of VLAN_HLEN as a part of max frame size
Alexander Duyck [Sat, 2 May 2015 07:52:00 +0000 (00:52 -0700)] 
e1000e: Cleanup handling of VLAN_HLEN as a part of max frame size

When the VLAN_HLEN was added to the calculation for the maximum frame size
there seems to have been a number of issues added to the driver.

The first issue is that in some cases the maximum frame size for a device
never really reached the actual maximum frame size as the VLAN header
length was not included the calculation for that value.  As a result some
parts only supported a maximum frame size of either 1496 in the case of
parts that didn't support jumbo frames, and 8996 in the case of the parts
that do.

The second issue is the fact that there were several checks that weren't
updated so as a result setting an MTU of 1500 was treated as enabling jumbo
frames as the calculated value was 1522 instead of 1518.  I have addressed
those by replacing ETH_FRAME_LEN with VLAN_ETH_FRAME_LEN where appropriate.

The final issue was the fact that lowering the MTU below 1500 would cause
the driver to allocate 2K buffers for the rings.  This is an old issue that
was fixed several years ago in igb/ixgbe and I am addressing now by just
replacing == with a <= so that we always just round up to 1522 for anything
that isn't a jumbo frame.

Fixes: c751a3d58cf2d ("e1000e: Correctly include VLAN_HLEN when changing interface MTU")
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoe100: don't initialize int object to zero
Jean Sacren [Sat, 2 May 2015 07:49:26 +0000 (00:49 -0700)] 
e100: don't initialize int object to zero

'err' will be overwritten so no need to initialize it to zero.

Signed-off-by: Jean Sacren <sakiwit@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoigb: simplify and clean up igb_enable_mas()
Todd Fujinaka [Sat, 2 May 2015 07:39:03 +0000 (00:39 -0700)] 
igb: simplify and clean up igb_enable_mas()

igb_enable_mas() should only be called for the 82575 and has no clear
return so changing it to void. Also simplify the odd conditional
expression.

Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoMerge branch 'via-rhine-rework'
David S. Miller [Mon, 4 May 2015 04:18:27 +0000 (00:18 -0400)] 
Merge branch 'via-rhine-rework'

Francois Romieu says:

====================
via-rhine rework

The series applies against davem-next as of
9dd3c797496affd699805c8a9d8429ad318c892f ("drivers: net: xgene: fix kbuild
warnings").

Patches #1..#4 avoid holes in the receive ring.

Patch #5 is a small leftover cleanup for #1..#4.

Patches #6 and #7 are fairly simple barrier stuff.

Patch #8 closes some SMP transmit races - not that anyone really
complained about these but it's a bit hard to handwave that they
can be safely ignored. Some testing, especially SMP testing of
course, would be welcome.

. Changes since #2:
  - added dma_rmb barrier in vlan related patch 6.
  - s/wmb/dma_wmb/ in (*new*) patch 7 of 8.
  - added explicit SMP barriers in (*new*) patch 8 of 8.

. Changes since #1:
  - turned wmb() into dma_wmb() as suggested by davem and Alexander Duyck
    in patch 1 of 6.
  - forgot to reset rx_head_desc in rhine_reset_rbufs in patch 4 of 6.
  - removed rx_head_desc altogether in (*new*) patch 5 of 6
  - remoed some vlan receive uglyness in (*new*) patch 6 of 6.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agovia-rhine: close SMP transmit races.
françois romieu [Fri, 1 May 2015 20:14:45 +0000 (22:14 +0200)] 
via-rhine: close SMP transmit races.

7ab87ff4c770eed71e3777936299292739fcd0fe ("via-rhine: move work from
irq handler to softirq and beyond") forgot to explicitely control the
lifespan of the tx_dirty and tx_cur pointers.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agovia-rhine: dma_wmb transmit barrier.
françois romieu [Fri, 1 May 2015 20:14:44 +0000 (22:14 +0200)] 
via-rhine: dma_wmb transmit barrier.

Follow the now usual transmit descriptor update path:
1. content change
2. dma_wmb
3. ownership change

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agovia-rhine: add consistent memory barrier in vlan receive code.
françois romieu [Fri, 1 May 2015 20:14:43 +0000 (22:14 +0200)] 
via-rhine: add consistent memory barrier in vlan receive code.

The NAPI receive path depends on desc->rx_status but it does not
enforce any explicit receive barrier.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agovia-rhine: kiss rx_head_desc goodbye.
françois romieu [Fri, 1 May 2015 20:14:42 +0000 (22:14 +0200)] 
via-rhine: kiss rx_head_desc goodbye.

The driver no longer produces holes in its receive ring so rx_head_desc
only duplicates cur_rx.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agovia-rhine: forbid holes in the receive descriptor ring.
françois romieu [Fri, 1 May 2015 20:14:41 +0000 (22:14 +0200)] 
via-rhine: forbid holes in the receive descriptor ring.

Rationales:
- throttle work under memory pressure
- lower receive descriptor recycling latency for the network adapter
- lower the maintenance burden of uncommon paths

The patch is twofold:
- it fails early if the receive ring can't be completely initialized
  at dev->open() time
- it drops packets on the floor in the napi receive handler so as to
  keep the received ring full

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agovia-rhine: gotoize rhine_open error path.
françois romieu [Fri, 1 May 2015 20:14:40 +0000 (22:14 +0200)] 
via-rhine: gotoize rhine_open error path.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agovia-rhine: allocate and map receive buffer in a single transaction
françois romieu [Fri, 1 May 2015 20:14:39 +0000 (22:14 +0200)] 
via-rhine: allocate and map receive buffer in a single transaction

It's used to initialize the receive ring but it will actually shine when
the receive poll code is reworked.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agovia-rhine: commit receive buffer address before descriptor status update.
françois romieu [Fri, 1 May 2015 20:14:38 +0000 (22:14 +0200)] 
via-rhine: commit receive buffer address before descriptor status update.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'flow_keys_digest'
David S. Miller [Mon, 4 May 2015 04:09:09 +0000 (00:09 -0400)] 
Merge branch 'flow_keys_digest'

Tom Herbert says:

====================
net: Eliminate calls to flow_dissector and introduce flow_keys_digest

In this patch set we add skb_get_hash_perturb which gets the skbuff
hash for a packet and perturbs it using a provided key and jhash1.
This function is used in serveral qdiscs and eliminates many calls
to flow_dissector and jhash3 to get a perturbed hash for a packet.

To handle the sch_choke issue (passes flow_keys in skbuff cb) we
add flow_keys_digest which is a digest of a flow constructed
from a flow_keys structure.

This is the second version of these patches I posted a while ago,
and is prerequisite work to increasing the size of the flow_keys
structure and hashing over it (full IPv6 address, flow label, VLAN ID,
etc.).

Version 2:

- Add keyval parameter to __flow_hash_from_keys which allows caller to
  set the initval for jhash
- Perturb always does flow dissection and creates hash based on
  input perturb value which acts as the keyval to __flow_hash_from_keys
- Added a _flow_keys_digest_data which is used in make_flow_keys_digest.
  This fills out the digest by populating individual fields instead
  of copying the whole structure.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agosch_choke: Use flow_keys_digest
Tom Herbert [Fri, 1 May 2015 18:30:18 +0000 (11:30 -0700)] 
sch_choke: Use flow_keys_digest

Call make_flow_keys_digest to get a digest from flow keys and
use that to pass skbuff cb and for comparing flows.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: Add flow_keys digest
Tom Herbert [Fri, 1 May 2015 18:30:17 +0000 (11:30 -0700)] 
net: Add flow_keys digest

Some users of flow keys (well just sch_choke now) need to pass
flow_keys in skbuff cb, and use them for exact comparisons of flows
so that skb->hash is not sufficient. In order to increase size of
the flow_keys structure, we introduce another structure for
the purpose of passing flow keys in skbuff cb. We limit this structure
to sixteen bytes, and we will technically treat this as a digest of
flow_keys struct hence its name flow_keys_digest. In the first
incaranation we just copy the flow_keys structure up to 16 bytes--
this is the same information previously passed in the cb. In the
future, we'll adapt this for larger flow_keys and could use something
like SHA-1 over the whole flow_keys to improve the quality of the
digest.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agosched: Call skb_get_hash_perturb in sch_sfq
Tom Herbert [Fri, 1 May 2015 18:30:16 +0000 (11:30 -0700)] 
sched: Call skb_get_hash_perturb in sch_sfq

Call skb_get_hash_perturb instead of doing skb_flow_dissect and then
jhash by hand.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agosched: Call skb_get_hash_perturb in sch_sfb
Tom Herbert [Fri, 1 May 2015 18:30:15 +0000 (11:30 -0700)] 
sched: Call skb_get_hash_perturb in sch_sfb

Call skb_get_hash_perturb instead of doing skb_flow_dissect and then
jhash by hand.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agosched: Call skb_get_hash_perturb in sch_hhf
Tom Herbert [Fri, 1 May 2015 18:30:14 +0000 (11:30 -0700)] 
sched: Call skb_get_hash_perturb in sch_hhf

Call skb_get_hash_perturb instead of doing skb_flow_dissect and then
jhash by hand.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agosched: Call skb_get_hash_perturb in sch_fq_codel
Tom Herbert [Fri, 1 May 2015 18:30:13 +0000 (11:30 -0700)] 
sched: Call skb_get_hash_perturb in sch_fq_codel

Call skb_get_hash_perturb instead of doing skb_flow_dissect and then
jhash by hand.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: Add skb_get_hash_perturb
Tom Herbert [Fri, 1 May 2015 18:30:12 +0000 (11:30 -0700)] 
net: Add skb_get_hash_perturb

This calls flow_disect and __skb_get_hash to procure a hash for a
packet. Input includes a key to initialize jhash. This function
does not set skb->hash.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: ipv4: route: Fix sending IGMP messages with link address
Andrew Lunn [Fri, 1 May 2015 14:39:54 +0000 (16:39 +0200)] 
net: ipv4: route: Fix sending IGMP messages with link address

In setups with a global scope address on an interface, and a lesser
scope address on an interface sending IGMP reports, the reports can be
sent using the other interfaces global scope address rather than the
local interface address. RFC 2236 suggests:

     Ignore the Report if you cannot identify the source address of
     the packet as belonging to a subnet assigned to the interface on
     which the packet was received.

since such reports could be forged.

Look at the protocol when deciding if a RT_SCOPE_LINK address should
be used for the packet.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: sched: run ingress qdisc without locks
Alexei Starovoitov [Fri, 1 May 2015 03:14:07 +0000 (20:14 -0700)] 
net: sched: run ingress qdisc without locks

TC classifiers/actions were converted to RCU by John in the series:
http://thread.gmane.org/gmane.linux.network/329739/focus=329739
and many follow on patches.
This is the last patch from that series that finally drops
ingress spin_lock.

Single cpu ingress+u32 performance goes from 22.9 Mpps to 24.5 Mpps.

In two cpu case when both cores are receiving traffic on the same
device and go into the same ingress+u32 the performance jumps
from 4.5 + 4.5 Mpps to 23.5 + 23.5 Mpps

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'tcp_sack_rttm'
David S. Miller [Mon, 4 May 2015 03:18:02 +0000 (23:18 -0400)] 
Merge branch 'tcp_sack_rttm'

Kenneth Klette Jonassen says:

====================
tcp: SACK RTTM changes for congestion control

This patch series improves SACK RTT measurements for congestion control:
  o Picks the latest sequence SACKed for RTT, i.e. most accurate delay
    signal.
  o Calls the congestion control's pkts_acked hook with SACK RTTMs
    even when not sequentially ACKing new data.

V2: amend misleading comment
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agotcp: invoke pkts_acked hook on every ACK
Kenneth Klette Jonassen [Thu, 30 Apr 2015 23:10:59 +0000 (01:10 +0200)] 
tcp: invoke pkts_acked hook on every ACK

Invoking pkts_acked is currently conditioned on FLAG_ACKED:
receiving a cumulative ACK of new data, or ACK with SYN flag set.

Remove this condition so that CC may get RTT measurements from all SACKs.

Cc: Yuchung Cheng <ycheng@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Kenneth Klette Jonassen <kennetkl@ifi.uio.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agotcp: improve RTT from SACK for CC
Kenneth Klette Jonassen [Thu, 30 Apr 2015 23:10:58 +0000 (01:10 +0200)] 
tcp: improve RTT from SACK for CC

tcp_sacktag_one() always picks the earliest sequence SACKed for RTT.
This might not make sense for congestion control in cases where:

  1. ACKs are lost, i.e. a SACK following a lost SACK covers both
     new and old segments at the receiver.
  2. The receiver disregards the RFC 5681 recommendation to immediately
     ACK out-of-order segments.

Give congestion control a RTT for the latest segment SACKed, which is the
most accurate RTT estimate, but preserve the conservative RTT for RTO.

Removes the call to skb_mstamp_get() in tcp_sacktag_one().

Cc: Yuchung Cheng <ycheng@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Kenneth Klette Jonassen <kennetkl@ifi.uio.no>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agotcp: move struct tcp_sacktag_state to tcp_ack()
Kenneth Klette Jonassen [Thu, 30 Apr 2015 23:10:57 +0000 (01:10 +0200)] 
tcp: move struct tcp_sacktag_state to tcp_ack()

Later patch passes two values set in tcp_sacktag_one() to
tcp_clean_rtx_queue(). Prepare passing them via struct tcp_sacktag_state.

Acked-by: Yuchung Cheng <ycheng@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Kenneth Klette Jonassen <kennetkl@ifi.uio.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
This page took 0.05774 seconds and 5 git commands to generate.