aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2015-10-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netGravatar David S. Miller 5-1/+63
Conflicts: net/ipv6/xfrm6_output.c net/openvswitch/flow_netlink.c net/openvswitch/vport-gre.c net/openvswitch/vport-vxlan.c net/openvswitch/vport.c net/openvswitch/vport.h The openvswitch conflicts were overlapping changes. One was the egress tunnel info fix in 'net' and the other was the vport ->send() op simplification in 'net-next'. The xfrm6_output.c conflicts was also a simplification overlapping a bug fix. Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-24Merge branch 'for-upstream' of ↵Gravatar David S. Miller 4-163/+110
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2015-10-22 Here's probably the last bluetooth-next pull request for 4.4. Among several other changes it contains the rest of the fixes & cleanups from the Bluetooth UnplugFest (that didn't need to be hurried to 4.3). - Refactoring & cleanups to 6lowpan code - New USB ids for two Atheros controllers and BCM43142A0 from Broadcom - Fix (quirk) for broken Broadcom BCM2045 controllers - Support for latest Apple controllers - Improvements to the vendor diagnostic message support Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-23Merge branch 'master' of ↵Gravatar David S. Miller 3-0/+10
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2015-10-23 This series contains updates to i40e, i40evf, if_link, ixgbe and ixgbevf. Anjali adds a workaround to drop any flow control frames from being transmitted from any VSI, so that a malicious VF cannot send flow control or PFC packets out on the wire. Also fixed a bug in debugfs by grabbing the filter list lock before adding or deleting a filter. Akeem fixes an issue where we were unconditionally returning VEB bridge mode before allowing LB in the add VSI routine, resolve by checking if the bridge is actually in VEB mode first. Mitch fixed an issue where the incorrect structure was being used for VLAN filter list, which meant the VLAN filter list did not get processed correctly and VLAN filters would not be re-enabled after any kind of reset. Helin fixed a problem of possibly getting inconsistent flow control status after a PF reset. The issue was requested_mode was being set with a default value during probe, but the hardware state could be a different value from this mode. Carolyn fixed a problem where the driver output of the OEM version string varied from the other tools. Jean Sacren fixes up kernel documentation by fixing function header comments to match actual variables used in the functions. Also cleaned up variable initialization, when the variable would be over-written immediately. Hiroshi Shimanoto provides three patches to add "trusted" VF by adding netlink directives and an NDO entry. Then implement these new controls in ixgbe and ixgbevf. This series has gone through several iterations to address all the suggested community changes and concerns. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-23mpls: multipath route supportGravatar Roopa Prabhu 1-1/+1
This patch adds support for MPLS multipath routes. Includes following changes to support multipath: - splits struct mpls_route into 'struct mpls_route + struct mpls_nh' - 'struct mpls_nh' represents a mpls nexthop label forwarding entry - moves mpls route and nexthop structures into internal.h - A mpls_route can point to multiple mpls_nh structs - the nexthops are maintained as a array (similar to ipv4 fib) - In the process of restructuring, this patch also consistently changes all labels to u8 - Adds support to parse/fill RTA_MULTIPATH netlink attribute for multipath routes similar to ipv4/v6 fib - In this patch, the multipath route nexthop selection algorithm simply returns the first nexthop. It is replaced by a hash based algorithm from Robert Shearman in the next patch - mpls_route_update cleanup: remove 'dev' handling in mpls_route_update. mpls_route_update though implemented to update based on dev, it was never used that way. And the dev handling gets tricky with multiple nexthops. Cannot match against any single nexthops dev. So, this patch removes the unused 'dev' handling in mpls_route_update. - dead route/path handling will be implemented in a subsequent patch Example: $ip -f mpls route add 100 nexthop as 200 via inet 10.1.1.2 dev swp1 \ nexthop as 700 via inet 10.1.1.6 dev swp2 \ nexthop as 800 via inet 40.1.1.2 dev swp3 $ip -f mpls route show 100 nexthop as to 200 via inet 10.1.1.2 dev swp1 nexthop as to 700 via inet 10.1.1.6 dev swp2 nexthop as to 800 via inet 40.1.1.2 dev swp3 Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Acked-by: Robert Shearman <rshearma@brocade.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-23net: phy: Add nested variants of mdiobus read/writeGravatar Neil Armstrong 1-0/+2
Since nested variants of mdiobus_read/write are used in multiple drivers, add nested variants in the mdiobus core. Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-23if_link: Add control trust VFGravatar Hiroshi Shimamoto 3-0/+10
Add netlink directives and ndo entry to trust VF user. This controls the special permission of VF user. The administrator will dedicatedly trust VF user to use some features which impacts security and/or performance. The administrator never turn it on unless VF user is fully trusted. CC: Sy Jong Choi <sy.jong.choi@intel.com> Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Acked-by: Greg Rose <gregory.v.rose@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-10-23tcp/dccp: fix hashdance race for passive sessionsGravatar Eric Dumazet 3-4/+11
Multiple cpus can process duplicates of incoming ACK messages matching a SYN_RECV request socket. This is a rare event under normal operations, but definitely can happen. Only one must win the race, otherwise corruption would occur. To fix this without adding new atomic ops, we use logic in inet_ehash_nolisten() to detect the request was present in the same ehash bucket where we try to insert the new child. If request socket was not found, we have to undo the child creation. This actually removes a spin_lock()/spin_unlock() pair in reqsk_queue_unlink() for the fast path. Fixes: e994b2f0fb92 ("tcp: do not lock listener to process SYN packets") Fixes: 079096f103fa ("tcp/dccp: install syn_recv requests into ehash table") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-23overflow-arith: begin to add support for overflow builtin functionsGravatar Hannes Frederic Sowa 2-0/+22
The idea of the overflow-arith.h header is to collect overflow checking functions in one central place. If gcc compiler supports the __builtin_overflow_* builtins we use them because they might give better performance, otherwise the code falls back to normal overflow checking functions. The builtin_overflow functions are supported by gcc-5 and clang. The matter of supporting clang is to just provide a corresponding CC_HAVE_BUILTIN_OVERFLOW, because the specific overflow checking builtins don't differ between gcc and clang. I just provide overflow_usub function here as I intend this to get merged into net, more functions will definitely follow as they are needed. Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-22openvswitch: Fix egress tunnel info.Gravatar Pravin B Shelar 2-0/+39
While transitioning to netdev based vport we broke OVS feature which allows user to retrieve tunnel packet egress information for lwtunnel devices. Following patch fixes it by introducing ndo operation to get the tunnel egress info. Same ndo operation can be used for lwtunnel devices and compat ovs-tnl-vport devices. So after adding such device operation we can remove similar operation from ovs-vport. Fixes: 614732eaa12d ("openvswitch: Use regular VXLAN net_device device"). Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-22net: dsa: remove port_fdb_getnextGravatar Vivien Didelot 1-3/+0
No driver implements port_fdb_getnext anymore, and port_fdb_dump is preferred anyway, so remove this function from DSA. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-22net: dsa: add port_fdb_dump functionGravatar Vivien Didelot 1-0/+4
Not all switch chips support a Get Next operation to iterate on its FDB. So add a more simple port_fdb_dump function for them. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-22Merge tag 'mac80211-next-for-davem-2015-10-21' of ↵Gravatar David S. Miller 4-28/+168
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== Here's another set of patches for the current cycle: * I merged net-next back to avoid a conflict with the * cfg80211 scheduled scan API extensions * preparations for better scan result timestamping * regulatory cleanups * mac80211 statistics cleanups * a few other small cleanups and fixes ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-22bpf: introduce bpf_perf_event_output() helperGravatar Alexei Starovoitov 2-0/+12
This helper is used to send raw data from eBPF program into special PERF_TYPE_SOFTWARE/PERF_COUNT_SW_BPF_OUTPUT perf_event. User space needs to perf_event_open() it (either for one or all cpus) and store FD into perf_event_array (similar to bpf_perf_event_read() helper) before eBPF program can send data into it. Today the programs triggered by kprobe collect the data and either store it into the maps or print it via bpf_trace_printk() where latter is the debug facility and not suitable to stream the data. This new helper replaces such bpf_trace_printk() usage and allows programs to have dedicated channel into user space for post-processing of the raw data collected. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-22tcp: fastopen: limit max_qlenGravatar Eric Dumazet 1-1/+2
Allowing an application to set whatever limit for the list of recently RST fastopen sessions [1] is not wise, as it open ways to deplete kernel memory. Cap the user provided limit by somaxconn sysctl, like listen() backlog. [1] https://tools.ietf.org/html/rfc7413#section-5.1 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-21net: mdio-gpio: move platform data headerGravatar Vivien Didelot 1-0/+0
This header file only contains the platform data structure definition, so move it to the include/linux/platform_data/ directory. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-21openvswitch: Clarify conntrack COMMIT behaviourGravatar Joe Stringer 1-1/+2
The presence of this attribute does not modify the ct_state for the current packet, only future packets. Make this more clear in the header definition. Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-21netlink: Rightsize IFLA_AF_SPEC size calculationGravatar Arad, Ronen 1-1/+2
if_nlmsg_size() overestimates the minimum allocation size of netlink dump request (when called from rtnl_calcit()) or the size of the message (when called from rtnl_getlink()). This is because ext_filter_mask is not supported by rtnl_link_get_af_size() and rtnl_link_get_size(). The over-estimation is significant when at least one netdev has many VLANs configured (8 bytes for each configured VLAN). This patch-set "rightsizes" the protocol specific attribute size calculation by propagating ext_filter_mask to rtnl_link_get_af_size() and adding this a argument to get_link_af_size op in rtnl_af_ops. Bridge module already used filtering aware sizing for notifications. br_get_link_af_size_filtered() is consistent with the modified get_link_af_size op so it replaces br_get_link_af_size() in br_af_ops. br_get_link_af_size() becomes unused and thus removed. Signed-off-by: Ronen Arad <ronen.arad@intel.com> Acked-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-21Bluetooth: Remove unnecessary hci_explicit_connect_lookup functionGravatar Johan Hedberg 1-3/+0
There's only one user of this helper which can be replaces with a call to hci_pend_le_action_lookup() and a check for params->explicit_connect. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-10-21Bluetooth: Add hci_conn_hash_lookup_le() helper functionGravatar Johan Hedberg 1-0/+24
Many of the existing LE connection lookups are forced to use hci_conn_hash_lookup_ba() which doesn't take into account the address type. What's worse, most of the users don't bother checking that the returned address type matches what was wanted. This patch adds a new helper API to look up LE connections based on their address and address type, paving the way to have the hci_conn_hash_lookup_ba() users converted to do more precise lookups. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-10-21tcp: use RACK to detect lossesGravatar Yuchung Cheng 1-0/+9
This patch implements the second half of RACK that uses the the most recent transmit time among all delivered packets to detect losses. tcp_rack_mark_lost() is called upon receiving a dubious ACK. It then checks if an not-yet-sacked packet was sent at least "reo_wnd" prior to the sent time of the most recently delivered. If so the packet is deemed lost. The "reo_wnd" reordering window starts with 1msec for fast loss detection and changes to min-RTT/4 when reordering is observed. We found 1msec accommodates well on tiny degree of reordering (<3 pkts) on faster links. We use min-RTT instead of SRTT because reordering is more of a path property but SRTT can be inflated by self-inflicated congestion. The factor of 4 is borrowed from the delayed early retransmit and seems to work reasonably well. Since RACK is still experimental, it is now used as a supplemental loss detection on top of existing algorithms. It is only effective after the fast recovery starts or after the timeout occurs. The fast recovery is still triggered by FACK and/or dupack threshold instead of RACK. We introduce a new sysctl net.ipv4.tcp_recovery for future experiments of loss recoveries. For now RACK can be disabled by setting it to 0. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-21tcp: track the packet timings in RACKGravatar Yuchung Cheng 2-0/+11
This patch is the first half of the RACK loss recovery. RACK loss recovery uses the notion of time instead of packet sequence (FACK) or counts (dupthresh). It's inspired by the previous FACK heuristic in tcp_mark_lost_retrans(): when a limited transmit (new data packet) is sacked, then current retransmitted sequence below the newly sacked sequence must been lost, since at least one round trip time has elapsed. But it has several limitations: 1) can't detect tail drops since it depends on limited transmit 2) is disabled upon reordering (assumes no reordering) 3) only enabled in fast recovery ut not timeout recovery RACK (Recently ACK) addresses these limitations with the notion of time instead: a packet P1 is lost if a later packet P2 is s/acked, as at least one round trip has passed. Since RACK cares about the time sequence instead of the data sequence of packets, it can detect tail drops when later retransmission is s/acked while FACK or dupthresh can't. For reordering RACK uses a dynamically adjusted reordering window ("reo_wnd") to reduce false positives on ever (small) degree of reordering. This patch implements tcp_advanced_rack() which tracks the most recent transmission time among the packets that have been delivered (ACKed or SACKed) in tp->rack.mstamp. This timestamp is the key to determine which packet has been lost. Consider an example that the sender sends six packets: T1: P1 (lost) T2: P2 T3: P3 T4: P4 T100: sack of P2. rack.mstamp = T2 T101: retransmit P1 T102: sack of P2,P3,P4. rack.mstamp = T4 T205: ACK of P4 since the hole is repaired. rack.mstamp = T101 We need to be careful about spurious retransmission because it may falsely advance tp->rack.mstamp by an RTT or an RTO, causing RACK to falsely mark all packets lost, just like a spurious timeout. We identify spurious retransmission by the ACK's TS echo value. If TS option is not applicable but the retransmission is acknowledged less than min-RTT ago, it is likely to be spurious. We refrain from using the transmission time of these spurious retransmissions. The second half is implemented in the next patch that marks packet lost using RACK timestamp. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-21tcp: skb_mstamp_after helperGravatar Yuchung Cheng 1-0/+9
a helper to prepare the first main RACK patch. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-21tcp: remove tcp_mark_lost_retrans()Gravatar Yuchung Cheng 1-2/+0
Remove the existing lost retransmit detection because RACK subsumes it completely. This also stops the overloading the ack_seq field of the skb control block. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-21tcp: track min RTT using windowed min-filterGravatar Yuchung Cheng 2-0/+10
Kathleen Nichols' algorithm for tracking the minimum RTT of a data stream over some measurement window. It uses constant space and constant time per update. Yet it almost always delivers the same minimum as an implementation that has to keep all the data in the window. The measurement window is tunable via sysctl.net.ipv4.tcp_min_rtt_wlen with a default value of 5 minutes. The algorithm keeps track of the best, 2nd best & 3rd best min values, maintaining an invariant that the measurement time of the n'th best >= n-1'th best. It also makes sure that the three values are widely separated in the time window since that bounds the worse case error when that data is monotonically increasing over the window. Upon getting a new min, we can forget everything earlier because it has no value - the new min is less than everything else in the window by definition and it's the most recent. So we restart fresh on every new min and overwrites the 2nd & 3rd choices. The same property holds for the 2nd & 3rd best. Therefore we have to maintain two invariants to maximize the information in the samples, one on values (1st.v <= 2nd.v <= 3rd.v) and the other on times (now-win <=1st.t <= 2nd.t <= 3rd.t <= now). These invariants determine the structure of the code The RTT input to the windowed filter is the minimum RTT measured from ACK or SACK, or as the last resort from TCP timestamps. The accessor tcp_min_rtt() returns the minimum RTT seen in the window. ~0U indicates it is not available. The minimum is 1usec even if the true RTT is below that. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-21Bluetooth: Fix missing hdev locking for LE scan cleanupGravatar Johan Hedberg 1-0/+1
The hci_conn objects don't have a dedicated lock themselves but rely on the caller to hold the hci_dev lock for most types of access. The hci_conn_timeout() function has so far sent certain HCI commands based on the hci_conn state which has been possible without holding the hci_dev lock. The recent changes to do LE scanning before connect attempts added even more operations to hci_conn and hci_dev from hci_conn_timeout, thereby exposing potential race conditions with the hci_dev and hci_conn states. As an example of such a race, here there's a timeout but an l2cap_sock_connect() call manages to race with the cleanup routine: [Oct21 08:14] l2cap_chan_timeout: chan ee4b12c0 state BT_CONNECT [ +0.000004] l2cap_chan_close: chan ee4b12c0 state BT_CONNECT [ +0.000002] l2cap_chan_del: chan ee4b12c0, conn f3141580, err 111, state BT_CONNECT [ +0.000002] l2cap_sock_teardown_cb: chan ee4b12c0 state BT_CONNECT [ +0.000005] l2cap_chan_put: chan ee4b12c0 orig refcnt 4 [ +0.000010] hci_conn_drop: hcon f53d56e0 orig refcnt 1 [ +0.000013] l2cap_chan_put: chan ee4b12c0 orig refcnt 3 [ +0.000063] hci_conn_timeout: hcon f53d56e0 state BT_CONNECT [ +0.000049] hci_conn_params_del: addr ee:0d:30:09:53:1f (type 1) [ +0.000002] hci_chan_list_flush: hcon f53d56e0 [ +0.000001] hci_chan_del: hci0 hcon f53d56e0 chan f4e7ccc0 [ +0.004528] l2cap_sock_create: sock e708fc00 [ +0.000023] l2cap_chan_create: chan ee4b1770 [ +0.000001] l2cap_chan_hold: chan ee4b1770 orig refcnt 1 [ +0.000002] l2cap_sock_init: sk ee4b3390 [ +0.000029] l2cap_sock_bind: sk ee4b3390 [ +0.000010] l2cap_sock_setsockopt: sk ee4b3390 [ +0.000037] l2cap_sock_connect: sk ee4b3390 [ +0.000002] l2cap_chan_connect: 00:02:72:d9:e5:8b -> ee:0d:30:09:53:1f (type 2) psm 0x00 [ +0.000002] hci_get_route: 00:02:72:d9:e5:8b -> ee:0d:30:09:53:1f [ +0.000001] hci_dev_hold: hci0 orig refcnt 8 [ +0.000003] hci_conn_hold: hcon f53d56e0 orig refcnt 0 Above the l2cap_chan_connect() shouldn't have been able to reach the hci_conn f53d56e0 anymore but since hci_conn_timeout didn't do proper locking that's not the case. The end result is a reference to hci_conn that's not in the conn_hash list, resulting in list corruption when trying to remove it later: [Oct21 08:15] l2cap_chan_timeout: chan ee4b1770 state BT_CONNECT [ +0.000004] l2cap_chan_close: chan ee4b1770 state BT_CONNECT [ +0.000003] l2cap_chan_del: chan ee4b1770, conn f3141580, err 111, state BT_CONNECT [ +0.000001] l2cap_sock_teardown_cb: chan ee4b1770 state BT_CONNECT [ +0.000005] l2cap_chan_put: chan ee4b1770 orig refcnt 4 [ +0.000002] hci_conn_drop: hcon f53d56e0 orig refcnt 1 [ +0.000015] l2cap_chan_put: chan ee4b1770 orig refcnt 3 [ +0.000038] hci_conn_timeout: hcon f53d56e0 state BT_CONNECT [ +0.000003] hci_chan_list_flush: hcon f53d56e0 [ +0.000002] hci_conn_hash_del: hci0 hcon f53d56e0 [ +0.000001] ------------[ cut here ]------------ [ +0.000461] WARNING: CPU: 0 PID: 1782 at lib/list_debug.c:56 __list_del_entry+0x3f/0x71() [ +0.000839] list_del corruption, f53d56e0->prev is LIST_POISON2 (00000200) The necessary fix is unfortunately more complicated than just adding hci_dev_lock/unlock calls to the hci_conn_timeout() call path. Particularly, the hci_conn_del() API, which expects the hci_dev lock to be held, performs a cancel_delayed_work_sync(&hcon->disc_work) which would lead to a deadlock if the hci_conn_timeout() call path tries to acquire the same lock. This patch solves the problem by deferring the cleanup work to a separate work callback. To protect against the hci_dev or hci_conn going away meanwhile temporary references are taken with the help of hci_dev_hold() and hci_conn_get(). Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org # 4.3
2015-10-21Bluetooth: Introduce driver specific post init callbackGravatar Marcel Holtmann 1-0/+1
Some drivers might have to restore certain settings after the init procedure has been completed. This driver callback allows them to hook into that stage. This callback is run just before the controller is declared as powered up. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-10-216lowpan: remove lowpan_is_addr_broadcastGravatar Alexander Aring 1-10/+0
This macro is used at 802.15.4 6LoWPAN only and can be replaced by memcmp with the interface broadcast address. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-10-216lowpan: move IPHC functionality definesGravatar Alexander Aring 1-123/+0
This patch removes the IPHC related defines for doing bit manipulation from global 6lowpan header to the iphc file which should the only one implementation which use these defines. Also move next header compression defines to their nhc implementation. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-10-216lowpan: remove lowpan_fetch_skb_u8Gravatar Alexander Aring 1-13/+14
This patch removes the lowpan_fetch_skb_u8 function for getting the iphc bytes. Instead we using the generic which has a len parameter to tell the amount of bytes to fetch. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-10-216lowpan: cleanup lowpan_header_decompressGravatar Alexander Aring 2-6/+28
This patch changes the lowpan_header_decompress function by removing inklayer related information from parameters. This is currently for supporting short and extended address for iphc handling in 802154. We don't support short address handling anyway right now, but there exists already code for handling short addresses in lowpan_header_decompress. The address parameters are also changed to a void pointer, so 6LoWPAN linklayer specific code can put complex structures as these parameters and cast it again inside the generic code by evaluating linklayer type before. The order is also changed by destination address at first and then source address, which is the same like all others functions where destination is always the first, memcpy, dev_hard_header, lowpan_header_compress, etc. This patch also moves the fetching of iphc values from 6LoWPAN linklayer specific code into the generic branch. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-10-216lowpan: cleanup lowpan_header_compressGravatar Alexander Aring 1-7/+23
This patch changes the lowpan_header_compress function by removing unused parameters like "len" and drop static value parameters of protocol type. Instead we really check the protocol type inside inside the skb structure. Also we drop the use of IEEE802154_ADDR_LEN which is link-layer specific. Instead we using EUI64_ADDR_LEN which should always the default case for now. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-10-216lowpan: introduce LOWPAN_IPHC_MAX_HC_BUF_LENGravatar Alexander Aring 1-0/+8
This patch introduces the LOWPAN_IPHC_MAX_HC_BUF_LEN define which represent the worst-case supported IPHC buffer length. It's used to allocate the stack buffer space for creating the IPHC header. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-10-21Bluetooth: Add support setup stage internal notification eventGravatar Marcel Holtmann 1-0/+1
Before the vendor specific setup stage is triggered call back into the core to trigger an internal notification event. That event is used to send an index update to the monitor interface. With that specific event it is possible to update userspace with manufacturer information before any HCI command has been executed. This is useful for early stage debugging of vendor specific initialization sequences. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-10-21Bluetooth: Add new quirk for non-persistent diagnostic settingsGravatar Marcel Holtmann 1-0/+9
If the diagnostic settings are not persistent over HCI Reset, then this quirk can be used to tell the Bluetoth core about it. This will ensure that the settings are programmed correctly when the controller is powered up. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-10-21Bluetooth: Don't use remote address type to decide IRK persistencyGravatar Johan Hedberg 1-1/+1
There are LE devices on the market that start off by announcing their public address and then once paired switch to using private address. To be interoperable with such devices we should simply trust the fact that we're receiving an IRK from them to indicate that they may use private addresses in the future. Instead, simply tie the persistency to the bonding/no-bonding information the same way as for LTKs and CSRKs. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-10-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netGravatar David S. Miller 16-39/+129
Conflicts: drivers/net/usb/asix_common.c net/ipv4/inet_connection_sock.c net/switchdev/switchdev.c In the inet_connection_sock.c case the request socket hashing scheme is completely different in net-next. The other two conflicts were overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netGravatar Linus Torvalds 5-25/+27
Pull networking fixes from David Miller: 1) Account for extra headroom in ath9k driver, from Felix Fietkau. 2) Fix OOPS in pppoe driver due to incorrect socket state transition, from Guillaume Nault. 3) Kill memory leak in amd-xgbe debugfx, from Geliang Tang. 4) Power management fixes for iwlwifi, from Johannes Berg. 5) Fix races in reqsk_queue_unlink(), from Eric Dumazet. 6) Fix dst_entry usage in ARP replies, from Jiri Benc. 7) Cure OOPSes with SO_GET_FILTER, from Daniel Borkmann. 8) Missing allocation failure check in amd-xgbe, from Tom Lendacky. 9) Various resource allocation/freeing cures in DSA< from Neil Armstrong. 10) A series of bug fixes in the openvswitch conntrack support, from Joe Stringer. 11) Fix two cases (BPF and act_mirred) where we have to clean the sender cpu stored in the SKB before transmitting. From WANG Cong and Alexei Starovoitov. 12) Disable VLAN filtering in promiscuous mode in mlx5 driver, from Achiad Shochat. 13) Older bnx2x chips cannot do 4-tuple UDP hashing, so prevent this configuration via ethtool. From Yuval Mintz. 14) Don't call rt6_uncached_list_flush_dev() from rt6_ifdown() when 'dev' is NULL, from Eric Biederman. 15) Prevent stalled link synchronization in tipc, from Jon Paul Maloy. 16) kcalloc() gstrings ethtool buffer before having driver fill it in, in order to prevent kernel memory leaking. From Joe Perches. 17) Fix mixxing rt6_info initialization for blackhole routes, from Martin KaFai Lau. 18) Kill VLAN regression in via-rhine, from Andrej Ota. 19) Missing pfmemalloc check in sk_add_backlog(), from Eric Dumazet. 20) Fix spurious MSG_TRUNC signalling in netlink dumps, from Ronen Arad. 21) Scrube SKBs when pushing them between namespaces in openvswitch, from Joe Stringer. 22) bcmgenet enables link interrupts too early, fix from Florian Fainelli. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (92 commits) net: bcmgenet: Fix early link interrupt enabling tunnels: Don't require remote endpoint or ID during creation. openvswitch: Scrub skb between namespaces xen-netback: correctly check failed allocation net: asix: add support for the Billionton GUSB2AM-1G-B USB adapter netlink: Trim skb to alloc size to avoid MSG_TRUNC net: add pfmemalloc check in sk_add_backlog() via-rhine: fix VLAN receive handling regression. ipv6: Initialize rt6_info properly in ip6_blackhole_route() ipv6: Move common init code for rt6_info to a new function rt6_info_init() Bluetooth: Fix initializing conn_params in scan phase Bluetooth: Fix conn_params list update in hci_connect_le_scan_cleanup Bluetooth: Fix remove_device behavior for explicit connects Bluetooth: Fix LE reconnection logic Bluetooth: Fix reference counting for LE-scan based connections Bluetooth: Fix double scan updates mlxsw: core: Fix race condition in __mlxsw_emad_transmit tipc: move fragment importance field to new header position ethtool: Use kcalloc instead of kmalloc for ethtool_get_strings tipc: eliminate risk of stalled link synchronization ...
2015-10-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextGravatar David S. Miller 6-71/+50
Pablo Neira Ayuso says: ==================== Netfilter/IPVS updates for net-next The following patchset contains Netfilter/IPVS updates for your net-next tree. Most relevantly, updates for the nfnetlink_log to integrate with conntrack, fixes for cttimeout and improvements for nf_queue core, they are: 1) Remove useless ifdef around static inline function in IPVS, from Eric W. Biederman. 2) Simplify the conntrack support for nfnetlink_queue: Merge nfnetlink_queue_ct.c file into nfnetlink_queue_core.c, then rename it back to nfnetlink_queue.c 3) Use y2038 safe timestamp from nfnetlink_queue. 4) Get rid of dead function definition in nf_conntrack, from Flavio Leitner. 5) Attach conntrack support for nfnetlink_log.c, from Ken-ichirou MATSUZAWA. This adds a new NETFILTER_NETLINK_GLUE_CT Kconfig switch that controls enabling both nfqueue and nflog integration with conntrack. The userspace application can request this via NFULNL_CFG_F_CONNTRACK configuration flag. 6) Remove unused netns variables in IPVS, from Eric W. Biederman and Simon Horman. 7) Don't put back the refcount on the cttimeout object from xt_CT on success. 8) Fix crash on cttimeout policy object removal. We have to flush out the cttimeout extension area of the conntrack not to refer to an unexisting object that was just removed. 9) Make sure rcu_callback completion before removing nfnetlink_cttimeout module removal. 10) Fix compilation warning in br_netfilter when no nf_defrag_ipv4 and nf_defrag_ipv6 are enabled. Patch from Arnd Bergmann. 11) Autoload ctnetlink dependencies when NFULNL_CFG_F_CONNTRACK is requested. Again from Ken-ichirou MATSUZAWA. 12) Don't use pointer to previous hook when reinjecting traffic via nf_queue with NF_REPEAT verdict since it may be already gone. This also avoids a deadloop if the userspace application keeps returning NF_REPEAT. 13) A bunch of cleanups for netfilter IPv4 and IPv6 code from Ian Morris. 14) Consolidate logger instance existence check in nfulnl_recv_config(). 15) Fix broken atomicity when applying configuration updates to logger instances in nfnetlink_log. 16) Get rid of the .owner attribute in our hook object. We don't need this anymore since we're dropping pending packets that have escaped from the kernel when unremoving the hook. Patch from Florian Westphal. 17) Remove unnecessary rcu_read_lock() from nf_reinject code, we always assume RCU read side lock from .call_rcu in nfnetlink. Also from Florian. 18) Use static inline function instead of macros to define NF_HOOK() and NF_HOOK_COND() when no netfilter support in on, from Arnd Bergmann. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-18uapi: add mpls_iptunnel.hGravatar stephen hemminger 1-0/+1
Add missing rule to export mpls iptunnel header needed by iproute2 Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-18tcp: do not set queue_mapping on SYNACKGravatar Eric Dumazet 1-1/+1
At the time of commit fff326990789 ("tcp: reflect SYN queue_mapping into SYNACK packets") we had little ways to cope with SYN floods. We no longer need to reflect incoming skb queue mappings, and instead can pick a TX queue based on cpu cooking the SYNACK, with normal XPS affinities. Note that all SYNACK retransmits were picking TX queue 0, this no longer is a win given that SYNACK rtx are now distributed on all cpus. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-17Merge branch 'master' of ↵Gravatar Pablo Neira Ayuso 61-332/+994
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next This merge resolves conflicts with 75aec9df3a78 ("bridge: Remove br_nf_push_frag_xmit_sk") as part of Eric Biederman's effort to improve netns support in the network stack that reached upstream via David's net-next tree. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Conflicts: net/bridge/br_netfilter_hooks.c
2015-10-17net: add pfmemalloc check in sk_add_backlog()Gravatar Eric Dumazet 1-0/+8
Greg reported crashes hitting the following check in __sk_backlog_rcv() BUG_ON(!sock_flag(sk, SOCK_MEMALLOC)); The pfmemalloc bit is currently checked in sk_filter(). This works correctly for TCP, because sk_filter() is ran in tcp_v[46]_rcv() before hitting the prequeue or backlog checks. For UDP or other protocols, this does not work, because the sk_filter() is ran from sock_queue_rcv_skb(), which might be called _after_ backlog queuing if socket is owned by user by the time packet is processed by softirq handler. Fixes: b4b9e35585089 ("netvm: set PF_MEMALLOC as appropriate during SKB processing") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Greg Thelen <gthelen@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linuxGravatar Linus Torvalds 1-1/+2
Pull drm fixes from Dave Airlie: "Nothing too crazy or exciting: - two MAINTAINERS entries that I didn't see the point in delaying. - one drm mst fix to stop sending uninitialised data to monitors - two amdgpu fixes - one radeon mst tiling fix - one vmwgfx regression fix - one virtio warning fix. I have found one locking problem that needs a bit of reorg to fix, but I'm not sure it's worth putting in -fixes as I don't think we've seen it hit in the real world ever, I just found it using the virtio-gpu driver when working on it. I'll possibly send it next week once I've time to discuss with Daniel" * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: drm/virtio: use %llu format string form atomic64_t MAINTAINERS: Add myself as maintainer for the gma500 driver MAINTAINERS: add a maintainer for the atmel-hlcdc DRM driver drm/amdgpu: Keep the pflip interrupts always enabled v7 drm/amdgpu: adjust default dispclk (v2) drm/dp/mst: make mst i2c transfer code more robust. drm/radeon: attach tile property to mst connector drm/vmwgfx: Fix kernel NULL pointer dereference on older hardware
2015-10-16netfilter: turn NF_HOOK into an inline functionGravatar Arnd Bergmann 1-2/+17
A recent change to the dst_output handling caused a new warning when the call to NF_HOOK() is the only used of a local variable passed as 'dev', and CONFIG_NETFILTER is disabled: net/ipv6/ip6_output.c: In function 'ip6_output': net/ipv6/ip6_output.c:135:21: warning: unused variable 'dev' [-Wunused-variable] The reason for this is that the NF_HOOK macro in this case does not reference the variable at all, and the call to dev_net(dev) got removed from the ip6_output function. To avoid that warning now and in the future, this changes the macro into an equivalent inline function, which tells the compiler that the variable is passed correctly but still unused. The dn_forward function apparently had the same problem in the past and added a local workaround that no longer works with the inline function. In order to avoid a regression, we have to also remove the #ifdef from decnet in the same patch. Fixes: ede2059dbaf9 ("dst: Pass net into dst->output") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-10-16netfilter: make nf_queue_entry_get_refs return voidGravatar Florian Westphal 1-1/+1
We don't care if module is being unloaded anymore since hook unregister handling will destroy queue entries using that hook. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-10-16netfilter: remove hook owner refcountingGravatar Florian Westphal 1-1/+0
since commit 8405a8fff3f8 ("netfilter: nf_qeueue: Drop queue entries on nf_unregister_hook") all pending queued entries are discarded. So we can simply remove all of the owner handling -- when module is removed it also needs to unregister all its hooks. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-10-16net: introduce pre-change upper device notifierGravatar Jiri Pirko 1-0/+1
This newly introduced netdevice notifier is called before actual change upper happens. That provides a possibility for notifier handlers to know upper change will happen and react to it, including possibility to forbid the change. That is valuable for drivers which can check if the upper device linkage is supported and forbid that in case it is not. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16tcp/dccp: fix race at listener dismantle phaseGravatar Eric Dumazet 2-26/+2
Under stress, a close() on a listener can trigger the WARN_ON(sk->sk_ack_backlog) in inet_csk_listen_stop() We need to test if listener is still active before queueing a child in inet_csk_reqsk_queue_add() Create a common inet_child_forget() helper, and use it from inet_csk_reqsk_queue_add() and inet_csk_listen_stop() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16tcp/dccp: add inet_csk_reqsk_queue_drop_and_put() helperGravatar Eric Dumazet 1-0/+1
Let's reduce the confusion about inet_csk_reqsk_queue_drop() : In many cases we also need to release reference on request socket, so add a helper to do this, reducing code size and complexity. Fixes: 4bdc3d66147b ("tcp/dccp: fix behavior of stale SYN_RECV request sockets") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15switchdev: introduce possibility to defer obj_add/delGravatar Jiri Pirko 1-0/+1
Similar to the attr usecase, the caller knows if he is holding RTNL and is in atomic section. So let the called to decide the correct call variant. This allows drivers to sleep inside their ops and wait for hw to get the operation status. Then the status is propagated into switchdev core. This avoids silent errors in drivers. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>