aboutsummaryrefslogtreecommitdiff
path: root/net/rds
AgeCommit message (Collapse)AuthorFilesLines
2020-05-28tcp: add tcp_sock_set_keepcntGravatar Christoph Hellwig 2-15/+4
Add a helper to directly set the TCP_KEEPCNT sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28tcp: add tcp_sock_set_keepintvlGravatar Christoph Hellwig 1-3/+1
Add a helper to directly set the TCP_KEEPINTVL sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28tcp: add tcp_sock_set_keepidleGravatar Christoph Hellwig 1-4/+1
Add a helper to directly set the TCP_KEEP_IDLE sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28tcp: add tcp_sock_set_nodelayGravatar Christoph Hellwig 3-12/+2
Add a helper to directly set the TCP_NODELAY sockopt from kernel space without going through a fake uaccess. Cleanup the callers to avoid pointless wrappers now that this is a simple function call. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Sagi Grimberg <sagi@grimberg.me> Acked-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28tcp: add tcp_sock_set_corkGravatar Christoph Hellwig 1-7/+2
Add a helper to directly set the TCP_CORK sockopt from kernel space without going through a fake uaccess. Cleanup the callers to avoid pointless wrappers now that this is a simple function call. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28net: add sock_set_keepaliveGravatar Christoph Hellwig 1-5/+1
Add a helper to directly set the SO_KEEPALIVE sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28net: add sock_no_lingerGravatar Christoph Hellwig 3-14/+2
Add a helper to directly set the SO_LINGER sockopt from kernel space with onoff set to true and a linger time of 0 without going through a fake uaccess. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-20rds: fix crash in rds_info_getsockopt()Gravatar John Hubbard 1-1/+2
The conversion to pin_user_pages() had a bug: it overlooked the case of allocation of pages failing. Fix that by restoring an equivalent check. Reported-by: syzbot+118ac0af4ac7f785a45b@syzkaller.appspotmail.com Fixes: dbfe7d74376e ("rds: convert get_user_pages() --> pin_user_pages()") Cc: David S. Miller <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: netdev@vger.kernel.org Cc: linux-rdma@vger.kernel.org Cc: rds-devel@oss.oracle.com Signed-off-by: John Hubbard <jhubbard@nvidia.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-17rds: convert get_user_pages() --> pin_user_pages()Gravatar John Hubbard 1-4/+2
This code was using get_user_pages_fast(), in a "Case 2" scenario (DMA/RDMA), using the categorization from [1]. That means that it's time to convert the get_user_pages_fast() + put_page() calls to pin_user_pages_fast() + unpin_user_pages() calls. There is some helpful background in [2]: basically, this is a small part of fixing a long-standing disconnect between pinning pages, and file systems' use of those pages. [1] Documentation/core-api/pin_user_pages.rst [2] "Explicit pinning of user-space pages": https://lwn.net/Articles/807108/ Cc: David S. Miller <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: netdev@vger.kernel.org Cc: linux-rdma@vger.kernel.org Cc: rds-devel@oss.oracle.com Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-28Merge branch 'work.sysctl' of ↵Gravatar Daniel Borkmann 1-4/+2
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull in Christoph Hellwig's series that changes the sysctl's ->proc_handler methods to take kernel pointers instead. It gets rid of the set_fs address space overrides used by BPF. As per discussion, pull in the feature branch into bpf-next as it relates to BPF sysctl progs. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200427071508.GV23230@ZenIV.linux.org.uk/T/
2020-04-27sysctl: pass kernel pointers to ->proc_handlerGravatar Christoph Hellwig 1-4/+2
Instead of having all the sysctl handlers deal with user pointers, which is rather hairy in terms of the BPF interaction, copy the input to and from userspace in common code. This also means that the strings are always NUL-terminated by the common code, making the API a little bit safer. As most handler just pass through the data to one of the common handlers a lot of the changes are mechnical. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-04-15net/rds: Use ERR_PTR for rds_message_alloc_sgs()Gravatar Jason Gunthorpe 4-21/+19
Returning the error code via a 'int *ret' when the function returns a pointer is very un-kernely and causes gcc 10's static analysis to choke: net/rds/message.c: In function ‘rds_message_map_pages’: net/rds/message.c:358:10: warning: ‘ret’ may be used uninitialized in this function [-Wmaybe-uninitialized] 358 | return ERR_PTR(ret); Use a typical ERR_PTR return instead. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-09net/rds: Fix MR reference counting problemGravatar Ka-Cheong Poon 2-21/+12
In rds_free_mr(), it calls rds_destroy_mr(mr) directly. But this defeats the purpose of reference counting and makes MR free handling impossible. It means that holding a reference does not guarantee that it is safe to access some fields. For example, In rds_cmsg_rdma_dest(), it increases the ref count, unlocks and then calls mr->r_trans->sync_mr(). But if rds_free_mr() (and rds_destroy_mr()) is called in between (there is no lock preventing this to happen), r_trans_private is set to NULL, causing a panic. Similar issue is in rds_rdma_unuse(). Reported-by: zerons <sironhide0null@gmail.com> Signed-off-by: Ka-Cheong Poon <ka-cheong.poon@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-09net/rds: Replace struct rds_mr's r_refcount with struct krefGravatar Ka-Cheong Poon 3-23/+20
And removed rds_mr_put(). Signed-off-by: Ka-Cheong Poon <ka-cheong.poon@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-16net/rds: Track user mapped pages through special APIGravatar Leon Romanovsky 1-12/+12
Convert net/rds to use the newly introduces pin_user_pages() API, which properly sets FOLL_PIN. Setting FOLL_PIN is now required for code that requires tracking of pinned pages. Note that this effectively changes the code's behavior: it now ultimately calls set_page_dirty_lock(), instead of set_page_dirty(). This is probably more accurate. As Christoph Hellwig put it, "set_page_dirty() is only safe if we are dealing with a file backed page where we have reference on the inode it hangs off." [1] [1] https://lore.kernel.org/r/20190723153640.GB720@lst.de Cc: Hans Westgaard Ry <hans.westgaard.ry@oracle.com> Cc: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: John Hubbard <jhubbard@nvidia.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-18net/rds: Use prefetch for On-Demand-Paging MRGravatar Hans Westgaard Ry 1-0/+9
Try prefetching pages when using On-Demand-Paging MR using ib_advise_mr. Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-01-18net/rds: Handle ODP mr registration/unregistrationGravatar Hans Westgaard Ry 7-56/+244
On-Demand-Paging MRs are registered using ib_reg_user_mr and unregistered with ib_dereg_mr. Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-01-16net/rds: Detect need of On-Demand-Paging memory registrationGravatar Hans Westgaard Ry 1-2/+4
Add code to check if memory intended for RDMA is FS-DAX-memory. RDS will fail with error code EOPNOTSUPP if FS-DAX-memory is detected. Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2019-11-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netGravatar David S. Miller 1-8/+15
Lots of overlapping changes and parallel additions, stuff like that. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-16rds: ib: update WR sizes when bringing up connectionGravatar Dag Moxnes 1-8/+15
Currently WR sizes are updated from rds_ib_sysctl_max_send_wr and rds_ib_sysctl_max_recv_wr when a connection is shut down. As a result, a connection being down while rds_ib_sysctl_max_send_wr or rds_ib_sysctl_max_recv_wr are updated, will not update the sizes when it comes back up. Move resizing of WRs to rds_ib_setup_qp so that connections will be setup with the most current WR sizes. Signed-off-by: Dag Moxnes <dag.moxnes@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-17net/rds: Remove unnecessary null checkGravatar YueHaibing 1-2/+1
Null check before dma_pool_destroy is redundant, so remove it. This is detected by coccinelle. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-06net/rds: Add missing include fileGravatar YueHaibing 2-0/+2
Fix build error: net/rds/ib_cm.c: In function rds_dma_hdrs_alloc: net/rds/ib_cm.c:475:13: error: implicit declaration of function dma_pool_zalloc; did you mean mempool_alloc? [-Werror=implicit-function-declaration] hdrs[i] = dma_pool_zalloc(pool, GFP_KERNEL, &hdr_daddrs[i]); ^~~~~~~~~~~~~~~ mempool_alloc net/rds/ib.c: In function rds_ib_dev_free: net/rds/ib.c:111:3: error: implicit declaration of function dma_pool_destroy; did you mean mempool_destroy? [-Werror=implicit-function-declaration] dma_pool_destroy(rds_ibdev->rid_hdrs_pool); ^~~~~~~~~~~~~~~~ mempool_destroy Reported-by: Hulk Robot <hulkci@huawei.com> Fixes: 9b17f5884be4 ("net/rds: Use DMA memory pool allocation for rds_header") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netGravatar David S. Miller 1-3/+3
2019-10-03net/rds: Use DMA memory pool allocation for rds_headerGravatar Ka-Cheong Poon 5-61/+153
Currently, RDS calls ib_dma_alloc_coherent() to allocate a large piece of contiguous DMA coherent memory to store struct rds_header for sending/receiving packets. The memory allocated is then partitioned into struct rds_header. This is not necessary and can be costly at times when memory is fragmented. Instead, RDS should use the DMA memory pool interface to handle this. The DMA addresses of the pre- allocated headers are stored in an array. At send/receive ring initialization and refill time, this arrary is de-referenced to get the DMA addresses. This array is not accessed at send/receive packet processing. Suggested-by: Håkon Bugge <haakon.bugge@oracle.com> Signed-off-by: Ka-Cheong Poon <ka-cheong.poon@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-02net/rds: Log vendor error if send/recv Work requests failGravatar Sudhakar Dindukurti 2-4/+5
Log vendor error if work requests fail. Vendor error provides more information that is used for debugging the issue. Signed-off-by: Sudhakar Dindukurti <sudhakar.dindukurti@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-02net/rds: Fix error handling in rds_ib_add_one()Gravatar Dotan Barak 1-3/+3
rds_ibdev:ipaddr_list and rds_ibdev:conn_list are initialized after allocation some resources such as protection domain. If allocation of such resources fail, then these uninitialized variables are accessed in rds_ib_dev_free() in failure path. This can potentially crash the system. The code has been updated to initialize these variables very early in the function. Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il> Signed-off-by: Sudhakar Dindukurti <sudhakar.dindukurti@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-27net/rds: Check laddr_check before calling itGravatar Ka-Cheong Poon 1-1/+4
In rds_bind(), laddr_check is called without checking if it is NULL or not. And rs_transport should be reset if rds_add_bound() fails. Fixes: c5c1a030a7db ("net/rds: An rds_sock is added too early to the hash table") Reported-by: syzbot+fae39afd2101a17ec624@syzkaller.appspotmail.com Signed-off-by: Ka-Cheong Poon <ka-cheong.poon@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-26net: Fix Kconfig indentationGravatar Krzysztof Kozlowski 1-2/+2
Adjust indentation from spaces to tab (+optional two spaces) as in coding style with command like: $ sed -e 's/^ /\t/' -i */Kconfig Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Acked-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-17Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/netGravatar David S. Miller 1-1/+1
Pull in bug fixes from 'net' tree for the merge window. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-15net/rds: Fix 'ib_evt_handler_call' element in 'rds_ib_stat_names'Gravatar Gerd Rausch 1-1/+1
All entries in 'rds_ib_stat_names' are stringified versions of the corresponding "struct rds_ib_statistics" element without the "s_"-prefix. Fix entry 'ib_evt_handler_call' to do the same. Fixes: f4f943c958a2 ("RDS: IB: ack more receive completions to improve performance") Signed-off-by: Gerd Rausch <gerd.rausch@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netGravatar David S. Miller 1-22/+18
Minor overlapping changes in the btusb and ixgbe drivers. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-11net/rds: An rds_sock is added too early to the hash tableGravatar Ka-Cheong Poon 1-22/+18
In rds_bind(), an rds_sock is added to the RDS bind hash table before rs_transport is set. This means that the socket can be found by the receive code path when rs_transport is NULL. And the receive code path de-references rs_transport for congestion update check. This can cause a panic. An rds_sock should not be added to the bind hash table before all the needed fields are set. Reported-by: syzbot+4b4f8163c2e246df3c4c@syzkaller.appspotmail.com Signed-off-by: Ka-Cheong Poon <ka-cheong.poon@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05Convert usage of IN_MULTICAST to ipv4_is_multicastGravatar Dave Taht 3-6/+6
IN_MULTICAST's primary intent is as a uapi macro. Elsewhere in the kernel we use ipv4_is_multicast consistently. This patch unifies linux's multicast checks to use that function rather than this macro. Signed-off-by: Dave Taht <dave.taht@gmail.com> Reviewed-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netGravatar David S. Miller 1-1/+4
r8152 conflicts are the NAPI fixes in 'net' overlapping with some tasklet stuff in net-next Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-27net/rds: Fix info leak in rds6_inc_info_copy()Gravatar Ka-Cheong Poon 1-1/+4
The rds6_inc_info_copy() function has a couple struct members which are leaking stack information. The ->tos field should hold actual information and the ->flags field needs to be zeroed out. Fixes: 3eb450367d08 ("rds: add type of service(tos) infrastructure") Fixes: b7ff8b1036f0 ("rds: Extend RDS API for IPv6 support") Reported-by: 黄ID蝴蝶 <butterflyhuangxx@gmail.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Ka-Cheong Poon <ka-cheong.poon@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netGravatar David S. Miller 4-8/+22
Minor conflict in r8169, bug fix had two versions in net and net-next, take the net-next hunks. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-24net: rds: add service level support in rds-infoGravatar Zhu Yanjun 4-8/+22
>From IB specific 7.6.5 SERVICE LEVEL, Service Level (SL) is used to identify different flows within an IBA subnet. It is carried in the local route header of the packet. Before this commit, run "rds-info -I". The outputs are as below: " RDS IB Connections: LocalAddr RemoteAddr Tos SL LocalDev RemoteDev 192.2.95.3 192.2.95.1 2 0 fe80::21:28:1a:39 fe80::21:28:10:b9 192.2.95.3 192.2.95.1 1 0 fe80::21:28:1a:39 fe80::21:28:10:b9 192.2.95.3 192.2.95.1 0 0 fe80::21:28:1a:39 fe80::21:28:10:b9 " After this commit, the output is as below: " RDS IB Connections: LocalAddr RemoteAddr Tos SL LocalDev RemoteDev 192.2.95.3 192.2.95.1 2 2 fe80::21:28:1a:39 fe80::21:28:10:b9 192.2.95.3 192.2.95.1 1 1 fe80::21:28:1a:39 fe80::21:28:10:b9 192.2.95.3 192.2.95.1 0 0 fe80::21:28:1a:39 fe80::21:28:10:b9 " The commit fe3475af3bdf ("net: rds: add per rds connection cache statistics") adds cache_allocs in struct rds_info_rdma_connection as below: struct rds_info_rdma_connection { ... __u32 rdma_mr_max; __u32 rdma_mr_size; __u8 tos; __u32 cache_allocs; }; The peer struct in rds-tools of struct rds_info_rdma_connection is as below: struct rds_info_rdma_connection { ... uint32_t rdma_mr_max; uint32_t rdma_mr_size; uint8_t tos; uint8_t sl; uint32_t cache_allocs; }; The difference between userspace and kernel is the member variable sl. In the kernel struct, the member variable sl is missing. This will introduce risks. So it is necessary to use this commit to avoid this risk. Fixes: fe3475af3bdf ("net: rds: add per rds connection cache statistics") CC: Joe Jin <joe.jin@oracle.com> CC: JUNXIAO_BI <junxiao.bi@oracle.com> Suggested-by: Gerd Rausch <gerd.rausch@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-23net/rds: Whitelist rdma_cookie and rx_tstamp for usercopyGravatar Dag Moxnes 3-15/+27
Add the RDMA cookie and RX timestamp to the usercopy whitelist. After the introduction of hardened usercopy whitelisting (https://lwn.net/Articles/727322/), a warning is displayed when the RDMA cookie or RX timestamp is copied to userspace: kernel: WARNING: CPU: 3 PID: 5750 at mm/usercopy.c:81 usercopy_warn+0x8e/0xa6 [...] kernel: Call Trace: kernel: __check_heap_object+0xb8/0x11b kernel: __check_object_size+0xe3/0x1bc kernel: put_cmsg+0x95/0x115 kernel: rds_recvmsg+0x43d/0x620 [rds] kernel: sock_recvmsg+0x43/0x4a kernel: ___sys_recvmsg+0xda/0x1e6 kernel: ? __handle_mm_fault+0xcae/0xf79 kernel: __sys_recvmsg+0x51/0x8a kernel: SyS_recvmsg+0x12/0x1c kernel: do_syscall_64+0x79/0x1ae When the whitelisting feature was introduced, the memory for the RDMA cookie and RX timestamp in RDS was not added to the whitelist, causing the warning above. Signed-off-by: Dag Moxnes <dag.moxnes@oracle.com> Tested-by: Jenny <jenny.x.xu@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-15rds: check for excessive looping in rds_send_xmitGravatar Andy Grover 3-1/+14
Original commit from 2011 updated to include a change by Yuval Shaia <yuval.shaia@oracle.com> that adds a new statistic counter "send_stuck_rm" to capture the messages looping exessively in the send path. Signed-off-by: Gerd Rausch <gerd.rausch@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-15net/rds: Add a few missing rds_stat_names entriesGravatar Gerd Rausch 1-0/+2
In a previous commit, fields were added to "struct rds_statistics" but array "rds_stat_names" was not updated accordingly. Please note the inconsistent naming of the string representations that is done in the name of compatibility with the Oracle internal code-base. s_recv_bytes_added_to_socket -> "recv_bytes_added_to_sock" s_recv_bytes_removed_from_socket -> "recv_bytes_freed_fromsock" Fixes: 192a798f5299 ("RDS: add stat for socket recv memory usage") Signed-off-by: Gerd Rausch <gerd.rausch@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-15RDS: don't use GFP_ATOMIC for sk_alloc in rds_createGravatar Chris Mason 1-1/+1
Signed-off-by: Chris Mason <chris.mason@oracle.com> Signed-off-by: Bang Nguyen <bang.nguyen@oracle.com> Signed-off-by: Gerd Rausch <gerd.rausch@oracle.com> Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-15RDS: limit the number of times we loop in rds_send_xmitGravatar Chris Mason 1-1/+11
This will kick the RDS worker thread if we have been looping too long. Original commit from 2012 updated to include a change by Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com> that triggers "must_wake" if "rds_ib_recv_refill_one" fails. Signed-off-by: Gerd Rausch <gerd.rausch@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-15net/rds: Add RDS6_INFO_SOCKETS and RDS6_INFO_RECV_MESSAGES optionsGravatar Ka-Cheong Poon 1-3/+90
Add support of the socket options RDS6_INFO_SOCKETS and RDS6_INFO_RECV_MESSAGES which update the RDS_INFO_SOCKETS and RDS_INFO_RECV_MESSAGES options respectively. The old options work for IPv4 sockets only. Signed-off-by: Ka-Cheong Poon <ka-cheong.poon@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-27net: rds: Fix possible null-pointer dereferences in ↵Gravatar Jia-Ju Bai 1-1/+4
rds_rdma_cm_event_handler_cmn() In rds_rdma_cm_event_handler_cmn(), there are some if statements to check whether conn is NULL, such as on lines 65, 96 and 112. But conn is not checked before being used on line 108: trans->cm_connect_complete(conn, event); and on lines 140-143: rdsdebug("DISCONNECT event - dropping connection " "%pI6c->%pI6c\n", &conn->c_laddr, &conn->c_faddr); rds_conn_drop(conn); Thus, possible null-pointer dereferences may occur. To fix these bugs, conn is checked before being used. These bugs are found by a static analysis tool STCheck written by us. Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netGravatar Linus Torvalds 5-49/+109
Pull networking fixes from David Miller: 1) Fix AF_XDP cq entry leak, from Ilya Maximets. 2) Fix handling of PHY power-down on RTL8411B, from Heiner Kallweit. 3) Add some new PCI IDs to iwlwifi, from Ihab Zhaika. 4) Fix handling of neigh timers wrt. entries added by userspace, from Lorenzo Bianconi. 5) Various cases of missing of_node_put(), from Nishka Dasgupta. 6) The new NET_ACT_CT needs to depend upon NF_NAT, from Yue Haibing. 7) Various RDS layer fixes, from Gerd Rausch. 8) Fix some more fallout from TCQ_F_CAN_BYPASS generalization, from Cong Wang. 9) Fix FIB source validation checks over loopback, also from Cong Wang. 10) Use promisc for unsupported number of filters, from Justin Chen. 11) Missing sibling route unlink on failure in ipv6, from Ido Schimmel. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (90 commits) tcp: fix tcp_set_congestion_control() use from bpf hook ag71xx: fix return value check in ag71xx_probe() ag71xx: fix error return code in ag71xx_probe() usb: qmi_wwan: add D-Link DWM-222 A2 device ID bnxt_en: Fix VNIC accounting when enabling aRFS on 57500 chips. net: dsa: sja1105: Fix missing unlock on error in sk_buff() gve: replace kfree with kvfree selftests/bpf: fix test_xdp_noinline on s390 selftests/bpf: fix "valid read map access into a read-only array 1" on s390 net/mlx5: Replace kfree with kvfree MAINTAINERS: update netsec driver ipv6: Unlink sibling route in case of failure liquidio: Replace vmalloc + memset with vzalloc udp: Fix typo in net/ipv4/udp.c net: bcmgenet: use promisc for unsupported filters ipv6: rt6_check should return NULL if 'from' is NULL tipc: initialize 'validated' field of received packets selftests: add a test case for rp_filter fib: relax source validation check for loopback packets mlxsw: spectrum: Do not process learned records with a dummy FID ...
2019-07-17net/rds: Initialize ic->i_fastreg_wrs upon allocationGravatar Gerd Rausch 1-1/+1
Otherwise, if an IB connection is torn down before "rds_ib_setup_qp" is called, the value of "ic->i_fastreg_wrs" is still at zero (as it wasn't initialized by "rds_ib_setup_qp"). Consequently "rds_ib_conn_path_shutdown" will spin forever, waiting for it to go back to "RDS_IB_DEFAULT_FR_WR", which of course will never happen as there are no outstanding work requests. Signed-off-by: Gerd Rausch <gerd.rausch@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-17net/rds: Keep track of and wait for FRWR segments in use upon shutdownGravatar Gerd Rausch 3-6/+45
Since "rds_ib_free_frmr" and "rds_ib_free_frmr_list" simply put the FRMR memory segments on the "drop_list" or "free_list", and it is the job of "rds_ib_flush_mr_pool" to reap those entries by ultimately issuing a "IB_WR_LOCAL_INV" work-request, we need to trigger and then wait for all those memory segments attached to a particular connection to be fully released before we can move on to release the QP, CQ, etc. So we make "rds_ib_conn_path_shutdown" wait for one more atomic_t called "i_fastreg_inuse_count" that keeps track of how many FRWR memory segments are out there marked "FRMR_IS_INUSE" (and also wake_up rds_ib_ring_empty_wait, as they go away). Signed-off-by: Gerd Rausch <gerd.rausch@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-17net/rds: Set fr_state only to FRMR_IS_FREE if IB_WR_LOCAL_INV had been ↵Gravatar Gerd Rausch 1-1/+2
successful Fix a bug where fr_state first goes to FRMR_IS_STALE, because of a failure of operation IB_WR_LOCAL_INV, but then gets set back to "FRMR_IS_FREE" uncoditionally, even though the operation failed. Signed-off-by: Gerd Rausch <gerd.rausch@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-17net/rds: Fix NULL/ERR_PTR inconsistencyGravatar Gerd Rausch 1-2/+2
Make function "rds_ib_try_reuse_ibmr" return NULL in case memory region could not be allocated, since callers simply check if the return value is not NULL. Signed-off-by: Gerd Rausch <gerd.rausch@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-17net/rds: Wait for the FRMR_IS_FREE (or FRMR_IS_STALE) transition after ↵Gravatar Gerd Rausch 2-27/+40
posting IB_WR_LOCAL_INV In order to: 1) avoid a silly bouncing between "clean_list" and "drop_list" triggered by function "rds_ib_reg_frmr" as it is releases frmr regions whose state is not "FRMR_IS_FREE" right away. 2) prevent an invalid access error in a race from a pending "IB_WR_LOCAL_INV" operation with a teardown ("dma_unmap_sg", "put_page") and de-registration ("ib_dereg_mr") of the corresponding memory region. Signed-off-by: Gerd Rausch <gerd.rausch@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>