op-kernel-dev - Development kernel branch for OpenPOWER systems

	Commit message (Collapse)	Author	Age	Files	Lines
*	tipc: allocate user memory with GFP_KERNEL flag	Parthasarathy Bhuvaragan	2017-01-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Until now, we allocate memory always with GFP_ATOMIC flag. When the system is under memory pressure and a user tries to send, the send fails due to low memory. However, the user application can wait for free memory if we allocate it using GFP_KERNEL flag. In this commit, we use allocate memory with GFP_KERNEL for all user allocation. Reported-by: Rune Torgersen <runet@innovsys.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: fix broadcast link synchronization problem	Jon Paul Maloy	2016-10-29	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In commit 2d18ac4ba745 ("tipc: extend broadcast link initialization criteria") we tried to fix a problem with the initial synchronization of broadcast link acknowledge values. Unfortunately that solution is not sufficient to solve the issue. We have seen it happen that LINK_PROTOCOL/STATE packets with a valid non-zero unicast acknowledge number may bypass BCAST_PROTOCOL initialization, NAME_DISTRIBUTOR and other STATE packets with invalid broadcast acknowledge numbers, leading to premature opening of the broadcast link. When the bypassed packets finally arrive, they are inadvertently accepted, and the already correctly initialized acknowledge number in the broadcast receive link is overwritten by the invalid (zero) value of the said packets. After this the broadcast link goes stale. We now fix this by marking the packets where we know the acknowledge value is or may be invalid, and then ignoring the acks from those. To this purpose, we claim an unused bit in the header to indicate that the value is invalid. We set the bit to 1 in the initial BCAST_PROTOCOL synchronization packet and all initial ("bulk") NAME_DISTRIBUTOR packets, plus those LINK_PROTOCOL packets sent out before the broadcast links are fully synchronized. This minor protocol update is fully backwards compatible. Reported-by: John Thompson <thompa.atl@gmail.com> Tested-by: John Thompson <thompa.atl@gmail.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: fix random link resets while adding a second bearer	Parthasarathy Bhuvaragan	2016-09-01	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In a dual bearer configuration, if the second tipc link becomes active while the first link still has pending nametable "bulk" updates, it randomly leads to reset of the second link. When a link is established, the function named_distribute(), fills the skb based on node mtu (allows room for TUNNEL_PROTOCOL) with NAME_DISTRIBUTOR message for each PUBLICATION. However, the function named_distribute() allocates the buffer by increasing the node mtu by INT_H_SIZE (to insert NAME_DISTRIBUTOR). This consumes the space allocated for TUNNEL_PROTOCOL. When establishing the second link, the link shall tunnel all the messages in the first link queue including the "bulk" update. As size of the NAME_DISTRIBUTOR messages while tunnelling, exceeds the link mtu the transmission fails (-EMSGSIZE). Thus, the synch point based on the message count of the tunnel packets is never reached leading to link timeout. In this commit, we adjust the size of name distributor message so that they can be tunnelled. Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: purge deferred updates from dead nodes	Erik Hugne	2016-04-11	1	-0/+19
\| \| \| \| \| \| \| \| \| \|	If a peer node becomes unavailable, in addition to removing the nametable entries from this node we also need to purge all deferred updates associated with this node. Signed-off-by: Erik Hugne <erik.hugne@gmail.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: make dist queue pernet	Erik Hugne	2016-04-11	1	-9/+7
\| \| \| \| \| \| \| \| \| \| \| \|	Nametable updates received from the network that cannot be applied immediately are placed on a defer queue. This queue is global to the TIPC module, which might cause problems when using TIPC in containers. To prevent nametable updates from escaping into the wrong namespace, we make the queue pernet instead. Signed-off-by: Erik Hugne <erik.hugne@gmail.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: reduce code dependency between binding table and node layer	Jon Paul Maloy	2015-11-20	1	-64/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The file name_distr.c currently contains three functions, named_cluster_distribute(), tipc_publ_subcscribe() and tipc_publ_unsubscribe() that all directly access fields in struct tipc_node. We want to eliminate such dependencies, so we move those functions to the file node.c and rename them to tipc_node_broadcast(), tipc_node_subscribe() and tipc_node_unsubscribe() respectively. Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: move linearization of buffers to generic code	Jon Paul Maloy	2015-11-20	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In commit 5cbb28a4bf65c7e4 ("tipc: linearize arriving NAME_DISTR and LINK_PROTO buffers") we added linearization of NAME_DISTRIBUTOR, LINK_PROTOCOL/RESET and LINK_PROTOCOL/ACTIVATE to the function tipc_udp_recv(). The location of the change was selected in order to make the commit easily appliable to 'net' and 'stable'. We now move this linearization to where it should be done, in the functions tipc_named_rcv() and tipc_link_proto_rcv() respectively. Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: ensure binding table initial distribution is sent via first link	Jon Paul Maloy	2015-10-24	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Correct synchronization of the broadcast link at first contact between two nodes is dependent on the assumption that the binding table "bulk" update passes via the same link as the initial broadcast syncronization message, i.e., via the first link that is established. This is not guaranteed in the current implementation. If two link come up very close to each other in time, the "bulk" may quite well pass via the second link, and hence void the guarantee of a correct initial synchronization before the broadcast link is opened. This commit makes two small changes to strengthen this guarantee. 1) We let the second established link occupy slot 1 of the "active_links" array, while the first link will retain slot 0. (This is in reality a cosmetic change, we could just as well keep the current, opposite order) 2) We let the name distributor always use link selector/slot 0 when it sends it binding table updates. The extra traffic bias on the first link caused by this change should be negligible, since binding table updates constitutes a very small fraction of the total traffic. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: make media xmit call outside node spinlock context	Jon Paul Maloy	2015-07-20	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, message sending is performed through a deep call chain, where the node spinlock is grabbed and held during a significant part of the transmission time. This is clearly detrimental to overall throughput performance; it would be better if we could send the message after the spinlock has been released. In this commit, we do instead let the call revert on the stack after the buffer chain has been added to the transmission queue, whereafter clones of the buffers are transmitted to the device layer outside the spinlock scope. As a further step in our effort to separate the roles of the node and link entities we also move the function tipc_link_xmit() to node.c, and rename it to tipc_node_xmit(). Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: introduce link entry structure to struct tipc_node	Jon Paul Maloy	2015-07-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	struct 'tipc_node' currently contains two arrays for link attributes, one for the link pointers, and one for the usable link MTUs. We now group those into a new struct 'tipc_link_entry', and intoduce one single array consisting of such enties. Apart from being a cosmetic improvement, this is a starting point for the strict master-slave relation between node and link that we will introduce in the following commits. Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: involve reference counter for node structure	Ying Xue	2015-03-29	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	TIPC node hash node table is protected with rcu lock on read side. tipc_node_find() is used to look for a node object with node address through iterating the hash node table. As the entire process of what tipc_node_find() traverses the table is guarded with rcu read lock, it's safe for us. However, when callers use the node object returned by tipc_node_find(), there is no rcu read lock applied. Therefore, this is absolutely unsafe for callers of tipc_node_find(). Now we introduce a reference counter for node structure. Before tipc_node_find() returns node object to its caller, it first increases the reference counter. Accordingly, after its caller used it up, it decreases the counter again. This can prevent a node being used by one thread from being freed by another thread. Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericson.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: only create header copy for name distr messages	Erik Hugne	2015-02-27	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	The TIPC name distributor pushes topology updates to the cluster neighbors. Currently this is done in a unicast manner, and the skb holding the update is cloned for each cluster member. This is unnecessary, as we only modify the destnode field in the header so we change it to do pskb_copy instead. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: resolve race problem at unicast message reception	Jon Paul Maloy	2015-02-05	1	-12/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	TIPC handles message cardinality and sequencing at the link layer, before passing messages upwards to the destination sockets. During the upcall from link to socket no locks are held. It is therefore possible, and we see it happen occasionally, that messages arriving in different threads and delivered in sequence still bypass each other before they reach the destination socket. This must not happen, since it violates the sequentiality guarantee. We solve this by adding a new input buffer queue to the link structure. Arriving messages are added safely to the tail of that queue by the link, while the head of the queue is consumed, also safely, by the receiving socket. Sequentiality is secured per socket by only allowing buffers to be dequeued inside the socket lock. Since there may be multiple simultaneous readers of the queue, we use a 'filter' parameter to reduce the risk that they peek the same buffer from the queue, hence also reducing the risk of contention on the receiving socket locks. This solves the sequentiality problem, and seems to cause no measurable performance degradation. A nice side effect of this change is that lock handling in the functions tipc_rcv() and tipc_bcast_rcv() now becomes uniform, something that will enable future simplifications of those functions. Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: reduce usage of context info in socket and link	Jon Paul Maloy	2015-02-05	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The most common usage of namespace information is when we fetch the own node addess from the net structure. This leads to a lot of passing around of a parameter of type 'struct net *' between functions just to make them able to obtain this address. However, in many cases this is unnecessary. The own node address is readily available as a member of both struct tipc_sock and tipc_link, and can be fetched from there instead. The fact that the vast majority of functions in socket.c and link.c anyway are maintaining a pointer to their respective base structures makes this option even more compelling. In this commit, we introduce the inline functions tsk_own_node() and link_own_node() to make it easy for functions to fetch the node address from those structs instead of having to pass along and dereference the namespace struct. In particular, we make calls to the msg_xx() functions in msg.{h,c} context independent by directly passing them the own node address as parameter when needed. Those functions should be regarded as leaves in the code dependency tree, and it is hence desirable to keep them namspace unaware. Apart from a potential positive effect on cache behavior, these changes make it easier to introduce the changes that will follow later in this series. Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: make tipc node address support net namespace	Ying Xue	2015-01-12	1	-9/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	If net namespace is supported in tipc, each namespace will be treated as a separate tipc node. Therefore, every namespace must own its private tipc node address. This means the "tipc_own_addr" global variable of node address must be moved to tipc_net structure to satisfy the requirement. It's turned out that users also can assign node address for every namespace. Signed-off-by: Ying Xue <ying.xue@windriver.com> Tested-by: Tero Aho <Tero.Aho@coriant.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: name tipc name table support net namespace	Ying Xue	2015-01-12	1	-15/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	TIPC name table is used to store the mapping relationship between TIPC service name and socket port ID. When tipc supports namespace, it allows users to publish service names only owned by a certain namespace. Therefore, every namespace must have its private name table to prevent service names published to one namespace from being contaminated by other service names in another namespace. Therefore, The name table global variable (ie, nametbl) and its lock must be moved to tipc_net structure, and a parameter of namespace must be added for necessary functions so that they can obtain name table variable defined in tipc_net structure. Signed-off-by: Ying Xue <ying.xue@windriver.com> Tested-by: Tero Aho <Tero.Aho@coriant.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: make tipc node table aware of net namespace	Ying Xue	2015-01-12	1	-26/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Global variables associated with node table are below: - node table list (node_htable) - node hash table list (tipc_node_list) - node table lock (node_list_lock) - node number counter (tipc_num_nodes) - node link number counter (tipc_num_links) To make node table support namespace, above global variables must be moved to tipc_net structure in order to keep secret for different namespaces. As a consequence, these variables are allocated and initialized when namespace is created, and deallocated when namespace is destroyed. After the change, functions associated with these variables have to utilize a namespace pointer to access them. So adding namespace pointer as a parameter of these functions is the major change made in the commit. Signed-off-by: Ying Xue <ying.xue@windriver.com> Tested-by: Tero Aho <Tero.Aho@coriant.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: convert name table read-write lock to RCU	Ying Xue	2014-12-08	1	-14/+14
\| \| \| \| \| \| \| \| \| \| \| \|	Convert tipc name table read-write lock to RCU. After this change, a new spin lock is used to protect name table on write side while RCU is applied on read side. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Tested-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: make name table allocated dynamically	Ying Xue	2014-12-08	1	-34/+10
\| \| \| \| \| \| \| \| \| \| \| \| \|	Name table locking policy is going to be adjusted from read-write lock protection to RCU lock protection in the future commits. But its essential precondition is to convert the allocation way of name table from static to dynamic mode. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Tested-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: remove size variable from publ_list struct	Ying Xue	2014-12-08	1	-12/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The size variable is introduced in publ_list struct to help us exactly calculate SKB buffer sizes needed by publications when all publications in name table are delivered in bulk in named_distribute(). But if publication SKB buffer size is assumed to MTU, the size variable in publ_list struct can be completely eliminated at the cost of wasting a bit memory space for last SKB. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Tero Aho <tero.aho@coriant.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Tested-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: use generic SKB list APIs to manage TIPC outgoing packet chains	Ying Xue	2014-11-26	1	-24/+22
\| \| \| \| \| \| \| \| \| \|	Use standard SKB list APIs associated with struct sk_buff_head to manage socket outgoing packet chain and name table outgoing packet chain, having relevant code simpler and more readable. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: remove node subscription infrastructure	Ying Xue	2014-11-26	1	-7/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The node subscribe infrastructure represents a virtual base class, so its users, such as struct tipc_port and struct publication, can derive its implemented functionalities. However, after the removal of struct tipc_port, struct publication is left as its only single user now. So defining an abstract infrastructure for one user becomes no longer reasonable. If corresponding new functions associated with the infrastructure are moved to name_table.c file, the node subscription infrastructure can be removed as well. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: fix sparse warnings	Erik Hugne	2014-09-10	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \|	This fixes the following sparse warnings: sparse: symbol 'tipc_update_nametbl' was not declared. Should it be static? Also, the function is changed to return bool upon success, rather than a potentially freed pointer. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: add name distributor resiliency queue	Erik Hugne	2014-09-01	1	-3/+66
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	TIPC name table updates are distributed asynchronously in a cluster, entailing a risk of certain race conditions. E.g., if two nodes simultaneously issue conflicting (overlapping) publications, this may not be detected until both publications have reached a third node, in which case one of the publications will be silently dropped on that node. Hence, we end up with an inconsistent name table. In most cases this conflict is just a temporary race, e.g., one node is issuing a publication under the assumption that a previous, conflicting, publication has already been withdrawn by the other node. However, because of the (rtt related) distributed update delay, this may not yet hold true on all nodes. The symptom of this failure is a syslog message: "tipc: Cannot publish {%u,%u,%u}, overlap error". In this commit we add a resiliency queue at the receiving end of the name table distributor. When insertion of an arriving publication fails, we retain it in this queue for a short amount of time, assuming that another update will arrive very soon and clear the conflict. If so happens, we insert the publication, otherwise we drop it. The (configurable) retention value defaults to 2000 ms. Knowing from experience that the situation described above is extremely rare, there is no risk that the queue will accumulate any large number of items. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: refactor name table updates out of named packet receive routine	Erik Hugne	2014-09-01	1	-36/+38
\| \| \| \| \| \| \| \| \| \| \|	We need to perform the same actions when processing deferred name table updates, so this functionality is moved to a separate function. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: rename temporarily named functions	Jon Paul Maloy	2014-07-16	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	After the previous commit, we can now give the functions with temporary names, such as tipc_link_xmit2(), tipc_msg_build2() etc., their proper names. There are no functional changes in this commit. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: make name table distributor use new send function	Jon Paul Maloy	2014-07-16	1	-32/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In a previous commit series ("tipc: new unicast transmission code") we introduced a new message sending function, tipc_link_xmit2(), and moved the unicast data users over to use that function. We now let the internal name table distributor do the same. The interaction between the name distributor and the node/link layer also becomes significantly simpler, so we can eliminate the function tipc_link_names_xmit(). Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: avoid to asynchronously deliver name tables to peer node	Ying Xue	2014-05-05	1	-50/+2
\| \| \| \| \| \| \| \| \| \|	Postpone the actions of delivering name tables until after node lock is released, avoiding to do it under asynchronous context. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: always use tipc_node_lock() to hold node lock	Ying Xue	2014-05-05	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \|	Although we obtain node lock with tipc_node_lock() in most time, there are still places where we directly use native spin lock interface to grab node lock. But as we will do more jobs in the future when node lock is released, we should ensure that tipc_node_lock() is always called when node lock is taken. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: move the delivery of named messages out of nametbl lock	Ying Xue	2014-04-28	1	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit a89778d8baf19cd7e728d81121a294a06cedaad1 ("tipc: add support for link state subscriptions") introduced below possible deadlock scenario: CPU0 CPU1 T0: tipc_publish() link_timeout() T1: tipc_nametbl_publish() [grab node lock]* T2: [grab nametbl write lock]* link_state_event() T3: named_cluster_distribute() link_activate() T4: [grab node lock]* tipc_node_link_up() T5: tipc_nametbl_publish() T6: [grab nametble write lock]* The opposite order of holding nametbl write lock and node lock on above two different paths may result in a deadlock. If we move the the delivery of named messages via link out of name nametbl lock, the reverse order of holding locks will be eliminated, as a result, the deadlock will be killed as well. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: purge tipc_net_lock lock	Ying Xue	2014-04-22	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now tipc routing hierarchy comprises the structures 'node', 'link'and 'bearer'. The whole hierarchy is protected by a big read/write lock, tipc_net_lock, to ensure that nothing is added or removed while code is accessing any of these structures. Obviously the locking policy makes node, link and bearer components closely bound together so that their relationship becomes unnecessarily complex. In the worst case, such locking policy not only has a negative influence on performance, but also it's prone to lead to deadlock occasionally. In order o decouple the complex relationship between bearer and node as well as link, the locking policy is adjusted as follows: - Bearer level RTNL lock is used on update side, and RCU is used on read side. Meanwhile, all bearer instances including broadcast bearer are saved into bearer_list array. - Node and link level All node instances are saved into two tipc_node_list and node_htable lists. The two lists are protected by node_list_lock on write side, and they are guarded with RCU lock on read side. All members in node structure including link instances are protected by node spin lock. - The relationship between bearer and node When link accesses bearer, it first needs to find the bearer with its bearer identity from the bearer_list array. When bearer accesses node, it can iterate the node_htable hash list with the node address to find the corresponding node. In the new locking policy, every component has its private locking solution and the relationship between bearer and node is very simple, that is, they can find each other with node address or bearer identity from node_htable hash list or bearer_list array. Until now above all changes have been done, so tipc_net_lock can be removed safely. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Tested-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: tipc: convert node list and node hlist to RCU lists	Ying Xue	2014-03-27	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \|	Convert tipc_node_list list and node_htable hash list to RCU lists. On read side, the two lists are protected with RCU read lock, and on update side, node_list_lock is applied to them. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: acquire necessary locks in named_cluster_distribute routine	Ying Xue	2014-03-27	1	-3/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The 'tipc_node_list' is guarded by tipc_net_lock and 'links' array defined in 'tipc_node' structure is protected by node lock as well. Without acquiring the two locks in named_cluster_distribute() a fatal oops may happen in case that a destroyed link might be got and then accessed. Therefore, above mentioned two locks must be held in named_cluster_distribute() to prevent the issue from happening accidentally. As 'links' array in node struct must be protected by node lock, we have to move the code of selecting an active link from tipc_link_xmit() to named_cluster_distribute() and then call __tipc_link_xmit() with the selected link to deliver name messages. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: align tipc function names with common naming practice in the network	Ying Xue	2014-02-18	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rename the following functions, which are shorter and more in line with common naming practice in the network subsystem. tipc_bclink_send_msg->tipc_bclink_xmit tipc_bclink_recv_pkt->tipc_bclink_rcv tipc_disc_recv_msg->tipc_disc_rcv tipc_link_send_proto_msg->tipc_link_proto_xmit link_recv_proto_msg->tipc_link_proto_rcv link_send_sections_long->tipc_link_iovec_long_xmit tipc_link_send_sections_fast->tipc_link_iovec_xmit_fast tipc_link_send_sync->tipc_link_sync_xmit tipc_link_recv_sync->tipc_link_sync_rcv tipc_link_send_buf->__tipc_link_xmit tipc_link_send->tipc_link_xmit tipc_link_send_names->tipc_link_names_xmit tipc_named_recv->tipc_named_rcv tipc_link_recv_bundle->tipc_link_bundle_rcv tipc_link_dup_send_queue->tipc_link_dup_queue_xmit link_send_long_buf->tipc_link_frag_xmit tipc_multicast->tipc_port_mcast_xmit tipc_port_recv_mcast->tipc_port_mcast_rcv tipc_port_reject_sections->tipc_port_iovec_reject tipc_port_recv_proto_msg->tipc_port_proto_rcv tipc_connect->tipc_port_connect __tipc_connect->__tipc_port_connect __tipc_disconnect->__tipc_port_disconnect tipc_disconnect->tipc_port_disconnect tipc_shutdown->tipc_port_shutdown tipc_port_recv_msg->tipc_port_rcv tipc_port_recv_sections->tipc_port_iovec_rcv release->tipc_release accept->tipc_accept bind->tipc_bind get_name->tipc_getname poll->tipc_poll send_msg->tipc_sendmsg send_packet->tipc_send_packet send_stream->tipc_send_stream recv_msg->tipc_recvmsg recv_stream->tipc_recv_stream connect->tipc_connect listen->tipc_listen shutdown->tipc_shutdown setsockopt->tipc_setsockopt getsockopt->tipc_getsockopt Above changes have no impact on current users of the functions. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: eliminate an unnecessary cast of node variable	Ying Xue	2012-11-22	1	-1/+1
\| \| \| \| \| \| \| \|	As the variable:node is currently defined to u32 type, it is unnecessary to cast its type to u32 again when using it. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
*	tipc: use standard printk shortcut macros (pr_err etc.)	Erik Hugne	2012-07-13	1	-12/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	All messages should go directly to the kernel log. The TIPC specific error, warning, info and debug trace macro's are removed and all references replaced with pr_err, pr_warn, pr_info and pr_debug. Commonly used sub-strings are explicitly declared as a const char to reduce .text size. Note that this means the debug messages (changed to pr_debug), are now enabled through dynamic debugging, instead of a TIPC specific Kconfig option (TIPC_DEBUG). The latter will be phased out completely Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> [PG: use pr_fmt as suggested by Joe Perches <joe@perches.com>] Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
*	tipc: compress out gratuitous extra carriage returns	Paul Gortmaker	2012-04-30	1	-11/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some of the comment blocks are floating in limbo between two functions, or between blocks of code. Delete the extra line feeds between any comment and its associated following block of code, to be consistent with the majority of the rest of the kernel. Also delete trailing newlines at EOF and fix a couple trivial typos in existing comments. This is a 100% cosmetic change with no runtime impact. We get rid of over 500 lines of non-code, and being blank line deletes, they won't even show up as noise in git blame. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
*	tipc: Update node-scope publications when network address is assigned	Allan Stephens	2012-04-19	1	-4/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Ensures that node-scope name publications that exist prior to the configuration of a node's network address are properly re-initialized with that address when it is assigned. TIPC's node-scope publications are now tracked using a publications list like the lists used for cluster-scope and zone-scope publications so they can be easily updated when required. The inclusion of node scope name publications in a conventional publication list means that they must now also be withdrawn, just like cluster and zone scope publications are currently withdrawn. So some conditional tests on scope ==/!= TIPC_NODE_SCOPE are inserted/removed accordingly. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
*	tipc: Separate cluster-scope and zone-scope names into distinct lists	Allan Stephens	2012-04-19	1	-5/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Utilizes distinct lists to track zone-scope and cluster-scope names published by a node. For now, TIPC continues to process the entries in both lists in the same way; however, an upcoming patch will utilize the existence of the lists to prevent the sending of cluster-scope names to nodes that are not part of the local cluster. To achieve this, an array of publication lists is introduced, so that they can be iterated over and accessed via publ->scope as an index where convenient. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
*	tipc: Factor out name publication code to a separate function	Allan Stephens	2012-04-18	1	-27/+34
\| \| \| \| \| \| \| \|	This is done so that it can be reused with differing publication lists, instead of being hard coded to the cluster publicaton list. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
*	tipc: introduce publication lists struct	Allan Stephens	2012-04-18	1	-10/+17
\| \| \| \| \| \| \| \| \| \| \| \| \|	There is currently a single list that is containing both cluster-scope and zone-scope publications, and the list count is a separate free floating variable. Create a struct to bind the count to the list, and to pave the way for factoring out the publications into zone/cluster/node scope. The current "publ_root" most matches what will be the cluster scope list, so it is named accordingly in this commit. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
*	tipc: Eliminate trivial buffer manipulation helper routines	Allan Stephens	2012-02-24	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Gets rid of two inlined routines that simply call existing sk_buff manipulation routines, since there is no longer any extra processing done by the helper routines. Note that these changes are essentially cosmetic in nature, and have no impact on the actual operation of TIPC. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
*	tipc: Eliminate alteration of publication key during name table purging	Allan Stephens	2012-02-06	1	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Removes code that alters the publication key of a name table entry that is being forcibly purged from TIPC's name table after contact with the publishing node has been lost. Current TIPC ensures that all defunct names are purged before re-establishing contact with a failed node. There used to be a risk that the publication might be accidentally deleted because it might be re-added to the name table before the purge operation was completed. But now there is no longer a need to ensure that the new key is different than the old one. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
*	tipc: rename struct link* to struct tipc_link*	Paul Gortmaker	2011-12-29	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	This converts the following: struct link -> struct tipc_link struct link_req -> struct tipc_link_req struct link_name -> struct tipc_link_name Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
*	tipc: Eliminate useless check when network address is assigned	Allan Stephens	2011-12-27	1	-7/+5
\| \| \| \| \| \| \| \| \| \|	Gets rid of an unnecessary check in the routine that updates the port id of a node's name publications when the node is assigned a network address, since the routine is only invoked if the new address is different from the existing one. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
*	tipc: Enhance sending of bulk name table messages	Allan Stephens	2011-09-17	1	-2/+8
\| \| \| \| \| \| \| \| \| \| \| \|	Modifies the initial transfer of name table entries to a new neighboring node so that the messages are enqueued as a unit, rather than individually. The revised algorithm now locates the link carrying the message only once, and eliminates unnecessary checks for link congestion, message fragmentation, and message bundling that are not required when sending these messages. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
*	tipc: relocate/coalesce node cast in tipc_named_node_up	Paul Gortmaker	2011-09-17	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Functions like this are called using unsigned longs from function pointers. In this case, the function is passed in a node which is normally internally treated as a u32 by TIPC. Rather than add more casts into this function in the future for each added use of node within, move the cast to a single place on a local. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
*	tipc: Prevent fragmented messages during initial name table exchange	Allan Stephens	2011-09-17	1	-3/+19
\| \| \| \| \| \| \| \| \| \| \|	Reduces the maximum size of messages sent during the initial exchange of name table information between two nodes to be no larger than the MTU of the first link established between the nodes. This ensures that messages will never need to be fragmented, which would add unnecessary overhead to the name table synchronization mechanism. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
*	tipc: Cleanup of message header size terminology	Allan Stephens	2011-06-24	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Performs cosmetic cleanup of the symbolic names used to specify TIPC payload message header sizes. The revised names now more accurately reflect the payload messages in which they can appear. In addition, several places where these payload message symbol names were being used to create non-payload messages have been updated to use the proper internal message symbolic name. No functional changes are introduced by this rework. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
*	Fix common misspellings	Lucas De Marchi	2011-03-31	1	-1/+1
\| \| \| \| \| \|	Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>