op-kernel-dev - Development kernel branch for OpenPOWER systems

	Commit message (Collapse)	Author	Age	Files	Lines
*	tipc: add packet sequence number at instant of transmission	Jon Paul Maloy	2015-05-14	1	-23/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, the packet sequence number is updated and added to each packet at the moment a packet is added to the link backlog queue. This is wasteful, since it forces the code to traverse the send packet list packet by packet when adding them to the backlog queue. It would be better to just splice the whole packet list into the backlog queue when that is the right action to do. In this commit, we do this change. Also, since the sequence numbers cannot now be assigned to the packets at the moment they are added the backlog queue, we do instead calculate and add them at the moment of transmission, when the backlog queue has to be traversed anyway. We do this in the function tipc_link_push_packet(). Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: improve link congestion algorithm	Jon Paul Maloy	2015-05-14	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The link congestion algorithm used until now implies two problems. - It is too generous towards lower-level messages in situations of high load by giving "absolute" bandwidth guarantees to the different priority levels. LOW traffic is guaranteed 10%, MEDIUM is guaranted 20%, HIGH is guaranteed 30%, and CRITICAL is guaranteed 40% of the available bandwidth. But, in the absence of higher level traffic, the ratio between two distinct levels becomes unreasonable. E.g. if there is only LOW and MEDIUM traffic on a system, the former is guaranteed 1/3 of the bandwidth, and the latter 2/3. This again means that if there is e.g. one LOW user and 10 MEDIUM users, the former will have 33.3% of the bandwidth, and the others will have to compete for the remainder, i.e. each will end up with 6.7% of the capacity. - Packets of type MSG_BUNDLER are created at SYSTEM importance level, but only after the packets bundled into it have passed the congestion test for their own respective levels. Since bundled packets don't result in incrementing the level counter for their own importance, only occasionally for the SYSTEM level counter, they do in practice obtain SYSTEM level importance. Hence, the current implementation provides a gap in the congestion algorithm that in the worst case may lead to a link reset. We now refine the congestion algorithm as follows: - A message is accepted to the link backlog only if its own level counter, and all superior level counters, permit it. - The importance of a created bundle packet is set according to its contents. A bundle packet created from messges at levels LOW to CRITICAL is given importance level CRITICAL, while a bundle created from a SYSTEM level message is given importance SYSTEM. In the latter case only subsequent SYSTEM level messages are allowed to be bundled into it. This solves the first problem described above, by making the bandwidth guarantee relative to the total number of users at all levels; only the upper limit for each level remains absolute. In the example described above, the single LOW user would use 1/11th of the bandwidth, the same as each of the ten MEDIUM users, but he still has the same guarantee against starvation as the latter ones. The fix also solves the second problem. If the CRITICAL level is filled up by bundle packets of that level, no lower level packets will be accepted any more. Suggested-by: Gergely Kiss <gergely.kiss@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: eliminate delayed link deletion at link failover	Jon Paul Maloy	2015-04-02	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a bearer is disabled manually, all its links have to be reset and deleted. However, if there is a remaining, parallel link ready to take over a deleted link's traffic, we currently delay the delete of the removed link until the failover procedure is finished. This is because the remaining link needs to access state from the reset link, such as the last received packet number, and any partially reassembled buffer, in order to perform a successful failover. In this commit, we do instead move the state data over to the new link, so that it can fulfill the procedure autonomously, without accessing any data on the old link. This means that we can now proceed and delete all pertaining links immediately when a bearer is disabled. This saves us from some unnecessary complexity in such situations. We also choose to change the confusing definitions CHANGEOVER_PROTOCOL, ORIGINAL_MSG and DUPLICATE_MSG to the more descriptive TUNNEL_PROTOCOL, FAILOVER_MSG and SYNCH_MSG respectively. Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: fix two bugs in secondary destination lookup	Jon Paul Maloy	2015-03-29	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A message sent to a node after a successful name table lookup may still find that the destination socket has disappeared, because distribution of name table updates is non-atomic. If so, the message will be rejected back to the sender with error code TIPC_ERR_NO_PORT. If the source socket of the message has disappeared in the meantime, the message should be dropped. However, in the currrent code, the message will instead be subject to an unwanted tertiary lookup, because the function tipc_msg_lookup_dest() doesn't check if there is an error code present in the message before performing the lookup. In the worst case, the message may now find the old destination again, and be redirected once more, instead of being dropped directly as it should be. A second bug in this function is that the "prev_node" field in the message is not updated after successful lookup, something that may have unpredictable consequences. The problems arising from those bugs occur very infrequently. The third change in this function; the test on msg_reroute_msg_cnt() is purely cosmetic, reflecting that the returned value never can be negative. This commit corrects the two bugs described above. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: clean up handling of message priorities	Jon Paul Maloy	2015-03-14	1	-4/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Messages transferred by TIPC are assigned an "importance priority", -an integer value indicating how to treat the message when there is link or destination socket congestion. There is no separate header field for this value. Instead, the message user values have been chosen in ascending order according to perceived importance, so that the message user field can be used for this. This is not a good solution. First, we have many more users than the needed priority levels, so we end up with treating more priority levels than necessary. Second, the user field cannot always accurately reflect the priority of the message. E.g., a message fragment packet should really have the priority of the enveloped user data message, and not the priority of the MSG_FRAGMENTER user. Until now, we have been working around this problem in different ways, but it is now time to implement a consistent way of handling such priorities, although still within the constraint that we cannot allocate any more bits in the regular data message header for this. In this commit, we define a new priority level, TIPC_SYSTEM_IMPORTANCE, that will be the only one used apart from the four (lower) user data levels. All non-data messages map down to this priority. Furthermore, we take some free bits from the MSG_FRAGMENTER header and allocate them to store the priority of the enveloped message. We then adjust the functions msg_importance()/msg_set_importance() so that they read/set the correct header fields depending on user type. This small protocol change is fully compatible, because the code at the receiving end of a link currently reads the importance level only from user data messages, where there is no change. Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: split link outqueue	Jon Paul Maloy	2015-03-14	1	-15/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	struct tipc_link contains one single queue for outgoing packets, where both transmitted and waiting packets are queued. This infrastructure is hard to maintain, because we need to keep a number of fields to keep track of which packets are sent or unsent, and the number of packets in each category. A lot of code becomes simpler if we split this queue into a transmission queue, where sent/unacknowledged packets are kept, and a backlog queue, where we keep the not yet sent packets. In this commit we do this separation. Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: extract bundled buffers by cloning instead of copying	Jon Paul Maloy	2015-03-14	1	-14/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When we currently extract a bundled buffer from a message bundle in the function tipc_msg_extract(), we allocate a new buffer and explicitly copy the linear data area. This is unnecessary, since we can just clone the buffer and do skb_pull() on the clone to move the data pointer to the correct position. This is what we do in this commit. Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: eliminate unnecessary linearization of incoming buffers	Jon Paul Maloy	2015-03-14	1	-4/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, TIPC linearizes all incoming buffers directly at reception before passing them upwards in the stack. This is clearly a waste of CPU resources, and must be avoided. In this commit, we eliminate this unnecessary linearization. We still ensure that at least the message header is linear, and that the buffer is linearized where this is still needed, i.e. when unbundling and when reversing messages. In addition, we ensure that fragmented messages are validated after reassembly before delivering them upwards in the stack. Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: move message validation function to msg.c	Jon Paul Maloy	2015-03-14	1	-1/+43
\| \| \| \| \| \| \| \| \| \| \| \|	The function link_buf_validate() is in reality re-entrant and context independent, and will in later commits be called from several locations. Therefore, we move it to msg.c, make it outline and rename the it to tipc_msg_validate(). We also redesign the function to make proper use of pskb_may_pull() Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: resolve race problem at unicast message reception	Jon Paul Maloy	2015-02-05	1	-0/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	TIPC handles message cardinality and sequencing at the link layer, before passing messages upwards to the destination sockets. During the upcall from link to socket no locks are held. It is therefore possible, and we see it happen occasionally, that messages arriving in different threads and delivered in sequence still bypass each other before they reach the destination socket. This must not happen, since it violates the sequentiality guarantee. We solve this by adding a new input buffer queue to the link structure. Arriving messages are added safely to the tail of that queue by the link, while the head of the queue is consumed, also safely, by the receiving socket. Sequentiality is secured per socket by only allowing buffers to be dequeued inside the socket lock. Since there may be multiple simultaneous readers of the queue, we use a 'filter' parameter to reduce the risk that they peek the same buffer from the queue, hence also reducing the risk of contention on the receiving socket locks. This solves the sequentiality problem, and seems to cause no measurable performance degradation. A nice side effect of this change is that lock handling in the functions tipc_rcv() and tipc_bcast_rcv() now becomes uniform, something that will enable future simplifications of those functions. Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: split up function tipc_msg_eval()	Jon Paul Maloy	2015-02-05	1	-21/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The function tipc_msg_eval() is in reality doing two related, but different tasks. First it tries to find a new destination for named messages, in case there was no first lookup, or if the first lookup failed. Second, it does what its name suggests, evaluating the validity of the message and its destination, and returning an appropriate error code depending on the result. This is confusing, and in this commit we choose to break it up into two functions. A new function, tipc_msg_lookup_dest(), first attempts to find a new destination, if the message is of the right type. If this lookup fails, or if the message should not be subject to a second lookup, the already existing tipc_msg_reverse() is called. This function performs prepares the message for rejection, if applicable. Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: reduce usage of context info in socket and link	Jon Paul Maloy	2015-02-05	1	-18/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The most common usage of namespace information is when we fetch the own node addess from the net structure. This leads to a lot of passing around of a parameter of type 'struct net *' between functions just to make them able to obtain this address. However, in many cases this is unnecessary. The own node address is readily available as a member of both struct tipc_sock and tipc_link, and can be fetched from there instead. The fact that the vast majority of functions in socket.c and link.c anyway are maintaining a pointer to their respective base structures makes this option even more compelling. In this commit, we introduce the inline functions tsk_own_node() and link_own_node() to make it easy for functions to fetch the node address from those structs instead of having to pass along and dereference the namespace struct. In particular, we make calls to the msg_xx() functions in msg.{h,c} context independent by directly passing them the own node address as parameter when needed. Those functions should be regarded as leaves in the code dependency tree, and it is hence desirable to keep them namspace unaware. Apart from a potential positive effect on cache behavior, these changes make it easier to introduce the changes that will follow later in this series. Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: tipc ->sendmsg() conversion	Al Viro	2015-02-04	1	-5/+2
\| \| \| \| \| \| \| \|	This one needs to copy the same data from user potentially more than once. Sadly, MTU changes can trigger that ;-/ Cc: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	tipc: make tipc node address support net namespace	Ying Xue	2015-01-12	1	-18/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	If net namespace is supported in tipc, each namespace will be treated as a separate tipc node. Therefore, every namespace must own its private tipc node address. This means the "tipc_own_addr" global variable of node address must be moved to tipc_net structure to satisfy the requirement. It's turned out that users also can assign node address for every namespace. Signed-off-by: Ying Xue <ying.xue@windriver.com> Tested-by: Tero Aho <Tero.Aho@coriant.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: name tipc name table support net namespace	Ying Xue	2015-01-12	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	TIPC name table is used to store the mapping relationship between TIPC service name and socket port ID. When tipc supports namespace, it allows users to publish service names only owned by a certain namespace. Therefore, every namespace must have its private name table to prevent service names published to one namespace from being contaminated by other service names in another namespace. Therefore, The name table global variable (ie, nametbl) and its lock must be moved to tipc_net structure, and a parameter of namespace must be added for necessary functions so that they can obtain name table variable defined in tipc_net structure. Signed-off-by: Ying Xue <ying.xue@windriver.com> Tested-by: Tero Aho <Tero.Aho@coriant.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: involve namespace infrastructure	Ying Xue	2015-01-12	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	Involve namespace infrastructure, make the "tipc_net_id" global variable aware of per namespace, and rename it to "net_id". In order that the conversion can be successfully done, an instance of networking namespace must be passed to relevant functions, allowing them to access the "net_id" variable of per namespace. Signed-off-by: Ying Xue <ying.xue@windriver.com> Tested-by: Tero Aho <Tero.Aho@coriant.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: cleanup core.c and core.h files	Ying Xue	2015-01-12	1	-0/+23
\| \| \| \| \| \| \| \| \| \| \|	Only the works of initializing and shutting down tipc module are done in core.h and core.c files, so all stuffs which are not closely associated with the two tasks should be moved to appropriate places. Signed-off-by: Ying Xue <ying.xue@windriver.com> Tested-by: Tero Aho <Tero.Aho@coriant.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	put iov_iter into msghdr	Al Viro	2014-12-09	1	-2/+2
\| \| \| \| \| \| \| \|	Note that the code _using_ ->msg_iter at that point will be very unhappy with anything other than unshifted iovec-backed iov_iter. We still need to convert users to proper primitives. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	tipc: use generic SKB list APIs to manage TIPC outgoing packet chains	Ying Xue	2014-11-26	1	-38/+36
\| \| \| \| \| \| \| \| \| \|	Use standard SKB list APIs associated with struct sk_buff_head to manage socket outgoing packet chain and name table outgoing packet chain, having relevant code simpler and more readable. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: use generic SKB list APIs to manage link transmission queue	Ying Xue	2014-11-26	1	-25/+25
\| \| \| \| \| \| \| \| \|	Use standard SKB list APIs associated with struct sk_buff_head to manage link transmission queue, having relevant code more clean. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: eliminate two pseudo message types of BUNDLE_OPEN and BUNDLE_CLOSED	Ying Xue	2014-11-26	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The pseudo message types of BUNDLE_CLOSED as well as BUNDLE_OPEN are used to flag whether or not more messages can be bundled into a data packet in the outgoing transmission queue. Obviously, no more messages can be appended after the packet has been sent and is waiting to be acknowledged and deleted. These message types do in reality represent a send-side local implementation flag, and are not defined as part of the protocol. It is therefore safe to move it to to where it belongs, that is, the control area (TIPC_SKB_CB) of the buffer. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc_msg_build(): pass msghdr instead of its ->msg_iov	Al Viro	2014-11-24	1	-4/+4
\| \| \| \|	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	tipc: spelling errors	stephen hemminger	2014-10-30	1	-2/+2
\| \| \| \| \|	Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: use pseudo message to wake up sockets after link congestion	Jon Paul Maloy	2014-08-23	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current link implementation keeps a linked list of blocked ports/ sockets that is populated when there is link congestion. The purpose of this is to let the link know which users to wake up when the congestion abates. This adds unnecessary complexity to the data structure and the code, since it forces us to involve the link each time we want to delete a socket. It also forces us to grab the spinlock port_lock within the scope of node_lock. We want to get rid of this direct dependence, as well as the deadlock hazard resulting from the usage of port_lock. In this commit, we instead let the link keep list of a "wakeup" pseudo messages for use in such situations. Those messages are sent to the pending sockets via the ordinary message reception path, and wake up the socket's owner when they are received. This enables us to get rid of the 'waiting_ports' linked lists in struct tipc_port that manifest this direct reference. As a consequence, we can eliminate another BH entry into the socket, and hence the need to grab port_lock. This is a further step in our effort to remove port_lock altogether. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: introduce new function tipc_msg_create()	Jon Paul Maloy	2014-08-23	1	-2/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The function tipc_msg_init() has turned out to be of limited value in many cases. It take too few parameters to be usable for creating a complete message, it makes too many assumptions about what the message should be used for, and it does not allocate any buffer to be returned to the caller. Therefore, we now introduce the new function tipc_msg_create(), which takes all the parameters needed to create a full message, and returns a buffer of the requested size. The new function will be very useful for the changes we will be doing in later commits in this series. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: make tipc_buf_append() more robust	Jon Paul Maloy	2014-07-28	1	-8/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As per comment from David Miller, we try to make the buffer reassembly function more resilient to user errors than it is today. - We check that the "buf" parameter always is set, since this is mandatory input. - We ensure that buf->next always is set to NULL before linking in the buffer, instead of relying of the caller to have done this. - We ensure that the "tail" pointer in the head buffer's control block is initialized to NULL when the first fragment arrives. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: rename temporarily named functions	Jon Paul Maloy	2014-07-16	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \|	After the previous commit, we can now give the functions with temporary names, such as tipc_link_xmit2(), tipc_msg_build2() etc., their proper names. There are no functional changes in this commit. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: remove unreferenced functions	Jon Paul Maloy	2014-07-16	1	-35/+0
\| \| \| \| \| \| \| \| \| \| \|	We can now remove a number of functions which have become obsolete and unreferenced through this commit series. There are no functional changes in this commit. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: add new functions for multicast and broadcast distribution	Jon Paul Maloy	2014-07-16	1	-0/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We add a new broadcast link transmit function in bclink.c and a new receive function in socket.c. The purpose is to move the branching between external and internal destination down to the link layer, just as we have done with unicast in earlier commits. We also make use of the new link-independent fragmentation support that was introduced in an earlier commit series. This gives a shorter and simpler code path, and makes it possible to obtain copy-free buffer delivery to all node local destination sockets. The new transmission code is added in parallel with the existing one, and will be used by the socket multicast send function in the next commit in this series. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	David S. Miller	2014-07-16	1	-3/+8
\|\ \| \| \| \| \| \|	Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	tipc: fix bug in multicast/broadcast message reassembly	Jon Paul Maloy	2014-07-08	1	-3/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since commit 37e22164a8a3c39bdad45aa463b1e69a1fdf4110 ("tipc: rename and move message reassembly function") reassembly of long broadcast messages has been broken. This is because we test for a non-NULL return value of the *buf parameter as criteria for succesful reassembly. However, this parameter is left defined even after reception of the first fragment, when reassebly is still incomplete. This leads to a kernel crash as soon as a the first fragment of a long broadcast message is received. We fix this with this commit, by implementing a stricter behavior of the function and its return values. This commit should be applied to both net and net-next. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	tipc: clean up connection protocol reception function	Jon Paul Maloy	2014-06-27	1	-3/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We simplify the code for receiving connection probes, leveraging the recently introduced tipc_msg_reverse() function. We also stick to the principle of sending a possible response message directly from the calling (tipc_sk_rcv or backlog_rcv) functions, hence making the call chain shallower and easier to follow. We make one small protocol change here, allowed according to the spec. If a protocol message arrives from a remote socket that is not the one we are connected to, we are currently generating a connection abort message and send it to the source. This behavior is unnecessary, and might even be a security risk, so instead we now choose to only ignore the message. The consequnce for the sender is that he will need longer time to discover his mistake (until the next timeout), but this is an extreme corner case, and may happen anyway under other circumstances, so we deem this change acceptable. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	tipc: introduce message evaluation function	Jon Paul Maloy	2014-06-27	1	-0/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a message arrives in a node and finds no destination socket, we may need to drop it, reject it, or forward it after a secondary destination lookup. The latter two cases currently results in a code path that is perceived as complex, because it follows a deep call chain via obscure functions such as net_route_named_msg() and net_route_msg(). We now introduce a function, tipc_msg_eval(), that takes the decision about whether such a message should be rejected or forwarded, but leaves it to the caller to actually perform the indicated action. If the decision is 'reject', it is still the task of the recently introduced function tipc_msg_reverse() to take the final decision about whether the message is rejectable or not. In the latter case it drops the message. As a result of this change, we can finally eliminate the function net_route_named_msg(), and hence become independent of net_route_msg(). Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	tipc: separate building and sending of rejected messages	Jon Paul Maloy	2014-06-27	1	-0/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The way we build and send rejected message is currenty perceived as hard to follow, partly because we let the transmission go via deep call chains through functions such as tipc_reject_msg() and net_route_msg(). We want to remove those functions, and make the call sequences shallower and simpler. For this purpose, we separate building and sending of rejected messages. We build the reject message using the new function tipc_msg_reverse(), and let the transmission go via the newly introduced tipc_link_xmit2() function, as all transmission eventually will do. We also ensure that all calls to tipc_link_xmit2() are made outside port_lock/bh_lock_sock. Finally, we replace all calls to tipc_reject_msg() with the two new calls at all locations in the code that we want to keep. The remaining calls are made from code that we are planning to remove, along with tipc_reject_msg() itself. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	tipc: introduce direct iovec to buffer chain fragmentation function	Jon Paul Maloy	2014-06-27	1	-0/+102
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fragmentation at message sending is currently performed in two places in link.c, depending on whether data to be transmitted is delivered in the form of an iovec or as a big sk_buff. Those functions are also tightly entangled with the send functions that are using them. We now introduce a re-entrant, standalone function, tipc_msg_build2(), that builds a packet chain directly from an iovec. Each fragment is sized according to the MTU value given by the caller, and is prepended with a correctly built fragment header, when needed. The function is independent from who is calling and where the chain will be delivered, as long as the caller is able to indicate a correct MTU. The function is tested, but not called by anybody yet. Since it is incompatible with the existing tipc_msg_build(), and we cannot yet remove that function, we have given it a temporary name. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	tipc: introduce send functions for chained buffers in link	Jon Paul Maloy	2014-06-27	1	-11/+85
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current link implementation provides several different transmit functions, depending on the characteristics of the message to be sent: if it is an iovec or an sk_buff, if it needs fragmentation or not, if the caller holds the node_lock or not. The permutation of these options gives us an unwanted amount of unnecessarily complex code. As a first step towards simplifying the send path for all messages, we introduce two new send functions at link level, tipc_link_xmit2() and __tipc_link_xmit2(). The former looks up a link to the message destination, and if one is found, it grabs the node lock and calls the second function, which works exclusively inside the node lock protection. If no link is found, and the destination is on the same node, it delivers the message directly to the local destination socket. The new functions take a buffer chain where all packet headers are already prepared, and the correct MTU has been used. These two functions will later replace all other link-level transmit functions. The functions are not backwards compatible, so we have added them as new functions with temporary names. They are tested, but have no users yet. Those will be added later in this series. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: rename and move message reassembly function	Jon Paul Maloy	2014-05-14	1	-1/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The function tipc_link_frag_rcv() is in reality a re-entrant generic message reassemby function that has nothing in particular to do with the link, where it is defined now. This becomes obvious when we see the need to call the function from other places in the code. In this commit rename it to tipc_buf_append() and move it to the file msg.c. We also simplify its signature by moving the tail pointer to the control block of the head buffer, hence making the head buffer self-contained. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: remove iovec length parameter from all sending functions	Ying Xue	2013-10-18	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	tipc_msg_build() now copies message data from iovec to skb_buff using memcpy_fromiovecend(), which doesn't need to be passed the iovec length to perform the copying. So we remove the parameter indicating iovec length in all functions where TIPC messages are built and sent. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: don't use memcpy to copy from user space	Ying Xue	2013-10-18	1	-13/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	tipc_msg_build() calls skb_copy_to_linear_data_offset() to copy data from user space to kernel space. However, the latter function does in its turn call memcpy() to perform the actual copying. This poses an obvious security and robustness risk, since memcpy() never makes any validity check on the pointer it is copying from. To correct this, we the replace the offending function call with a call to memcpy_fromiovecend(), which uses copy_from_user() to perform the copying. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: cosmetic realignment of function arguments	Paul Gortmaker	2013-06-17	1	-2/+2
\| \| \| \| \| \| \| \| \|	No runtime code changes here. Just a realign of the function arguments to start where the 1st one was, and fit as many args as can be put in an 80 char line. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: remove user_port instance from tipc_port structure	Ying Xue	2013-06-17	1	-10/+5
\| \| \| \| \| \| \| \| \| \| \| \|	After the native API has been completely removed, the 'user_port' field in struct tipc_port becomes unused, and can be removed. As a consequence, the "usrmem" argument in tipc_msg_build() is no longer needed, and so we remove that one too. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: remove TIPC packet debugging functions and macros	Erik Hugne	2012-07-13	1	-242/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The link queue traces and packet level debug functions served a purpose during early development, but are now redundant since there are other, more capable tools available for debugging at the packet level. The TIPC_DEBUG Kconfig option is removed since it does not provide any extra debugging features anymore. This gets rid of a lot of tipc_printf usages, which will make the pending cleanup work of that function easier. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
*	tipc: compress out gratuitous extra carriage returns	Paul Gortmaker	2012-04-30	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some of the comment blocks are floating in limbo between two functions, or between blocks of code. Delete the extra line feeds between any comment and its associated following block of code, to be consistent with the majority of the rest of the kernel. Also delete trailing newlines at EOF and fix a couple trivial typos in existing comments. This is a 100% cosmetic change with no runtime impact. We get rid of over 500 lines of non-code, and being blank line deletes, they won't even show up as noise in git blame. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
*	tipc: Eliminate trivial buffer manipulation helper routines	Allan Stephens	2012-02-24	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Gets rid of two inlined routines that simply call existing sk_buff manipulation routines, since there is no longer any extra processing done by the helper routines. Note that these changes are essentially cosmetic in nature, and have no impact on the actual operation of TIPC. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
*	tipc: Hide media-specific addressing details from generic bearer code	Allan Stephens	2011-12-27	1	-3/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reworks TIPC's media address data structure and associated processing routines to transfer all media-specific details of address conversion to the associated TIPC media adaptation code. TIPC's generic bearer code now only needs to know which media type an address is associated with and whether or not it is a broadcast address, and totally ignores the "value" field that contains the actual media-specific addressing info. These changes eliminate the need for a number of endianness conversion operations and will make it easier for TIPC to support new media types in the future. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
*	tipc: Eliminate useless check when creating internal message	Allan Stephens	2011-06-24	1	-4/+2
\| \| \| \| \| \| \| \| \|	Gets rid of code that allows tipc_msg_init() to create a short payload message header. This optimization is possible because there are no longer any callers who require this capability. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
*	tipc: Avoid recomputation of outgoing message length	Allan Stephens	2011-05-10	1	-22/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rework TIPC's message sending routines to take advantage of the total amount of data value passed to it by the kernel socket infrastructure. This change eliminates the need for TIPC to compute the size of outgoing messages itself, as well as the check for an oversize message in tipc_msg_build(). In addition, this change warrants an explanation: - res = send_packet(NULL, sock, &my_msg, 0); + res = send_packet(NULL, sock, &my_msg, bytes_to_send); Previously, the final argument to send_packet() was ignored (since the amount of data being sent was recalculated by a lower-level routine) and we could just pass in a dummy value (0). Now that the recalculation is being eliminated, the argument value being passed to send_packet() is significant and we have to supply the actual amount of data we want to send. Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
*	tipc: Eliminate obsolete routine for handling routed messages	Allan Stephens	2011-03-13	1	-6/+0
\| \| \| \| \| \| \| \| \|	Eliminates a routine that is used in handling messages arriving from another cluster or zone. Such messages can no longer be received by TIPC now that multi-cluster and multi-zone network support has been eliminated. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
*	tipc: Eliminate remaining support for routing table messages	Allan Stephens	2011-03-13	1	-27/+0
\| \| \| \| \| \| \| \| \| \|	Gets rid of all remaining code relating to ROUTE_DISTRIBUTOR messages. These messages were only used in multi-cluster and multi-zone networks, which TIPC no longer supports. (For safety, TIPC now treats such messages the same way that it handles other unrecognized messages.) Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
*	tipc: Eliminate timestamp from link protocol messages	Allan Stephens	2011-03-13	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	Removes support for the timestamp field of TIPC's link protocol messages. This field was previously used to hold an OS-dependent timestamp value that was used to assist in debugging early versions of TIPC. The field has now been deemed unnecessary and has been removed from the latest TIPC specification. This change has no impact on the operation of TIPC since the field was set by TIPC, but never referenced. Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>