summaryrefslogtreecommitdiffstats
path: root/sys/ofed
Commit message (Collapse)AuthorAgeFilesLines
* Fix OFED startup order: All SYSINIT()'s and modules should be loadedhselasky2014-07-064-6/+11
| | | | | | | | | prior to starting "/sbin/init" which will run all the "/etc/rc.d/xxx" scripts. Else there can be a race configuring the interfaces via "/etc/rc.conf". MFC after: 4 weeks Sponsored by: Mellanox Technologies
* Fix compile warning.hselasky2014-07-061-1/+1
| | | | | MFC after: 4 weeks Sponsored by: Mellanox Technologies
* Fix some compile warnings.hselasky2014-07-061-12/+12
| | | | | MFC after: 4 weeks Sponsored by: Mellanox Technologies
* Compile fixes:hselasky2014-06-283-1/+3
| | | | | | | | | | | | | | | | | | | | | Remove duplicate "debug_ktr.mask" sysctl definition. Remove now unused variable from "kern_ktr.c". This fixes build of "ktr" which was broken by r267961. Let the default value for "vm_kmem_size_scale" be zero. It is setup after that the sysctl has been initialized from "getenv()" in the "kmeminit()" function to equal the "VM_KMEM_SIZE_MAX" value, if zero. On Sparc64 the "VM_KMEM_SIZE_MAX" macro is not a constant. This fixes build of Sparc64 which was broken by r267961. Add a special macro to dynamically create SYSCTL root nodes, because root nodes have a special parent. This fixes build of existing OFED module and CANBUS module for pc98 which was broken by r267961. Add missing "sysctl.h" includes to get the needed sysctl header file declarations. This is needed after r267961. MFC after: 2 weeks
* - Fix out of range shifting bug in bitops.h.hselasky2014-06-121-5/+5
| | | | | | | - Make code a bit easier to read by adding parenthesis. MFC after: 3 days Sponsored by: Mellanox Technologies
* Use src.opts.mk in preference to bsd.own.mk except where we need stuffimp2014-05-062-2/+2
| | | | from the latter.
* Rename global cnt to vm_cnt to avoid shadowing.bdrewery2014-03-222-2/+2
| | | | | | | | | | | | | | To reduce the diff struct pcu.cnt field was not renamed, so PCPU_OP(cnt.field) is still used. pc_cnt and pcpu are also used in kvm(3) and vmstat(8). The goal was to not affect externally used KPI. Bump __FreeBSD_version_ in case some out-of-tree module/code relies on the the global cnt variable. Exp-run revealed no ports using it directly. No objection from: arch@ Sponsored by: EMC / Isilon Storage Division
* Since 32-bit if_baudrate isn't enough to describe a baud rate of a 10 Gbitglebius2014-03-133-13/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | interface, in the r241616 a crutch was provided. It didn't work well, and finally we decided that it is time to break ABI and simply make if_baudrate a 64-bit value. Meanwhile, the entire struct if_data was reviewed. o Remove the if_baudrate_pf crutch. o Make all fields of struct if_data fixed machine independent size. The notion of data (packet counters, etc) are by no means MD. And it is a bug that on amd64 we've got a 64-bit counters, while on i386 32-bit, which at modern speeds overflow within a second. This also removes quite a lot of COMPAT_FREEBSD32 code. o Give 16 bit for the ifi_datalen field. This field was provided to make future changes to if_data less ABI breaking. Unfortunately the 8 bit size of it had effectively limited sizeof if_data to 256 bytes. o Give 32 bits to ifi_mtu and ifi_metric. o Give 64 bits to the rest of fields, since they are counters. __FreeBSD_version bumped. Discussed with: emax Sponsored by: Netflix Sponsored by: Nginx, Inc.
* Simplify filling sockaddr_dl structure for if_resolvemulti()melifaro2014-01-181-16/+2
| | | | | | | | | | | | | | callback providers. link_init_sdl() function can be used to fill most of the parameters. Use caller stack instead of allocation / freing memory for each request. Do not drop support for extra-long (probably non-existing) link-layer protocols by introducing link_alloc_sdl() (used by if_resolvemulti() callback) and link_free_sdl() (used by caller). Since this change breaks KBI, MFC requires slightly different approach (link_init_sdl() auto-allocating buffer if necessary to handle cases with unmodified if_resolvemulti() callers). MFC after: 2 weeks
* Similar to r260020, only use -fms-extensions with gcc, for all otherdim2013-12-302-2/+2
| | | | | | | | modules which require this flag to compile. Use a GCC_MS_EXTENSIONS variable, defined in kern.pre.mk, which can be used to easily supply the flag (or not), depending on the compiler type. MFC after: 3 days
* Defer start/stop port to workqueues.alfred2013-12-152-25/+75
| | | | | | | | | We need to do this because the Linux compat layer uses sx(9) for mutex, however the lagg code uses rmlocks and calls into the mellanox driver. This causes deadlock due to sleeping while holding a rmlock. Submitted by: Shahar Klein (shahark mellanox.com) MFC After: 3 days.
* Fix undefined behavior: (1 << 31) is not defined as 1 is an int and thiseadler2013-11-304-13/+13
| | | | | | | | | | | | | shifts into the sign bit. Instead use (1U << 31) which gets the expected result. This fix is not ideal as it assumes a 32 bit int, but does fix the issue for most cases. A similar change was made in OpenBSD. Discussed with: -arch, rdivacky Reviewed by: cperciva
* Fix creating a vlan over lagg over mlxen crash.alfred2013-11-171-0/+6
| | | | | PR: 181931 Submitted by: Shahar Klein (shahark mellanox.com)
* Do not use a sleep lock when protecting the driver flags.alfred2013-11-082-2/+5
| | | | | | This was causing a locking issue with lagg Submitted by: odeds
* Fix for bad performance when mtu is increased.alfred2013-11-083-65/+45
| | | | | | | Update the auto moderation behavior in the mlxen driver to match the new LINUX OFED code. Submitted by: odeds
* Use explicit long cast to avoid overflow in bitopts.alfred2013-11-081-3/+3
| | | | | | | This was causing problems with the buddy allocator inside of ofed. Submitted by: odeds
* Fix API mismatch exposed by lagg.alfred2013-11-021-2/+2
| | | | | When destroying a lagg the driver tries to restore the old mac and fails due to API mismatch
* The r48589 promised to remove implicit inclusion of if_var.h soon. Prepareglebius2013-10-263-0/+3
| | | | | | | | to this event, adding if_var.h to files that do need it. Also, include all includes that now are included due to implicit pollution via if_var.h Sponsored by: Netflix Sponsored by: Nginx, Inc.
* Fix resource free.alfred2013-10-171-3/+3
| | | | | | | | | | | | The order of releasing resources in mlxen was wrong, which caused panic on reload of the module. conf_ctx list should be released before stat_ctx list, otherwise the leafs in conf_ctx list won't be released because of the dependancy. The fix is to change the order of the releases. Submitted by: Shahar Klein (shahark at mellanox.com)
* Fix __free_pages() in the linux shim.alfred2013-10-151-3/+3
| | | | | __free_pages() is actaully supposed to take a "struct page *" not an address.
* Fix for When more than one NIC is present.alfred2013-10-101-34/+1
| | | | | | | | | | | | The device name was incorrect due to a specific function we ported from the Linux driver that is not FBSD compatible. This resulted with a false sysctl registration and some more problematic issues. The patch basically revokes it all together. Submitted by: Meny Yossefi (menyy mellanox.com) Approved by: re
* Remove redundant declaration of cmclass indim2013-10-091-3/+0
| | | | | | | sys/ofed/drivers/infiniband/core/ucm.c, to silence a gcc warning. Approved by: re (kib) X-MFC-With: r255932
* Give an unnamed union in sys/ofed/include/rdma/ib_verbs.h a name, todim2013-10-071-1/+1
| | | | | | | silence a gcc warning. Approved by: re (gjb) MFC after: 3 days
* Fixed kernel crash when running devinfoalfred2013-10-011-0/+2
| | | | | | | | | | When calling to ib_uverbs_cleanup_ucontext, there is a call to mutex_lock of xrcd_table_mutex, which was not initialized. Added missing initialization for xrcd_table_mutex. Submitted by: Orit Moskovich (oritm mellanox.com) Approved by: re
* Enable ib_dev.mmap functionalfred2013-10-013-2/+73
| | | | | | | | | Removed the ifdef linux from this function. Added stub function for contiguous pages to avoid compilation errors. Submitted by: Orit Moskovich (oritm mellanox.com) Approved by: re
* Fixed 'Couldn't Create QP' issue when running rc_pingpong, uc_pingpong,alfred2013-10-011-0/+18
| | | | | | | | | | | | srq_pingpong IBverbs Removed refrences using 'ifdef __linux__' to qpg functions and related fields in struct ib_qp_init_attr. Submitted by: Orit Moskovich (oritm mellanox.com) Approved by: re
* Fixed kernel crash when removing IPOIB_CM option from configuration filealfred2013-10-012-2/+2
| | | | | | | Changed module init from module_init() to module_init_order() with SI_ORDER_MIDDLE flag Submitted by: Orit Moskovich (oritm mellanox.com) Approved by: re
* Fix mis-merge of upstream fix.alfred2013-10-011-4/+0
| | | | | | | | We would accidentally make the string one byte too short. Submitted by: Orit Moskovich (oritm mellanox.com) Approved by: re
* Update OFED to Linux 3.7 and update Mellanox drivers.alfred2013-09-2996-3500/+24925
| | | | | | | | | | | | | | | | | | | | | | | Update the OFED Infiniband core to the version supplied in Linux version 3.7. The update to OFED is nearly all additional defines and functions with the exception of the addition of additional parameters to ib_register_device() and the reg_user_mr callback. In addition the ibcore (Infiniband core) and ipoib (IP over Infiniband) have both been made into completely loadable modules to facilitate testing of the OFED stack in FreeBSD. Finally the Mellanox Infiniband drivers are now updated to the latest version shipping with Linux 3.7. Submitted by: Mellanox FreeBSD driver team: Oded Shanoon (odeds mellanox.com), Meny Yossefi (menyy mellanox.com), Orit Moskovich (oritm mellanox.com) Approved by: re
* Handle cases where capability rights are not provided.pjd2013-09-051-3/+9
| | | | Reported by: kib
* Change m->pkthdr.header to m->pkthdr.PH_loc.ptr after r254804andre2013-08-251-2/+2
| | | | | | to transiently store pointers to packet headers. Sponsored by: The FreeBSD Foundation
* Fix implementation of sock_getname.np2013-08-231-5/+5
| | | | MFC after: 1 week
* Stop an ipoib interface before detaching it.jhb2013-08-201-0/+2
| | | | | | | PR: kern/181225 Submitted by: Shahar Klein Obtained from: Mellanox MFC after: 1 week
* Add m_clrprotoflags() to clear protocol specific mbuf flags at up andandre2013-08-191-1/+1
| | | | | | | | downwards layer crossings. Consistently use it within IP, IPv6 and ethernet protocols. Discussed with: trociny, glebius
* Make sendfile() a method in the struct fileops. Currently onlyglebius2013-08-151-0/+1
| | | | | | | | vnode backed file descriptors have this method implemented. Reviewed by: kib Sponsored by: Nginx, Inc. Sponsored by: Netflix
* - Reserve a special AF for SDP. The one we were incorrectly using beforejeff2013-08-091-0/+2
| | | | | | was taken by another AF. Sponsored by: EMC / Isilon Storage Division
* - Correctly handle various edge cases in sysfs emulation.jeff2013-08-091-4/+7
| | | | Sponsored by: EMC / Isilon Storage Division
* - Use the correct type in the linux bitops emulation.jeff2013-08-091-4/+9
| | | | Submitted by: Maxim Ignatenko <gelraen.ua@gmail.com>
* Split the pagequeues per NUMA domains, and split pageademon processkib2013-08-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | into threads each processing queue in a single domain. The structure of the pagedaemons and queues is kept intact, most of the changes come from the need for code to find an owning page queue for given page, calculated from the segment containing the page. The tie between NUMA domain and pagedaemon thread/pagequeue split is rather arbitrary, the multithreaded daemon could be allowed for the single-domain machines, or one domain might be split into several page domains, to further increase concurrency. Right now, each pagedaemon thread tries to reach the global target, precalculated at the start of the pass. This is not optimal, since it could cause excessive page deactivation and freeing. The code should be changed to re-check the global page deficit state in the loop after some number of iterations. The pagedaemons reach the quorum before starting the OOM, since one thread inability to meet the target is normal for split queues. Only when all pagedaemons fail to produce enough reusable pages, OOM is started by single selected thread. Launder is modified to take into account the segments layout with regard to the region for which cleaning is performed. Based on the preliminary patch by jeff, sponsored by EMC / Isilon Storage Division. Reviewed by: alc Tested by: pho Sponsored by: The FreeBSD Foundation
* Replace kernel virtual address space allocation with vmem. This providesjeff2013-08-073-9/+9
| | | | | | | | | | | | | transparent layering and better fragmentation. - Normalize functions that allocate memory to use kmem_* - Those that allocate address space are named kva_* - Those that operate on maps are named kmap_* - Implement recursive allocation handling for kmem_arena in vmem. Reviewed by: alc Tested by: pho Sponsored by: EMC / Isilon Storage Division
* Add a missing prototype.jhb2013-07-291-0/+1
| | | | Pointy hat: me
* Various fixes to the mlxen(4) driver:jhb2013-07-291-21/+31
| | | | | | | | | | | | | | - Remove an incorrect assertion that can trigger when downing an interface. - Stop the interface during detach to avoid panics when unloading the driver. - A few locking fixes to be more consistent with other FreeBSD drivers: - Protect if_drv_flags with the driver lock, not atomic ops - Hold the driver lock when adjusting multicast state. - Hold the driver lock while adjusting if_capenable. PR: kern/180791 [1,2] Submitted by: Shakar Klein @ Mellanox [1,2] MFC after: 3 days
* Avoid trashing IP fragments:jhb2013-07-251-2/+6
| | | | | | | | | - Only enable UDP/TCP hardware checksums if CSUM_UDP or CSUM_TCP is set. - Only enable IP hardware checksums if CSUM_IP is set. PR: kern/180430 Submitted by: Meny Yossefi <menyy@mellanox.com> MFC after: 1 week
* rename scheduler->swapper and SI_SUB_RUN_SCHEDULER->SI_SUB_LASTavg2013-07-241-3/+3
| | | | | | | | | | | | | | | | Also directly call swapper() at the end of mi_startup instead of relying on swapper being the last thing in sysinits order. Rationale: - "RUN_SCHEDULER" was misleading, scheduling already takes place at that stage - "scheduler" was misleading, the function swaps in the swapped out processes - another SYSINIT(SI_SUB_RUN_SCHEDULER, SI_ORDER_ANY) could never be invoked depending on its relative order with scheduler; this was not obvious and the bug actually used to exist Reviewed by: kib (ealier version) MFC after: 14 days
* Rework the previous fix for the IB vs Ethernet sysctl handler to be morejhb2013-07-182-20/+18
| | | | | | | | | | | | | | | generic and apply to all sysfs attributes: - Use sysctl_handle_string() instead of reimplementing it. - Remove trailing newline from the current value before passing it to userland and append a newline to the new string value before passing it to the attribute's store function. - Don't leak the temporary buffer if the first error check triggers. - Revert earlier change to mlx4 port mode handler. PR: kern/174213 Submitted by: Garrett Cooper Reviewed by: Shakar Klein @ Mellanox MFC after: 1 week
* Remove check forbidding requests that would result in one port being setjhb2013-07-171-3/+0
| | | | | | | | to Ethernet and the subsequent port being set to IB. Submitted by: Shakar Klein @ Mellanox Tested by: Morgan Robertson <morganrobertson@gmail.com> MFC after: 1 week
* Allow mlx4 devices to switch from Ethernet to Infiniband (and vice versa):jhb2013-07-082-5/+10
| | | | | | | | | | | | - Fix sysctl wrapper for sysfs attributes to properly handle new string values similar to sysctl_handle_string() (only copyin the user's supplied length and nul-terminate the string). - Don't check for a trailing newline when evaluating the desired operating mode of a mlx4 device. PR: kern/179999 Submitted by: Shahar Klein <shahark@mellanox.com> MFC after: 1 week
* Store a reference to the vnode associated with a file descriptor in thejhb2013-06-112-1/+5
| | | | | | | | | | | linux_file structure and use it instead of directly accessing td_fpop when destroying the linux_file structure. The td_fpop pointer is not valid when a cdevpriv destructor is run, and the type-specific close method has already been called, so f_vnode may not be valid (and the vnode might have been recycled without our own reference). Tested by: Julian Stecklina <jsteckli@os.inf.tu-dresden.de> MFC after: 1 week
* Fxi a bunch of typos.eadler2013-05-101-1/+1
| | | | | PR: misc/174625 Submitted by: Jeremy Chadwick <jdc@koitsu.org>
* According to the documentation, on Linux, cancel_delayed_work() does notdelphij2013-05-081-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | do drain (flush_workqueue() in Linux terms) but instead returns true if the work was removed before it is run, or false otherwise. Simulate this by removing the taskqueue_drain() and return the value derived from taskqueue_cancel()'s return value. This would solve a witness warning caused by calling taskqueue_drain() with a non-sleepable lock held, like: taskqueue_drain with the following non-sleepable locks held: exclusive rw lle (lle) r = 0 (0xfffffe001450b410) locked @ /usr/src/sys/netinet/in.c:1484 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xffffff848d4f7690 kdb_backtrace() at kdb_backtrace+0x39/frame 0xffffff848d4f7740 witness_warn() at witness_warn+0x4a8/frame 0xffffff848d4f7800 taskqueue_drain() at taskqueue_drain+0x3a/frame 0xffffff848d4f7840 set_timeout() at set_timeout+0x4a/frame 0xffffff848d4f7860 netevent_callback() at netevent_callback+0x16/frame 0xffffff848d4f7870 arpintr() at arpintr+0x9b5/frame 0xffffff848d4f7930 This do not affect kernel without OFED compiled in. Reported by: Garrett Cooper <yaneurabeya gmail com> (who also tested an earlier version of this patch, but bugs are mine) MFC after: 2 weeks
OpenPOWER on IntegriCloud