summaryrefslogtreecommitdiffstats
path: root/sys
Commit message (Collapse)AuthorAgeFilesLines
* - Divorce the IOTSBs, which so far where handled via a global listmarius2007-08-057-143/+95
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | instead of per IOMMU, so we no longer need to program all of them identically in systems having multiple IOMMUs. This continues the rototilling of the nexus(4) done about 5 months ago, which amongst others changed nexus(4) and the drivers for host-to-foo bridges to provide bus_get_dma_tag methods, allowing to handle DMA tags in a hierarchical way and to link them with devices. This still doesn't move the silicon bug workarounds for Sabre (and in the uncommitted schizo(4) for Tomatillo) bridges into special bus_dma_tag_create() and bus_dmamap_sync() methods though, as w/o fully newbus'ified bus_dma_tag_create() and bus_dma_tag_destroy() this still requires too much hackery, i.e. per-child parent DMA tags in the parent driver. - Let the host-to-foo drivers supply the maximum physical address of the IOMMU accompanying the bridges. Previously iommu(4) hard- coded an upper limit of 16GB, which actually only applies to the IOMMUs of the Hummingbird and Sabre bridges. The Psycho variants as well as the U2S in fact can can translate to up to 2TB, i.e. translate to 41-bit physical addresses. According to the recently available Tomatillo documentation these bridges even translate to 43-bit physical addresses and hints at the Schizo bridges doing 43 bits as well. This fixes the issue the FreeBSD 6.0 todo list item "Max RAM on sparc64" was refering to and pretty much obsoletes the lack of support for bounce buffers on sparc64. Thanks to Nathan Whitehorn for pointing me at the Tomatillo manual. Approved by: re (kensmith)
* o In order to reduce bug and code duplication fold handling of NICsmarius2007-08-052-69/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | requiring DC_TX_ALIGN or DC_TX_COALESCE, which was previously done in dc_start_locked(), into dc_encap(). o In dc_encap(): - If m_defrag() fails just drop the packet like other NIC drivers do. This should only happen when there's a mbuf shortage, in which case it was possible to end up with an IFQ full of packets which couldn't be processed as they couldn't be defragmented as they were taking up all the mbufs themselves. This includes adjusting dc_start_locked() to not trying to prepend the mbuf (chain) if dc_encap() has freed it. - Likewise, if bus_dmamap_load_mbuf() fails as dc_dma_map_txbuf() failed, free the mbuf possibly allocated by the above call to m_defrag() and drop the packet. o In dc_txeof(): - Don't clear IFF_DRV_OACTIVE unless there are at least 6 free TX descriptors. Further down the road dc_encap() will bail if there are only 5 or fewer free TX descriptors, causing dc_start_locked() to abort and prepend the dequeued mbuf again so it makes no sense to pretend we could process mbufs again when in fact we won't. While at it replace this magic 5 with a macro DC_TX_LIST_RSVD. - Just always assign idx to sc->dc_cdata.dc_tx_cons; it doesn't make much sense to exclude the idx == sc->dc_cdata.dc_tx_cons case. o In dc_dma_map_txbuf() there's no need to set sc->dc_cdata.dc_tx_err to error if the latter is != 0, bus_dmamap_load_mbuf() already returns the same error value in that case anyway. o For less overhead, convert to use bus_dmamap_load_mbuf_sg() for loading RX buffers. o Remove some banal and/or outdated comments. Approved by: re (kensmith) MFC after: 1 week
* Initialize the rl_vlanctl field of the descriptors to zero (in ordermarius2007-08-051-0/+1
| | | | | | | | | to clear RL_TDESC_VLANCTL_TAG). This fixes sending packets in the native VLAN when running both tagged and an untagged VLAN over the same trunk and descriptors are recycled. Approved by: re (kensmith) MFC after: 1 week
* Do not acquire Giant unconditionally around the calls to the cdevswkib2007-08-051-5/+0
| | | | | | | | | d_mmap methods. prep_cdevsw() already installs the shims that acquire/drop Giant for the methods of a driver that specified the D_NEEDGIANT flag. Reviewed by: alc Approved by: re (kensmith)
* - Ensure the path cost does not exceed 65535 in legacy STP mode.thompsa2007-08-042-2/+31
| | | | | | | | | - If the path cost is calculated when the link is down, set a pending flag so it is calculated again when it comes back up. - To not use 00:00:00:00:00:00 as the bridge id, all interfaces are scanned and the lowest number wins. All zeros is too low. Approved by: re (rwatson)
* Replace "__asm __volatile()" by equivalent support functions frommarcel2007-08-041-6/+6
| | | | | | | | | | | ia64_cpu.h. This improves readability and consistency and aids in auditing the code. Add instruction-serialization after writing to cr.pta. Delay enabling interrupts until after we setup the clocks and after we program the task priority register. Approved by: re (blanket)
* Replace "__asm __volatile()" by equivalent support functions frommarcel2007-08-041-3/+5
| | | | | | | | | ia64_cpu.h. This improves readability and consistency and aids in auditing the code. Add data-serialization after writing to the region registers and add instruction-serialization after writing to cr.pta. Approved by: re (blanket)
* Replace "__asm __volatile()" by equivalent support functions frommarcel2007-08-041-16/+18
| | | | | | | | ia64_cpu.h. This improves readability and consistency and aids in auditing the code. Add data-serialization after writing to cr.tpr. Approved by: re (blanket)
* Add required data-serialization after writing to cr.itm and cr.itv.marcel2007-08-041-0/+1
| | | | Approved by: re (blanket)
* Add ia64_srlz_d() and ia64_srlz_i() functions to aid in serialization.marcel2007-08-041-0/+12
| | | | Approved by: re (blanket)
* Set D_NEEDGIANT.kib2007-08-041-0/+1
| | | | | Approved by: phk Approved by: re (kensmith)
* - Fix one line that erroneously crept in my last commit.jeff2007-08-041-1/+0
| | | | Approved by: re
* - Share scheduler locks between hyper-threaded cores to protect thejeff2007-08-031-114/+200
| | | | | | | | | | | | | tdq_group structure. Hyper-threaded cores won't really benefit from seperate locks anyway. - Seperate out the migration case from sched_switch to simplify the main switch code. We only migrate here if called via sched_bind(). - When preempted place the preempted thread back in the same queue at the head. - Improve the cpu group and topology infrastructure. Tested by: many on current@ Approved by: re
* - Set SW_PREEMPT when we preempt in critical_exit().jeff2007-08-031-1/+1
| | | | Approved by: re
* Oops, fix the fix for the i/o size of the fsinfo block. Its logbde2007-08-032-2/+2
| | | | | | | | | | | | | | | message explained why the size is 1 sector, but the code used a size of 1 cluster. I/o sizes larger than necessary may cause serious coherency problems in the buffer cache. Here I think there were only minor efficiency problems, since a too-large fsinfo buffer could only get far enough to overlap buffers for the same vnode (the device vnode), so mappings are coherent at the page level although not at the buffer level, and the former is probably enough due to our limited use of the fsinfo buffer. Approved by: re (kensmith)
* MFp4 - Refine locking to eliminate some potential race/panics:delphij2007-08-032-21/+22
| | | | | | | | | | | | | - Copy before testing a pointer. This closes a race window. - Use msleep with the node interlock instead of tsleep. - Do proper locking around access to tn_vpstate. - Assert vnode VOP lock for dir_{atta,de}tach to capture inconsistent locking. Suggested by: kib Submitted by: delphij Reviewed by: Howard Su Approved by: re (tmpfs blanket)
* Move mp_topology() from apic_init(i386) and apic_setup_local(amd64) topeter2007-08-024-36/+12
| | | | | | | | | | | | | | cpu_start_mp(). This is after we have read the cpuid registers to calculate the hyperthreading_cpus value for the sysctl that enables or disables hyperthread cores. Change mp_topology() to use that information rather than trying to do it itself. This solves the problem of ULE being incorrectly told that dual core Athlon64 X2 or Operton cpus are hyperthreading cores. At the very least, we now have a single piece of code to identify hyperthreading. Obtained from: jhb Approved by: re (kensmith)
* Add the device ID for the VIA CX700 chipset.kevlo2007-08-021-2/+9
| | | | Approved by: re (hrs)
* MFP4(123686): Fixing various ancontrol(8) related panics by dropping locksavatar2007-08-021-12/+60
| | | | | | | | around copyin()/copyout(). Reviewed by: sam, thompsa Tested by: dhw Approved by: re (kensmith)
* Call ttyld_close() in nmdmclose() to ensure that nmdm(4)emax2007-08-011-1/+6
| | | | | | | | closes line discipline installed onto /dev/nmdmX device. Reviewed by: julian Approved by: re (hrs) MFC after: 3 days
* Add 64bit statistic counters to the ng_ppp node.mav2007-08-012-6/+85
| | | | | | | 64bit counters are needed to simplify traffic accounting and reduce system load at the big PPP concentrators. Approved by: re (rwatson), glebius (mentor)
* This patch improves fine-grained locking for the ng_ppp node.mav2007-08-011-68/+137
| | | | | | | | | Till now node's transmit path was completely unprotected and so wasn't thread safe in multilink mode. It's receive path was declared as WRITER as the simpliest protection method but it reduces performance when compression or encryption enabled. Approved by: re (rwatson), glebius (mentor)
* Add a bridge interface flag called PRIVATE where any private port can notthompsa2007-08-012-34/+40
| | | | | | | | | | | | | | | communicate with another private port. All unicast/broadcast/multicast layer2 traffic is blocked so it works much the same way as using firewall rules but scales better and is generally easier as firewall packages usually do not allow ARP blocking. An example usage would be having a number of customers on separate vlans bridged with a server network. All the vlans are marked private, they can all communicate with the server network unhindered, but can not exchange any traffic whatsoever with each other. Approved by: re (rwatson)
* Change TCPTV_MIN to be independent of HZ. While it was documented topeter2007-07-312-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | be in ticks "for algorithm stability" when originally committed, it turns out that it has a significant impact in timing out connections. When we changed HZ from 100 to 1000, this had a big effect on reducing the time before dropping connections. To demonstrate, boot with kern.hz=100. ssh to a box on local ethernet and establish a reliable round-trip-time (ie: type a few commands). Then unplug the ethernet and press a key. Time how long it takes to drop the connection. The old behavior (with hz=100) caused the connection to typically drop between 90 and 110 seconds of getting no response. Now boot with kern.hz=1000 (default). The same test causes the ssh session to drop after just 9-10 seconds. This is a big deal on a wifi connection. With kern.hz=1000, change sysctl net.inet.tcp.rexmit_min from 3 to 30. Note how it behaves the same as when HZ was 100. Also, note that when booting with hz=100, net.inet.tcp.rexmit_min *used* to be 30. This commit changes TCPTV_MIN to be scaled with hz. rexmit_min should always be about 30. If you set hz to Really Slow(TM), there is a safety feature to prevent a value of 0 being used. This may be revised in the future, but for the time being, it restores the old, pre-hz=1000 behavior, which is significantly less annoying. As a workaround, to avoid rebooting or rebuilding a kernel, you can run "sysctl net.inet.tcp.rexmit_min=30" and add "net.inet.tcp.rexmit_min=30" to /etc/sysctl.conf. This is safe to run from 6.0 onwards. Approved by: re (rwatson) Reviewed by: andre, silby
* Make the driver fully MPSAFE. This fixes some serious locking problemsscottl2007-07-311-12/+32
| | | | | | | | that could cause panics and corruption under moderate load. Many thanks to Matt Reimer, Tom McDonald, and the rest of the guys at VPOP.net for their help in identifying and testing this. Approved by: re
* Fix locking mistakes in the error recovery paths of the AHC and AHD drivers.scottl2007-07-312-4/+0
| | | | Approved by: re
* Add in all the USB devices and all the wireless goo. The KB9202 hasimp2007-07-311-0/+44
| | | | | | | only USB 1.1 speeds available, but this shouldn't hurt. Now that we have working usb support for this board, this is a natural followup. Approved by: re (kensmith)
* Make USB work on the KB9202{,A,B} boards. This has been in p4 for aboutimp2007-07-313-3/+36
| | | | | | | | | | | | | | | | | 7 months. You must have JP6 in the 1-2 position to supply power to the USB devices, but I've used uftdi, uplcom and umass successfully. If you have it in 2-3, then nothing will show up. Also, if you have the FQPA packaging for the AT91RM9200 (like the KN9202 boards have), you will get the following message uhub0: device problem (IOERROR), disabling port 2 due to a hardware erratum. It is safe to ignore as it is about pins that aren't brought out on the FQPA package and aren't proeprly terminated either. Alas, there's no register to read to tell the FQPA from the BGA versions. Submitted by: Daan Vreeken Approved by: re (kensmith)
* MFppc:cognet2007-07-311-1/+1
| | | | | | | | | | | | | | | revision 1.66 date: 2007/07/31 06:23:26; author: marcel; state: Exp; lines: +2 -2 Fix backward compatibility of the "old" (i.e. FreeBSD6) lseek syscall. It was broken when a new lseek syscall was introduced. The problem is that we need to swap the 32-bit td_retval values for the __syscall indirect syscall when the actual syscall has a 32-bit return value. Hence, we need to exclude lseek(2). And this means the "old" lseek(2) as well -- which we didn't. Based on a patch from: grehan@ Approved by: re (blanket)
* Fix backward compatibility of the "old" (i.e. FreeBSD6) lseekmarcel2007-07-312-4/+4
| | | | | | | | | | | syscall. It was broken when a new lseek syscall was introduced. The problem is that we need to swap the 32-bit td_retval values for the __syscall indirect syscall when the actual syscall has a 32-bit return value. Hence, we need to exclude lseek(2). And this means the "old" lseek(2) as well -- which we didn't. Based on a patch from: grehan@ Approved by: re (rwatson)
* Enable -Werror for ia64.marcel2007-07-311-1/+1
| | | | Approved by: re (blanket)
* - Fixed a problem that would cause kernel panics and "bce0: discard frame .."davidch2007-07-313-119/+240
| | | | | | | | | | | | errors (especially when jumbo frames are enabled or in low memory systems) because the RX chain was corrupted when an mbuf was mapped to an unexpected number of buffers. - Fixed a problem that would cause kernel panics when an excessively fragmented TX mbuf couldn't be defragmented and was released by bce_tx_encap(). Approved by: re(hrs) MFC after: 7 days
* o Switch to physical addressing before dereferencing the VHPTmarcel2007-07-301-37/+62
| | | | | | | | | | | | bucket pointer. The virtual mapping may not be present in the translation cache. This will result in a nested TLB fault at a place we don't handle (and don't want to handle). o Make sure there's a stop after the rfi instruction, otherwise its behaviour is undefined. o Make sure we switch back to virtual addressing before doing a rfi. Behaviour is undefined otherwise. Approved by: re (blanket)
* Add option EXCEPTION_TRACING, which enables KTR-like functionalitymarcel2007-07-303-1/+87
| | | | | | | for processor interruptions. This is especially useful to track unexpected nested TLB faults. Approved by: re (blanket)
* Rework the interrupt code and add support for interrupt filteringmarcel2007-07-306-177/+239
| | | | | | | | | | | | | | | | | | | | (INTR_FILTER). This includes: o Save a pointer to the sapic structure and IRQ for every vector, so that we can quickly EOI, mask and unmask the interrupt. o Add locking to the sapic code now that we can reprogram a sapic on multiple CPUs at the same time. o Use u_int for the vector and IRQ. We only have 256 vectors, so using a 64-bit type for it is rather excessive. o Properly handle concurrent registration of a handler for the same vector. Since vectors have a corresponding priority, we should not map IRQs to vectors in a linear fashion, but rather pick a vector that has a priority in line with the interrupt type. This is left for later. The vector/IRQ interchange has been untangled as much as possible to make this easier. Approved by: re (blacket)
* Explicitly map the VHPT on all processors. Previously we weremarcel2007-07-304-0/+27
| | | | | | | | merely lucky that the VHPT was mapped as a side-effect of mapping the kernel, but when there's enough physical memory, this may not at all be the case. Approved by: re (blanket)
* Add casts to some of the more commonly used pointer-type atomicmarcel2007-07-301-5/+14
| | | | | | | operations. We really should be able to make those inline functions, but this would break its use for sx_locks. Approved by: re (blanket)
* - Propagate the largest set of interface capabilities supported by all laggthompsa2007-07-302-16/+57
| | | | | | | | | | | ports to the lagg interface. - Use the MTU from the first interface as the lagg MTU, all extra interfaces must be the same. This fixes using a lagg interface for a vlan or enabling jumbo frames, etc. Approved by: re (kensmith) MFC After: 3 days
* Dynamically choose the quality of the ACPI timer depending on whethernjl2007-07-302-2/+4
| | | | | | | | | | | the fast or safe/slow method is in use. Fast remains at 1000, slow is now at 850 (always preferred to TSC). Since the HPET has proven slower than ACPI-fast on some systems, drop its quality to 900. In the future, it is hoped that HPET performance will improve as it is the main timer Intel supports. HPET may move back to 2000 in -current once RELENG_7 is branched to ensure that it gets tested. Approved by: re
* Make tcpstates[] static, and make sure TCPSTATES is defined beforedes2007-07-305-12/+3
| | | | | | | | | <netinet/tcp_fsm.h> is included into any compilation unit that needs tcpstates[]. Also remove incorrect extern declarations and TCPDEBUG conditionals. This allows kernels both with and without TCPDEBUG to build, and unbreaks the tinderbox. Approved by: re (rwatson)
* Mfi386 revision 1.239 of src/sys/i386/isa/clock.c. Seemingly somedwmalone2007-07-292-2/+4
| | | | | | | | | pc98 motherboards do not provide us with the correct day of week either. Ignore the day of week when setting the clock here too. Approved by: re (bmah) Requested from: nyan MFC after: 3 weeks
* Fix a typo in a log message: s/Reveived/Received/.bmah2007-07-291-1/+1
| | | | Approved by: re (rwatson)
* Add missing newline in printf.imp2007-07-291-1/+1
| | | | | Submitted by: "R.Mahmatkhanov" cvs-src at yandex ru Approved by: re (blanket)
* In pci_alloc_map(), restore the original value of the BAR formarcel2007-07-291-0/+8
| | | | | | | | | the duration of the function. The device we would otherwise have left in an useless state may just as well be the low-level console. When booting verbose, we do need it addressable if we want to avoid a MCA. Approved by: re (kensmith)
* Fix compilation problems- tcpstates is only available if TCPDEBUGmjacob2007-07-292-1/+9
| | | | | | is set. Approved by: re (in spirit)
* Fix a panic introduced in rev 1.126.silby2007-07-281-1/+1
| | | | Approved by: re (rwatson)
* Provide a sysctl to toggle reporting of TCP debug logging:andre2007-07-283-8/+27
| | | | | | | | | | | | | | | | | | | | | sys.net.inet.tcp.log_debug = 1 It defaults to enabled for the moment and is to be turned off for the next release like other diagnostics from development branches. It is important to note that sysctl sys.net.inet.tcp.log_in_vain uses the same logging function as log_debug. Enabling of the former also causes the latter to engage, but not vice versa. Use consistent terminology in tcp log messages: "ignored" means a segment contains invalid flags/information and is dropped without changing state or issuing a reply. "rejected" means a segments contains invalid flags/information but is causing a reply (usually RST) and may cause a state change. Approved by: re (rwatson)
* o Move setting/resetting logic of syncache timer from macroandre2007-07-281-19/+49
| | | | | | | | | | | | | | | | SYNCACHE_TIMEOUT to new function syncache_timeout(). o Fix inverted timeout callout engagement logic to actually enable the timer for the bucket row. Before SYN|ACK was not retransmitted. o Simplify SYN|ACK retransmit timeout backoff calculation. o Improve logging of retransmit and timeout events. o Reset timeout when duplicate SYN arrives. o Add comments. o Rearrange SYN cookie statistics counting. Bug found by: silby Submitted by: silby (different version) Approved by: re (rwatson)
* o Move all detailed checks for RST in LISTEN state from tcp_input() toandre2007-07-282-17/+45
| | | | | | | | | syncache_rst(). o Fix tests for flag combinations of RST and SYN, ACK, FIN. Before a RST for a connection in syncache did not properly free the entry. o Add more detailed logging. Approved by: re (rwatson)
* Replace references to NET_CALLOUT_MPSAFE with CALLOUT_MPSAFE, and removerwatson2007-07-289-21/+16
| | | | | | | | definition of NET_CALLOUT_MPSAFE, which is no longer required now that debug.mpsafenet has been removed. The once over: bz Approved by: re (kensmith)
OpenPOWER on IntegriCloud