summaryrefslogtreecommitdiffstats
path: root/sys/sparc64/include
Commit message (Collapse)AuthorAgeFilesLines
* Make the pmap stats writeable. It can be useful to clear them.jake2003-04-061-1/+1
|
* Use the vis block copy/zero functions for pmap_copy_page and pmap_zero_page.jake2003-04-062-7/+9
| | | | | | | | | | | | | These are called through function pointers so that different implementations can be provided for cheetah, where the block load instructions may or may not be a win, and so they can be disabled with the machdep.use_vis tunable. In terms of raw bandwidth the integer versions are faster, but not allocating lines in the L2 cache for useless data gives a measurable improvement in user time for the benchmarks I tested (mostly buildworld with -j8). As far as I can tell the instructions used are implemented on everything back to UltraSPARC I, so there should not be a problem with different cpu types.
* Add optimized block copy and zero functions using vis instructions, whichjake2003-04-031-0/+3
| | | | | | | | | | can do 64 bytes at a time and don't allocate lines in the L2 cache. These assume that everything is 64 byte aligned, and that there's more than 128 bytes of data (best for whole pages). The block load and store instructions don't follow normal memory ordering rules and require either a memory barrier or move between registers before the data can actually be used. This implementation correctly shuffles around 3 out of the 4 sets of registers in order to avoid memory barriers expect for the last 2 blocks.
* - Add space for kernel floating point registers to the pcb. These will bejake2003-04-031-6/+8
| | | | | | | | | used to support block copy and zero operations in the kernel which use the floating point registers. - While I'm changing the size, improve the layout of struct pcb, sort by size, then alphabetical etc. - Add some assertions to validate assumptions made about how the pcb is allocated.
* - Add a flags field to struct pcb. Use this to keep track of wether orjake2003-04-011-0/+3
| | | | | not the pcb has floating point registers saved in it. - Implement get_mcontext and set_mcontext.
* - Rename pcb_fpstate to pcb_ufp (user floating point), and change it tojake2003-04-012-19/+5
| | | | | | a simple array of 64 ints. - Use a critical section when saving floating point state in cpu_fork instead of sched_lock.
* Rename pcb_fp to pcb_sp, so as to not be confused with floating pointjake2003-04-011-1/+1
| | | | state.
* Handle the fictitious pages created by the device pager. For fictitiousjake2003-03-271-0/+2
| | | | | | | | | | pages which represent actual physical memory we must strip off the fake page in order to allow illegal aliases to be detected. Otherwise we map uncacheable in the virtual and physical caches and set the side effect bit, as is required for mapping device memory. This fixes gstat on sparc64, which wants to mmap kernel memory through a character device.
* - Add vm_paddr_t, a physical address type. This is required for systemsjake2003-03-251-0/+1
| | | | | | | | | | | | | | | where physical addresses larger than virtual addresses, such as i386s with PAE. - Use this to represent physical addresses in the MI vm system and in the i386 pmap code. This also changes the paddr parameter to d_mmap_t. - Fix printf formats to handle physical addresses >4G in the i386 memory detection code, and due to kvtop returning vm_paddr_t instead of u_long. Note that this is a name change only; vm_paddr_t is still the same as vm_offset_t on all currently supported platforms. Sponsored by: DARPA, Network Associates Laboratories Discussed with: re, phk (cdevsw change)
* - Remove unused cache flushing routines. These will not necessary workjake2003-03-193-74/+30
| | | | | | | | | | | | | | | | | | | on future UltraSPARC cpus for which the data cache is not direct mapped. - Move UltraSPARC I and II (spitfire, blackbird, sapphire, sabre) specific functions to spitfire.c, and add cheetah.c for UltraSPARC III specific functions. Initially just cache flushing, but there are a few other functions that will need to move here. - Add an ipi handler for data cache flushing on UltraSPARC III. - Use function pointers to select the right cache flushing functions based on cpu_impl. With this it is possible to boot single user from an mfs root on UltraSPARC III systems, including spinning up secondary processors. There is currently no support for the host to pci bridge, and no documentation for it is publically available. Thanks to Oleg Derevenetz for providing access to a system with UltraSPARC III+ cpus.
* Remove unused fields.jake2003-03-181-5/+1
|
* Made the prototypes for pmap_kenter and pmap_kremove MD. These functionsjake2003-03-161-0/+2
| | | | | | | | | are machine dependent because they are not required to update the tlb when mappings are added or removed, and doing so is machine dependent. In addition, an implementation may require that pages mapped with pmap_kenter have a backing vm_page_t, which is not necessarily true of all physical pages, and so may choose to pass the vm_page_t to pmap_kenter instead of the physical address in order to make this requirement clear.
* Correctly set BUS_SPACE_MAXSIZE in all the busdma backends.mux2003-02-261-1/+1
| | | | | It was bogusly set to 64 * 1024 or 128 * 1024 because it was bogusly reused in the BUS_DMAMAP_NSEGS definition.
* Make the 'a' parameter of bus_space_write_multi_stream_*() a const pointer.obrien2003-02-241-3/+3
|
* The rest of our platforms make bus_space_write_multi_stream_2's 'a' aobrien2003-02-231-1/+1
| | | | const pointer.
* Add an empty bus_space_unmap() like Alpha has. puc(4) uses it.obrien2003-02-231-0/+11
|
* Implement fpclassify():mike2003-02-082-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | o Add a MD header private to libc called _fpmath.h; this header contains bitfield layouts of MD floating-point types. o Add a MI header private to libc called fpmath.h; this header contains bitfield layouts of MI floating-point types. o Add private libc variables to lib/libc/$arch/gen/infinity.c for storing NaN values. o Add __double_t and __float_t to <machine/_types.h>, and provide double_t and float_t typedefs in <math.h>. o Add some C99 manifest constants (FP_ILOGB0, FP_ILOGBNAN, HUGE_VALF, HUGE_VALL, INFINITY, NAN, and return values for fpclassify()) to <math.h> and others (FLT_EVAL_METHOD, DECIMAL_DIG) to <float.h> via <machine/float.h>. o Add C99 macro fpclassify() which calls __fpclassify{d,f,l}() based on the size of its argument. __fpclassifyl() is never called on alpha because (sizeof(long double) == sizeof(double)), which is good since __fpclassifyl() can't deal with such a small `long double'. This was developed by David Schultz and myself with input from bde and fenner. PR: 23103 Submitted by: David Schultz <dschultz@uclink.Berkeley.EDU> (significant portions) Reviewed by: bde, fenner (earlier versions)
* Fix another mistake in the bus_dmamem_alloc_size() thingscottl2003-01-291-2/+2
| | | | Submitted by: tmm
* Fix some more missing dt_ prefixes for dma tag fields.scottl2003-01-291-4/+4
|
* Implement bus_dmamem_alloc_size() and bus_dmamem_free_size() asscottl2003-01-292-0/+34
| | | | | | | | | | | | | | | | | | | | counterparts to bus_dmamem_alloc() and bus_dmamem_free(). This allows the caller to specify the size of the allocation instead of it defaulting to the max_size field of the busdma tag. This is intended to aid in converting drivers to busdma. Lots of hardware cannot understand scatter/gather lists, which forces the driver to copy the i/o buffers to a single contiguous region before sending it to the hardware. Without these new methods, this would require a new busdma tag for each operation, or a complex internal allocator/cache for each driver. Allocations greater than PAGE_SIZE are rounded up to the next PAGE_SIZE by contigmalloc(), so this is not suitable for multiple static allocations that would be better served by a single fixed-length subdivided allocation. Reviewed by: jake (sparc64)
* Fixes for a number of problems in the IOMMU code:tmm2003-01-211-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | 1.) Fix an off-by-one in the DVMA space handling, which would make it possible to allocate one page beyond the end of the DVMA area. This page was aliased to the first page. Apparently, this bug was responsible for the trashed nvram/eeprom some people were reporting, in conjunction with a number of unfortunate coincidences. 2.) Fix broken boundary and and lowaddr calculations. 3.) Fix a memory leak on an error path. 4.) Update a outdated comment to reflect the introduction of IOMMU_MAX_PRE, make the usage of IOMMU_MAX_PRE more consistent and KASSERT that the preallocation size is not 0. 5.) Fix a case where an error return was lost. 6.) When signalling an error to the caller by invoking the callback, do not use a segment pointer of NULL for compatability with existing drivers. Also, increase the maximum segment number to 64; it is rather arbitrary, with the exception of the of the stack space consumed by the segment array. Special thanks go to Harti Brandt <brandt@fokus.fraunhofer.de> for spotting 4 and 5, and testing many iterations of patches. Pointy hats to: tmm
* Don't allow user process to set an invalid window state through sigreturn.jake2003-01-101-0/+1
| | | | Spotted by: tmm
* Implement bus_space_subregion.jake2003-01-081-0/+10
|
* Change the iommu code to be able to handle more than one DVMA area pertmm2003-01-062-8/+20
| | | | | | | | | | | map. Use this new feature to implement iommu_dvmamap_load_mbuf() and iommu_dvmamap_load_uio() functions in terms of a new helper function, iommu_dvmamap_load_buffer(). Reimplement the iommu_dvmamap_load() to use it, too. This requires some changes to the map format; in addition to that, remove unused or redundant members. Add SBus and Psycho wrappers for the new functions, and make them available through the respective DMA tags.
* - remove the unused parent DMA tag argument fromtmm2003-01-061-0/+2
| | | | | | | | | | _nexus_dmamap_load_buffer() - implement nexus_dmamap_load() in terms of _nexus_dmamap_load_buffer(). Note that this is untested, as this code is not currently used (but might be later for UPA devices). - move BUS_DMAMAP_NSEGS to bus_private.h - disable the ecache flushing in nexus_dmamap_sync(); it should not be needed, although the docs are not entirely clear on that.
* Prefix the members of struct bus_space_tag and struct bus_dma_tag withtmm2003-01-061-65/+66
| | | | a uniqifier. No functional changes.
* Look for the correct method in sparc64_dmamap_load_mbuf() andtmm2003-01-061-2/+2
| | | | sparc64_dmamap_load_uio().
* Some cleanup:tmm2003-01-061-1/+22
| | | | | | | | - move some constants into iommureg.h - correct some comments - use KASSERT() in one place instead of rolling our own - take a sanity check out of #ifdef DIAGNOSTIC - fix a syntax error in normally #ifdef'ed out debug code
* - Reorganize PMAP_STATS to scale a little better.jake2003-01-051-0/+19
| | | | - Add some more stats for things that are now considered interesting.
* Make imgact_elf32.c compile on sparc64.jake2003-01-051-0/+10
| | | | Obtained from: ia64
* Correct typos, mostly s/ a / an / where appropriate. Some whitespace cleanup,schweikh2003-01-012-4/+4
| | | | especially in troff files.
* Use memset instead of __builtin_memset. Apparently there's an inlinejake2002-12-291-1/+1
| | | | memset in libkern which causes problems; why that's there is beyond me.
* Define UMA_MD_SMALL_ALLOC so that uma_small_alloc and uma_small_free willjake2002-12-271-0/+2
| | | | | | | be used for zones that allocate objects of less 1 page. The biggest advantage of this is that all of a sudden the majority of kernel malloc-ed data doesn't need kva allocated for it. Besides microbenchmarks I haven't seen a measurable performance improvement from doing this.
* - Change the way the direct mapped region is implemented to be generallyjake2002-12-232-36/+84
| | | | | | | | | | | | | | | | | | | | | useful for accessing more than 1 page of contiguous physical memory, and to use 4mb tlb entries instead of 8k. This requires that the system only use the direct mapped addresses when they have the same virtual colour as all other mappings of the same page, instead of being able to choose the colour and cachability of the mapping. - Adapt the physical page copying and zeroing functions to account for not being able to choose the colour or cachability of the direct mapped address. This adds a lot more cases to handle. Basically when a page has a different colour than its direct mapped address we have a choice between bypassing the data cache and using physical addresses directly, which requires a cache flush, or mapping it at the right colour, which requires a tlb flush. For now we choose to map the page and do the tlb flush. This will allows the direct mapped addresses to be used for more things that don't require normal pmap handling, including mapping the vm_page structures, the message buffer, temporary mappings for crash dumps, and will provide greater benefit for implementing uma_small_alloc, due to the much greater tlb coverage.
* - Add a spin lock to single thread cache invalidation and tlb flush ipis,jake2002-12-221-6/+12
| | | | | which allows ipis to be sent outside of Giant. - Remove the ap boot mutex, which is unused.
* MB_LEN_MAX is not MD, move it to the MI limits.h.tjr2002-12-222-2/+0
|
* - Add a pmap pointer to struct md_page, and use this to find the pmap thatjake2002-12-212-9/+11
| | | | | | | | | | | | | | | | | | | | | | | | a mapping belongs to by setting it in the vm_page_t structure that backs the tsb page that the tte for a mapping is in. This allows the pmap that a mapping belongs to to be found without keeping a pointer to it in the tte itself. - Remove the pmap pointer from struct tte and use the space to make the tte pv lists doubly linked (TAILQs), like on other architectures. This makes entering or removing a mapping O(1) instead of O(n) where n is the number of pmaps a page is mapped by (including kernel_pmap). - Use atomic ops for setting and clearing bits in the ttes, now that they return the old value and can be easily used for this purpose. - Use __builtin_memset for zeroing ttes instead of bzero, so that gcc will inline it (4 inline stores using %g0 instead of a function call). - Initially set the virtual colour for all the vm_page_ts to be equal to their physical colour. This will be more useful once uma_small_alloc is implemented, but basically pages with virtual colour equal to phsyical colour are easier to handle at the pmap level because they can be safely accessed through cachable direct virtual to physical mappings with that colour, without fear of causing illegal dcache aliases. In total these changes give a minor performance improvement, about 1% reduction in system time during buildworld.
* Removed unused pmap_qenter_flags.jake2002-12-211-1/+0
|
* Make the atomic arithmetic functions return the old value, since they'rejake2002-12-211-40/+38
| | | | all implemented with cas anyway.
* Always initialize the UPA target module id in the interrupt mappingtmm2002-12-011-17/+18
| | | | | | | | | | | | register to the one of the processor doing the interrupt setup. This is required since this field is preinitialized to 0, but there exist machines which have no processor with a MID of 0 (e.g. e450s with 1 or 2 processors). Add some more macros for handle the interrupt mapping registers, and rename some existing ones for consistency. Approved by: re
* Move pmap_collect() out of the machine-dependent code, rename italc2002-11-131-2/+0
| | | | | | | | to reflect its new location, and add page queue and flag locking. Notes: (1) alpha, i386, and ia64 had identical implementations of pmap_collect() in terms of machine-independent interfaces; (2) sparc64 doesn't require it; (3) powerpc had it as a TODO.
* - Clear the page's PG_WRITEABLE flag in the i386's pmap_changebit()alc2002-11-111-2/+0
| | | | | | if we're removing write access from the page's PTEs. - Export pmap_remove_all() on alpha, i386, and ia64. (It's already exported on sparc64.)
* Add two new workaround for firmware anomalies:tmm2002-11-072-1/+5
| | | | | | | | | | | | | | | | | 1. At least some Netra t1 models have PCI buses with no associated interrupt map, but obviously expect the PCI swizzle to be done with the interrupt number from the higher level as intpin. In this case, the mapping also needs to continue at parent bus nodes. To handle that, add a quirk table based on the "name" property of the root node to avoid breaking other boxen. This property is now retrieved and printed at boot. 2. On SPARCengine Ultra AX machines, interrupt numbers are not mapped at all, and full interrupt numbers (not just INOs) are given in the interrupt properties. This is more or less cosmetical; the PCI interrupt numbers would be wrong, but the psycho resource allocation method would pass the right numbers on anyway. Tested by: mux (1), Maxim Mazurok <maxim@km.ua> (2)
* Don peril sensitive sun glasses and change the default system call vectorjake2002-10-271-1/+24
| | | | | | | | | | | | | | | | | for sparc64 from trap #9 to trap #65. This is one of the ABI "blessed" system call vectors and is different from any other system that we might want to emulate, making the emulation easier by reducing the number of code paths that need to be shared. Compatibility with old applications is provided with COMPAT_FREEBSD4. Add defines for a few special traps that we may need to implement for compatibility with 32bit applications, and add comments on which vectors are used for what in other systems, and which are available. Pass magic flags to trap() for deprecated or unimplemented system call vectors so they will deliver SIGSYS instead of SIGILL. This piggy backs nicely with the recent sigaction(2) system call number change, and provided the rules are followed for upgrading past it, this change should not be noticed.
* Split 4.x and 5.x signal handling so that we can keep 4.x signalpeter2002-10-251-8/+0
| | | | | | | | | | | | | | | | handling clean and functional as 5.x evolves. This allows some of the nasty bandaids in the 5.x codepaths to be unwound. Encapsulate 4.x signal handling under COMPAT_FREEBSD4 (there is an anti-foot-shooting measure in place, 5.x folks need this for a while) and finish encapsulating the older stuff under COMPAT_43. Since the ancient stuff is required on alpha (longjmp(3) passes a 'struct osigcontext *' to the current sigreturn(2), instead of the 'ucontext_t *' that sigreturn is supposed to take), add a compile time check to prevent foot shooting there too. Add uniform COMPAT_43 stubs for ia64/sparc64/powerpc. Tested on: i386, alpha, ia64. Compiled on sparc64 (a few days ago). Approved by: re
* Initialize tick_MHz and related variables much earlier. After the lasttmm2002-10-251-1/+2
| | | | | | | revision of tick.c, this was done at SI_SUB_CLOCKS, which is too late because tick_MHz is required for DELAY() to work. Reviewed by: jake
* Greatly improve readability of trap() by using a table to convert betweenjake2002-10-251-0/+2
| | | | | | trap types and signals to send. Rearrange KASSERTs to better handle faults early before curthread is setup, or in the case that it gets corrupted or set to 0.
* - Expand struct trapframe to 256 bytes, make all fields fixed width and thejake2002-10-223-38/+92
| | | | | | | | | | | | | | same size. Add some fields that previously overlapped with something else or were missing. - Make struct regs and struct mcontext (minus floating point) the same as struct trapframe so converting between them is easy (null). - Add space for saving floating point state to struct mcontext. This requires that it be 64 byte aligned. - Add assertions that none of these structures change size, as they are part of the ABI. - Remove some dead code in sendsig(). - Save and restore %gsr in struct trapframe. Remember to restore %fsr. - Add some comments to exception.S.
* Add kernel dump support, based on the ia64 version (which was committedtmm2002-10-202-0/+94
| | | | | | | | | | as sparc64/sparc64/dump_machdep.c a while back). Other than ia64 (which uses ELF), sparc64 uses a homegrown format for the dumps (headers are required because the physical address and size of the tsb must be noted, and because physical memory may be discontiguous); ELF would not offer any advantages here. Reviewed by: jake
* Explicitely specify an alignment for struct pcb. While all regular pcb'stmm2002-10-191-1/+1
| | | | | are positioned and aligned by md code, dumppcb is just a static variable and requires this.
OpenPOWER on IntegriCloud