| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
better error reporting.
Submitted by: Matthew Fleming <matthew dot fleming at isilon dot com>
MFC After: 1 month
|
|
|
|
|
|
|
|
| |
kern.ngroups+1. kern.ngroups can range from NGROUPS_MAX=1023 to
INT_MAX-1. Given that the Windows group limit is 1024, this range
should be sufficient for most applications.
MFC after: 1 month
|
|
|
|
|
|
|
|
|
| |
- name some columns more closely to the user space variables,
as we do for host.* or allow.* (in the listing) already.
- print pr_childmax (children.max).
- prefix hex values with 0x.
MFC after: 3 weeks
|
|
|
|
|
|
|
| |
address selection, even for IPv4, since r183571.
Pointed out by: Jase Thew (bazerka beardz.net)
MFC after: 3 days
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When renaming a directory it passes through several intermediate
states. First its new name will be created causing it to have two
names (from possibly different parents). Next, if it has different
parents, its value of ".." will be changed from pointing to the old
parent to pointing to the new parent. Concurrently, its old name
will be removed bringing it back into a consistent state. When fsck
encounters an extra name for a directory, it offers to remove the
"extraneous hard link"; when it finds that the names have been
changed but the update to ".." has not happened, it offers to rewrite
".." to point at the correct parent. Both of these changes were
considered unexpected so would cause fsck in preen mode or fsck in
background mode to fail with the need to run fsck manually to fix
these problems. Fsck running in preen mode or background mode now
corrects these expected inconsistencies that arise during directory
rename. The functionality added with this update is used by fsck
running in background mode to make these fixes.
Solution:
This update adds three new fsck sysctl commands to support background
fsck in correcting expected inconsistencies that arise from incomplete
directory rename operations. They are:
setcwd(dirinode) - set the current directory to dirinode in the
filesystem associated with the snapshot.
setdotdot(oldvalue, newvalue) - Verify that the inode number for ".."
in the current directory is oldvalue then change it to newvalue.
unlink(nameptr, oldvalue) - Verify that the inode number associated
with nameptr in the current directory is oldvalue then unlink it.
As with all other fsck sysctls, these new ones may only be used by
processes with appropriate priviledge.
Reported by: jeff
Security issues: rwatson
|
|
|
|
|
|
|
| |
r198561 | thompsa | 2009-10-28 15:25:22 -0600 (Wed, 28 Oct 2009) | 4 lines
Allow a scratch buffer to be set in order to be able to use setenv() while
booting, before dynamic kenv is running. A few platforms implement their own
scratch+sprintf handling to save data from the boot environment.
|
|
|
|
|
|
| |
for same key coalesce to same queue, this makes searching
path shorter and improves performance.
Also fix comments about shared PI-mutex.
|
|
|
|
|
|
| |
number of supplemental groups, not the total number of groups.
MFC after: 3 days
|
|
|
|
| |
Suggested by: jmallett
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While the name is pretentious, a good explanation of its targets is
reported in this 17 months old presentation e-mail:
http://lists.freebsd.org/pipermail/freebsd-arch/2008-August/008452.html
In order to implement it, the sq_type in sleepqueues is mandatory and not
only compiled along with INVARIANTS option. Additively, a new sleepqueue
function, sleepq_type() is added, returning the type of the sleepqueue
linked to a wchan.
Three new sysctls are added in order to configure the thread:
debug.deadlkres.slptime_threshold
debug.deadlkres.blktime_threshold
debug.deadlkres.sleepfreq
rappresenting the thresholds for sleep and block time that will lead to
a deadlock matching (when exceeded), while the sleepfreq rappresents the
number of seconds between 2 consecutive thread runnings.
In order to enable the deadlock resolver thread recompile your kernel
with the option DEADLKRES.
Reviewed by: jeff
Tested by: pho, Giovanni Trematerra
Sponsored by: Nokia Incorporated, Sandvine Incorporated
MFC after: 2 weeks
|
|
|
|
|
|
| |
PR: 128335
Submitted by: Mateusz Guzik <mjguzik@gmail.com>
MFC after: 2 week
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
is not cleaned up on the wakeup but reset.
This is harmless mostly because td_slptick (and ki_slptime from
userland) should be analyzed only with the assumption that the thread
is actually sleeping (thus while the td_slptick is correctly set) but
without this invariant the number is nomore consistent.
- Move td_slptick from u_int to int in order to follow 'ticks' signedness
and wrap up accordingly [0]
[0] Submitted by: emaste
Sponsored by: Sandvine Incorporated
MFC 1 week
|
|
|
|
|
| |
Submitted by: Marc Balmer <marc@msys.ch>
MFC after: 1 week
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
sleeps/timeout may have left spourious lk_exslpfail counts on, so clean
it up even when accessing a shared queue acquisition, giving to
lk_exslpfail the value of 'upper limit'.
In the worst case scenario, infact (mixed
interruptible sleep / LK_SLEEPFAIL waiters) what may happen is that both
queues are awaken even if that's not necessary, but still no harm.
Reported by: Lucius Windschuh <lwindschuh at googlemail dot com>
Reviewed by: kib
Tested by: pho, Lucius Windschuh <lwindschuh at googlemail dot com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
now type sema_t is a structure which can be put in a shared memory area,
and multiple processes can operate it concurrently.
User can either use mmap(MAP_SHARED) + sem_init(pshared=1) or use sem_open()
to initialize a shared semaphore.
Named semaphore uses file system and is located in /tmp directory, and its
file name is prefixed with 'SEMD', so now it is chroot or jail friendly.
In simplist cases, both for named and un-named semaphore, userland code
does not have to enter kernel to reduce/increase semaphore's count.
The semaphore is designed to be crash-safe, it means even if an application
is crashed in the middle of operating semaphore, the semaphore state is
still safely recovered by later use, there is no waiter counter maintained
by userland code.
The main semaphore code is in libc and libthr only has some necessary stubs,
this makes it possible that a non-threaded application can use semaphore
without linking to thread library.
Old semaphore implementation is kept libc to maintain binary compatibility.
The kernel ksem API is no longer used in the new implemenation.
Discussed on: threads@
|
|
|
|
|
|
|
|
| |
It looks like I didn't implement this when I imported MPSAFE TTY.
Applications like mail(1) still use this. I think it's conceptually bad.
Tested by: Pete French <petefrench ticketswitch com>
MFC after: 2 weeks
|
| |
|
|
|
|
|
|
|
|
|
|
| |
processes to share semaphore by using shared memory area, in simplest case,
only one atomic operation is needed in userland, waiter flag is maintained by
kernel and userland only checks the flag, if the flag is set, user code enters
kernel and does a wakeup() call.
Move type definitions into file _umtx.h to minimize compiling time.
Also type names need to be prefixed with underline character, this would reduce
name conflict (still in progress).
|
|
|
|
|
|
| |
at add it again.
MFC after: 1 week
|
|
|
|
|
|
|
|
|
|
|
|
| |
r195175. Remove all definitions, documentation, and usage.
fifo_misc.c:
Remove all kqueue tests as fifo_io.c performs all those that
would have remained.
Reviewed by: rwatson
MFC after: 3 weeks
X-MFC note: don't change vlan_link_state() function signature
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
or equial then PSOCK, not less or equial. Higher priority has lesser
numerical value.
Existing test does not allow for swapout of the thread waiting for
advisory lock, for exiting child or sleeping for timeout. On the other
hand, high-priority waiters of VFS/VM events can be swapped out.
Tested by: pho
Reviewed by: jhb
MFC after: 1 week
|
|
|
|
| |
resource_list_release() will later release the resource instead of failing.
|
|
|
|
|
|
|
|
|
| |
active.
- Fix bus_generic_rl_(alloc|release)_resource() to not attempt to fetch a
resource list for grandchild devices, but just pass those requests up to
the parent directly. This worked by accident previously, but it is
better to not let bus drivers try to operate on devices they do not
manage.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This replaces d_mmap() with the d_mmap2() implementation and also
changes the type of offset to vm_ooffset_t.
Purge d_mmap2().
All driver modules will need to be rebuilt since D_VERSION is also
bumped.
Reviewed by: jhb@
MFC after: Not in this lifetime...
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
Fix some wrong usages.
Note: this does not affect generated binaries as this argument is not used.
PR: 137213
Submitted by: Eygene Ryabinkin (initial version)
MFC after: 1 month
|
|
|
|
|
|
|
|
|
|
| |
the namecache records. The reclamation is not enabled by default because
for typical workload it would make namecache unusable, but large nested
directory tree easily puts any process that accesses filesystem into 1
second wait for vlru.
Reported by: yar (long time ago)
MFC after: 3 days
|
|
|
|
| |
is not being used improperly.
|
|
|
|
| |
MFC after: 3 days
|
|
|
|
| |
MFC after: 3 days
|
|
|
|
| |
MFC after: 3 days
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
flag. Besides providing the redundand information, need to update both
vnode and object flags causes more acquisition of vnode interlock.
OBJ_MIGHTBEDIRTY is only checked for vnode-backed vm objects.
Remove VI_OBJDIRTY and make sure that OBJ_MIGHTBEDIRTY is set only for
vnode-backed vm objects.
Suggested and reviewed by: alc
Tested by: pho
MFC after: 3 weeks
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Basically this commit changes two things, which improves access to TTYs
in exceptional conditions. Basically the problem was that when you ran
jexec(8) to attach to a jail, you couldn't use /dev/tty (well, also the
node of the actual TTY, e.g. /dev/pts/X). This is very inconvenient if
you want to attach to screens quickly, use ssh(1), etc.
The fixes:
- Cache the cdev_priv of the controlling TTY in struct session. Change
devfs_access() to compare against the cdev_priv instead of the vnode.
This allows you to bypass UNIX permissions, even across different
mounts of devfs.
- Extend devfs_prison_check() to unconditionally expose the device node
of the controlling TTY, even if normal prison nesting rules normally
don't allow this. This actually allows you to interact with this
device node.
To be honest, I'm not really happy with this solution. We now have to
store three pointers to a controlling TTY (s_ttyp, s_ttyvp, s_ttydp).
In an ideal world, we should just get rid of the latter two and only use
s_ttyp, but this makes certian pieces of code very impractical (e.g.
devfs, kern_exit.c).
Reported by: Many people
|
| |
|
|
|
|
|
|
|
| |
Just like a similar change we made to the TTY code about half a year
ago, make these strings look similar.
Suggested by: Jille Timmermans <jille@quis.cx>
|
|
|
|
|
|
|
|
|
| |
threads are executing the eventhandler, sleep in this case to make it safe for
module unload. If the runcount was up then an entry would have been marked
EHE_DEAD_PRIORITY so use this as a trigger to do the wakeup in
eventhandler_prune_list().
Reviewed by: jhb
|
|
|
|
| |
MFC after: 1 week
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
struct callout_cpu. From the comment in the file:
+ * There is one struct callout_cpu per cpu, holding all relevant
+ * state for the callout processing thread on the individual CPU.
+ * In particular:
+ * cc_ticks is incremented once per tick in callout_cpu().
+ * It tracks the global 'ticks' but in a way that the individual
+ * threads should not worry about races in the order in which
+ * hardclock() and hardclock_cpu() run on the various CPUs.
+ * cc_softclock is advanced in callout_cpu() to point to the
+ * first entry in cc_callwheel that may need handling. In turn,
+ * a softclock() is scheduled so it can serve the various entries i
+ * such that cc_softclock <= i <= cc_ticks .
Together with a smaller patch committed in september, this fixes a
bug that affects 8.0 with apps that rely on callouts to fire exactly
in the number of ticks specified (qemu among them).
Right now, callouts in 8.0 fire one tick late.
This was discussed in september with JeffR and jhb
MFC after: 3 days
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
if (jailed(cred))
left. If you are running with a vnet (virtual network stack) those will
return true and defer you to classic IP-jails handling and thus things
will be "denied" or returned with an error.
Work around this problem by introducing another "jailed()" function,
jailed_without_vnet(), that also takes vnets into account, and permits
the calls, should the jail from the given cred have its own virtual
network stack.
We cannot change the classic jailed() call to do that, as it is used
outside the network stack as well.
Discussed with: julian, zec, jamie, rwatson (back in Sept)
MFC after: 5 days
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
sxlock, via the sx_{s, x}lock_sig() interface, or plain lockmgr), will
leave the waiters flag on forcing the owner to do a wakeup even when if
the waiter queue is empty.
That operation may lead to a deadlock in the case of doing a fake wakeup
on the "preferred" (based on the wakeup algorithm) queue while the other
queue has real waiters on it, because nobody is going to wakeup the 2nd
queue waiters and they will sleep indefinitively.
A similar bug, is present, for lockmgr in the case the waiters are
sleeping with LK_SLEEPFAIL on. In this case, even if the waiters queue
is not empty, the waiters won't progress after being awake but they will
just fail, still not taking care of the 2nd queue waiters (as instead the
lock owned doing the wakeup would expect).
In order to fix this bug in a cheap way (without adding too much locking
and complicating too much the semantic) add a sleepqueue interface which
does report the actual number of waiters on a specified queue of a
waitchannel (sleepq_sleepcnt()) and use it in order to determine if the
exclusive waiters (or shared waiters) are actually present on the lockmgr
(or sx) before to give them precedence in the wakeup algorithm.
This fix alone, however doesn't solve the LK_SLEEPFAIL bug. In order to
cope with it, add the tracking of how many exclusive LK_SLEEPFAIL waiters
a lockmgr has and if all the waiters on the exclusive waiters queue are
LK_SLEEPFAIL just wake both queues.
The sleepq_sleepcnt() introduction and ABI breakage require
__FreeBSD_version bumping.
Reported by: avg, kib, pho
Reviewed by: kib
Tested by: pho
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
are not allocated by the device driver. These resources should still appear
allocated from the system's perspective so that their assigned ranges are
not reused by other resource requests. The PCI bus driver has used a hack
to effect this for a while now where it uses rman_set_device() to assign
devices to the PCI bus when they are first encountered and later assigns
them to the actual device when a driver allocates a BAR. A few downsides of
this approach is that it results in somewhat confusing devinfo -r output as
well as not being very easily portable to other bus drivers.
This commit adds generic support for "reserved" resources to the resource
list API used by many bus drivers to manage the resources of child devices.
A resource may be reserved via resource_list_reserve(). This will allocate
the resource from the bus' parent without activating it.
resource_list_alloc() recognizes an attempt to allocate a reserved resource.
When this happens it activates the resource (if requested) and then returns
the reserved resource. Similarly, when a reserved resource is released via
resource_list_release(), it is deactivated (if it is active) and the
resource is then marked reserved again, but is left allocated from the
bus' parent. To completely remove a reserved resource, a bus driver may
use resource_list_unreserve(). A bus driver may use resource_list_busy()
to determine if a reserved resource is allocated by a child device or if
it can be unreserved.
The PCI bus driver has been changed to use this framework instead of
abusing rman_set_device() to keep track of reserved vs allocated resources.
Submitted by: imp (an older version many moons ago)
MFC after: 1 month
|
|
|
|
|
|
|
| |
only affects cases where open(2) is being used improperly - i.e. when the user
specifies O_APPEND without O_WRONLY or O_RDWR.
Reviewed by: rwatson
|
|
|
|
|
| |
Reported and tested by: jh
MFC after: 2 weeks
|
|
|
|
|
| |
incorrectly returning EINVAL from acl_valid(3) for applications linked
against pre-8.0 libc.
|