diff options
Diffstat (limited to 'usr.sbin/nfsd')
-rw-r--r-- | usr.sbin/nfsd/Makefile | 7 | ||||
-rw-r--r-- | usr.sbin/nfsd/Makefile.depend | 21 | ||||
-rw-r--r-- | usr.sbin/nfsd/nfsd.8 | 240 | ||||
-rw-r--r-- | usr.sbin/nfsd/nfsd.c | 1086 | ||||
-rw-r--r-- | usr.sbin/nfsd/nfsv4.4 | 331 | ||||
-rw-r--r-- | usr.sbin/nfsd/stablerestart.5 | 97 |
6 files changed, 1782 insertions, 0 deletions
diff --git a/usr.sbin/nfsd/Makefile b/usr.sbin/nfsd/Makefile new file mode 100644 index 0000000..2905067 --- /dev/null +++ b/usr.sbin/nfsd/Makefile @@ -0,0 +1,7 @@ +# @(#)Makefile 8.1 (Berkeley) 6/5/93 +# $FreeBSD$ + +PROG= nfsd +MAN= nfsd.8 nfsv4.4 stablerestart.5 + +.include <bsd.prog.mk> diff --git a/usr.sbin/nfsd/Makefile.depend b/usr.sbin/nfsd/Makefile.depend new file mode 100644 index 0000000..c0b7a14 --- /dev/null +++ b/usr.sbin/nfsd/Makefile.depend @@ -0,0 +1,21 @@ +# $FreeBSD$ +# Autogenerated - do NOT edit! + +DIRDEPS = \ + gnu/lib/csu \ + gnu/lib/libgcc \ + include \ + include/arpa \ + include/rpc \ + include/rpcsvc \ + include/xlocale \ + lib/${CSU_DIR} \ + lib/libc \ + lib/libcompiler_rt \ + + +.include <dirdeps.mk> + +.if ${DEP_RELDIR} == ${_DEP_RELDIR} +# local dependencies - needed for -jN in clean tree +.endif diff --git a/usr.sbin/nfsd/nfsd.8 b/usr.sbin/nfsd/nfsd.8 new file mode 100644 index 0000000..d014a01 --- /dev/null +++ b/usr.sbin/nfsd/nfsd.8 @@ -0,0 +1,240 @@ +.\" Copyright (c) 1989, 1991, 1993 +.\" The Regents of the University of California. All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 4. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" @(#)nfsd.8 8.4 (Berkeley) 3/29/95 +.\" $FreeBSD$ +.\" +.Dd April 25, 2015 +.Dt NFSD 8 +.Os +.Sh NAME +.Nm nfsd +.Nd remote +.Tn NFS +server +.Sh SYNOPSIS +.Nm +.Op Fl ardute +.Op Fl n Ar num_servers +.Op Fl h Ar bindip +.Op Fl Fl maxthreads Ar max_threads +.Op Fl Fl minthreads Ar min_threads +.Sh DESCRIPTION +The +.Nm +utility runs on a server machine to service +.Tn NFS +requests from client machines. +At least one +.Nm +must be running for a machine to operate as a server. +.Pp +Unless otherwise specified, eight servers per CPU for +.Tn UDP +transport are started. +.Pp +The following options are available: +.Bl -tag -width Ds +.It Fl r +Register the +.Tn NFS +service with +.Xr rpcbind 8 +without creating any servers. +This option can be used along with the +.Fl u +or +.Fl t +options to re-register NFS if the rpcbind server is restarted. +.It Fl d +Unregister the +.Tn NFS +service with +.Xr rpcbind 8 +without creating any servers. +.It Fl n Ar threads +Specifies how many servers to create. This option is equivalent to specifying +.Fl Fl maxthreads +and +.Fl Fl minthreads +with their respective arguments to +.Ar threads . +.It Fl Fl maxthreads Ar threads +Specifies the maximum servers that will be kept around to service requests. +.It Fl Fl minthreads Ar threads +Specifies the minimum servers that will be kept around to service requests. +.It Fl h Ar bindip +Specifies which IP address or hostname to bind to on the local host. +This option is recommended when a host has multiple interfaces. +Multiple +.Fl h +options may be specified. +.It Fl a +Specifies that nfsd should bind to the wildcard IP address. +This is the default if no +.Fl h +options are given. +It may also be specified in addition to any +.Fl h +options given. +Note that NFS/UDP does not operate properly when +bound to the wildcard IP address whether you use -a or do not use -h. +.It Fl t +Serve +.Tn TCP NFS +clients. +.It Fl u +Serve +.Tn UDP NFS +clients. +.It Fl e +Ignored; included for backward compatibility. +.El +.Pp +For example, +.Dq Li "nfsd -u -t -n 6" +serves +.Tn UDP +and +.Tn TCP +transports using six daemons. +.Pp +A server should run enough daemons to handle +the maximum level of concurrency from its clients, +typically four to six. +.Pp +The +.Nm +utility listens for service requests at the port indicated in the +.Tn NFS +server specification; see +.%T "Network File System Protocol Specification" , +RFC1094, +.%T "NFS: Network File System Version 3 Protocol Specification" , +RFC1813 and +.%T "Network File System (NFS) Version 4 Protocol" , +RFC3530. +.Pp +If +.Nm +detects that +.Tn NFS +is not loaded in the running kernel, it will attempt +to load a loadable kernel module containing +.Tn NFS +support using +.Xr kldload 2 . +If this fails, or no +.Tn NFS +KLD is available, +.Nm +will exit with an error. +.Pp +If +.Nm +is to be run on a host with multiple interfaces or interface aliases, use +of the +.Fl h +option is recommended. +If you do not use the option NFS may not respond to +UDP packets from the same IP address they were sent to. +Use of this option +is also recommended when securing NFS exports on a firewalling machine such +that the NFS sockets can only be accessed by the inside interface. +The +.Nm ipfw +utility +would then be used to block nfs-related packets that come in on the outside +interface. +.Pp +If the server has stopped servicing clients and has generated a console message +like +.Dq Li "nfsd server cache flooded..." , +the value for vfs.nfsd.tcphighwater needs to be increased. +This should allow the server to again handle requests without a reboot. +Also, you may want to consider decreasing the value for +vfs.nfsd.tcpcachetimeo to several minutes (in seconds) instead of 12 hours +when this occurs. +.Pp +Unfortunately making vfs.nfsd.tcphighwater too large can result in the mbuf +limit being reached, as indicated by a console message +like +.Dq Li "kern.ipc.nmbufs limit reached" . +If you cannot find values of the above +.Nm sysctl +values that work, you can disable the DRC cache for TCP by setting +vfs.nfsd.cachetcp to 0. +.Pp +The +.Nm +utility has to be terminated with +.Dv SIGUSR1 +and cannot be killed with +.Dv SIGTERM +or +.Dv SIGQUIT . +The +.Nm +utility needs to ignore these signals in order to stay alive as long +as possible during a shutdown, otherwise loopback mounts will +not be able to unmount. +If you have to kill +.Nm +just do a +.Dq Li "kill -USR1 <PID of master nfsd>" +.Sh EXIT STATUS +.Ex -std +.Sh SEE ALSO +.Xr nfsstat 1 , +.Xr kldload 2 , +.Xr nfssvc 2 , +.Xr nfsv4 4 , +.Xr exports 5 , +.Xr stablerestart 5 , +.Xr gssd 8 , +.Xr ipfw 8 , +.Xr mountd 8 , +.Xr nfsiod 8 , +.Xr nfsrevoke 8 , +.Xr nfsuserd 8 , +.Xr rpcbind 8 +.Sh HISTORY +The +.Nm +utility first appeared in +.Bx 4.4 . +.Sh BUGS +If +.Nm +is started when +.Xr gssd 8 +is not running, it will service AUTH_SYS requests only. To fix the problem +you must kill +.Nm +and then restart it, after the +.Xr gssd 8 +is running. diff --git a/usr.sbin/nfsd/nfsd.c b/usr.sbin/nfsd/nfsd.c new file mode 100644 index 0000000..f58ed30 --- /dev/null +++ b/usr.sbin/nfsd/nfsd.c @@ -0,0 +1,1086 @@ +/* + * Copyright (c) 1989, 1993, 1994 + * The Regents of the University of California. All rights reserved. + * + * This code is derived from software contributed to Berkeley by + * Rick Macklem at The University of Guelph. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 4. Neither the name of the University nor the names of its contributors + * may be used to endorse or promote products derived from this software + * without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#ifndef lint +static const char copyright[] = +"@(#) Copyright (c) 1989, 1993, 1994\n\ + The Regents of the University of California. All rights reserved.\n"; +#endif /* not lint */ + +#ifndef lint +#if 0 +static char sccsid[] = "@(#)nfsd.c 8.9 (Berkeley) 3/29/95"; +#endif +static const char rcsid[] = + "$FreeBSD$"; +#endif /* not lint */ + +#include <sys/param.h> +#include <sys/syslog.h> +#include <sys/wait.h> +#include <sys/mount.h> +#include <sys/fcntl.h> +#include <sys/linker.h> +#include <sys/module.h> +#include <sys/types.h> +#include <sys/stat.h> +#include <sys/sysctl.h> +#include <sys/ucred.h> + +#include <rpc/rpc.h> +#include <rpc/pmap_clnt.h> +#include <rpcsvc/nfs_prot.h> + +#include <netdb.h> +#include <arpa/inet.h> +#include <nfsserver/nfs.h> +#include <nfs/nfssvc.h> + +#include <err.h> +#include <errno.h> +#include <signal.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <unistd.h> +#include <sysexits.h> + +#include <getopt.h> + +static int debug = 0; + +#define NFSD_STABLERESTART "/var/db/nfs-stablerestart" +#define NFSD_STABLEBACKUP "/var/db/nfs-stablerestart.bak" +#define MAXNFSDCNT 256 +#define DEFNFSDCNT 4 +static pid_t children[MAXNFSDCNT]; /* PIDs of children */ +static int nfsdcnt; /* number of children */ +static int nfsdcnt_set; +static int minthreads; +static int maxthreads; +static int nfssvc_nfsd; /* Set to correct NFSSVC_xxx flag */ +static int stablefd = -1; /* Fd for the stable restart file */ +static int backupfd; /* Fd for the backup stable restart file */ +static const char *getopt_shortopts; +static const char *getopt_usage; + +static int minthreads_set; +static int maxthreads_set; + +static struct option longopts[] = { + { "debug", no_argument, &debug, 1 }, + { "minthreads", required_argument, &minthreads_set, 1 }, + { "maxthreads", required_argument, &maxthreads_set, 1 }, + { NULL, 0, NULL, 0} +}; + +static void cleanup(int); +static void child_cleanup(int); +static void killchildren(void); +static void nfsd_exit(int); +static void nonfs(int); +static void reapchild(int); +static int setbindhost(struct addrinfo **ia, const char *bindhost, + struct addrinfo hints); +static void start_server(int); +static void unregistration(void); +static void usage(void); +static void open_stable(int *, int *); +static void copy_stable(int, int); +static void backup_stable(int); +static void set_nfsdcnt(int); + +/* + * Nfs server daemon mostly just a user context for nfssvc() + * + * 1 - do file descriptor and signal cleanup + * 2 - fork the nfsd(s) + * 3 - create server socket(s) + * 4 - register socket with rpcbind + * + * For connectionless protocols, just pass the socket into the kernel via. + * nfssvc(). + * For connection based sockets, loop doing accepts. When you get a new + * socket from accept, pass the msgsock into the kernel via. nfssvc(). + * The arguments are: + * -r - reregister with rpcbind + * -d - unregister with rpcbind + * -t - support tcp nfs clients + * -u - support udp nfs clients + * -e - forces it to run a server that supports nfsv4 + * followed by "n" which is the number of nfsds' to fork off + */ +int +main(int argc, char **argv) +{ + struct nfsd_addsock_args addsockargs; + struct addrinfo *ai_udp, *ai_tcp, *ai_udp6, *ai_tcp6, hints; + struct netconfig *nconf_udp, *nconf_tcp, *nconf_udp6, *nconf_tcp6; + struct netbuf nb_udp, nb_tcp, nb_udp6, nb_tcp6; + struct sockaddr_in inetpeer; + struct sockaddr_in6 inet6peer; + fd_set ready, sockbits; + fd_set v4bits, v6bits; + int ch, connect_type_cnt, i, maxsock, msgsock; + socklen_t len; + int on = 1, unregister, reregister, sock; + int tcp6sock, ip6flag, tcpflag, tcpsock; + int udpflag, ecode, error, s; + int bindhostc, bindanyflag, rpcbreg, rpcbregcnt; + int nfssvc_addsock; + int longindex = 0; + const char *lopt; + char **bindhost = NULL; + pid_t pid; + + nfsdcnt = DEFNFSDCNT; + unregister = reregister = tcpflag = maxsock = 0; + bindanyflag = udpflag = connect_type_cnt = bindhostc = 0; + getopt_shortopts = "ah:n:rdtue"; + getopt_usage = + "usage:\n" + " nfsd [-ardtue] [-h bindip]\n" + " [-n numservers] [--minthreads #] [--maxthreads #]\n"; + while ((ch = getopt_long(argc, argv, getopt_shortopts, longopts, + &longindex)) != -1) + switch (ch) { + case 'a': + bindanyflag = 1; + break; + case 'n': + set_nfsdcnt(atoi(optarg)); + break; + case 'h': + bindhostc++; + bindhost = realloc(bindhost,sizeof(char *)*bindhostc); + if (bindhost == NULL) + errx(1, "Out of memory"); + bindhost[bindhostc-1] = strdup(optarg); + if (bindhost[bindhostc-1] == NULL) + errx(1, "Out of memory"); + break; + case 'r': + reregister = 1; + break; + case 'd': + unregister = 1; + break; + case 't': + tcpflag = 1; + break; + case 'u': + udpflag = 1; + break; + case 'e': + /* now a no-op, since this is the default */ + break; + case 0: + lopt = longopts[longindex].name; + if (!strcmp(lopt, "minthreads")) { + minthreads = atoi(optarg); + } else if (!strcmp(lopt, "maxthreads")) { + maxthreads = atoi(optarg); + } + break; + default: + case '?': + usage(); + }; + if (!tcpflag && !udpflag) + udpflag = 1; + argv += optind; + argc -= optind; + if (minthreads_set && maxthreads_set && minthreads > maxthreads) + errx(EX_USAGE, + "error: minthreads(%d) can't be greater than " + "maxthreads(%d)", minthreads, maxthreads); + + /* + * XXX + * Backward compatibility, trailing number is the count of daemons. + */ + if (argc > 1) + usage(); + if (argc == 1) + set_nfsdcnt(atoi(argv[0])); + + /* + * Unless the "-o" option was specified, try and run "nfsd". + * If "-o" was specified, try and run "nfsserver". + */ + if (modfind("nfsd") < 0) { + /* Not present in kernel, try loading it */ + if (kldload("nfsd") < 0 || modfind("nfsd") < 0) + errx(1, "NFS server is not available"); + } + + ip6flag = 1; + s = socket(AF_INET6, SOCK_DGRAM, IPPROTO_UDP); + if (s == -1) { + if (errno != EPROTONOSUPPORT && errno != EAFNOSUPPORT) + err(1, "socket"); + ip6flag = 0; + } else if (getnetconfigent("udp6") == NULL || + getnetconfigent("tcp6") == NULL) { + ip6flag = 0; + } + if (s != -1) + close(s); + + if (bindhostc == 0 || bindanyflag) { + bindhostc++; + bindhost = realloc(bindhost,sizeof(char *)*bindhostc); + if (bindhost == NULL) + errx(1, "Out of memory"); + bindhost[bindhostc-1] = strdup("*"); + if (bindhost[bindhostc-1] == NULL) + errx(1, "Out of memory"); + } + + if (unregister) { + unregistration(); + exit (0); + } + if (reregister) { + if (udpflag) { + memset(&hints, 0, sizeof hints); + hints.ai_flags = AI_PASSIVE; + hints.ai_family = AF_INET; + hints.ai_socktype = SOCK_DGRAM; + hints.ai_protocol = IPPROTO_UDP; + ecode = getaddrinfo(NULL, "nfs", &hints, &ai_udp); + if (ecode != 0) + err(1, "getaddrinfo udp: %s", gai_strerror(ecode)); + nconf_udp = getnetconfigent("udp"); + if (nconf_udp == NULL) + err(1, "getnetconfigent udp failed"); + nb_udp.buf = ai_udp->ai_addr; + nb_udp.len = nb_udp.maxlen = ai_udp->ai_addrlen; + if ((!rpcb_set(NFS_PROGRAM, 2, nconf_udp, &nb_udp)) || + (!rpcb_set(NFS_PROGRAM, 3, nconf_udp, &nb_udp))) + err(1, "rpcb_set udp failed"); + freeaddrinfo(ai_udp); + } + if (udpflag && ip6flag) { + memset(&hints, 0, sizeof hints); + hints.ai_flags = AI_PASSIVE; + hints.ai_family = AF_INET6; + hints.ai_socktype = SOCK_DGRAM; + hints.ai_protocol = IPPROTO_UDP; + ecode = getaddrinfo(NULL, "nfs", &hints, &ai_udp6); + if (ecode != 0) + err(1, "getaddrinfo udp6: %s", gai_strerror(ecode)); + nconf_udp6 = getnetconfigent("udp6"); + if (nconf_udp6 == NULL) + err(1, "getnetconfigent udp6 failed"); + nb_udp6.buf = ai_udp6->ai_addr; + nb_udp6.len = nb_udp6.maxlen = ai_udp6->ai_addrlen; + if ((!rpcb_set(NFS_PROGRAM, 2, nconf_udp6, &nb_udp6)) || + (!rpcb_set(NFS_PROGRAM, 3, nconf_udp6, &nb_udp6))) + err(1, "rpcb_set udp6 failed"); + freeaddrinfo(ai_udp6); + } + if (tcpflag) { + memset(&hints, 0, sizeof hints); + hints.ai_flags = AI_PASSIVE; + hints.ai_family = AF_INET; + hints.ai_socktype = SOCK_STREAM; + hints.ai_protocol = IPPROTO_TCP; + ecode = getaddrinfo(NULL, "nfs", &hints, &ai_tcp); + if (ecode != 0) + err(1, "getaddrinfo tcp: %s", gai_strerror(ecode)); + nconf_tcp = getnetconfigent("tcp"); + if (nconf_tcp == NULL) + err(1, "getnetconfigent tcp failed"); + nb_tcp.buf = ai_tcp->ai_addr; + nb_tcp.len = nb_tcp.maxlen = ai_tcp->ai_addrlen; + if ((!rpcb_set(NFS_PROGRAM, 2, nconf_tcp, &nb_tcp)) || + (!rpcb_set(NFS_PROGRAM, 3, nconf_tcp, &nb_tcp))) + err(1, "rpcb_set tcp failed"); + freeaddrinfo(ai_tcp); + } + if (tcpflag && ip6flag) { + memset(&hints, 0, sizeof hints); + hints.ai_flags = AI_PASSIVE; + hints.ai_family = AF_INET6; + hints.ai_socktype = SOCK_STREAM; + hints.ai_protocol = IPPROTO_TCP; + ecode = getaddrinfo(NULL, "nfs", &hints, &ai_tcp6); + if (ecode != 0) + err(1, "getaddrinfo tcp6: %s", gai_strerror(ecode)); + nconf_tcp6 = getnetconfigent("tcp6"); + if (nconf_tcp6 == NULL) + err(1, "getnetconfigent tcp6 failed"); + nb_tcp6.buf = ai_tcp6->ai_addr; + nb_tcp6.len = nb_tcp6.maxlen = ai_tcp6->ai_addrlen; + if ((!rpcb_set(NFS_PROGRAM, 2, nconf_tcp6, &nb_tcp6)) || + (!rpcb_set(NFS_PROGRAM, 3, nconf_tcp6, &nb_tcp6))) + err(1, "rpcb_set tcp6 failed"); + freeaddrinfo(ai_tcp6); + } + exit (0); + } + if (debug == 0) { + daemon(0, 0); + (void)signal(SIGHUP, SIG_IGN); + (void)signal(SIGINT, SIG_IGN); + /* + * nfsd sits in the kernel most of the time. It needs + * to ignore SIGTERM/SIGQUIT in order to stay alive as long + * as possible during a shutdown, otherwise loopback + * mounts will not be able to unmount. + */ + (void)signal(SIGTERM, SIG_IGN); + (void)signal(SIGQUIT, SIG_IGN); + } + (void)signal(SIGSYS, nonfs); + (void)signal(SIGCHLD, reapchild); + (void)signal(SIGUSR2, backup_stable); + + openlog("nfsd", LOG_PID | (debug ? LOG_PERROR : 0), LOG_DAEMON); + + /* + * For V4, we open the stablerestart file and call nfssvc() + * to get it loaded. This is done before the daemons do the + * regular nfssvc() call to service NFS requests. + * (This way the file remains open until the last nfsd is killed + * off.) + * It and the backup copy will be created as empty files + * the first time this nfsd is started and should never be + * deleted/replaced if at all possible. It should live on a + * local, non-volatile storage device that does not do hardware + * level write-back caching. (See SCSI doc for more information + * on how to prevent write-back caching on SCSI disks.) + */ + open_stable(&stablefd, &backupfd); + if (stablefd < 0) { + syslog(LOG_ERR, "Can't open %s: %m\n", NFSD_STABLERESTART); + exit(1); + } + /* This system call will fail for old kernels, but that's ok. */ + nfssvc(NFSSVC_BACKUPSTABLE, NULL); + if (nfssvc(NFSSVC_STABLERESTART, (caddr_t)&stablefd) < 0) { + syslog(LOG_ERR, "Can't read stable storage file: %m\n"); + exit(1); + } + nfssvc_addsock = NFSSVC_NFSDADDSOCK; + nfssvc_nfsd = NFSSVC_NFSDNFSD; + + if (tcpflag) { + /* + * For TCP mode, we fork once to start the first + * kernel nfsd thread. The kernel will add more + * threads as needed. + */ + pid = fork(); + if (pid == -1) { + syslog(LOG_ERR, "fork: %m"); + nfsd_exit(1); + } + if (pid) { + children[0] = pid; + } else { + (void)signal(SIGUSR1, child_cleanup); + setproctitle("server"); + start_server(0); + } + } + + (void)signal(SIGUSR1, cleanup); + FD_ZERO(&v4bits); + FD_ZERO(&v6bits); + FD_ZERO(&sockbits); + + rpcbregcnt = 0; + /* Set up the socket for udp and rpcb register it. */ + if (udpflag) { + rpcbreg = 0; + for (i = 0; i < bindhostc; i++) { + memset(&hints, 0, sizeof hints); + hints.ai_flags = AI_PASSIVE; + hints.ai_family = AF_INET; + hints.ai_socktype = SOCK_DGRAM; + hints.ai_protocol = IPPROTO_UDP; + if (setbindhost(&ai_udp, bindhost[i], hints) == 0) { + rpcbreg = 1; + rpcbregcnt++; + if ((sock = socket(ai_udp->ai_family, + ai_udp->ai_socktype, + ai_udp->ai_protocol)) < 0) { + syslog(LOG_ERR, + "can't create udp socket"); + nfsd_exit(1); + } + if (bind(sock, ai_udp->ai_addr, + ai_udp->ai_addrlen) < 0) { + syslog(LOG_ERR, + "can't bind udp addr %s: %m", + bindhost[i]); + nfsd_exit(1); + } + freeaddrinfo(ai_udp); + addsockargs.sock = sock; + addsockargs.name = NULL; + addsockargs.namelen = 0; + if (nfssvc(nfssvc_addsock, &addsockargs) < 0) { + syslog(LOG_ERR, "can't Add UDP socket"); + nfsd_exit(1); + } + (void)close(sock); + } + } + if (rpcbreg == 1) { + memset(&hints, 0, sizeof hints); + hints.ai_flags = AI_PASSIVE; + hints.ai_family = AF_INET; + hints.ai_socktype = SOCK_DGRAM; + hints.ai_protocol = IPPROTO_UDP; + ecode = getaddrinfo(NULL, "nfs", &hints, &ai_udp); + if (ecode != 0) { + syslog(LOG_ERR, "getaddrinfo udp: %s", + gai_strerror(ecode)); + nfsd_exit(1); + } + nconf_udp = getnetconfigent("udp"); + if (nconf_udp == NULL) + err(1, "getnetconfigent udp failed"); + nb_udp.buf = ai_udp->ai_addr; + nb_udp.len = nb_udp.maxlen = ai_udp->ai_addrlen; + if ((!rpcb_set(NFS_PROGRAM, 2, nconf_udp, &nb_udp)) || + (!rpcb_set(NFS_PROGRAM, 3, nconf_udp, &nb_udp))) + err(1, "rpcb_set udp failed"); + freeaddrinfo(ai_udp); + } + } + + /* Set up the socket for udp6 and rpcb register it. */ + if (udpflag && ip6flag) { + rpcbreg = 0; + for (i = 0; i < bindhostc; i++) { + memset(&hints, 0, sizeof hints); + hints.ai_flags = AI_PASSIVE; + hints.ai_family = AF_INET6; + hints.ai_socktype = SOCK_DGRAM; + hints.ai_protocol = IPPROTO_UDP; + if (setbindhost(&ai_udp6, bindhost[i], hints) == 0) { + rpcbreg = 1; + rpcbregcnt++; + if ((sock = socket(ai_udp6->ai_family, + ai_udp6->ai_socktype, + ai_udp6->ai_protocol)) < 0) { + syslog(LOG_ERR, + "can't create udp6 socket"); + nfsd_exit(1); + } + if (setsockopt(sock, IPPROTO_IPV6, IPV6_V6ONLY, + &on, sizeof on) < 0) { + syslog(LOG_ERR, + "can't set v6-only binding for " + "udp6 socket: %m"); + nfsd_exit(1); + } + if (bind(sock, ai_udp6->ai_addr, + ai_udp6->ai_addrlen) < 0) { + syslog(LOG_ERR, + "can't bind udp6 addr %s: %m", + bindhost[i]); + nfsd_exit(1); + } + freeaddrinfo(ai_udp6); + addsockargs.sock = sock; + addsockargs.name = NULL; + addsockargs.namelen = 0; + if (nfssvc(nfssvc_addsock, &addsockargs) < 0) { + syslog(LOG_ERR, + "can't add UDP6 socket"); + nfsd_exit(1); + } + (void)close(sock); + } + } + if (rpcbreg == 1) { + memset(&hints, 0, sizeof hints); + hints.ai_flags = AI_PASSIVE; + hints.ai_family = AF_INET6; + hints.ai_socktype = SOCK_DGRAM; + hints.ai_protocol = IPPROTO_UDP; + ecode = getaddrinfo(NULL, "nfs", &hints, &ai_udp6); + if (ecode != 0) { + syslog(LOG_ERR, "getaddrinfo udp6: %s", + gai_strerror(ecode)); + nfsd_exit(1); + } + nconf_udp6 = getnetconfigent("udp6"); + if (nconf_udp6 == NULL) + err(1, "getnetconfigent udp6 failed"); + nb_udp6.buf = ai_udp6->ai_addr; + nb_udp6.len = nb_udp6.maxlen = ai_udp6->ai_addrlen; + if ((!rpcb_set(NFS_PROGRAM, 2, nconf_udp6, &nb_udp6)) || + (!rpcb_set(NFS_PROGRAM, 3, nconf_udp6, &nb_udp6))) + err(1, "rpcb_set udp6 failed"); + freeaddrinfo(ai_udp6); + } + } + + /* Set up the socket for tcp and rpcb register it. */ + if (tcpflag) { + rpcbreg = 0; + for (i = 0; i < bindhostc; i++) { + memset(&hints, 0, sizeof hints); + hints.ai_flags = AI_PASSIVE; + hints.ai_family = AF_INET; + hints.ai_socktype = SOCK_STREAM; + hints.ai_protocol = IPPROTO_TCP; + if (setbindhost(&ai_tcp, bindhost[i], hints) == 0) { + rpcbreg = 1; + rpcbregcnt++; + if ((tcpsock = socket(AF_INET, SOCK_STREAM, + 0)) < 0) { + syslog(LOG_ERR, + "can't create tcp socket"); + nfsd_exit(1); + } + if (setsockopt(tcpsock, SOL_SOCKET, + SO_REUSEADDR, + (char *)&on, sizeof(on)) < 0) + syslog(LOG_ERR, + "setsockopt SO_REUSEADDR: %m"); + if (bind(tcpsock, ai_tcp->ai_addr, + ai_tcp->ai_addrlen) < 0) { + syslog(LOG_ERR, + "can't bind tcp addr %s: %m", + bindhost[i]); + nfsd_exit(1); + } + if (listen(tcpsock, -1) < 0) { + syslog(LOG_ERR, "listen failed"); + nfsd_exit(1); + } + freeaddrinfo(ai_tcp); + FD_SET(tcpsock, &sockbits); + FD_SET(tcpsock, &v4bits); + maxsock = tcpsock; + connect_type_cnt++; + } + } + if (rpcbreg == 1) { + memset(&hints, 0, sizeof hints); + hints.ai_flags = AI_PASSIVE; + hints.ai_family = AF_INET; + hints.ai_socktype = SOCK_STREAM; + hints.ai_protocol = IPPROTO_TCP; + ecode = getaddrinfo(NULL, "nfs", &hints, + &ai_tcp); + if (ecode != 0) { + syslog(LOG_ERR, "getaddrinfo tcp: %s", + gai_strerror(ecode)); + nfsd_exit(1); + } + nconf_tcp = getnetconfigent("tcp"); + if (nconf_tcp == NULL) + err(1, "getnetconfigent tcp failed"); + nb_tcp.buf = ai_tcp->ai_addr; + nb_tcp.len = nb_tcp.maxlen = ai_tcp->ai_addrlen; + if ((!rpcb_set(NFS_PROGRAM, 2, nconf_tcp, + &nb_tcp)) || (!rpcb_set(NFS_PROGRAM, 3, + nconf_tcp, &nb_tcp))) + err(1, "rpcb_set tcp failed"); + freeaddrinfo(ai_tcp); + } + } + + /* Set up the socket for tcp6 and rpcb register it. */ + if (tcpflag && ip6flag) { + rpcbreg = 0; + for (i = 0; i < bindhostc; i++) { + memset(&hints, 0, sizeof hints); + hints.ai_flags = AI_PASSIVE; + hints.ai_family = AF_INET6; + hints.ai_socktype = SOCK_STREAM; + hints.ai_protocol = IPPROTO_TCP; + if (setbindhost(&ai_tcp6, bindhost[i], hints) == 0) { + rpcbreg = 1; + rpcbregcnt++; + if ((tcp6sock = socket(ai_tcp6->ai_family, + ai_tcp6->ai_socktype, + ai_tcp6->ai_protocol)) < 0) { + syslog(LOG_ERR, + "can't create tcp6 socket"); + nfsd_exit(1); + } + if (setsockopt(tcp6sock, SOL_SOCKET, + SO_REUSEADDR, + (char *)&on, sizeof(on)) < 0) + syslog(LOG_ERR, + "setsockopt SO_REUSEADDR: %m"); + if (setsockopt(tcp6sock, IPPROTO_IPV6, + IPV6_V6ONLY, &on, sizeof on) < 0) { + syslog(LOG_ERR, + "can't set v6-only binding for tcp6 " + "socket: %m"); + nfsd_exit(1); + } + if (bind(tcp6sock, ai_tcp6->ai_addr, + ai_tcp6->ai_addrlen) < 0) { + syslog(LOG_ERR, + "can't bind tcp6 addr %s: %m", + bindhost[i]); + nfsd_exit(1); + } + if (listen(tcp6sock, -1) < 0) { + syslog(LOG_ERR, "listen failed"); + nfsd_exit(1); + } + freeaddrinfo(ai_tcp6); + FD_SET(tcp6sock, &sockbits); + FD_SET(tcp6sock, &v6bits); + if (maxsock < tcp6sock) + maxsock = tcp6sock; + connect_type_cnt++; + } + } + if (rpcbreg == 1) { + memset(&hints, 0, sizeof hints); + hints.ai_flags = AI_PASSIVE; + hints.ai_family = AF_INET6; + hints.ai_socktype = SOCK_STREAM; + hints.ai_protocol = IPPROTO_TCP; + ecode = getaddrinfo(NULL, "nfs", &hints, &ai_tcp6); + if (ecode != 0) { + syslog(LOG_ERR, "getaddrinfo tcp6: %s", + gai_strerror(ecode)); + nfsd_exit(1); + } + nconf_tcp6 = getnetconfigent("tcp6"); + if (nconf_tcp6 == NULL) + err(1, "getnetconfigent tcp6 failed"); + nb_tcp6.buf = ai_tcp6->ai_addr; + nb_tcp6.len = nb_tcp6.maxlen = ai_tcp6->ai_addrlen; + if ((!rpcb_set(NFS_PROGRAM, 2, nconf_tcp6, &nb_tcp6)) || + (!rpcb_set(NFS_PROGRAM, 3, nconf_tcp6, &nb_tcp6))) + err(1, "rpcb_set tcp6 failed"); + freeaddrinfo(ai_tcp6); + } + } + + if (rpcbregcnt == 0) { + syslog(LOG_ERR, "rpcb_set() failed, nothing to do: %m"); + nfsd_exit(1); + } + + if (tcpflag && connect_type_cnt == 0) { + syslog(LOG_ERR, "tcp connects == 0, nothing to do: %m"); + nfsd_exit(1); + } + + setproctitle("master"); + /* + * We always want a master to have a clean way to to shut nfsd down + * (with unregistration): if the master is killed, it unregisters and + * kills all children. If we run for UDP only (and so do not have to + * loop waiting waiting for accept), we instead make the parent + * a "server" too. start_server will not return. + */ + if (!tcpflag) + start_server(1); + + /* + * Loop forever accepting connections and passing the sockets + * into the kernel for the mounts. + */ + for (;;) { + ready = sockbits; + if (connect_type_cnt > 1) { + if (select(maxsock + 1, + &ready, NULL, NULL, NULL) < 1) { + error = errno; + if (error == EINTR) + continue; + syslog(LOG_ERR, "select failed: %m"); + nfsd_exit(1); + } + } + for (tcpsock = 0; tcpsock <= maxsock; tcpsock++) { + if (FD_ISSET(tcpsock, &ready)) { + if (FD_ISSET(tcpsock, &v4bits)) { + len = sizeof(inetpeer); + if ((msgsock = accept(tcpsock, + (struct sockaddr *)&inetpeer, &len)) < 0) { + error = errno; + syslog(LOG_ERR, "accept failed: %m"); + if (error == ECONNABORTED || + error == EINTR) + continue; + nfsd_exit(1); + } + memset(inetpeer.sin_zero, 0, + sizeof(inetpeer.sin_zero)); + if (setsockopt(msgsock, SOL_SOCKET, + SO_KEEPALIVE, (char *)&on, sizeof(on)) < 0) + syslog(LOG_ERR, + "setsockopt SO_KEEPALIVE: %m"); + addsockargs.sock = msgsock; + addsockargs.name = (caddr_t)&inetpeer; + addsockargs.namelen = len; + nfssvc(nfssvc_addsock, &addsockargs); + (void)close(msgsock); + } else if (FD_ISSET(tcpsock, &v6bits)) { + len = sizeof(inet6peer); + if ((msgsock = accept(tcpsock, + (struct sockaddr *)&inet6peer, + &len)) < 0) { + error = errno; + syslog(LOG_ERR, + "accept failed: %m"); + if (error == ECONNABORTED || + error == EINTR) + continue; + nfsd_exit(1); + } + if (setsockopt(msgsock, SOL_SOCKET, + SO_KEEPALIVE, (char *)&on, + sizeof(on)) < 0) + syslog(LOG_ERR, "setsockopt " + "SO_KEEPALIVE: %m"); + addsockargs.sock = msgsock; + addsockargs.name = (caddr_t)&inet6peer; + addsockargs.namelen = len; + nfssvc(nfssvc_addsock, &addsockargs); + (void)close(msgsock); + } + } + } + } +} + +static int +setbindhost(struct addrinfo **ai, const char *bindhost, struct addrinfo hints) +{ + int ecode; + u_int32_t host_addr[4]; /* IPv4 or IPv6 */ + const char *hostptr; + + if (bindhost == NULL || strcmp("*", bindhost) == 0) + hostptr = NULL; + else + hostptr = bindhost; + + if (hostptr != NULL) { + switch (hints.ai_family) { + case AF_INET: + if (inet_pton(AF_INET, hostptr, host_addr) == 1) { + hints.ai_flags = AI_NUMERICHOST; + } else { + if (inet_pton(AF_INET6, hostptr, + host_addr) == 1) + return (1); + } + break; + case AF_INET6: + if (inet_pton(AF_INET6, hostptr, host_addr) == 1) { + hints.ai_flags = AI_NUMERICHOST; + } else { + if (inet_pton(AF_INET, hostptr, + host_addr) == 1) + return (1); + } + break; + default: + break; + } + } + + ecode = getaddrinfo(hostptr, "nfs", &hints, ai); + if (ecode != 0) { + syslog(LOG_ERR, "getaddrinfo %s: %s", bindhost, + gai_strerror(ecode)); + return (1); + } + return (0); +} + +static void +set_nfsdcnt(int proposed) +{ + + if (proposed < 1) { + warnx("nfsd count too low %d; reset to %d", proposed, + DEFNFSDCNT); + nfsdcnt = DEFNFSDCNT; + } else if (proposed > MAXNFSDCNT) { + warnx("nfsd count too high %d; truncated to %d", proposed, + MAXNFSDCNT); + nfsdcnt = MAXNFSDCNT; + } else + nfsdcnt = proposed; + nfsdcnt_set = 1; +} + +static void +usage(void) +{ + (void)fprintf(stderr, "%s", getopt_usage); + exit(1); +} + +static void +nonfs(__unused int signo) +{ + syslog(LOG_ERR, "missing system call: NFS not available"); +} + +static void +reapchild(__unused int signo) +{ + pid_t pid; + int i; + + while ((pid = wait3(NULL, WNOHANG, NULL)) > 0) { + for (i = 0; i < nfsdcnt; i++) + if (pid == children[i]) + children[i] = -1; + } +} + +static void +unregistration(void) +{ + if ((!rpcb_unset(NFS_PROGRAM, 2, NULL)) || + (!rpcb_unset(NFS_PROGRAM, 3, NULL))) + syslog(LOG_ERR, "rpcb_unset failed"); +} + +static void +killchildren(void) +{ + int i; + + for (i = 0; i < nfsdcnt; i++) { + if (children[i] > 0) + kill(children[i], SIGKILL); + } +} + +/* + * Cleanup master after SIGUSR1. + */ +static void +cleanup(__unused int signo) +{ + nfsd_exit(0); +} + +/* + * Cleanup child after SIGUSR1. + */ +static void +child_cleanup(__unused int signo) +{ + exit(0); +} + +static void +nfsd_exit(int status) +{ + killchildren(); + unregistration(); + exit(status); +} + +static int +get_tuned_nfsdcount(void) +{ + int ncpu, error, tuned_nfsdcnt; + size_t ncpu_size; + + ncpu_size = sizeof(ncpu); + error = sysctlbyname("hw.ncpu", &ncpu, &ncpu_size, NULL, 0); + if (error) { + warnx("sysctlbyname(hw.ncpu) failed defaulting to %d nfs servers", + DEFNFSDCNT); + tuned_nfsdcnt = DEFNFSDCNT; + } else { + tuned_nfsdcnt = ncpu * 8; + } + return tuned_nfsdcnt; +} + +static void +start_server(int master) +{ + char principal[MAXHOSTNAMELEN + 5]; + struct nfsd_nfsd_args nfsdargs; + int status, error; + char hostname[MAXHOSTNAMELEN + 1], *cp; + struct addrinfo *aip, hints; + + status = 0; + gethostname(hostname, sizeof (hostname)); + snprintf(principal, sizeof (principal), "nfs@%s", hostname); + if ((cp = strchr(hostname, '.')) == NULL || + *(cp + 1) == '\0') { + /* If not fully qualified, try getaddrinfo() */ + memset((void *)&hints, 0, sizeof (hints)); + hints.ai_flags = AI_CANONNAME; + error = getaddrinfo(hostname, NULL, &hints, &aip); + if (error == 0) { + if (aip->ai_canonname != NULL && + (cp = strchr(aip->ai_canonname, '.')) != + NULL && *(cp + 1) != '\0') + snprintf(principal, sizeof (principal), + "nfs@%s", aip->ai_canonname); + freeaddrinfo(aip); + } + } + nfsdargs.principal = principal; + + if (nfsdcnt_set) + nfsdargs.minthreads = nfsdargs.maxthreads = nfsdcnt; + else { + nfsdargs.minthreads = minthreads_set ? minthreads : get_tuned_nfsdcount(); + nfsdargs.maxthreads = maxthreads_set ? maxthreads : nfsdargs.minthreads; + if (nfsdargs.maxthreads < nfsdargs.minthreads) + nfsdargs.maxthreads = nfsdargs.minthreads; + } + error = nfssvc(nfssvc_nfsd, &nfsdargs); + if (error < 0 && errno == EAUTH) { + /* + * This indicates that it could not register the + * rpcsec_gss credentials, usually because the + * gssd daemon isn't running. + * (only the experimental server with nfsv4) + */ + syslog(LOG_ERR, "No gssd, using AUTH_SYS only"); + principal[0] = '\0'; + error = nfssvc(nfssvc_nfsd, &nfsdargs); + } + if (error < 0) { + syslog(LOG_ERR, "nfssvc: %m"); + status = 1; + } + if (master) + nfsd_exit(status); + else + exit(status); +} + +/* + * Open the stable restart file and return the file descriptor for it. + */ +static void +open_stable(int *stable_fdp, int *backup_fdp) +{ + int stable_fd, backup_fd = -1, ret; + struct stat st, backup_st; + + /* Open and stat the stable restart file. */ + stable_fd = open(NFSD_STABLERESTART, O_RDWR, 0); + if (stable_fd < 0) + stable_fd = open(NFSD_STABLERESTART, O_RDWR | O_CREAT, 0600); + if (stable_fd >= 0) { + ret = fstat(stable_fd, &st); + if (ret < 0) { + close(stable_fd); + stable_fd = -1; + } + } + + /* Open and stat the backup stable restart file. */ + if (stable_fd >= 0) { + backup_fd = open(NFSD_STABLEBACKUP, O_RDWR, 0); + if (backup_fd < 0) + backup_fd = open(NFSD_STABLEBACKUP, O_RDWR | O_CREAT, + 0600); + if (backup_fd >= 0) { + ret = fstat(backup_fd, &backup_st); + if (ret < 0) { + close(backup_fd); + backup_fd = -1; + } + } + if (backup_fd < 0) { + close(stable_fd); + stable_fd = -1; + } + } + + *stable_fdp = stable_fd; + *backup_fdp = backup_fd; + if (stable_fd < 0) + return; + + /* Sync up the 2 files, as required. */ + if (st.st_size > 0) + copy_stable(stable_fd, backup_fd); + else if (backup_st.st_size > 0) + copy_stable(backup_fd, stable_fd); +} + +/* + * Copy the stable restart file to the backup or vice versa. + */ +static void +copy_stable(int from_fd, int to_fd) +{ + int cnt, ret; + static char buf[1024]; + + ret = lseek(from_fd, (off_t)0, SEEK_SET); + if (ret >= 0) + ret = lseek(to_fd, (off_t)0, SEEK_SET); + if (ret >= 0) + ret = ftruncate(to_fd, (off_t)0); + if (ret >= 0) + do { + cnt = read(from_fd, buf, 1024); + if (cnt > 0) + ret = write(to_fd, buf, cnt); + else if (cnt < 0) + ret = cnt; + } while (cnt > 0 && ret >= 0); + if (ret >= 0) + ret = fsync(to_fd); + if (ret < 0) + syslog(LOG_ERR, "stable restart copy failure: %m"); +} + +/* + * Back up the stable restart file when indicated by the kernel. + */ +static void +backup_stable(__unused int signo) +{ + + if (stablefd >= 0) + copy_stable(stablefd, backupfd); +} + diff --git a/usr.sbin/nfsd/nfsv4.4 b/usr.sbin/nfsd/nfsv4.4 new file mode 100644 index 0000000..8d9bc80 --- /dev/null +++ b/usr.sbin/nfsd/nfsv4.4 @@ -0,0 +1,331 @@ +.\" Copyright (c) 2009 Rick Macklem, University of Guelph +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd July 1, 2013 +.Dt NFSV4 4 +.Os +.Sh NAME +.Nm NFSv4 +.Nd NFS Version 4 Protocol +.Sh DESCRIPTION +The NFS client and server provides support for the +.Tn NFSv4 +specification; see +.%T "Network File System (NFS) Version 4 Protocol RFC 3530" . +The protocol is somewhat similar to NFS Version 3, but differs in significant +ways. +It uses a single compound RPC that concatenates operations to-gether. +Each of these operations are similar to the RPCs of NFS Version 3. +The operations in the compound are performed in order, until one of +them fails (returns an error) and then the RPC terminates at that point. +.Pp +It has +integrated locking support, which implies that the server is no longer +stateless. +As such, the +.Nm +server remains in recovery mode for a grace period (always greater than the +lease duration the server uses) after a reboot. +During this grace period, clients may recover state but not perform other +open/lock state changing operations. +To provide for correct recovery semantics, a small file described by +.Xr stablerestart 5 +is used by the server during the recovery phase. +If this file is missing or empty, there is a backup copy maintained by +.Xr nfsd 8 +that will be used. If either file is missing, they will be +created by the +.Xr nfsd 8 . +If both the file and the backup copy are empty, +it will result in the server starting without providing a grace period +for recovery. +Note that recovery only occurs when the server +machine is rebooted, not when the +.Xr nfsd 8 +are just restarted. +.Pp +It provides several optional features not present in NFS Version 3: +.sp +.Bd -literal -offset indent -compact +- NFS Version 4 ACLs +- Referrals, which redirect subtrees to other servers + (not yet implemented) +- Delegations, which allow a client to operate on a file locally +.Ed +.Pp +The +.Nm +protocol does not use a separate mount protocol and assumes that the +server provides a single file system tree structure, rooted at the point +in the local file system tree specified by one or more +.sp 1 +.Bd -literal -offset indent -compact +V4: <rootdir> [-sec=secflavors] [host(s) or net] +.Ed +.sp 1 +line(s) in the +.Xr exports 5 +file. +(See +.Xr exports 5 +for details.) +The +.Xr nfsd 8 +allows a limited subset of operations to be performed on non-exported subtrees +of the local file system, so that traversal of the tree to the exported +subtrees is possible. +As such, the ``<rootdir>'' can be in a non-exported file system. +The exception is ZFS, which checks exports and, as such, all ZFS file systems +below the ``<rootdir>'' must be exported. +However, +the entire tree that is rooted at that point must be in local file systems +that are of types that can be NFS exported. +Since the +.Nm +file system is rooted at ``<rootdir>'', setting this to anything other +than ``/'' will result in clients being required to use different mount +paths for +.Nm +than for NFS Version 2 or 3. +Unlike NFS Version 2 and 3, Version 4 allows a client mount to span across +multiple server file systems, although not all clients are capable of doing +this. +.Pp +.Nm +uses names for users and groups instead of numbers. +On the wire, they +take the form: +.sp +.Bd -literal -offset indent -compact +<user>@<dns.domain> +.Ed +.sp +where ``<dns.domain>'' is not the same as the DNS domain used +for host name lookups, but is usually set to the same string. +Most systems set this ``<dns.domain>'' +to the domain name part of the machine's +.Xr hostname 1 +by default. +However, this can normally be overridden by a command line +option or configuration file for the daemon used to do the name<->number +mapping. +Under FreeBSD, the mapping daemon is called +.Xr nfsuserd 8 +and has a command line option that overrides the domain component of the +machine's hostname. +For use of +.Nm , +either client or server, this daemon must be running. +If this ``<dns.domain>'' is not set correctly or the daemon is not running, ``ls -l'' will typically +report a lot of ``nobody'' and ``nogroup'' ownerships. +.Pp +Although uid/gid numbers are no longer used in the +.Nm +protocol, they will still be in the RPC authentication fields when +using AUTH_SYS (sec=sys), which is the default. +As such, in this case both the user/group name and number spaces must +be consistent between the client and server. +.Pp +However, if you run +.Nm +with RPCSEC_GSS (sec=krb5, krb5i, krb5p), only names and KerberosV tickets +will go on the wire. +.Sh SERVER SETUP +To set up the NFS server that supports +.Nm , +you will need to either set the variables in +.Xr rc.conf 5 +as follows: +.sp +.Bd -literal -offset indent -compact +nfs_server_enable="YES" +nfsv4_server_enable="YES" +nfsuserd_enable="YES" +.Ed +.sp +or start +.Xr mountd 8 +and +.Xr nfsd 8 +without the ``-o'' option, which would force use of the old server. +The +.Xr nfsuserd 8 +daemon must also be running. +.Pp +You will also need to add at least one ``V4:'' line to the +.Xr exports 5 +file for +.Nm +to work. +.Pp +If the file systems you are exporting are only being accessed via +.Nm +there are a couple of +.Xr sysctl 8 +variables that you can change, which might improve performance. +.Bl -tag -width Ds +.It Cm vfs.nfsd.issue_delegations +when set non-zero, allows the server to issue Open Delegations to +clients. +These delegations permit the client to manipulate the file +locally on the client. +Unfortunately, at this time, client use of +delegations is limited, so performance gains may not be observed. +This can only be enabled when the file systems being exported to +.Nm +clients are not being accessed locally on the server and, if being +accessed via NFS Version 2 or 3 clients, these clients cannot be +using the NLM. +.It Cm vfs.nfsd.enable_locallocks +can be set to 0 to disable acquisition of local byte range locks. +Disabling local locking can only be done if neither local accesses +to the exported file systems nor the NLM is operating on them. +.El +.sp +Note that Samba server access would be considered ``local access'' for the above +discussion. +.Pp +To build a kernel with the NFS server that supports +.Nm +linked into it, the +.sp +.Bd -literal -offset indent -compact +options NFSD +.Ed +.sp +must be specified in the kernel's +.Xr config 5 +file. +.Sh CLIENT MOUNTS +To do an +.Nm +mount, specify the ``nfsv4'' option on the +.Xr mount_nfs 8 +command line. +This will force use of the client that supports +.Nm +plus set ``tcp'' and +.Nm . +.Pp +The +.Xr nfsuserd 8 +must be running, as above. +Also, since an +.Nm +mount uses the host uuid to identify the client uniquely to the server, +you cannot safely do an +.Nm +mount when +.sp +.Bd -literal -offset indent -compact +hostid_enable="NO" +.Ed +.sp +is set in +.Xr rc.conf 5 . +.sp +If the +.Nm +server that is being mounted on supports delegations, you can start the +.Xr nfscbd 8 +daemon to handle client side callbacks. +This will occur if +.sp +.Bd -literal -offset indent -compact +nfsuserd_enable="YES" +nfscbd_enable="YES" +.Ed +.sp +are set in +.Xr rc.conf 5 . +.sp +Without a functioning callback path, a server will never issue Delegations +to a client. +.sp +By default, the callback address will be set to the IP address acquired via +rtalloc() in the kernel and port# 7745. +To override the default port#, a command line option for +.Xr nfscbd 8 +can be used. +.sp +To get callbacks to work when behind a NAT gateway, a port for the callback +service will need to be set up on the NAT gateway and then the address +of the NAT gateway (host IP plus port#) will need to be set by assigning the +.Xr sysctl 8 +variable vfs.nfs.callback_addr to a string of the form: +.sp +N.N.N.N.N.N +.sp +where the first 4 Ns are the host IP address and the last two are the +port# in network byte order (all decimal #s in the range 0-255). +.Pp +To build a kernel with the client that supports +.Nm +linked into it, the option +.sp +.Bd -literal -offset indent -compact +options NFSCL +.Ed +.sp +must be specified in the kernel's +.Xr config 5 +file. +.Pp +Options can be specified for the +.Xr nfsuserd 8 +and +.Xr nfscbd 8 +daemons at boot time via the ``nfsuserd_flags'' and ``nfscbd_flags'' +.Xr rc.conf 5 +variables. +.Pp +NFSv4 mount(s) against exported volume(s) on the same host are not recommended, +since this can result in a hung NFS server. +It occurs when an nfsd thread tries to do an NFSv4 VOP_RECLAIM()/Close RPC +as part of acquiring a new vnode. +If all other nfsd threads are blocked waiting for lock(s) held by this nfsd +thread, then there isn't an nfsd thread to service the Close RPC. +.Sh FILES +.Bl -tag -width /var/db/nfs-stablerestart.bak -compact +.It Pa /var/db/nfs-stablerestart +NFS V4 stable restart file +.It Pa /var/db/nfs-stablerestart.bak +backup copy of the file +.El +.Sh SEE ALSO +.Xr stablerestart 5 , +.Xr mountd 8 , +.Xr nfscbd 8 , +.Xr nfsd 8 , +.Xr nfsdumpstate 8 , +.Xr nfsrevoke 8 , +.Xr nfsuserd 8 +.Sh BUGS +At this time, there is no recall of delegations for local file system +operations. +As such, delegations should only be enabled for file systems +that are being used solely as NFS export volumes and are not being accessed +via local system calls nor services such as Samba. diff --git a/usr.sbin/nfsd/stablerestart.5 b/usr.sbin/nfsd/stablerestart.5 new file mode 100644 index 0000000..7096053 --- /dev/null +++ b/usr.sbin/nfsd/stablerestart.5 @@ -0,0 +1,97 @@ +.\" Copyright (c) 2009 Rick Macklem, University of Guelph +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd April 10, 2011 +.Dt STABLERESTART 5 +.Os +.Sh NAME +.Nm nfs-stablerestart +.Nd restart information for the +.Tn NFSv4 +server +.Sh SYNOPSIS +.Nm nfs-stablerestart +.Sh DESCRIPTION +The +.Nm +file holds information that allows the +.Tn NFSv4 +server to restart without always returning the NFSERR_NOGRACE error, as described in the +.Tn NFSv4 +server specification; see +.%T "Network File System (NFS) Version 4 Protocol RFC 3530, Section 8.6.3" . +.Pp +The first record in the file, as defined by struct nfsf_rec in +/usr/include/fs/nfs/nfsrvstate.h, holds the lease duration of the +last incarnation of the server and the number of boot times that follows. +Following this are the number of previous boot times listed in the +first record. +The lease duration is used to set the grace period. +The boot times +are used to avoid the unlikely occurrence of a boot time being reused, +due to a TOD clock going backwards. This record and the previous boot times with this boot time added is re-written at the +end of the grace period. +.Pp +The rest of the file are appended records, as defined by +struct nfst_rec in /usr/include/fs/nfs/nfsrvstate.h and are used +represent one of two things. There are records which indicate that a +client successfully acquired state and records that indicate a client's state was revoked. +State revoke records indicate that state information +for a client was discarded, due to lease expiry and an otherwise +conflicting open or lock request being made by a different client. +These records can be used +to determine if clients might have done either of the +edge conditions. +.Pp +If a client might have done either edge condition or this file is +empty or corrupted, the server returns NFSERR_NOGRACE for any reclaim +request from the client. +.Pp +For correct operation of the server, it must be ensured that the file +is written to stable storage by the time a write op with IO_SYNC specified +has returned. This might require hardware level caching to be disabled for +a local disk drive that holds the file, or similar. +.Sh FILES +.Bl -tag -width /var/db/nfs-stablerestart.bak -compact +.It Pa /var/db/nfs-stablerestart +NFSv4 stable restart file +.It Pa /var/db/nfs-stablerestart.bak +backup copy of the file +.El +.Sh SEE ALSO +.Xr nfsv4 4 , +.Xr nfsd 8 +.Sh BUGS +If the file is empty, the NFSv4 server has no choice but to return +NFSERR_NOGRACE for all reclaim requests. Although correct, this is +a highly undesirable occurrence, so the file should not be lost if +at all possible. The backup copy of the file is maintained +and used by the +.Xr nfsd 8 +to minimize the risk of this occurring. +To move the file, you must edit +the nfsd sources and recompile it. This was done to discourage +accidental relocation of the file. |