diff options
Diffstat (limited to 'contrib/cvs/TODO')
-rw-r--r-- | contrib/cvs/TODO | 862 |
1 files changed, 0 insertions, 862 deletions
diff --git a/contrib/cvs/TODO b/contrib/cvs/TODO deleted file mode 100644 index 6ee6759..0000000 --- a/contrib/cvs/TODO +++ /dev/null @@ -1,862 +0,0 @@ -The "TODO" file! -*-Indented-Text-*- - -38. Think hard about using RCS state information to allow one to checkin - a new vendor release without having it be accessed until it has been - integrated into the local changes. - -39. Think about a version of "cvs update -j" which remembers what from - that other branch is already merged. This has pitfalls--it could - easily lead to invisible state which could confuse users very - rapidly--but having to create a tag or some such mechanism to keep - track of what has been merged is a pain. Take a look at PRCS 1.2. - PRCS 1.0 was particularly bad the way it handled the "invisible - state", but 1.2 is significantly better. - -52. SCCS has a feature that I would *love* to see in CVS, as it is very - useful. One may make a private copy of SCCS suid to a particular user, - so other users in the authentication list may check files in and out of - a project directory without mucking about with groups. Is there any - plan to provide a similar functionality to CVS? Our site (and, I'd - imagine, many other sites with large user bases) has decided against - having the user-groups feature of unix available to the users, due to - perceived administrative, technical and performance headaches. A tool - such as CVS with features that provide group-like functionality would - be a huge help. - -62. Consider using revision controlled files and directories to handle the - new module format -- consider a cvs command front-end to - add/delete/modify module contents, maybe. - -63. The "import" and vendor support commands (co -j) need to be documented - better. - -66. Length of the CVS temporary files must be limited to 14 characters for - System-V stupid support. As well as the length on the CVS.adm files. - -72. Consider re-design of the module -t options to use the file system more - intuitively. - -73. Consider an option (in .cvsrc?) to automatically add files that are new - and specified to commit. - -79. Might be nice to have some sort of interface to Sun's Translucent - (?) File System and tagged revisions. - -82. Maybe the import stuff should allow an arbitrary revision to be - specified. - -84. Improve the documentation about administration of the repository and - how to add/remove files and the use of symbolic links. - -85. Make symbolic links a valid thing to put under version control. - Perhaps use one of the tag fields in the RCS file? Note that we - can only support symlinks that are relative and within the scope of - the sources being controlled. - -93. Need to think hard about release and development environments. Think - about execsets as well. - -98. If diff3 bombs out (too many differences) cvs then thinks that the file - has been updated and is OK to be commited even though the file - has not yet been merged. - -100. Checked out files should have revision control support. Maybe. - -102. Perhaps directory modes should be propagated on all import check-ins. - Not necessarily uid/gid changes. - -103. setuid/setgid on files is suspect. - -104. cvs should recover nicely on unreadable files/directories. - -105. cvs should have administrative tools to allow for changing permissions - and modes and what not. In particular, this would make cvs a - more attractive alternative to rdist. - -107. It should be possible to specify a list of symbolic revisions to - checkout such that the list is processed in reverse order looking for - matches within the RCS file for the symbolic revision. If there is - not a match, the next symbolic rev on the list is checked, and so on, - until all symbolic revs are exhausted. This would allow one to, say, - checkout "4.0" + "4.0.3" + "4.0.3Patch1" + "4.0.3Patch2" to get the - most recent 4.x stuff. This is usually handled by just specifying the - right release_tag, but most people forget to do this. - -108. If someone creates a whole new directory (i.e. adds it to the cvs - repository) and you happen to have a directory in your source farm by - the same name, when you do your cvs update -d it SILENTLY does - *nothing* to that directory. At least, I think it was silent; - certainly, it did *not* abort my cvs update, as it would have if the - same thing had happened with a file instead of a directory. - -109. I had gotten pieces of the sys directory in the past but not a - complete tree. I just did something like: - - cvs get * - - Where sys was in * and got the message - - cvs get: Executing 'sys/tools/make_links sys' - sh: sys/tools/make_links: not found - - I suspect this is because I didn't have the file in question, - but I do not understand how I could fool it into getting an - error. I think a later cvs get sys seemed to work so perhaps - something is amiss in handling multiple arguments to cvs get? - -119. When importing a directory tree that is under SCCS/RCS control, - consider an option to have import checkout the SCCS/RCS files if - necessary. (This is if someone wants to import something which - is in RCS or SCCS without preserving the history, but makes sure - they do get the latest versions. It isn't clear to me how useful - that is -kingdon, June 1996). - -122. If Name_Repository fails, it currently causes CVS to die completely. It - should instead return NULL and have the caller do something reasonable - (??? -what is reasonable? I'm not sure there is a real problem here. - -kingdon, June 1996). - -123. Add a flag to import to not build vendor branches for local code. - (See `importb' tests in src/sanity.sh for more details). - -124. Anyway, I thought you might want to add something like the following - to the cvs man pages: - - BUGS - The sum of the sizes of a module key and its contents are - limited. See ndbm(3). - -126. Do an analysis to see if CVS is forgetting to close file descriptors. - Especially when committing many files (more than the open file limit - for the particular UNIX). - -127. Look at *info files; they should all be quiet if the files are not - there. Should be able to point at a RCS directory and go. - -130. cvs diff with no -r arguments does not need to look up the current RCS - version number since it only cares about what's in the Entries file. - This should make it much faster. - - It should ParseEntries itself and access the entries list much like - Version_TS does (sticky tags and sticky options may need to be - supported here as well). Then it should only diff the things that - have the wrong time stamp (the ones that look modified). - -134. Make a statement about using hard NFS mounts to your source - repository. Look into checking NULL fgets() returns with ferror() to - see if an error had occurred. (we should be checking for errors, quite - aside from NFS issues -kingdon, June 1996). - -137. Some sites might want CVS to fsync() the RCS ,v file to protect - against nasty hardware errors. There is a slight performance hit with - doing so, though, so it should be configurable in the .cvsrc file. - Also, along with this, we should look at the places where CVS itself - could be a little more synchronous so as not to lose data. - [[ I've done some of this, but it could use much more ]] - -138. Some people have suggested that CVS use a VPATH-like environment - variable to limit the amount of sources that need to be duplicated for - sites with giant source trees and no disk space. - -141. Import should accept modules as its directory argument. If we're - going to implement this, we should think hard about how modules - might be expanded and how to handle those cases. - -143. Update the documentation to show that the source repository is - something far away from the files that you work on. (People who - come from an RCS background are used to their `repository' being - _very_ close to their working directory.) - -144. Have cvs checkout look for the environment variable CVSPREFIX - (or CVSMODPREFIX or some such). If it's set, then when looking - up an alias in the modules database, first look it up with the - value of CVSPREFIX attached, and then look for the alias itself. - This would be useful when you have several projects in a single - repository. You could have aliases abc_src and xyz_src and - tell people working on project abc to put "setenv CVSPREFIX abc_" - in their .cshrc file (or equivalent for other shells). - Then they could do "cvs co src" to get a copy of their src - directory, not xyz's. (This should create a directory called - src, not abc_src.) - -145. After you create revision 1.1.1.1 in the previous scenario, if - you do "cvs update -r1 filename" you get revision 1.1, not - 1.1.1.1. It would be nice to get the later revision. Again, - this restriction comes from RCS and is probably hard to - change in CVS. Sigh. - - |"cvs update -r1 filename" does not tell RCS to follow any branches. CVS - |tries to be consistent with RCS in this fashion, so I would not change - |this. Within CVS we do have the flexibility of extending things, like - |making a revision of the form "-r1HEAD" find the most recent revision - |(branch or not) with a "1." prefix in the RCS file. This would get what - |you want maybe. - - This would be very useful. Though I would prefer an option - such as "-v1" rather than "-r1HEAD". This option might be - used quite often. - -146. The merging of files should be controlled via a hook so that programs - other than "rcsmerge" can be used, like Sun's filemerge or emacs's - emerge.el. (but be careful in making this work client/server--it means - doing the interactive merging at the end after the server is done). - (probably best is to have CVS do the non-interactive part and - tell the user about where the files are (.#foo.c.working and - .#foo.c.1.5 or whatever), so they can do the interactive part at - that point -kingdon, June 1996). - -149. Maybe there should be an option to cvs admin that allows a user to - change the Repository/Root file with some degree of error checking? - Something like "cvs admin reposmv /old/path /new/pretty/path". Before - it does the replace it check to see that the files - /new/pretty/path/<dir>/<files> exist. - - The obvious cases are where one moves the repository to another - machine or directory. But there are other cases, like where the - user might want to change from :pserver: to :ext:, use a different - server (if there are two server machines which share the - repository using a networked file system), etc. - - The status quo is a bit of a mess (as of, say, CVS 1.9). It is - that the -d global option has two moderately different uses. One - is to use a totally different repository (in which case we'd - probably want to give an error if it disagreed with CVS/Root, as - CVS 1.8 and earlier did). The other is the "reposmv" - functionality above (in which the two repositories really are the - same, and we want to update the CVS/Root files). In CVS 1.9 and - 1.10, -d rewrites the CVS/Root file (but not in subdirectories). - This behavior was not particularly popular and has been since - reverted. - - This whole area is a rather bad pile of individual decisions which - accumulated over time, some of them probably bad decisions with - hindsight. But we didn't get into this mess overnight, and we're - not going to get out of it overnight (that is, we need to come up - with a replacement behavior, document what parts of the status - quo are deprecated, probably circulate some unofficial patches, &c). - - (this item originally added 2 Feb 1992 but revised since). - -150. I have a customer request for a way to specify log message per - file, non-interactively before the commit, such that a single, fully - recursive commit prompts for one commit message, and concatenates the - per file messages for each file. In short, one commit, one editor - session, log messages allowed to vary across files within the commit. - Also, the per file messages should be allowed to be written when the - files are changed, which may predate the commit considerably. - - A new command seems appropriate for this. The state can be saved in the - CVS directory. I.e., - - % cvs message foo.c - Enter log message for foo.c - >> fixed an uninitialized variable - >> ^D - - The text is saved as CVS/foo.c,m (or some such name) and commit - is modified to append (prepend?) the text (if found) to the log - message specified at commit time. Easy enough. (having cvs - commit be non-interactive takes care of various issues like - whether to connect to the server before or after prompting for a - message (see comment in commit.c at call to start_server). Also - would clean up the kludge for what to do with the message from - do_editor if the up-to-date check fails (see commit.c client code). - - I'm not sure about the part above about having commit prompt - for an overall message--part of the point is having commit - non-interactive and somehow combining messages seems like (excess?) - hair. - - Would be nice to do this so it allows users more flexibility in - specifying messages per-directory ("cvs message -l") or per-tree - ("cvs message") or per-file ("cvs message foo.c"), and fixes the - incompatibility between client/server (per-tree) and - non-client/server (per-directory). - - A few interesting issues with this: (1) if you do a cvs update or - some other operation which changes the working directory, do you - need to run "cvs message" again (it would, of course, bring up - the old message which you could accept)? Probably yes, after all - merging in some conflicts might change the situation. (2) How do - you change the stored messages if you change your mind before the - commit (probably run "cvs message" again, as hinted in (1))? - -151. Also, is there a flag I am missing that allows replacing Ulrtx_Build - by Ultrix_build? I.E. I would like a tag replacement to be a one step - operation rather than a two step "cvs rtag -r Ulrtx_Build Ultrix_Build" - followed by "cvs rtag -d Ulrtx_Build" - -152. The "cvs -n" option does not work as one would expect for all the - commands. In particular, for "commit" and "import", where one would - also like to see what it would do, without actually doing anything. - -153. There should be some command (maybe I just haven't figured out - which one...) to import a source directory which is already - RCS-administered without losing all prior RCS gathered data. - Thus, it would have to examine the RCS files and choose a - starting version and branch higher than previous ones used. - (Check out rcs-to-cvs and see if it addresses this issue.) - -154. When committing the modules file, a pre-commit check should be done to - verify the validity of the new modules file before allowing it to be - committed. - -155. The options for "cvs history" are mutually exclusive, even though - useful queries can be done if they are not, as in specifying both - a module and a tag. A workaround is to specify the module, then - run the output through grep to only display lines that begin with - T, which are tag lines. (Better perhaps if we redesign the whole - "history" business -- check out doc/cvs.texinfo for the entire - rant.) - -156. Also, how hard would it be to allow continuation lines in the - {commit,rcs,log}info files? It would probably be useful with all of - the various flags that are now available, or if somebody has a lot of - files to put into a module. - -158. If I do a recursive commit and find that the same RCS file is checked - out (and modified!) in two different places within my checked-out - files (but within the realm of a single "commit"), CVS will commit the - first change, then overwrite that change with the second change. We - should catch this (typically unusual) case and issue an appropriate - diagnostic and die. - -160. The checks that the commit command does should be extended to make - sure that the revision that we will lock is not already locked by - someone else. Maybe it should also lock the new revision if the old - revision was already locked by the user as well, thus moving the lock - forward after the commit. - -163. The rtag/tag commands should have an option that removes the specified - tag from any file that is in the attic. This allows one to re-use a - tag (like "Mon", "Tue", ...) all the time and still have it tag the - real main-line code. - -165. The "import" command will create RCS files automatically, but will - screw-up when trying to create long file names on short file name - file systems. Perhaps import should be a bit more cautious. - -166. There really needs to be a "Getting Started" document which describes - some of the new CVS philosophies. Folks coming straight from SCCS or - RCS might be confused by "cvs import". Also need to explain: - - How one might setup their $CVSROOT - - What all the tags mean in an "import" command - - Tags are important; revision numbers are not - -170. Is there an "info" file that can be invoked when a file is checked out, or - updated ? What I want to do is to advise users, particularly novices, of - the state of their working source whenever they check something out, as - a sanity check. - - For example, I've written a perl script which tells you what branch you're - on, if any. Hopefully this will help guard against mistaken checkins to - the trunk, or to the wrong branch. I suppose I can do this in - "commitinfo", but it'd be nice to advise people before they edit their - files. - - It would also be nice if there was some sort of "verboseness" switch to - the checkout and update commands that could turn this invocation of the - script off, for mature users. - -173. Need generic date-on-branch handling. Currently, many commands - allow both -r and -D, but that's problematic for commands like diff - that interpret that as two revisions rather than a single revision. - Checkout and update -j takes tag:date which is probably a better - solution overall. - -174. I would like to see "cvs release" modified so that it only removes files - which are known to CVS - all the files in the repository, plus those which - are listed in .cvsignore. This way, if you do leave something valuable in - a source tree you can "cvs release -d" the tree and your non-CVS goodies - are still there. If a user is going to leave non-CVS files in their source - trees, they really should have to clean them up by hand. - -175. And, in the feature request department, I'd dearly love a command-line - interface to adding a new module to the CVSROOT/modules file. - -176. If you use the -i flag in the modules file, you can control access - to source code; this is a Good Thing under certain circumstances. I - just had a nasty thought, and on experiment discovered that the - filter specified by -i is _not_ run before a cvs admin command; as - this allows a user to go behind cvs's back and delete information - (cvs admin -o1.4 file) this seems like a serious problem. - -177. We've got some external vendor source that sits under a source code - hierarchy, and when we do a cvs update, it gets wiped out because - its tag is different from the "main" distribution. I've tried to - use "-I" to ignore the directory, as well as .cvsignore, but this - doesn't work. - -179. "cvs admin" does not log its actions with loginfo, nor does it check - whether the action is allowed with commitinfo. It should. - -180. "cvs edit" should show you who is already editing the files, - probably (that is, do "cvs editors" before executing, or some - similar result). (But watch out for what happens if the network - is down!). - -182. There should be a way to show log entries corresponding to -changes from tag "foo" to tag "bar". "cvs log -rfoo:bar" doesn't cut -it, because it erroneously shows the changes associated with the -change from the revision before foo to foo. I'm not sure that is ever -a useful or logical behavior ("cvs diff -r foo -r bar" gets this -right), but is compatibility an issue? See -http://www.cyclic.com/cvs/unoff-log.txt for an unofficial patch. - -183. "cvs status" should report on Entries.Static flag and CVS/Tag (how? -maybe a "cvs status -d" to give directory status?). There should also -be more documentation of how these get set and how/when to re-set them. - -184. Would be nice to implement the FreeBSD MD5-based password hash -algorithm in pserver. For more info see "6.1. DES, MD5, and Crypt" in -the FreeBSD Handbook, and src/lib/libcrypt/crypt.c in the FreeBSD -sources. Certainly in the context of non-unix servers this algorithm -makes more sense than the traditional unix crypt() algorithm, which -suffers from export control problems. - -185. A frequent complaint is that keyword expansion causes conflicts -when merging from one branch to another. The first step is -documenting CVS's existing features in this area--what happens with -various -k options in various places? The second step is thinking -about whether there should be some new feature and if so how it should -be designed. For example, here is one thought: - - rcs' co command needs a new -k option. The new option should expand - $Log entries without expanding $Revision entries. This would - allow cvs to use rcsmerge in such a way that joining branches into - main lines would neither generate extra collisions on revisions nor - drop log lines. - -The details of this are out of date (CVS no longer invokes "co", and -any changes in this area would be done by bypassing RCS rather than -modifying it), but even as to the general idea, I don't have a clear -idea about whether it would be good (see what I mean about the need -for better documentation? I work on CVS full-time, and even I don't -understand the state of the art on this subject). - -186. There is a frequent discussion of multisite features. - -* There may be some overlap with the client/server CVS, which is good -especially when there is a single developer at each location. But by -"multisite" I mean something in which each site is more autonomous, to -one extent or another. - -* Vendor branches are the closest thing that CVS currently has for -multisite features. They have fixable drawbacks (such as poor -handling of added and removed files), and more fundamental drawbacks -(when you import a vendor branch, you are importing a set of files, -not importing any knowledge of their version history outside the -current repository). - -* One approach would be to require checkins (or other modifications to -the repository) to succeed at a write quorum of sites (51%) before -they are allowed to complete. To work well, the network should be -reliable enough that one can typically get to that many sites. When a -server which has been out of touch reconnects, it would want to update -its data before doing anything else. Any of the servers can service -all requests locally, except perhaps for a check that they are -up-to-date. The way this differs from a run-of-the-mill distributed -database is that if one only allows reversible operations via this -mechanism (exclude "cvs admin -o", "cvs tag -d", &c), then each site -can back up the others, such that failures at one site, including -something like deleting all the sources, can be recovered from. Thus -the sites need not trust each other as much as for many shared -databases, and the system may be resilient to many types of -organizational failures. Sometimes I call this design the -"CVScluster" design. - -* Another approach is a master/slave one. Checkins happen at the -master site, and slave sites need to check whether their local -repository is up to date before relying on its information. - -* Another approach is to have each site own a particular branch. This -one is the most tolerant of flaky networks; if checkins happen at each -site independently there is no particular problem. The big question -is whether merges happen only manually, as with existing CVS branches, -or whether there is a feature whereby there are circumstances in which -merges from one branch to the other happen automatically (for example, -the case in which the branches have not diverged). This might be a -legitimate question to ask even quite aside from multisite features. - -187. Might want to separate out usage error messages and help -messages. The problem now is that if you specify an invalid option, -for example, the error message is lost among all the help text. In -the new regime, the error message would be followed by a one-line -message directing people to the appropriate help option ("cvs -H -<command>" or "cvs --help-commands" or whatever, according to the -situation). I'm not sure whether this change would be controversial -(as defined in HACKING), so there might be a need for further -discussion or other actions other than just coding. - -188. Option parsing and .cvsrc has at least one notable limitation. -If you want to set a global option only for some CVS commands, there -is no way to do it (for example, if one wants to set -q only for -"rdiff"). I am told that the "popt" package from RPM -(http://www.rpm.org) could solve this and other problems (for example, -if the syntax of option stuff in .cvsrc is similar to RPM, that would -be great from a user point of view). It would at least be worth a -look (it also provides a cleaner API than getopt_long). - -Another issue which may or may not be related is the issue of -overriding .cvsrc from the command line. The cleanest solution might -be to have options in mutually exclusive sets (-l/-R being a current -example, but --foo/--no-foo is a better way to name such options). Or -perhaps there is some better solution. - -189. Renaming files and directories is a frequently discussed topic. - -Some of the problems with the status quo: - -a. "cvs annotate" cannot operate on both the old and new files in a -single run. You need to run it twice, once for the new name and once -for the old name. - -b. "cvs diff" (or "cvs diff -N") shows a rename as a removal of the -old file and an addition of the new one. Some people would like to -see the differences between the file contents (but then how would we -indicate the fact that the file has been renamed? Certainly the -notion that "patch(1)" has of renames is as a removal and addition). - -c. "cvs log" should be able to show the changes between two -tags/dates, even in the presence of adds/removes/renames (I'm not sure -what the status quo is on this; see also item #182). - -d. Renaming directories is way too hard. - -Implementations: - -It is perhaps premature to try to design implementation details -without answering some of the above questions about desired behaviors -but several general implementations get mentioned. - -i. No fundamental changes (for example, a "cvs rename" command which -operated on directories could still implement the current recommended -practice for renaming directories, which is to rename each of the -files contained therein via an add and a remove). One thing to note -that the status quo gets right is proper merges, even with adds and -removals (Well, mostly right at least. There are a *LOT* of different -cases; see the testsuite for some of them). - -ii. Rename database. In this scheme the files in the repository -would have some arbitrary name, and then a separate rename database -would indicate the current correspondence between the filename in the -working directory and the actual storage. As far as I know this has -never been designed in detail for CVS. - -iii. A modest change in which the RCS files would contain some -information such as "renamed from X" or "renamed to Y". That is, this -would be generally similar to the log messages which are suggested -when one renames via an add and a removal, but would be -computer-parseable. I don't think anyone has tried to flesh out any -details here either. - -It is interesting to note that in solution ii. version numbers in the -"new file" start where the "old file" left off, while in solutions -i. and iii., version numbers restart from 1.1 each time a file is -renamed. Except perhaps in the case where we rename a file from foo -to bar and then back to foo. I'll shut up now. - -Regardless of the method we choose, we need to address how renames -affect existing CVS behaviors. For example, what happens when you -rename a file on a branch but not the trunk and then try to merge the -two? What happens when you rename a file on one branch and delete it -on another and try to merge the two? - -Ideally, we'd come up with a way to parameterize the problem and -simply write up a lookup table to determine the correct behavior. - -190. The meaning of the -q and -Q global options is very ad hoc; -there is no clear definition of which messages are suppressed by them -and which are not. Here is a classification of the current meanings -of -q; I don't know whether anyone has done a similar investigation of --Q: - - a. The "warm fuzzies" printed upon entering each directory (for - example, "cvs update: Updating sdir"). The need for these messages - may be decreased now that most of CVS uses ->fullname instead of - ->file in messages (a project which is *still* not 100% complete, - alas). However, the issue of whether CVS can offer status as it - runs is an important one. Of course from the command line it is - hard to do this well and one ends up with options like -q. But - think about emacs, jCVS, or other environments which could flash you - the latest status line so you can see whether the system is working - or stuck. - - b. Other cases where the message just offers information (rather - than an error) and might be considered unnecessarily verbose. These - have a certain point to them, although it isn't really clear whether - it should be the same option as the warm fuzzies or whether it is - worth the conceptual hair: - - add.c: scheduling %s `%s' for addition (may be an issue) - modules.c: %s %s: Executing '%s' (I can see how that might be noise, - but...) - remove.c: scheduling `%s' for removal (analogous to the add.c one) - update.c: Checking out %s (hmm, that message is a bit on the noisy side...) - (but the similar message in annotate is not affected by -q). - - c. Suppressing various error messages. This is almost surely - bogus. - - commit.c: failed to remove tag `%s' from `%s' (Questionable. - Rationale might be that we already printed another message - elsewhere but why would it be necessary to avoid - the extra message in such an uncommon case?) - commit.c: failed to commit dead revision for `%s' (likewise) - remove.c: file `%s' still in working directory (see below about rm - -f analogy) - remove.c: nothing known about `%s' (looks dubious to me, especially in - the case where the user specified it explicitly). - remove.c: removed `%s' (seems like an obscure enough case that I fail - to see the appeal of being cryptically concise here). - remove.c: file `%s' already scheduled for removal (now it is starting - to look analogous to the infamous rm -f option). - rtag.c: cannot find tag `%s' in `%s' (more rm -f like behavior) - rtag.c: failed to remove tag `%s' from `%s' (ditto) - tag.c: failed to remove tag %s from %s (see above about whether RCS_* - has already printed an error message). - tag.c: couldn't tag added but un-commited file `%s' (more rm -f - like behavior) - tag.c: skipping removed but un-commited file `%s' (ditto) - tag.c: cannot find revision control file for `%s' (ditto, but at first - glance seems even worse, as this would seem to be a "can't happen" - condition) - -191. Storing RCS files, especially binary files, takes rather more -space than it could, typically. - - The virtue of the status quo is that it is simple to implement. - Of course it is also simplest in terms of dealing with compatibility. - - Just storing the revisions as separate gzipped files is a common - technique. It also is pretty simple (no new algorithms, CVS - already has zlib around). Of course for some files (such as files - which are already compressed) the gzip step won't help, but - something which can at least sometimes avoid rewriting the entire - RCS file for each new revision would, I would think, be a big - speedup for large files. - - Josh MacDonald has written a tool called xdelta which produces - differences (that is, sufficient information to transform the old - to the new) which looks for common sequences of bytes, like RCS - currently does, but which is not based on lines. This seems to do - quite well for some kinds of files (e.g. FrameMaker documents, - text files), and not as well for others (anything which is already - compressed, executables). xdelta 1.10 also is faster than GNU diff. - - Karl Fogel has thought some about using a difference technique - analogous to fractal compression (see the comp.compression FAQ for - more on fractal compression, including at least one patent to - watch for; I don't know how analogous Karl's ideas are to the - techniques described there). - - Quite possibly want some documented interface by which a site can - plug in their choice of external difference programs (with the - ability to choose the program based on filename, magic numbers, - or some such). - -192. "cvs update" using an absolute pathname does not work if the -working directory is not a CVS-controlled directory with the correct -CVSROOT. For example, the following will fail: - - cd /tmp - cvs -d /repos co foo - cd / - cvs update /tmp/foo - -It is possible to read the CVSROOT from the administrative files in -the directory specified by the absolute pathname argument to update. -In that case, the last command above would be equivalent to: - - cd /tmp/foo - cvs update . - -This can be problematic, however, if we ask CVS to update two -directories with different CVSROOTs. Currently, CVS has no way of -changing CVSROOT mid-stream. Consider the following: - - cd /tmp - cvs -d /repos1 co foo - cvs -d /repos2 co bar - cd / - cvs update /tmp/foo /tmp/bar - -To make that example work, we need to think hard about: - - - where and when CVSROOT-related variables get set - - who caches said variables for later use - - how the remote protocol should be extended to handle sending a new - repository mid-stream - - how the client should maintain connections to a variety of servers - in a single invocation. - -Because those issues are hairy, I suspect that having a change in -CVSROOT be an error would be a better move. - -193. The client relies on timestamps to figure out whether a file is -(maybe) modified. If something goes awry, then it ends up sending -entire files to the server to be checked, and this can be quite slow -especially over a slow network. A couple of things that can happen: -(a) other programs, like make, use timestamps, so one ends up needing -to do "touch foo" and otherwise messing with timestamps, (b) changing -the timezone offset (e.g. summer vs. winter or moving a machine) -should work on unix, but there may be problems with non-unix. - -Possible solutions: - - a. Store a checksum for each file in CVS/Entries or some such - place. What to do about hash collisions is interesting: using a - checksum, like MD5, large enough to "never" have collisions - probably works in practice (of course, if there is a collision then - all hell breaks loose because that code path was not tested, but - given the tiny, tiny probability of that I suppose this is only an - aesthetic issue). - - b. I'm not thinking of others, except storing the whole file in - CVS/Base, and I'm sure using twice the disk space would be - unpopular. - -194. CVS does not separate the "metadata" from the actual revision -history; it stores them both in the RCS files. Metadata means tags -and header information such as the number of the head revision. -Storing the metadata separately could speed up "cvs tag" enormously, -which is a big problem for large repositories. It could also probably -make CVS's locking much less in the way (see comment in do_recursion -about "two-pass design"). - -195. Many people using CVS over a slow link are interested in whether -the remote protocol could be any more efficient with network -bandwidth. This item is about one aspect of that--how the server -sends a new version of a file the client has a different version of, -or vice versa. - -a. Cases in which the status quo already sends a diff. For most text -files, this is probably already close to optimal. For binary files, -and anomalous (?) text files (e.g. those in which it would help to do -moves, as well as adds and deletes), it might be worth looking into other -difference algorithms (see item #191). - -b. Cases in which the status quo does not send a diff (e.g. "cvs -commit"). - -b1. With some frequency, people suggest rsync or a similar algorithm -(see ftp://samba.anu.edu.au/pub/rsync/). This could speed things up, -and in some ways involves the most minimal changes to the default CVS -paradigm. There are some downsides though: (1) there is an extra -network turnaround, (2) the algorithm needs to transmit some data to -discover what difference type programs can discover locally (although -this is only about 1% of the size of the files). - -b2. If one is willing to require that users use "cvs edit" before -editing a file on the client side (in some cases, a development -environment like emacs can make this fairly easy), then the Modified -request in the protocol could be extended to allow the client to just -send differences instead of entire files. In the degenerate case -(e.g. "cvs diff" without arguments) the required network traffic is -reduced to zero, and the client need not even contact the server. - -197. Analyze the difference between CVS_UNLINK & unlink_file. As far as I -can tell, unlink_file aborts in noexec mode and CVS_UNLINK does not. I'm not -sure it would be possible to remove even the use of temp files in noexec mode, -but most unlinks should probably be using unlink_file and not CVS_UNLINK. - -198. Remove references to deprecated cvs_temp_name function. - -199. Add test for login & logout functionality, including support for -backwards compatibility with old CVSROOTs. - -200. Make a 'cvs add' without write access a non-fatal error so that -the user's Entries file is updated and future 'cvs diffs' will work -properly. This should ease patch submission. - -201. cvs_temp_file should be creating temporary files in a privately owned -subdirectory of of temp due to security issues on some systems. - -202. Enable rdiff to accept most diff options. Make rdiff output look -like diff's. Make CVS diff garbage go to stderr and only standard diff -output go to stdout. - -203. Add val-tags additions to the tagging code. Don't remove the -update additions since val-tags could still be used as a cache when the -repository was imported from elsewhere (the tags weren't applied with a -version which wrote val-tags). - -204. Add test case for compression. A buf_shutdown error using compression -wasn't caught by the test suite. - -205. There are lots of cases where trailing slashes on directory names -and other non-canonical paths confuse CVS. Most of the cases that do -work are handled on an ad-hoc basis. We need to come up with a coherent -strategy to address path canonicalization and apply it consistently. - -208. Merge enhancements to the diff package back into the original GNU source. - -209. Go through this file and try to: - - a. Verify that items are still valid. - - b. Create test cases for valid items when they don't exist. - - c. Remove fixed and no longer applicable items. - -210. Explain to sanity.sh how to deal with paths with spaces and other odd -characters in them. - -211. Make sanity.sh run under the Win32 bash (cygwin) and maybe other Windex -environments (e.g. DGSS or whatever the MSVC portability environemnt is called). - -212. Autotestify (see autoconf source) sanity.sh. - -213. Examine desirability of updating the regex library (regex.{c,h}) to the -more recent versions that come with glibc and emacs. It might be worth waiting -for the emacs folks to get their act together and merge their changes into the -glibc version. - -214. Make options.h options configure script options instead. - -215. Add reditors and rwatchers commands. - - - Is an r* command abstraction layer possible here for the commands - where this makes sense? Would it be simpler? It seems to me the - major operational differences lie in the file list construction. - -218. Fix "checkout -d ." in client/server mode. - -221. Handle spaces in file/directory names. (Most, if not all, of the -internal infrastructure already handles them correctly, but most of the -administrative file interfaces do not.) - -223. Internationalization support. This probably means using some kind -of universal character set (ISO 10646?) internally and converting on -input and output, which opens the locale can of worms. - -224. Better timezone handling. Many people would like to see times -output in local time rather than UTC, but that's tricky since the -conversion from internal form is currently done by the server who has no -idea what the user's timezone even is, let alone the rules for -converting to it. - - - On the contrary, I think the MT server response should be easily adaptable -for this purpose. It is defined in cvsclient.texi as processed by the client -if it knows how and printed to stdout otherwise. A "time" tag or the like -could be the usual CVS server UTC time string. An old client could just print -the time in UTC and a new client would know that it could convert the time to a -local time string according to the localization settings before printing it. - -225. Add support for --allow-root to server command. - -227. 'cvs release' should use the CVS/Root in the directory being released -when such is specified rather than $CVSROOT. In my work directory with no CVS -dir, a release of subdirectories causes the released projects to be tested -against my $CVSROOT environment variable, which often isn't correct but which -can complete without generating error messages if the project also exists in -the other CVSROOT. This happens a lot with my copies of the ccvs project. - -228. Consider adding -d to commit ala ci. - -229. Improve the locking code to use a random delay with exponential -backoff ala Ethernet and separate the notification interval from the -wait interval. - -230. Support for options like compression as part of the CVSROOT might be -nice. This should be fairly easy to implement now using the method options. - -234. Noop commands should be logged in the history file. Information can -still be obtained with noop commands, for instance via `cvs -n up -p', and -paranoid admins might appreciate this. Similarly, perhaps diff operations -should be logged. |