diff options
Diffstat (limited to 'gnu/lib/libregex/ChangeLog')
-rw-r--r-- | gnu/lib/libregex/ChangeLog | 3030 |
1 files changed, 0 insertions, 3030 deletions
diff --git a/gnu/lib/libregex/ChangeLog b/gnu/lib/libregex/ChangeLog deleted file mode 100644 index ef919d2..0000000 --- a/gnu/lib/libregex/ChangeLog +++ /dev/null @@ -1,3030 +0,0 @@ -Fri Apr 2 17:31:59 1993 Jim Blandy (jimb@totoro.cs.oberlin.edu) - - * Released version 0.12. - - * regex.c (regerror): If errcode is zero, that's not a valid - error code, according to POSIX, but return "Success." - - * regex.c (regerror): Remember to actually fetch the message - from re_error_msg. - - * regex.c (regex_compile): Don't use the trick for ".*\n" on - ".+\n". Since the latter involves laying an extra choice - point, the backward jump isn't adjusted properly. - -Thu Mar 25 21:35:18 1993 Jim Blandy (jimb@totoro.cs.oberlin.edu) - - * regex.c (regex_compile): In the handle_open and handle_close - sections, clear pending_exact to zero. - -Tue Mar 9 12:03:07 1993 Jim Blandy (jimb@wookumz.gnu.ai.mit.edu) - - * regex.c (re_search_2): In the loop which searches forward - using fastmap, don't forget to cast the character from the - string to an unsigned before using it as an index into the - translate map. - -Thu Jan 14 15:41:46 1993 David J. MacKenzie (djm@kropotkin.gnu.ai.mit.edu) - - * regex.h: Never define const; let the callers do it. - configure.in: Don't define USING_AUTOCONF. - -Wed Jan 6 20:49:29 1993 Jim Blandy (jimb@geech.gnu.ai.mit.edu) - - * regex.c (regerror): Abort if ERRCODE is out of range. - -Sun Dec 20 16:19:10 1992 Jim Blandy (jimb@totoro.cs.oberlin.edu) - - * configure.in: Arrange to #define USING_AUTOCONF. - * regex.h: If USING_AUTOCONF is #defined, don't mess with - `const' at all; autoconf has taken care of it. - -Mon Dec 14 21:40:39 1992 David J. MacKenzie (djm@kropotkin.gnu.ai.mit.edu) - - * regex.h (RE_SYNTAX_AWK): Fix typo. From Arnold Robbins. - -Sun Dec 13 20:35:39 1992 Jim Blandy (jimb@totoro.cs.oberlin.edu) - - * regex.c (compile_range): Fetch the range start and end by - casting the pattern pointer to an `unsigned char *' before - fetching through it. - -Sat Dec 12 09:41:01 1992 Jim Blandy (jimb@totoro.cs.oberlin.edu) - - * regex.c: Undo change of 12/7/92; it's better for Emacs to - #define HAVE_CONFIG_H. - -Fri Dec 11 22:00:34 1992 Jim Meyering (meyering@hal.gnu.ai.mit.edu) - - * regex.c: Define and use isascii-protected ctype.h macros. - -Fri Dec 11 05:10:38 1992 Jim Blandy (jimb@totoro.cs.oberlin.edu) - - * regex.c (re_match_2): Undo Karl's November 10th change; it - keeps the group in :\(.*\) from matching :/ properly. - -Mon Dec 7 19:44:56 1992 Jim Blandy (jimb@wookumz.gnu.ai.mit.edu) - - * regex.c: #include config.h if either HAVE_CONFIG_H or emacs - is #defined. - -Tue Dec 1 13:33:17 1992 David J. MacKenzie (djm@goldman.gnu.ai.mit.edu) - - * regex.c [HAVE_CONFIG_H]: Include config.h. - -Wed Nov 25 23:46:02 1992 David J. MacKenzie (djm@goldman.gnu.ai.mit.edu) - - * regex.c (regcomp): Add parens around bitwise & for clarity. - Initialize preg->allocated to prevent segv. - -Tue Nov 24 09:22:29 1992 David J. MacKenzie (djm@goldman.gnu.ai.mit.edu) - - * regex.c: Use HAVE_STRING_H, not USG. - * configure.in: Check for string.h, not USG. - -Fri Nov 20 06:33:24 1992 Karl Berry (karl@cs.umb.edu) - - * regex.c (SIGN_EXTEND_CHAR) [VMS]: Back out of this change, - since Roland Roberts now says it was a localism. - -Mon Nov 16 07:01:36 1992 Karl Berry (karl@cs.umb.edu) - - * regex.h (const) [!HAVE_CONST]: Test another cpp symbol (from - Autoconf) before zapping const. - -Sun Nov 15 05:36:42 1992 Jim Blandy (jimb@wookumz.gnu.ai.mit.edu) - - * regex.c, regex.h: Changes for VMS from Roland B Roberts - <roberts@nsrl31.nsrl.rochester.edu>. - -Thu Nov 12 11:31:15 1992 Karl Berry (karl@cs.umb.edu) - - * Makefile.in (distfiles): Include INSTALL. - -Tue Nov 10 09:29:23 1992 Karl Berry (karl@cs.umb.edu) - - * regex.c (re_match_2): At maybe_pop_jump, if at end of string - and pattern, just quit the matching loop. - - * regex.c (LETTER_P): Rename to `WORDCHAR_P'. - - * regex.c (AT_STRINGS_{BEG,END}): Take `d' as an arg; change - callers. - - * regex.c (re_match_2) [!emacs]: In wordchar and notwordchar - cases, advance d. - -Wed Nov 4 15:43:58 1992 Karl Berry (karl@hal.gnu.ai.mit.edu) - - * regex.h (const) [!__STDC__]: Don't define if it's already defined. - -Sat Oct 17 19:28:19 1992 Karl Berry (karl@cs.umb.edu) - - * regex.c (bcmp, bcopy, bzero): Only #define if they are not - already #defined. - - * configure.in: Use AC_CONST. - -Thu Oct 15 08:39:06 1992 Karl Berry (karl@cs.umb.edu) - - * regex.h (const) [!const]: Conditionalize. - -Fri Oct 2 13:31:42 1992 Karl Berry (karl@cs.umb.edu) - - * regex.h (RE_SYNTAX_ED): New definition. - -Sun Sep 20 12:53:39 1992 Karl Berry (karl@cs.umb.edu) - - * regex.[ch]: remove traces of `longest_p' -- dumb idea to put - this into the pattern buffer, as it means parallelism loses. - - * Makefile.in (config.status): use sh to run configure --no-create. - - * Makefile.in (realclean): OK, don't remove configure. - -Sat Sep 19 09:05:08 1992 Karl Berry (karl@hayley) - - * regex.c (PUSH_FAILURE_POINT, POP_FAILURE_POINT) [DEBUG]: keep - track of how many failure points we push and pop. - (re_match_2) [DEBUG]: declare variables for that, and print results. - (DEBUG_PRINT4): new macro. - - * regex.h (re_pattern_buffer): new field `longest_p' (to - eliminate backtracking if the user doesn't need it). - * regex.c (re_compile_pattern): initialize it (to 1). - (re_search_2): set it to zero if register information is not needed. - (re_match_2): if it's set, don't backtrack. - - * regex.c (re_search_2): update fastmap only after checking that - the pattern is anchored. - - * regex.c (re_match_2): do more debugging at maybe_pop_jump. - - * regex.c (re_search_2): cast result of TRANSLATE for use in - array subscript. - -Thu Sep 17 19:47:16 1992 Karl Berry (karl@geech.gnu.ai.mit.edu) - - * Version 0.11. - -Wed Sep 16 08:17:10 1992 Karl Berry (karl@hayley) - - * regex.c (INIT_FAIL_STACK): rewrite as statements instead of a - complicated comma expr, to avoid compiler warnings (and also - simplify). - (re_compile_fastmap, re_match_2): change callers. - - * regex.c (POP_FAILURE_POINT): cast pop of regstart and regend - to avoid compiler warnings. - - * regex.h (RE_NEWLINE_ORDINARY): remove this syntax bit, and - remove uses. - * regex.c (at_{beg,end}line_loc_p): go the last mile: remove - the RE_NEWLINE_ORDINARY case which made the ^ in \n^ be an anchor. - -Tue Sep 15 09:55:29 1992 Karl Berry (karl@hayley) - - * regex.c (at_begline_loc_p): new fn. - (at_endline_loc_p): simplify at_endline_op_p. - (regex_compile): in ^/$ cases, call the above. - - * regex.c (POP_FAILURE_POINT): rewrite the fn as a macro again, - as lord's profiling indicates the function is 20% of the time. - (re_match_2): callers changed. - - * configure.in (AC_MEMORY_H): remove, since we never use memcpy et al. - -Mon Sep 14 17:49:27 1992 Karl Berry (karl@hayley) - - * Makefile.in (makeargs): include MFLAGS. - -Sun Sep 13 07:41:45 1992 Karl Berry (karl@hayley) - - * regex.c (regex_compile): in \1..\9 case, make it always - invalid to use \<digit> if there is no preceding <digit>th subexpr. - * regex.h (RE_NO_MISSING_BK_REF): remove this syntax bit. - - * regex.c (regex_compile): remove support for invalid empty groups. - * regex.h (RE_NO_EMPTY_GROUPS): remove this syntax bit. - - * regex.c (FREE_VARIABLES) [!REGEX_MALLOC]: define as alloca (0), - to reclaim memory. - - * regex.h (RE_SYNTAX_POSIX_SED): don't bother with this. - -Sat Sep 12 13:37:21 1992 Karl Berry (karl@hayley) - - * README: incorporate emacs.diff. - - * regex.h (_RE_ARGS) [!__STDC__]: define as empty parens. - - * configure.in: add AC_ALLOCA. - - * Put test files in subdir test, documentation in subdir doc. - Adjust Makefile.in and configure.in accordingly. - -Thu Sep 10 10:29:11 1992 Karl Berry (karl@hayley) - - * regex.h (RE_SYNTAX_{POSIX_,}SED): new definitions. - -Wed Sep 9 06:27:09 1992 Karl Berry (karl@hayley) - - * Version 0.10. - -Tue Sep 8 07:32:30 1992 Karl Berry (karl@hayley) - - * xregex.texinfo: put the day of month into the date. - - * Makefile.in (realclean): remove Texinfo-generated files. - (distclean): remove empty sorted index files. - (clean): remove dvi files, etc. - - * configure.in: test for more Unix variants. - - * fileregex.c: new file. - Makefile.in (fileregex): new target. - - * iregex.c (main): move variable decls to smallest scope. - - * regex.c (FREE_VARIABLES): free reg_{,info_}dummy. - (re_match_2): check that the allocation for those two succeeded. - - * regex.c (FREE_VAR): replace FREE_NONNULL with this. - (FREE_VARIABLES): call it. - (re_match_2) [REGEX_MALLOC]: initialize all our vars to NULL. - - * tregress.c (do_match): generalize simple_match. - (SIMPLE_NONMATCH): new macro. - (SIMPLE_MATCH): change from routine. - - * Makefile.in (regex.texinfo): make file readonly, so we don't - edit it by mistake. - - * many files (re_default_syntax): rename to `re_syntax_options'; - call re_set_syntax instead of assigning to the variable where - possible. - -Mon Sep 7 10:12:16 1992 Karl Berry (karl@hayley) - - * syntax.skel: don't use prototypes. - - * {configure,Makefile}.in: new files. - - * regex.c: include <string.h> `#if USG || STDC_HEADERS'; remove - obsolete test for `POSIX', and test for BSRTING. - Include <strings.h> if we are not USG or STDC_HEADERS. - Do not include <unistd.h>. What did we ever need that for? - - * regex.h (RE_NO_EMPTY_ALTS): remove this. - (RE_SYNTAX_AWK): remove from here, too. - * regex.c (regex_compile): remove the check. - * xregex.texinfo (Alternation Operator): update. - * other.c (test_others): remove tests for this. - - * regex.h (RE_DUP_MAX): undefine if already defined. - - * regex.h: (RE_SYNTAX_POSIX*): redo to allow more operators, and - define new syntaxes with the minimal set. - - * syntax.skel (main): used sscanf instead of scanf. - - * regex.h (RE_SYNTAX_*GREP): new definitions from mike. - - * regex.c (regex_compile): initialize the upper bound of - intervals at the beginning of the interval, not the end. - (From pclink@qld.tne.oz.au.) - - * regex.c (handle_bar): rename to `handle_alt', for consistency. - - * regex.c ({store,insert}_{op1,op2}): new routines (except the last). - ({STORE,INSERT}_JUMP{,2}): macros to replace the old routines, - which took arguments in different orders, and were generally weird. - - * regex.c (PAT_PUSH*): rename to `BUF_PUSH*' -- we're not - appending info to the pattern! - -Sun Sep 6 11:26:49 1992 Karl Berry (karl@hayley) - - * regex.c (regex_compile): delete the variable - `following_left_brace', since we never use it. - - * regex.c (print_compiled_pattern): don't print the fastmap if - it's null. - - * regex.c (re_compile_fastmap): handle - `on_failure_keep_string_jump' like `on_failure_jump'. - - * regex.c (re_match_2): in `charset{,_not' case, cast the bit - count to unsigned, not unsigned char, in case we have a full - 32-byte bit list. - - * tregress.c (simple_match): remove. - (simple_test): rename as `simple_match'. - (simple_compile): print the error string if the compile failed. - - * regex.c (DO_RANGE): rewrite as a function, `compile_range', so - we can debug it. Change pattern characters to unsigned char - *'s, and change the range variable to an unsigned. - (regex_compile): change calls. - -Sat Sep 5 17:40:49 1992 Karl Berry (karl@hayley) - - * regex.h (_RE_ARGS): new macro to put in argument lists (if - ANSI) or omit them (if K&R); don't declare routines twice. - - * many files (obscure_syntax): rename to `re_default_syntax'. - -Fri Sep 4 09:06:53 1992 Karl Berry (karl@hayley) - - * GNUmakefile (extraclean): new target. - (realclean): delete the info files. - -Wed Sep 2 08:14:42 1992 Karl Berry (karl@hayley) - - * regex.h: doc fix. - -Sun Aug 23 06:53:15 1992 Karl Berry (karl@hayley) - - * regex.[ch] (re_comp): no const in the return type (from djm). - -Fri Aug 14 07:25:46 1992 Karl Berry (karl@hayley) - - * regex.c (DO_RANGE): declare variables as unsigned chars, not - signed chars (from jimb). - -Wed Jul 29 18:33:53 1992 Karl Berry (karl@claude.cs.umb.edu) - - * Version 0.9. - - * GNUmakefile (distclean): do not remove regex.texinfo. - (realclean): remove it here. - - * tregress.c (simple_test): initialize buf.buffer. - -Sun Jul 26 08:59:38 1992 Karl Berry (karl@hayley) - - * regex.c (push_dummy_failure): new opcode and corresponding - case in the various routines. Pushed at the end of - alternatives. - - * regex.c (jump_past_next_alt): rename to `jump_past_alt', for - brevity. - (no_pop_jump): rename to `jump'. - - * regex.c (regex_compile) [DEBUG]: terminate printing of pattern - with a newline. - - * NEWS: new file. - - * tregress.c (simple_{compile,match,test}): routines to simplify all - these little tests. - - * tregress.c: test for matching as much as possible. - -Fri Jul 10 06:53:32 1992 Karl Berry (karl@hayley) - - * Version 0.8. - -Wed Jul 8 06:39:31 1992 Karl Berry (karl@hayley) - - * regex.c (SIGN_EXTEND_CHAR): #undef any previous definition, as - ours should always work properly. - -Mon Jul 6 07:10:50 1992 Karl Berry (karl@hayley) - - * iregex.c (main) [DEBUG]: conditionalize the call to - print_compiled_pattern. - - * iregex.c (main): initialize buf.buffer to NULL. - * tregress (test_regress): likewise. - - * regex.c (alloca) [sparc]: #if on HAVE_ALLOCA_H instead. - - * tregress.c (test_regress): didn't have jla's test quite right. - -Sat Jul 4 09:02:12 1992 Karl Berry (karl@hayley) - - * regex.c (re_match_2): only REGEX_ALLOCATE all the register - vectors if the pattern actually has registers. - (match_end): new variable to avoid having to use best_regend[0]. - - * regex.c (IS_IN_FIRST_STRING): rename to FIRST_STRING_P. - - * regex.c: doc fixes. - - * tregess.c (test_regress): new fastmap test forwarded by rms. - - * tregress.c (test_regress): initialize the fastmap field. - - * tregress.c (test_regress): new test from jla that aborted - in re_search_2. - -Fri Jul 3 09:10:05 1992 Karl Berry (karl@hayley) - - * tregress.c (test_regress): add tests for translating charsets, - from kaoru. - - * GNUmakefile (common): add alloca.o. - * alloca.c: new file, copied from bison. - - * other.c (test_others): remove var `buf', since it's no longer used. - - * Below changes from ro@TechFak.Uni-Bielefeld.DE. - - * tregress.c (test_regress): initialize buf.allocated. - - * regex.c (re_compile_fastmap): initialize `succeed_n_p'. - - * GNUmakefile (regex): depend on $(common). - -Wed Jul 1 07:12:46 1992 Karl Berry (karl@hayley) - - * Version 0.7. - - * regex.c: doc fixes. - -Mon Jun 29 08:09:47 1992 Karl Berry (karl@fosse) - - * regex.c (pop_failure_point): change string vars to - `const char *' from `unsigned char *'. - - * regex.c: consolidate debugging stuff. - (print_partial_compiled_pattern): avoid enum clash. - -Mon Jun 29 07:50:27 1992 Karl Berry (karl@hayley) - - * xmalloc.c: new file. - * GNUmakefile (common): add it. - - * iregex.c (print_regs): new routine (from jimb). - (main): call it. - -Sat Jun 27 10:50:59 1992 Jim Blandy (jimb@pogo.cs.oberlin.edu) - - * xregex.c (re_match_2): When we have accepted a match and - restored d from best_regend[0], we need to set dend - appropriately as well. - -Sun Jun 28 08:48:41 1992 Karl Berry (karl@hayley) - - * tregress.c: rename from regress.c. - - * regex.c (print_compiled_pattern): improve charset case to ease - byte-counting. - Also, don't distinguish between Emacs and non-Emacs - {not,}wordchar opcodes. - - * regex.c (print_fastmap): move here. - * test.c: from here. - * regex.c (print_{{partial,}compiled_pattern,double_string}): - rename from ..._printer. Change calls here and in test.c. - - * regex.c: create from xregex.c and regexinc.c for once and for - all, and change the debug fns to be extern, instead of static. - * GNUmakefile: remove traces of xregex.c. - * test.c: put in externs, instead of including regexinc.c. - - * xregex.c: move interactive main program and scanstring to iregex.c. - * iregex.c: new file. - * upcase.c, printchar.c: new files. - - * various doc fixes and other cosmetic changes throughout. - - * regexinc.c (compiled_pattern_printer): change variable name, - for consistency. - (partial_compiled_pattern_printer): print other info about the - compiled pattern, besides just the opcodes. - * xregex.c (regex_compile) [DEBUG]: print the compiled pattern - when we're done. - - * xregex.c (re_compile_fastmap): in the duplicate case, set - `can_be_null' and return. - Also, set `bufp->can_be_null' according to a new variable, - `path_can_be_null'. - Also, rewrite main while loop to not test `p != NULL', since - we never set it that way. - Also, eliminate special `can_be_null' value for the endline case. - (re_search_2): don't test for the special value. - * regex.h (struct re_pattern_buffer): remove the definition. - -Sat Jun 27 15:00:40 1992 Karl Berry (karl@hayley) - - * xregex.c (re_compile_fastmap): remove the `RE_' from - `REG_RE_MATCH_NULL_AT_END'. - Also, assert the fastmap in the pattern buffer is non-null. - Also, reset `succeed_n_p' after we've - paid attention to it, instead of every time through the loop. - Also, in the `anychar' case, only clear fastmap['\n'] if the - syntax says to, and don't return prematurely. - Also, rearrange cases in some semblance of a rational order. - * regex.h (REG_RE_MATCH_NULL_AT_END): remove the `RE_' from the name. - - * other.c: take bug reports from here. - * regress.c: new file for them. - * GNUmakefile (test): add it. - * main.c (main): new possible test. - * test.h (test_type): new value in enum. - -Thu Jun 25 17:37:43 1992 Karl Berry (karl@hayley) - - * xregex.c (scanstring) [test]: new function from jimb to allow some - escapes. - (main) [test]: call it (on the string, not the pattern). - - * xregex.c (main): make return type `int'. - -Wed Jun 24 10:43:03 1992 Karl Berry (karl@hayley) - - * xregex.c (pattern_offset_t): change to `int', for the benefit - of patterns which compile to more than 2^15 bytes. - - * xregex.c (GET_BUFFER_SPACE): remove spurious braces. - - * xregex.texinfo (Using Registers): put in a stub to ``document'' - the new function. - * regex.h (re_set_registers) [!__STDC__]: declare. - * xregex.c (re_set_registers): declare K&R style (also move to a - different place in the file). - -Mon Jun 8 18:03:28 1992 Jim Blandy (jimb@pogo.cs.oberlin.edu) - - * regex.h (RE_NREGS): Doc fix. - - * xregex.c (re_set_registers): New function. - * regex.h (re_set_registers): Declaration for new function. - -Fri Jun 5 06:55:18 1992 Karl Berry (karl@hayley) - - * main.c (main): `return 0' instead of `exit (0)'. (From Paul Eggert) - - * regexinc.c (SIGN_EXTEND_CHAR): cast to unsigned char. - (extract_number, EXTRACT_NUMBER): don't bother to cast here. - -Tue Jun 2 07:37:53 1992 Karl Berry (karl@hayley) - - * Version 0.6. - - * Change copyrights to `1985, 89, ...'. - - * regex.h (REG_RE_MATCH_NULL_AT_END): new macro. - * xregex.c (re_compile_fastmap): initialize `can_be_null' to - `p==pend', instead of in the test at the top of the loop (as - it was, it was always being set). - Also, set `can_be_null'=1 if we would jump to the end of the - pattern in the `on_failure_jump' cases. - (re_search_2): check if `can_be_null' is 1, not nonzero. This - was the original test in rms' regex; why did we change this? - - * xregex.c (re_compile_fastmap): rename `is_a_succeed_n' to - `succeed_n_p'. - -Sat May 30 08:09:08 1992 Karl Berry (karl@hayley) - - * xregex.c (re_compile_pattern): declare `regnum' as `unsigned', - not `regnum_t', for the benefit of those patterns with more - than 255 groups. - - * xregex.c: rename `failure_stack' to `fail_stack', for brevity; - likewise for `match_nothing' to `match_null'. - - * regexinc.c (REGEX_REALLOCATE): take both the new and old - sizes, and copy only the old bytes. - * xregex.c (DOUBLE_FAILURE_STACK): pass both old and new. - * This change from Thorsten Ohl. - -Fri May 29 11:45:22 1992 Karl Berry (karl@hayley) - - * regexinc.c (SIGN_EXTEND_CHAR): define as `(signed char) c' - instead of relying on __CHAR_UNSIGNED__, to work with - compilers other than GCC. From Per Bothner. - - * main.c (main): change return type to `int'. - -Mon May 18 06:37:08 1992 Karl Berry (karl@hayley) - - * regex.h (RE_SYNTAX_AWK): typo in RE_RE_UNMATCHED... - -Fri May 15 10:44:46 1992 Karl Berry (karl@hayley) - - * Version 0.5. - -Sun May 3 13:54:00 1992 Karl Berry (karl@hayley) - - * regex.h (struct re_pattern_buffer): now it's just `regs_allocated'. - (REGS_UNALLOCATED, REGS_REALLOCATE, REGS_FIXED): new constants. - * xregex.c (regexec, re_compile_pattern): set the field appropriately. - (re_match_2): and use it. bufp can't be const any more. - -Fri May 1 15:43:09 1992 Karl Berry (karl@hayley) - - * regexinc.c: unconditionally include <sys/types.h>, first. - - * regex.h (struct re_pattern_buffer): rename - `caller_allocated_regs' to `regs_allocated_p'. - * xregex.c (re_compile_pattern): same change here. - (regexec): and here. - (re_match_2): reallocate registers if necessary. - -Fri Apr 10 07:46:50 1992 Karl Berry (karl@hayley) - - * regex.h (RE_SYNTAX{_POSIX,}_AWK): new definitions from Arnold. - -Sun Mar 15 07:34:30 1992 Karl Berry (karl at hayley) - - * GNUmakefile (dist): versionize regex.{c,h,texinfo}. - -Tue Mar 10 07:05:38 1992 Karl Berry (karl at hayley) - - * Version 0.4. - - * xregex.c (PUSH_FAILURE_POINT): always increment the failure id. - (DEBUG_STATEMENT) [DEBUG]: execute the statement even if `debug'==0. - - * xregex.c (pop_failure_point): if the saved string location is - null, keep the current value. - (re_match_2): at fail, test for a dummy failure point by - checking the restored pattern value, not string value. - (re_match_2): new case, `on_failure_keep_string_jump'. - (regex_compile): output this opcode in the .*\n case. - * regexinc.c (re_opcode_t): define the opcode. - (partial_compiled_pattern_pattern): add the new case. - -Mon Mar 9 09:09:27 1992 Karl Berry (karl at hayley) - - * xregex.c (regex_compile): optimize .*\n to output an - unconditional jump to the ., instead of pushing failure points - each time through the loop. - - * xregex.c (DOUBLE_FAILURE_STACK): compute the maximum size - ourselves (and correctly); change callers. - -Sun Mar 8 17:07:46 1992 Karl Berry (karl at hayley) - - * xregex.c (failure_stack_elt_t): change to `const char *', to - avoid warnings. - - * regex.h (re_set_syntax): declare this. - - * xregex.c (pop_failure_point) [DEBUG]: conditionally pass the - original strings and sizes; change callers. - -Thu Mar 5 16:35:35 1992 Karl Berry (karl at claude.cs.umb.edu) - - * xregex.c (regnum_t): new type for register/group numbers. - (compile_stack_elt_t, regex_compile): use it. - - * xregex.c (regexec): declare len as `int' to match re_search. - - * xregex.c (re_match_2): don't declare p1 twice. - - * xregex.c: change `while (1)' to `for (;;)' to avoid silly - compiler warnings. - - * regex.h [__STDC__]: use #if, not #ifdef. - - * regexinc.c (REGEX_REALLOCATE): cast the result of alloca to - (char *), to avoid warnings. - - * xregex.c (regerror): declare variable as const. - - * xregex.c (re_compile_pattern, re_comp): define as returning a const - char *. - * regex.h (re_compile_pattern, re_comp): likewise. - -Thu Mar 5 15:57:56 1992 Karl Berry (karl@hal) - - * xregex.c (regcomp): declare `syntax' as unsigned. - - * xregex.c (re_match_2): try to avoid compiler warnings about - unsigned comparisons. - - * GNUmakefile (test-xlc): new target. - - * regex.h (reg_errcode_t): remove trailing comma from definition. - * regexinc.c (re_opcode_t): likewise. - -Thu Mar 5 06:56:07 1992 Karl Berry (karl at hayley) - - * GNUmakefile (dist): add version numbers automatically. - (versionfiles): new variable. - (regex.{c,texinfo}): don't add version numbers here. - * regex.h: put in placeholder instead of the version number. - -Fri Feb 28 07:11:33 1992 Karl Berry (karl at hayley) - - * xregex.c (re_error_msg): declare const, since it is. - -Sun Feb 23 05:41:57 1992 Karl Berry (karl at fosse) - - * xregex.c (PAT_PUSH{,_2,_3}, ...): cast args to avoid warnings. - (regex_compile, regexec): return REG_NOERROR, instead - of 0, on success. - (boolean): define as char, and #define false and true. - * regexinc.c (STREQ): cast the result. - -Sun Feb 23 07:45:38 1992 Karl Berry (karl at hayley) - - * GNUmakefile (test-cc, test-hc, test-pcc): new targets. - - * regex.inc (extract_number, extract_number_and_incr) [DEBUG]: - only define if we are debugging. - - * xregex.c [_AIX]: do #pragma alloca first if necessary. - * regexinc.c [_AIX]: remove the #pragma from here. - - * regex.h (reg_syntax_t): declare as unsigned, and redo the enum - as #define's again. Some compilers do stupid things with enums. - -Thu Feb 20 07:19:47 1992 Karl Berry (karl at hayley) - - * Version 0.3. - - * xregex.c, regex.h (newline_anchor_match_p): rename to - `newline_anchor'; dumb idea to change the name. - -Tue Feb 18 07:09:02 1992 Karl Berry (karl at hayley) - - * regexinc.c: go back to original, i.e., don't include - <string.h> or define strchr. - * xregex.c (regexec): don't bother with adding characters after - newlines to the fastmap; instead, just don't use a fastmap. - * xregex.c (regcomp): set the buffer and fastmap fields to zero. - - * xregex.texinfo (GNU r.e. compiling): have to initialize more - than two fields. - - * regex.h (struct re_pattern_buffer): rename `newline_anchor' to - `newline_anchor_match_p', as we're back to two cases. - * xregex.c (regcomp, re_compile_pattern, re_comp): change - accordingly. - (re_match_2): at begline and endline, POSIX is not a special - case anymore; just check newline_anchor_match_p. - -Thu Feb 13 16:29:33 1992 Karl Berry (karl at hayley) - - * xregex.c (*empty_string*): rename to *null_string*, for brevity. - -Wed Feb 12 06:36:22 1992 Karl Berry (karl at hayley) - - * xregex.c (re_compile_fastmap): at endline, don't set fastmap['\n']. - (re_match_2): rewrite the begline/endline cases to take account - of the new field newline_anchor. - -Tue Feb 11 14:34:55 1992 Karl Berry (karl at hayley) - - * regexinc.c [!USG etc.]: include <strings.h> and define strchr - as index. - - * xregex.c (re_search_2): when searching backwards, declare `c' - as a char and use casts when using it as an array subscript. - - * xregex.c (regcomp): if REG_NEWLINE, set - RE_HAT_LISTS_NOT_NEWLINE. Set the `newline_anchor' field - appropriately. - (regex_compile): compile [^...] as matching a \n according to - the syntax bit. - (regexec): if doing REG_NEWLINE stuff, compile a fastmap and add - characters after any \n's to the newline. - * regex.h (RE_HAT_LISTS_NOT_NEWLINE): new syntax bit. - (struct re_pattern_buffer): rename `posix_newline' to - `newline_anchor', define constants for its values. - -Mon Feb 10 07:22:50 1992 Karl Berry (karl at hayley) - - * xregex.c (re_compile_fastmap): combine the code at the top and - bottom of the loop, as it's essentially identical. - -Sun Feb 9 10:02:19 1992 Karl Berry (karl at hayley) - - * xregex.texinfo (POSIX Translate Tables): remove this, as it - doesn't match the spec. - - * xregex.c (re_compile_fastmap): if we finish off a path, go - back to the top (to set can_be_null) instead of returning - immediately. - - * xregex.texinfo: changes from bob. - -Sat Feb 1 07:03:25 1992 Karl Berry (karl at hayley) - - * xregex.c (re_search_2): doc fix (from rms). - -Fri Jan 31 09:52:04 1992 Karl Berry (karl at hayley) - - * xregex.texinfo (GNU Searching): clarify the range arg. - - * xregex.c (re_match_2, at_endline_op_p): add extra parens to - get rid of GCC 2's (silly, IMHO) warning about && within ||. - - * xregex.c (common_op_match_empty_string_p): use - MATCH_NOTHING_UNSET_VALUE, not -1. - -Thu Jan 16 08:43:02 1992 Karl Berry (karl at hayley) - - * xregex.c (SET_REGS_MATCHED): only set the registers from - lowest to highest. - - * regexinc.c (MIN): new macro. - * xregex.c (re_match_2): only check min (num_regs, - regs->num_regs) when we set the returned regs. - - * xregex.c (re_match_2): set registers after the first - num_regs to -1 before we return. - -Tue Jan 14 16:01:42 1992 Karl Berry (karl at hayley) - - * xregex.c (re_match_2): initialize max (RE_NREGS, re_nsub + 1) - registers (from rms). - - * xregex.c, regex.h: don't abbreviate `19xx' to `xx'. - - * regexinc.c [!emacs]: include <sys/types.h> before <unistd.h>. - (from ro@thp.Uni-Koeln.DE). - -Thu Jan 9 07:23:00 1992 Karl Berry (karl at hayley) - - * xregex.c (*unmatchable): rename to `match_empty_string_p'. - (CAN_MATCH_NOTHING): rename to `REG_MATCH_EMPTY_STRING_P'. - - * regexinc.c (malloc, realloc): remove prototypes, as they can - cause clashes (from rms). - -Mon Jan 6 12:43:24 1992 Karl Berry (karl at claude.cs.umb.edu) - - * Version 0.2. - -Sun Jan 5 10:50:38 1992 Karl Berry (karl at hayley) - - * xregex.texinfo: bring more or less up-to-date. - * GNUmakefile (regex.texinfo): generate from regex.h and - xregex.texinfo. - * include.awk: new file. - - * xregex.c: change all calls to the fn extract_number_and_incr - to the macro. - - * xregex.c (re_match_2) [emacs]: in at_dot, use PTR_CHAR_POS + 1, - instead of bf_* and sl_*. Cast d to unsigned char *, to match - the declaration in Emacs' buffer.h. - [emacs19]: in before_dot, at_dot, and after_dot, likewise. - - * regexinc.c: unconditionally include <sys/types.h>. - - * regexinc.c (alloca) [!alloca]: Emacs config files sometimes - define this, so don't define it if it's already defined. - -Sun Jan 5 06:06:53 1992 Karl Berry (karl at fosse) - - * xregex.c (re_comp): fix type conflicts with regex_compile (we - haven't been compiling this). - - * regexinc.c (SIGN_EXTEND_CHAR): use `__CHAR_UNSIGNED__', not - `CHAR_UNSIGNED'. - - * regexinc.c (NULL) [!NULL]: define it (as zero). - - * regexinc.c (extract_number): remove the temporaries. - -Sun Jan 5 07:50:14 1992 Karl Berry (karl at hayley) - - * regex.h (regerror) [!__STDC__]: return a size_t, not a size_t *. - - * xregex.c (PUSH_FAILURE_POINT, ...): declare `destination' as - `char *' instead of `void *', to match alloca declaration. - - * xregex.c (regerror): use `size_t' for the intermediate values - as well as the return type. - - * xregex.c (regexec): cast the result of malloc. - - * xregex.c (regexec): don't initialize `private_preg' in the - declaration, as old C compilers can't do that. - - * xregex.c (main) [test]: declare printchar void. - - * xregex.c (assert) [!DEBUG]: define this to do nothing, and - remove #ifdef DEBUG's from around asserts. - - * xregex.c (re_match_2): remove error message when not debugging. - -Sat Jan 4 09:45:29 1992 Karl Berry (karl at hayley) - - * other.c: test the bizarre duplicate case in re_compile_fastmap - that I just noticed. - - * test.c (general_test): don't test registers beyond the end of - correct_regs, as well as regs. - - * xregex.c (regex_compile): at handle_close, don't assign to - *inner_group_loc if we didn't push a start_memory (because the - group number was too big). In fact, don't push or pop the - inner_group_offset in that case. - - * regex.c: rename to xregex.c, since it's not the whole thing. - * regex.texinfo: likewise. - * GNUmakefile: change to match. - - * regex.c [DEBUG]: only include <stdio.h> if debugging. - - * regexinc.c (SIGN_EXTEND_CHAR) [CHAR_UNSIGNED]: if it's already - defined, don't redefine it. - - * regex.c: define _GNU_SOURCE at the beginning. - * regexinc.c (isblank) [!isblank]: define it. - (isgraph) [!isgraph]: change conditional to this, and remove the - sequent stuff. - - * regex.c (regex_compile): add `blank' character class. - - * regex.c (regex_compile): don't use a uchar variable to loop - through all characters. - - * regex.c (regex_compile): at '[', improve logic for checking - that we have enough space for the charset. - - * regex.h (struct re_pattern_buffer): declare translate as char - * again. We only use it as an array subscript once, I think. - - * regex.c (TRANSLATE): new macro to cast the data character - before subscripting. - (num_internal_regs): rename to `num_regs'. - -Fri Jan 3 07:58:01 1992 Karl Berry (karl at hayley) - - * regex.h (struct re_pattern_buffer): declare `allocated' and - `used' as unsigned long, since these are never negative. - - * regex.c (compile_stack_element): rename to compile_stack_elt_t. - (failure_stack_element): similarly. - - * regexinc.c (TALLOC, RETALLOC): new macros to simplify - allocation of arrays. - - * regex.h (re_*) [__STDC__]: don't declare string args unsigned - char *; that makes them incompatible with string constants. - (struct re_pattern_buffer): declare the pattern and translate - table as unsigned char *. - * regex.c (most routines): use unsigned char vs. char consistently. - - * regex.h (re_compile_pattern): do not declare the length arg as - const. - * regex.c (re_compile_pattern): likewise. - - * regex.c (POINTER_TO_REG): rename to `POINTER_TO_OFFSET'. - - * regex.h (re_registers): declare `start' and `end' as - `regoff_t', instead of `int'. - - * regex.c (regexec): if either of the malloc's for the register - information fail, return failure. - - * regex.h (RE_NREGS): define this again, as 30 (from jla). - (RE_ALLOCATE_REGISTERS): remove this. - (RE_SYNTAX_*): remove it from definitions. - (re_pattern_buffer): remove `return_default_num_regs', add - `caller_allocated_regs'. - * regex.c (re_compile_pattern): clear no_sub and - caller_allocated_regs in the pattern. - (regcomp): set caller_allocated_regs. - (re_match_2): do all register allocation at the end of the - match; implement new semantics. - - * regex.c (MAX_REGNUM): new macro. - (regex_compile): at handle_open and handle_close, if the group - number is too large, don't push the start/stop memory. - -Thu Jan 2 07:56:10 1992 Karl Berry (karl at hayley) - - * regex.c (re_match_2): if the back reference is to a group that - never matched, then goto fail, not really_fail. Also, don't - test if the pattern can match the empty string. Why did we - ever do that? - (really_fail): this label no longer needed. - - * regexinc.c [STDC_HEADERS]: use only this to test if we should - include <stdlib.h>. - - * regex.c (DO_RANGE, regex_compile): translate in all cases - except the single character after a \. - - * regex.h (RE_AWK_CLASS_HACK): rename to - RE_BACKSLASH_ESCAPE_IN_LISTS. - * regex.c (regex_compile): change use. - - * regex.c (re_compile_fastmap): do not translate the characters - again; we already translated them at compilation. (From ylo@ngs.fi.) - - * regex.c (re_match_2): in case for at_dot, invert sense of - comparison and find the character number properly. (From - worley@compass.com.) - (re_match_2) [emacs]: remove the cases for before_dot and - after_dot, since there's no way to specify them, and the code - is wrong (judging from this change). - -Wed Jan 1 09:13:38 1992 Karl Berry (karl at hayley) - - * psx-{interf,basic,extend}.c, other.c: set `t' as the first - thing, so that if we run them in sucession, general_test's - kludge to see if we're doing POSIX tests works. - - * test.h (test_type): add `all_test'. - * main.c: add case for `all_test'. - - * regexinc.c (partial_compiled_pattern_printer, - double_string_printer): don't print anything if we're passed null. - - * regex.c (PUSH_FAILURE_POINT): do not scan for the highest and - lowest active registers. - (re_match_2): compute lowest/highest active regs at start_memory and - stop_memory. - (NO_{LOW,HIGH}EST_ACTIVE_REG): new sentinel values. - (pop_failure_point): return the lowest/highest active reg values - popped; change calls. - - * regex.c [DEBUG]: include <assert.h>. - (various routines) [DEBUG]: change conditionals to assertions. - - * regex.c (DEBUG_STATEMENT): new macro. - (PUSH_FAILURE_POINT): use it to increment num_regs_pushed. - (re_match_2) [DEBUG]: only declare num_regs_pushed if DEBUG. - - * regex.c (*can_match_nothing): rename to *unmatchable. - - * regex.c (re_match_2): at stop_memory, adjust argument reading. - - * regex.h (re_pattern_buffer): declare `can_be_null' as a 2-bit - bit field. - - * regex.h (re_pattern_buffer): declare `buffer' unsigned char *; - no, dumb idea. The pattern can have signed number. - - * regex.c (re_match_2): in maybe_pop_jump case, skip over the - right number of args to the group operators, and don't do - anything with endline if posix_newline is not set. - - * regex.c, regexinc.c (all the things we just changed): go back - to putting the inner group count after the start_memory, - because we need it in the on_failure_jump case in re_match_2. - But leave it after the stop_memory also, since we need it - there in re_match_2, and we don't have any way of getting back - to the start_memory. - - * regexinc.c (partial_compiled_pattern_printer): adjust argument - reading for start/stop_memory. - * regex.c (re_compile_fastmap, group_can_match_nothing): likewise. - -Tue Dec 31 10:15:08 1991 Karl Berry (karl at hayley) - - * regex.c (bits list routines): remove these. - (re_match_2): get the number of inner groups from the pattern, - instead of keeping track of it at start and stop_memory. - Put the count after the stop_memory, not after the - start_memory. - (compile_stack_element): remove `fixup_inner_group' member, - since we now put it in when we can compute it. - (regex_compile): at handle_open, don't push the inner group - offset, and at handle_close, don't pop it. - - * regex.c (level routines): remove these, and their uses in - regex_compile. This was another manifestation of having to find - $'s that were endlines. - - * regex.c (regexec): this does searching, not matching (a - well-disguised part of the standard). So rewrite to use - `re_search' instead of `re_match'. - * psx-interf.c (test_regexec): add tests to, uh, match. - - * regex.h (RE_TIGHT_ALT): remove this; nobody uses it. - * regex.c: remove the code that was supposed to implement it. - - * other.c (test_others): ^ and $ never match newline characters; - RE_CONTEXT_INVALID_OPS doesn't affect anchors. - - * psx-interf.c (test_regerror): update for new error messages. - - * psx-extend.c: it's now ok to have an alternative be just a $, - so remove all the tests which supposed that was invalid. - -Wed Dec 25 09:00:05 1991 Karl Berry (karl at hayley) - - * regex.c (regex_compile): in handle_open, don't skip over ^ and - $ when checking for an empty group. POSIX has changed the - grammar. - * psx-extend.c (test_posix_extended): thus, move (^$) tests to - valid section. - - * regexinc.c (boolean): move from here to test.h and regex.c. - * test files: declare verbose, omit_register_tests, and - test_should_match as boolean. - - * psx-interf.c (test_posix_c_interface): remove the `c_'. - * main.c: likewise. - - * psx-basic.c (test_posix_basic): ^ ($) is an anchor after - (before) an open (close) group. - - * regex.c (re_match_2): in endline, correct precedence of - posix_newline condition. - -Tue Dec 24 06:45:11 1991 Karl Berry (karl at hayley) - - * test.h: incorporate private-tst.h. - * test files: include test.h, not private-tst.h. - - * test.c (general_test): set posix_newline to zero if we are - doing POSIX tests (unfortunately, it's difficult to call - regcomp in this case, which is what we should really be doing). - - * regex.h (reg_syntax_t): make this an enumeration type which - defines the syntax bits; renames re_syntax_t. - - * regex.c (at_endline_op_p): don't preincrement p; then if it's - not an empty string op, we lose. - - * regex.h (reg_errcode_t): new enumeration type of the error - codes. - * regex.c (regex_compile): return that type. - - * regex.c (regex_compile): in [, initialize - just_had_a_char_class to false; somehow I had changed this to - true. - - * regex.h (RE_NO_CONSECUTIVE_REPEATS): remove this, since we - don't use it, and POSIX doesn't require this behavior anymore. - * regex.c (regex_compile): remove it from here. - - * regex.c (regex_compile): remove the no_op insertions for - verify_and_adjust_endlines, since that doesn't exist anymore. - - * regex.c (regex_compile) [DEBUG]: use printchar to print the - pattern, so unprintable bytes will print properly. - - * regex.c: move re_error_msg back. - * test.c (general_test): print the compile error if the pattern - was invalid. - -Mon Dec 23 08:54:53 1991 Karl Berry (karl at hayley) - - * regexinc.c: move re_error_msg here. - - * regex.c (re_error_msg): the ``message'' for success must be - NULL, to keep the interface to re_compile_pattern the same. - (regerror): if the msg is null, use "Success". - - * rename most test files for consistency. Change Makefile - correspondingly. - - * test.c (most routines): add casts to (unsigned char *) when we - call re_{match,search}{,_2}. - -Sun Dec 22 09:26:06 1991 Karl Berry (karl at hayley) - - * regex.c (re_match_2): declare string args as unsigned char * - again; don't declare non-pointer args const; declare the - pattern buffer const. - (re_match): likewise. - (re_search_2, re_search): likewise, except don't declare the - pattern const, since we make a fastmap. - * regex.h [__STDC__]: change prototypes. - - * regex.c (regex_compile): return an error code, not a string. - (re_err_list): new table to map from error codes to string. - (re_compile_pattern): return an element of re_err_list. - (regcomp): don't test all the strings. - (regerror): just use the list. - (put_in_buffer): remove this. - - * regex.c (equivalent_failure_points): remove this. - - * regex.c (re_match_2): don't copy the string arguments into - non-const pointers. We never alter the data. - - * regex.c (re_match_2): move assignment to `is_a_jump_n' out of - the main loop. Just initialize it right before we do - something with it. - - * regex.[ch] (re_match_2): don't declare the int parameters const. - -Sat Dec 21 08:52:20 1991 Karl Berry (karl at hayley) - - * regex.h (re_syntax_t): new type; declare to be unsigned - (previously we used int, but since we do bit operations on - this, unsigned is better, according to H&S). - (obscure_syntax, re_pattern_buffer): use that type. - * regex.c (re_set_syntax, regex_compile): likewise. - - * regex.h (re_pattern_buffer): new field `posix_newline'. - * regex.c (re_comp, re_compile_pattern): set to zero. - (regcomp): set to REG_NEWLINE. - * regex.h (RE_HAT_LISTS_NOT_NEWLINE): remove this (we can just - check `posix_newline' instead.) - - * regex.c (op_list_type, op_list, add_op): remove these. - (verify_and_adjust_endlines): remove this. - (pattern_offset_list_type, *pattern_offset* routines): and these. - These things all implemented the nonleading/nontrailing position - code, which was very long, had a few remaining problems, and - is no longer needed. So... - - * regexinc.c (STREQ): new macro to abbreviate strcmp(,)==0, for - brevity. Change various places in regex.c to use it. - - * regex{,inc}.c (enum regexpcode): change to a typedef - re_opcode_t, for brevity. - - * regex.h (re_syntax_table) [SYNTAX_TABLE]: remove this; it - should only be in regex.c, I think, since we don't define it - in this case. Maybe it should be conditional on !SYNTAX_TABLE? - - * regexinc.c (partial_compiled_pattern_printer): simplify and - distinguish the emacs/not-emacs (not)wordchar cases. - -Fri Dec 20 08:11:38 1991 Karl Berry (karl at hayley) - - * regexinc.c (regexpcode) [emacs]: only define the Emacs opcodes - if we are ifdef emacs. - - * regex.c (BUF_PUSH*): rename to PAT_PUSH*. - - * regex.c (regex_compile): in $ case, go back to essentially the - original code for deciding endline op vs. normal char. - (at_endline_op_p): new routine. - * regex.h (RE_ANCHORS_ONLY_AT_ENDS, RE_CONTEXT_INVALID_ANCHORS, - RE_REPEATED_ANCHORS_AWAY, RE_NO_ANCHOR_AT_NEWLINE): remove - these. POSIX has simplified the rules for anchors in draft - 11.2. - (RE_NEWLINE_ORDINARY): new syntax bit. - (RE_CONTEXT_INDEP_ANCHORS): change description to be compatible - with POSIX. - * regex.texinfo (Syntax Bits): remove the descriptions. - -Mon Dec 16 08:12:40 1991 Karl Berry (karl at hayley) - - * regex.c (re_match_2): in jump_past_next_alt, unconditionally - goto no_pop. The only register we were finding was one which - enclosed the whole alternative expression, not one around an - individual alternative. So we were never doing what we - thought we were doing, and this way makes (|a) against the - empty string fail. - - * regex.c (regex_compile): remove `highest_ever_regnum', and - don't restore regnum from the stack; just put it into a - temporary to put into the stop_memory. Otherwise, groups - aren't numbered consecutively. - - * regex.c (is_in_compile_stack): rename to - `group_in_compile_stack'; remove unnecessary test for the - stack being empty. - - * regex.c (re_match_2): in on_failure_jump, skip no_op's before - checking for the start_memory, in case we were called from - succeed_n. - -Sun Dec 15 16:20:48 1991 Karl Berry (karl at hayley) - - * regex.c (regex_compile): in duplicate case, use - highest_ever_regnum instead of regnum, since the latter is - reverted at stop_memory. - - * regex.c (re_match_2): in on_failure_jump, if the * applied to - a group, save the information for that group and all inner - groups (by making it active), even though we're not inside it - yet. - -Sat Dec 14 09:50:59 1991 Karl Berry (karl at hayley) - - * regex.c (PUSH_FAILURE_ITEM, POP_FAILURE_ITEM): new macros. - Use them instead of copying the stack manipulating a zillion - times. - - * regex.c (PUSH_FAILURE_POINT, pop_failure_point) [DEBUG]: save - and restore a unique identification value for each failure point. - - * regexinc.c (partial_compiled_pattern_printer): don't print an - extra / after duplicate commands. - - * regex.c (regex_compile): in back-reference case, allow a back - reference to register `regnum'. Otherwise, even `\(\)\1' - fails, since regnum is 1 at the back-reference. - - * regex.c (re_match_2): in fail, don't examine the pattern if we - restored to pend. - - * test_private.h: rename to private_tst.h. Change includes. - - * regex.c (extend_bits_list): compute existing size for realloc - in bytes, not blocks. - - * regex.c (re_match_2): in jump_past_next_alt, the for loop was - missing its (empty) statement. Even so, some register tests - still fail, although in a different way than in the previous change. - -Fri Dec 13 15:55:08 1991 Karl Berry (karl at hayley) - - * regex.c (re_match_2): in jump_past_next_alt, unconditionally - goto no_pop, since we weren't properly detecting if the - alternative matched something anyway. No, we need to not jump - to keep the register values correct; just change to not look at - register zero and not test RE_NO_EMPTY_ALTS (which is a - compile-time thing). - - * regex.c (SET_REGS_MATCHED): start the loop at 1, since we never - care about register zero until the very end. (I think.) - - * regex.c (PUSH_FAILURE_POINT, pop_failure_point): go back to - pushing and popping the active registers, instead of only doing - the registers before a group: (fooq|fo|o)*qbar against fooqbar - fails, since we restore back into the middle of group 1, yet it - isn't active, because the previous restore clobbered the active flag. - -Thu Dec 12 17:25:36 1991 Karl Berry (karl at hayley) - - * regex.c (PUSH_FAILURE_POINT): do not call - `equivalent_failure_points' after all; it causes the registers - to be ``wrong'' (according to POSIX), and an infinite loop on - `((a*)*)*' against `ab'. - - * regex.c (re_compile_fastmap): don't push `pend' on the failure - stack. - -Tue Dec 10 10:30:03 1991 Karl Berry (karl at hayley) - - * regex.c (PUSH_FAILURE_POINT): if pushing same failure point that - is on the top of the stack, fail. - (equivalent_failure_points): new routine. - - * regex.c (re_match_2): add debug statements for every opcode we - execute. - - * regex.c (regex_compile/handle_close): restore - `fixup_inner_group_count' and `regnum' from the stack. - -Mon Dec 9 13:51:15 1991 Karl Berry (karl at hayley) - - * regex.c (PUSH_FAILURE_POINT): declare `this_reg' as int, so - unsigned arithmetic doesn't happen when we don't want to save - the registers. - -Tue Dec 3 08:11:10 1991 Karl Berry (karl at hayley) - - * regex.c (extend_bits_list): divide size by bits/block. - - * regex.c (init_bits_list): remove redundant assignmen to - `bits_list_ptr'. - - * regexinc.c (partial_compiled_pattern_printer): don't do *p++ - twice in the same expr. - - * regex.c (re_match_2): at on_failure_jump, use the correct - pattern positions for getting the stuff following the start_memory. - - * regex.c (struct register_info): remove the bits_list for the - inner groups; make that a separate variable. - -Mon Dec 2 10:42:07 1991 Karl Berry (karl at hayley) - - * regex.c (PUSH_FAILURE_POINT): don't pass `failure_stack' as an - arg; change callers. - - * regex.c (PUSH_FAILURE_POINT): print items in order they are - pushed. - (pop_failure_point): likewise. - - * regex.c (main): prompt for the pattern and string. - - * regex.c (FREE_VARIABLES) [!REGEX_MALLOC]: declare as nothing; - remove #ifdefs from around calls. - - * regex.c (extract_number, extract_number_and_incr): declare static. - - * regex.c: remove the canned main program. - * main.c: new file. - * Makefile (COMMON): add main.o. - -Tue Sep 24 06:26:51 1991 Kathy Hargreaves (kathy at fosse) - - * regex.c (re_match_2): Made `pend' and `dend' not register variables. - Only set string2 to string1 if string1 isn't null. - Send address of p, d, regstart, regend, and reg_info to - pop_failure_point. - Put in more debug statements. - - * regex.c [debug]: Added global variable. - (DEBUG_*PRINT*): Only print if `debug' is true. - (DEBUG_DOUBLE_STRING_PRINTER): Changed DEBUG_STRING_PRINTER's - name to this. - Changed some comments. - (PUSH_FAILURE_POINT): Moved and added some debugging statements. - Was saving regstart on the stack twice instead of saving both - regstart and regend; remedied this. - [NUM_REGS_ITEMS]: Changed from 3 to 4, as now save lowest and - highest active registers instead of highest used one. - [NUM_NON_REG_ITEMS]: Changed name of NUM_OTHER_ITEMS to this. - (NUM_FAILURE_ITEMS): Use active registers instead of number 0 - through highest used one. - (re_match_2): Have pop_failure_point put things in the variables. - (pop_failure_point): Have it do what the fail case in re_match_2 - did with the failure stack, instead of throwing away the stuff - popped off. re_match_2 can ignore results when it doesn't - need them. - - -Thu Sep 5 13:23:28 1991 Kathy Hargreaves (kathy at fosse) - - * regex.c (banner): Changed copyright years to be separate. - - * regex.c [CHAR_UNSIGNED]: Put __ at both ends of this name. - [DEBUG, debug_count, *debug_p, DEBUG_PRINT_1, DEBUG_PRINT_2, - DEBUG_COMPILED_PATTERN_PRINTER ,DEBUG_STRING_PRINTER]: - defined these for debugging. - (extract_number): Added this (debuggable) routine version of - the macro EXTRACT_NUMBER. Ditto for EXTRACT_NUMBER_AND_INCR. - (re_compile_pattern): Set return_default_num_regs if the - syntax bit RE_ALLOCATE_REGISTERS is set. - [REGEX_MALLOC]: Renamed USE_ALLOCA to this. - (BUF_POP): Got rid of this, as don't ever use it. - (regex_compile): Made the type of `pattern' not be register. - If DEBUG, print the pattern to compile. - (re_match_2): If had a `$' in the pattern before a `^' then - don't record the `^' as an anchor. - Put (enum regexpcode) before references to b, as suggested - [RE_NO_BK_BRACES]: Changed RE_NO_BK_CURLY_BRACES to this. - (remove_pattern_offset): Removed this unused routine. - (PUSH_FAILURE_POINT): Changed to only save active registers. - Put in debugging statements. - (re_compile_fastmap): Made `pattern' not a register variable. - Use routine for extracting numbers instead of macro. - (re_match_2): Made `p', `mcnt' and `mcnt2' not register variables. - Added `num_regs_pushed' for debugging. - Only malloc registers if the syntax bit RE_ALLOCATE_REGISTERS is set. - Put in debug statements. - Put the macro NOTE_INNER_GROUP's code inline, as it was the - only called in one place. - For debugging, extract numbers using routines instead of macros. - In case fail: only restore pushed active registers, and added - debugging statements. - (pop_failure_point): Test for underfull stack. - (group_can_match_nothing, common_op_can_match_nothing): For - debugging, extract numbers using routines instead of macros. - (regexec): Changed formal parameters to not be prototypes. - Don't initialize `regs' or `private_preg' in their declarations. - -Tue Jul 23 18:38:36 1991 Kathy Hargreaves (kathy at hayley) - - * regex.h [RE_CONTEX_INDEP_OPS]: Moved the anchor stuff out of - this bit. - [RE_UNMATCHED_RIGHT_PAREN_ORD]: Defined this bit. - [RE_CONTEXT_INVALID_ANCHORS]: Defined this bit. - [RE_CONTEXT_INDEP_ANCHORS]: Defined this bit. - Added RE_CONTEXT_INDEP_ANCHORS to all syntaxes which had - RE_CONTEXT_INDEP_OPS. - Took RE_ANCHORS_ONLY_AT_ENDS out of the POSIX basic syntax. - Added RE_UNMATCHED_RIGHT_PAREN_ORD to the POSIX extended - syntax. - Took RE_REPEATED_ANCHORS_AWAY out of the POSIX extended syntax. - Defined REG_NOERROR (which will probably have to go away again). - Changed the type `off_t' to `regoff_t'. - - * regex.c: Changed some commments. - (regex_compile): Added variable `had_an_endline' to keep track - of if hit a `$' since the beginning of the pattern or the last - alternative (if any). - Changed RE_CONTEXT_INVALID_OPS and RE_CONTEXT_INDEP_OPS to - RE_CONTEXT_INVALID_ANCHORS and RE_CONTEXT_INDEP_ANCHORS where - appropriate. - Put a `no_op' in the pattern if a repeat is only zero or one - times; in this case and if it is many times (whereupon a jump - backwards is pushed instead), keep track of the operator for - verify_and_adjust_endlines. - If RE_UNMATCHED_RIGHT_PAREN is set, make an unmatched - close-group operator match `)'. - Changed all error exits to exit (1). - (remove_pattern_offset): Added this routine, but don't use it. - (verify_and_adjust_endlines): At top of routine, if initialize - routines run out of memory, return true after setting - enough_memory false. - At end of endline, et al. case, don't set *p to no_op. - Repetition operators also set the level and active groups' - match statuses, unless RE_REPEATED_ANCHORS_AWAY is set. - (get_group_match_status): Put a return in front of call to get_bit. - (re_compile_fastmap): Changed is_a_succeed_n to a boolean. - If at end of pattern, then if the failure stack isn't empty, - go back to the failure point. - In *jump* case, only pop the stack if what's on top of it is - where we've just jumped to. - (re_search_2): Return -2 instead of val if val is -2. - (group_can_match_nothing, alternative_can_match_nothing, - common_op_can-match_nothing): Now pass in reg_info for the - `duplicate' case. - (re_match_2): Don't skip over the next alternative also if - empty alternatives aren't allowed. - In fail case, if failed to a backwards jump that's part of a - repetition loop, pop the current failure point and use the - next one. - (pop_failure_point): Check that there's as many register items - on the failure stack as the stack says there are. - (common_op_can_match_nothing): Added variables `ret' and - `reg_no' so can set reg_info for the group encountered. - Also break without doing anything if hit a no_op or the other - kinds of `endline's. - If not done already, set reg_info in start_memory case. - Put in no_pop_jump for an optimized succeed_n of zero repetitions. - In succeed_n case, if the number isn't zero, then return false. - Added `duplicate' case. - -Sat Jul 13 11:27:38 1991 Kathy Hargreaves (kathy at hayley) - - * regex.h (REG_NOERROR): Added this error code definition. - - * regex.c: Took some redundant parens out of macros. - (enum regexpcode): Added jump_past_next_alt. - Wrapped some macros in `do..while (0)'. - Changed some comments. - (regex_compile): Use `fixup_alt_jump' instead of `fixup_jump'. - Use `maybe_pop_jump' instead of `maybe_pop_failure_jump'. - Use `jump_past_next_alt' instead of `no_pop_jump' when at the - end of an alternative. - (re_match_2): Used REGEX_ALLOCATE for the registers stuff. - In stop_memory case: Add more boolean tests to see if the - group is in a loop. - Added jump_past_next_alt case, which doesn't jump over the - next alternative if the last one didn't match anything. - Unfortunately, to make this work with, e.g., `(a+?*|b)*' - against `bb', I also had to pop the alternative's failure - point, which in turn broke backtracking! - In fail case: Detect a dummy failure point by looking at - failure_stack.avail - 2, not stack[-2]. - (pop_failure_point): Only pop if the stack isn't empty; don't - give an error if it is. (Not sure yet this is correct.) - (group_can_match_nothing): Make it return a boolean instead of int. - Make it take an argument indicating the end of where it should look. - If find a group that can match nothing, set the pointer - argument to past the group in the pattern. - Took out cases which can share with alternative_can_match_nothing - and call common_op_can_match_nothing. - Took ++ out of switch, so could call common_op_can_match_nothing. - Wrote lots more for on_failure_jump case to handle alternatives. - Main loop now doesn't look for matching stop_memory, but - rather the argument END; return true if hit the matching - stop_memory; this way can call itself for inner groups. - (alternative_can_match_nothing): Added for alternatives. - (common_op_can_match_nothing): Added for previous two routines' - common operators. - (regerror): Returns a message saying there's no error if gets - sent REG_NOERROR. - -Wed Jul 3 10:43:15 1991 Kathy Hargreaves (kathy at hayley) - - * regex.c: Removed unnecessary enclosing parens from several macros. - Put `do..while (0)' around a few. - Corrected some comments. - (INIT_FAILURE_STACK_SIZE): Deleted in favor of using - INIT_FAILURE_ALLOC. - (INIT_FAILURE_STACK, DOUBLE_FAILURE_STACK, PUSH_PATTERN_OP, - PUSH_FAILURE_POINT): Made routines of the same name (but with all - lowercase letters) into these macros, so could use `alloca' - when USE_ALLOCA is defined. The reason is stated below for - bits lists. Deleted analogous routines. - (re_compile_fastmap): Added variable void *destination for - PUSH_PATTERN_OP. - (re_match_2): Added variable void *destination for REGEX_REALLOCATE. - Used the failure stack macros in place of the routines. - Detected a dummy failure point by inspecting the failure stack's - (avail - 2)th element, not failure_stack.stack[-2]. This bug - arose when used the failure stack macros instead of the routines. - - * regex.c [USE_ALLOCA]: Put this conditional around previous - alloca stuff and defined these to work differently depending - on whether or not USE_ALLOCA is defined: - (REGEX_ALLOCATE): Uses either `alloca' or `malloc'. - (REGEX_REALLOCATE): Uses either `alloca' or `realloc'. - (INIT_BITS_LIST, EXTEND_BITS_LIST, SET_BIT_TO_VALUE): Defined - macro versions of routines with the same name (only with all - lowercase letters) so could use `alloc' in re_match_2. This - is to prevent core leaks when C-g is used in Emacs and to make - things faster and avoid storage fragmentation. These things - have to be macros because the results of `alloca' go away with - the routine by which it's called. - (BITS_BLOCK_SIZE, BITS_BLOCK, BITS_MASK): Moved to above the - above-mentioned macros instead of before the routines defined - below regex_compile. - (set_bit_to_value): Compacted some code. - (reg_info_type): Changed inner_groups field to be bits_list_type - so could be arbitrarily long and thus handle arbitrary nesting. - (NOTE_INNER_GROUP): Put `do...while (0)' around it so could - use as a statement. - Changed code to use bits lists. - Added variable void *destination for REGEX_REALLOCATE (whose call - is several levels in). - Changed variable name of `this_bit' to `this_reg'. - (FREE_VARIABLES): Only define and use if USE_ALLOCA is defined. - (re_match_2): Use REGEX_ALLOCATE instead of malloc. - Instead of setting INNER_GROUPS of reg_info to zero, have to - use INIT_BITS_LIST and return -2 (and free variables if - USE_ALLOCA isn't defined) if it fails. - -Fri Jun 28 13:45:07 1991 Karl Berry (karl at hayley) - - * regex.c (re_match_2): set value of `dend' when we restore `d'. - - * regex.c: remove declaration of alloca. - - * regex.c (MISSING_ISGRAPH): rename to `ISGRAPH_MISSING'. - - * regex.h [_POSIX_SOURCE]: remove these conditionals; always - define POSIX stuff. - * regex.c (_POSIX_SOURCE): change conditionals to use `POSIX' - instead. - -Sat Jun 1 16:56:50 1991 Kathy Hargreaves (kathy at hayley) - - * regex.*: Changed RE_CONTEXTUAL_* to RE_CONTEXT_*, - RE_TIGHT_VBAR to RE_TIGHT_ALT, RE_NEWLINE_OR to - RE_NEWLINE_ALT, and RE_DOT_MATCHES_NEWLINE to RE_DOT_NEWLINE. - -Wed May 29 09:24:11 1991 Karl Berry (karl at hayley) - - * regex.texinfo (POSIX Pattern Buffers): cross-reference the - correct node name (Match-beginning-of-line, not ..._line). - (Syntax Bits): put @code around all syntax bits. - -Sat May 18 16:29:58 1991 Karl Berry (karl at hayley) - - * regex.c (global): add casts to keep broken compilers from - complaining about malloc and realloc calls. - - * regex.c (isgraph) [MISSING_ISGRAPH]: change test to this, - instead of `#ifndef isgraph', since broken compilers can't - have both a macro and a symbol by the same name. - - * regex.c (re_comp, re_exec) [_POSIX_SOURCE]: do not define. - (regcomp, regfree, regexec, regerror) [_POSIX_SOURCE && !emacs]: - only define in this case. - -Mon May 6 17:37:04 1991 Kathy Hargreaves (kathy at hayley) - - * regex.h (re_search, re_search_2): Changed BUFFER to not be const. - - * regex.c (re_compile_pattern): `^' is in a leading position if - it precedes a newline. - (various routines): Added or changed header comments. - (double_pattern_offsets_list): Changed name from - `extend_pattern_offsets_list'. - (adjust_pattern_offsets_list): Changed return value from - unsigned to void. - (verify_and_adjust_endlines): Now returns `true' and `false' - instead of 1 and 0. - `$' is in a leading position if it follows a newline. - (set_bit_to_value, get_bit_value): Exit with error if POSITION < 0 - so now calling routines don't have to. - (init_failure_stack, inspect_failure_stack_top, - pop_failure_stack_top, push_pattern_op, double_failure_stack): - Now return value unsigned instead of boolean. - (re_search, re_search_2): Changed BUFP to not be const. - (re_search_2): Added variable const `private_bufp' to send to - re_match_2. - (push_failure_point): Made return value unsigned instead of boolean. - -Sat May 4 15:32:22 1991 Kathy Hargreaves (kathy at hayley) - - * regex.h (re_compile_fastmap): Added extern for this. - Changed some comments. - - * regex.c (re_compile_pattern): In case handle_bar: put invalid - pattern test before levels matching stuff. - Changed some commments. - Added optimizing test for detecting an empty alternative that - ends with a trailing '$' at the end of the pattern. - (re_compile_fastmap): Moved failure_stack stuff to before this - so could use it. Made its stack dynamic. - Made it return an int so that it could return -2 if its stack - couldn't be allocated. - Added to header comment (about the return values). - (init_failure_stack): Wrote so both re_match_2 and - re_compile_fastmap could use it similar stacks. - (double_failure_stack): Added for above reasons. - (push_pattern_op): Wrote for re_compile_fastmap. - (re_search_2): Now return -2 if re_compile_fastmap does. - (re_match_2): Made regstart and regend type failure_stack_element*. - (push_failure_point): Made pattern_place and string_place type - failure_stack_element*. - Call double_failure_stack now. - Return true instead of 1. - -Wed May 1 12:57:21 1991 Kathy Hargreaves (kathy at hayley) - - * regex.c (remove_intervening_anchors): Avoid erroneously making - ops into no_op's by making them no_op only when they're beglines. - (verify_and_adjust_endlines): Don't make '$' a normal character - if it's before a newline. - Look for the endline op in *p, not p[1]. - (failure_stack_element): Added this declaration. - (failure_stack_type): Added this declaration. - (INIT_FAILURE_STACK_SIZE, FAILURE_STACK_EMPTY, - FAILURE_STACK_PTR_EMPTY, REMAINING_AVAIL_SLOTS): Added for - failure stack. - (FAILURE_ITEM_SIZE, PUSH_FAILURE_POINT): Deleted. - (FREE_VARIABLES): Now free failure_stack.stack instead of stackb. - (re_match_2): deleted variables `initial_stack', `stackb', - `stackp', and `stacke' and added `failure_stack' to replace them. - Replaced calls to PUSH_FAILURE_POINT with those to - push_failure_point. - (push_failure_point): Added for re_match_2. - (pop_failure_point): Rewrote to use a failure_stack_type of stack. - (can_match_nothing): Moved definition to below re_match_2. - (bcmp_translate): Moved definition to below re_match_2. - -Mon Apr 29 14:20:54 1991 Kathy Hargreaves (kathy at hayley) - - * regex.c (enum regexpcode): Added codes endline_before_newline - and repeated_endline_before_newline so could detect these - types of endlines in the intermediate stages of a compiled - pattern. - (INIT_FAILURE_ALLOC): Renamed NFAILURES to this and set it to 5. - (BUF_PUSH): Put `do {...} while 0' around this. - (BUF_PUSH_2): Defined this to cut down on expansion of EXTEND_BUFFER. - (regex_compile): Changed some comments. - Now push endline_before_newline if find a `$' before a newline - in the pattern. - If a `$' might turn into an ordinary character, set laststart - to point to it. - In '^' case, if syntax bit RE_TIGHT_VBAR is set, then for `^' - to be in a leading position, it must be first in the pattern. - Don't have to check in one of the else clauses that it's not set. - If RE_CONTEXTUAL_INDEP_OPS isn't set but RE_ANCHORS_ONLY_AT_ENDS - is, make '^' a normal character if it isn't first in the pattern. - Can only detect at the end if a '$' after an alternation op is a - trailing one, so can't immediately detect empty alternatives - if a '$' follows a vbar. - Added a picture of the ``success jumps'' in alternatives. - Have to set bufp->used before calling verify_and_adjust_endlines. - Also do it before returning all error strings. - (remove_intervening_anchors): Now replaces the anchor with - repeated_endline_before_newline if it's an endline_before_newline. - (verify_and_adjust_endlines): Deleted SYNTAX parameter (could - use bufp's) and added GROUP_FORWARD_MATCH_STATUS so could - detect back references referring to empty groups. - Added variable `bend' to point past the end of the pattern buffer. - Added variable `previous_p' so wouldn't have to reinspect the - pattern buffer to see what op we just looked at. - Added endline_before_newline and repeated_endline_before_newline - cases. - When checking if in a trailing position, added case where '$' - has to be at the pattern's end if either of the syntax bits - RE_ANCHORS_ONLY_AT_ENDS or RE_TIGHT_VBAR are set. - Since `endline' can have the intermediate form `endline_in_repeat', - have to change it to `endline' if RE_REPEATED_ANCHORS_AWAY - isn't set. - Now disallow empty alternatives with trailing endlines in them - if RE_NO_EMPTY_ALTS is set. - Now don't make '$' an ordinary character if it precedes a newline. - Don't make it an ordinary character if it's before a newline. - Back references now affect the level matching something only if - they refer to nonempty groups. - (can_match_nothing): Now increment p1 in the switch, which - changes many of the cases, but makes the code more like what - it was derived from. - Adjust the return statement to reflect above. - (struct register_info): Made `can_match_nothing' field an int - instead of a bit so could have -1 in it if never set. - (MAX_FAILURE_ITEMS): Changed name from MAX_NUM_FAILURE_ITEMS. - (FAILURE_ITEM_SIZE): Defined how much space a failure items uses. - (PUSH_FAILURE_POINT): Changed variable `last_used_reg's name - to `highest_used_reg'. - Added variable `num_stack_items' and changed `len's name to - `stack_length'. - Test failure stack limit in terms of number of items in it, not - in terms of its length. rms' fix tested length against number - of items, which was a misunderstanding. - Use `realloc' instead of `alloca' to extend the failure stack. - Use shifts instead of multiplying by 2. - (FREE_VARIABLES): Free `stackb' instead of `initial_stack', as - might may have been reallocated. - (re_match_2): When mallocing `initial_stack', now multiply - the number of items wanted (what was there before) by - FAILURE_ITEM_SIZE. - (pop_failure_point): Need this procedure form of the macro of - the same name for debugging, so left it in and deleted the - macro. - (recomp): Don't free the pattern buffer's translate field. - -Mon Apr 15 09:47:47 1991 Kathy Hargreaves (kathy at hayley) - - * regex.h (RE_DUP_MAX): Moved to outside of #ifdef _POSIX_SOURCE. - * regex.c (#include <sys/types.h>): Removed #ifdef _POSIX_SOURCE - condition. - (malloc, realloc): Made return type void* #ifdef __STDC__. - (enum regexpcode): Added endline_in_repeat for the compiler's - use; this never ends up on the final compiled pattern. - (INIT_PATTERN_OFFSETS_LIST_SIZE): Initial size for - pattern_offsets_list_type. - (pattern_offset_type): Type for pattern offsets. - (pattern_offsets_list_type): Type for keeping a list of - pattern offsets. - (anchor_list_type): Changed to above type. - (PATTERN_OFFSETS_LIST_PTR_FULL): Tests if a pattern offsets - list is full. - (ANCHOR_LIST_PTR_FULL): Changed to above. - (BIT_BLOCK_SIZE): Changed to BITS_BLOCK_SIZE and moved to - above bits list routines below regex_compile. - (op_list_type): Defined to be pattern_offsets_list_type. - (compile_stack_type): Changed offsets to be - pattern_offset_type instead of unsigned. - (pointer): Changed the name of all structure fields from this - to `avail'. - (COMPILE_STACK_FULL): Changed so the stack is full if `avail' - is equal to `size' instead of `size' - 1. - (GET_BUFFER_SPACE): Changed `>=' to `>' in the while statement. - (regex_compile): Added variable `enough_memory' so could check - that routine that verifies '$' positions could return an - allocation error. - (group_count): Deleted this variable, as `regnum' already does - this work. - (op_list): Added this variable to keep track of operations - needed for verifying '$' positions. - (anchor_list): Now initialize using routine - `init_pattern_offsets_list'. - Consolidated the three bits_list initializations. - In case '$': Instead of trying to go past constructs which can - follow '$', merely detect the special case where it has to be - at the pattern's end, fix up any fixup jumps if necessary, - record the anchor if necessary and add an `endline' (and - possibly two `no-op's) to the pattern; will call a routine at - the end to verify if it's in a valid position or not. - (init_pattern_offsets_list): Added to initialize pattern - offsets lists. - (extend_anchor_list): Renamed this extend_pattern_offsets_list - and renamed parameters and internal variables appropriately. - (add_pattern_offset): Added this routine which both - record_anchor_position and add_op call. - (adjust_pattern_offsets_list): Add this routine to adjust by - some increment all the pattern offsets a list of such after a - given position. - (record_anchor_position): Now send in offset instead of - calculating it and just call add_pattern_offset. - (adjust_anchor_list): Replaced by above routine. - (remove_intervening_anchors): If the anchor is an `endline' - then replace it with `endline_in_repeat' instead of `no_op'. - (add_op): Added this routine to call in regex_compile - wherever push something relevant to verifying '$' positions. - (verify_and_adjust_endlines): Added routine to (1) verify that - '$'s in a pattern buffer (represented by `endline') were in - valid positions and (2) whether or not they were anchors. - (BITS_BLOCK_SIZE): Renamed BIT_BLOCK_SIZE and moved to right - above bits list routines. - (BITS_BLOCK): Defines which array element of a bits list the - bit corresponding to a given position is in. - (BITS_MASK): Has a 1 where the bit (in a bit list array element) - for a given position is. - -Mon Apr 1 12:09:06 1991 Kathy Hargreaves (kathy at hayley) - - * regex.c (BIT_BLOCK_SIZE): Defined this for using with - bits_list_type, abstracted from level_list_type so could use - for more things than just the level match status. - (regex_compile): Renamed `level_list' variable to - `level_match_status'. - Added variable `group_match_status' of type bits_list_type. - Kept track of whether or not for all groups any of them - matched other than the empty string, so detect if a back - reference in front of a '^' made it nonleading or not. - Do this by setting a match status bit for all active groups - whenever leave a group that matches other than the empty string. - Could detect which groups are active by going through the - stack each time, but or-ing a bits list of active groups with - a bits list of group match status is faster, so make a bits - list of active groups instead. - Have to check that '^' isn't in a leading position before - going to normal_char. - Whenever set level match status of the current level, also set - the match status of all active groups. - Increase the group count and make that group active whenever - open a group. - When close a group, only set the next level down if the - current level matches other than the empty string, and make - the current group inactive. - At a back reference, only set a level's match status if the - group to which the back reference refers matches other than - the empty string. - (init_bits_list): Added to initialize a bits list. - (get_level_value): Deleted this. (Made into - get_level_match_status.) - (extend_bits_list): Added to extend a bits list. (Made this - from deleted routine `extend_level_list'.) - (get_bit): Added to get a bit value from a bits list. (Made - this from deleted routine `get_level_value'.) - (set_bit_to_value): Added to set a bit in a bits list. (Made - this from deleted routine `set_level_value'.) - (get_level_match_status): Added this to get the match status - of a given level. (Made from get_level_value.) - (set_this_level, set_next_lower_level): Made all routines - which set bits extend the bits list if necessary, thus they - now return an unsigned value to indicate whether or not the - reallocation failed. - (increase_level): No longer extends the level list. - (make_group_active): Added to mark as active a given group in - an active groups list. - (make_group_inactive): Added to mark as inactive a given group - in an active groups list. - (set_match_status_of_active_groups): Added to set the match - status of all currently active groups. - (get_group_match_status): Added to get a given group's match status. - (no_levels_match_anything): Removed the paramenter LEVEL. - (PUSH_FAILURE_POINT): Added rms' bug fix and changed RE_NREGS - to num_internal_regs. - -Sun Mar 31 09:04:30 1991 Kathy Hargreaves (kathy at hayley) - - * regex.h (RE_ANCHORS_ONLY_AT_ENDS): Added syntax so could - constrain '^' and '$' to only be anchors if at the beginning - and end of the pattern. - (RE_SYNTAX_POSIX_BASIC): Added the above bit. - - * regex.c (enum regexcode): Changed `unused' to `no_op'. - (this_and_lower_levels_match_nothing): Deleted forward reference. - (regex_compile): case '^': if the syntax bit RE_ANCHORS_ONLY_AT_ENDS - is set, then '^' is only an anchor if at the beginning of the - pattern; only record anchor position if the syntax bit - RE_REPEATED_ANCHORS_AWAY is set; the '^' is a normal char if - the syntax bit RE_ANCHORS_ONLY_AT_END is set and we're not at - the beginning of the pattern (and neither RE_CONTEXTUAL_INDEP_OPS - nor RE_CONTEXTUAL_INDEP_OPS syntax bits are set). - Only adjust the anchor list if the syntax bit - RE_REPEATED_ANCHORS_AWAY is set. - - * regex.c (level_list_type): Use to detect when '^' is - in a leading position. - (regex_compile): Added level_list_type level_list variable in - which we keep track of whether or not a grouping level (in its - current or most recent incarnation) matches anything besides the - empty string. Set the bit for the i-th level when detect it - should match something other than the empty string and the bit - for the (i-1)-th level when leave the i-th group. Clear all - bits for the i-th and higher levels if none of 0--(i - 1)-th's - bits are set when encounter an alternation operator on that - level. If no levels are set when hit a '^', then it is in a - leading position. We keep track of which level we're at by - increasing a variable current_level whenever we encounter an - open-group operator and decreasing it whenever we encounter a - close-group operator. - Have to adjust the anchor list contents whenever insert - something ahead of them (such as on_failure_jump's) in the - pattern. - (adjust_anchor_list): Adjusts the offsets in an anchor list by - a given increment starting at a given start position. - (get_level_value): Returns the bit setting of a given level. - (set_level_value): Sets the bit of a given level to a given value. - (set_this_level): Sets (to 1) the bit of a given level. - (set_next_lower_level): Sets (to 1) the bit of (LEVEL - 1) for a - given LEVEL. - (clear_this_and_higher_levels): Clears the bits for a given - level and any higher levels. - (extend_level_list): Adds sizeof(unsigned) more bits to a level list. - (increase_level): Increases by 1 the value of a given level variable. - (decrease_level): Decreases by 1 the value of a given level variable. - (lower_levels_match_nothing): Checks if any levels lower than - the given one match anything. - (no_levels_match_anything): Checks if any levels match anything. - (re_match_2): At case wordbeg: before looking at d-1, check that - we're not at the string's beginning. - At case wordend: Added some illuminating parentheses. - -Mon Mar 25 13:58:51 1991 Kathy Hargreaves (kathy at hayley) - - * regex.h (RE_NO_ANCHOR_AT_NEWLINE): Changed syntax bit name - from RE_ANCHOR_NOT_NEWLINE because an anchor never matches the - newline itself, just the empty string either before or after it. - (RE_REPEATED_ANCHORS_AWAY): Added this syntax bit for ignoring - anchors inside groups which are operated on by repetition - operators. - (RE_DOT_MATCHES_NEWLINE): Added this bit so the match-any-character - operator could match a newline when it's set. - (RE_SYNTAX_POSIX_BASIC): Set RE_DOT_MATCHES_NEWLINE in this. - (RE_SYNTAX_POSIX_EXTENDED): Set RE_DOT_MATCHES_NEWLINE and - RE_REPEATED_ANCHORS_AWAY in this. - (regerror): Changed prototypes to new POSIX spec. - - * regex.c (anchor_list_type): Added so could null out anchors inside - repeated groups. - (ANCHOR_LIST_PTR_FULL): Added for above type. - (compile_stack_element): Changed name from stack_element. - (compile_stack_type): Changed name from compile_stack. - (INIT_COMPILE_STACK_SIZE): Changed name from INIT_STACK_SIZE. - (COMPILE_STACK_EMPTY): Changed name from STACK_EMPTY. - (COMPILE_STACK_FULL): Changed name from STACK_FULL. - (regex_compile): Changed SYNTAX parameter to non-const. - Changed variable name `stack' to `compile_stack'. - If syntax bit RE_REPEATED_ANCHORS_AWAY is set, then naively put - anchors in a list when encounter them and then set them to - `unused' when detect they are within a group operated on by a - repetition operator. Need something more sophisticated than - this, as they should only get set to `unused' if they are in - positions where they would be anchors. Also need a better way to - detect contextually invalid anchors. - Changed some commments. - (is_in_compile_stack): Changed name from `is_in_stack'. - (extend_anchor_list): Added to do anchor stuff. - (record_anchor_position): Added to do anchor stuff. - (remove_intervening_anchors): Added to do anchor stuff. - (re_match_2): Now match a newline with the match-any-character - operator if RE_DOT_MATCHES_NEWLINE is set. - Compacted some code. - (regcomp): Added new POSIX newline information to the header - commment. - If REG_NEWLINE cflag is set, then now unset RE_DOT_MATCHES_NEWLINE - in syntax. - (put_in_buffer): Added to do new POSIX regerror spec. Called - by regerror. - (regerror): Changed to take a pattern buffer, error buffer and - its size, and return type `size_t', the size of the full error - message, and the first ERRBUF_SIZE - 1 characters of the full - error message in the error buffer. - -Wed Feb 27 16:38:33 1991 Kathy Hargreaves (kathy at hayley) - - * regex.h (#include <sys/types.h>): Removed this as new POSIX - standard has the user include it. - (RE_SYNTAX_POSIX_BASIC and RE_SYNTAX_POSIX_EXTENDED): Removed - RE_HAT_LISTS_NOT_NEWLINE as new POSIX standard has the cflag - REG_NEWLINE now set this. Similarly, added syntax bit - RE_ANCHOR_NOT_NEWLINE as this is now unset by REG_NEWLINE. - (RE_SYNTAX_POSIX_BASIC): Removed syntax bit - RE_NO_CONSECUTIVE_REPEATS as POSIX now allows them. - - * regex.c (#include <sys/types.h>): Added this as new POSIX - standard has the user include it instead of us putting it in - regex.h. - (extern char *re_syntax_table): Made into an extern so the - user could allocate it. - (DO_RANGE): If don't find a range end, now goto invalid_range_end - instead of unmatched_left_bracket. - (regex_compile): Made variable SYNTAX non-const.???? - Reformatted some code. - (re_compile_fastmap): Moved is_a_succeed_n's declaration to - inner braces. - Compacted some code. - (SET_NEWLINE_FLAG): Removed and put inline. - (regcomp): Made variable `syntax' non-const so can unset - RE_ANCHOR_NOT_NEWLINE syntax bit if cflag RE_NEWLINE is set. - If cflag RE_NEWLINE is set, set the RE_HAT_LISTS_NOT_NEWLINE - syntax bit and unset RE_ANCHOR_NOT_NEWLINE one of `syntax'. - -Wed Feb 20 16:33:38 1991 Kathy Hargreaves (kathy at hayley) - - * regex.h (RE_NO_CONSECUTIVE_REPEATS): Changed name from - RE_NO_CONSEC_REPEATS. - (REG_ENESTING): Deleted this POSIX return value, as the stack - is now unbounded. - (struct re_pattern_buffer): Changed some comments. - (re_compile_pattern): Changed a comment. - Deleted check on stack upper bound and corresponding error. - Now when there's no interval contents and it's the end of the - pattern, go to unmatched_left_curly_brace instead of end_of_pattern. - Removed nesting_too_deep error, as the stack is now unbounded. - (regcomp): Removed REG_ENESTING case, as the stack is now unbounded. - (regerror): Removed REG_ENESTING case, as the stack is now unbounded. - - * regex.c (MAX_STACK_SIZE): Deleted because don't need upper - bound on array indexed with an unsigned number. - -Sun Feb 17 15:50:24 1991 Kathy Hargreaves (kathy at hayley) - - * regex.h: Changed and added some comments. - - * regex.c (init_syntax_once): Made `_' a word character. - (re_compile_pattern): Added a comment. - (re_match_2): Redid header comment. - (regexec): With header comment about PMATCH, corrected and - removed details found regex.h, adding a reference. - -Fri Feb 15 09:21:31 1991 Kathy Hargreaves (kathy at hayley) - - * regex.c (DO_RANGE): Removed argument parentheses. - Now get untranslated range start and end characters and set - list bits for the translated (if at all) versions of them and - all characters between them. - (re_match_2): Now use regs->num_regs instead of num_regs_wanted - wherever possible. - (regcomp): Now build case-fold translate table using isupper - and tolower facilities so will work on foreign language characters. - -Sat Feb 9 16:40:03 1991 Kathy Hargreaves (kathy at hayley) - - * regex.h (RE_HAT_LISTS_NOT_NEWLINE): Changed syntax bit name - from RE_LISTS_NOT_NEWLINE as it only affects nonmatching lists. - Changed all references to the match-beginning-of-string - operator to match-beginning-of-line operator, as this is what - it does. - (RE_NO_CONSEC_REPEATS): Added this syntax bit. - (RE_SYNTAX_POSIX_BASIC): Added above bit to this. - (REG_PREMATURE_END): Changed name to REG_EEND. - (REG_EXCESS_NESTING): Changed name to REG_ENESTING. - (REG_TOO_BIG): Changed name to REG_ESIZE. - (REG_INVALID_PREV_RE): Deleted this return POSIX value. - Added and changed some comments. - - * regex.c (re_compile_pattern): Now sets the pattern buffer's - `return_default_num_regs' field. - (typedef struct stack_element, stack_type, INIT_STACK_SIZE, - MAX_STACK_SIZE, STACK_EMPTY, STACK_FULL): Added for regex_compile. - (INIT_BUF_SIZE): Changed value from 28 to 32. - (BUF_PUSH): Changed name from BUFPUSH. - (MAX_BUF_SIZE): Added so could use in many places. - (IS_CHAR_CLASS_STRING): Replaced is_char_class with this. - (regex_compile): Added a stack which could grow dynamically - and which has struct elements. - Go back to initializing `zero_times_ok' and `many_time_ok' to - 0 and |=ing them inside the loop. - Now disallow consecutive repetition operators if the syntax - bit RE_NO_CONSEC_REPEATS is set. - Now detect trailing backslash when the compiler is expecting a - `?' or a `+'. - Changed calls to GET_BUFFER_SPACE which asked for 6 to ask for - 3, as that's all they needed. - Now check for trailing backslash inside lists. - Now disallow an empty alternative right before an end-of-line - operator. - Now get buffer space before leaving space for a fixup jump. - Now check if at pattern end when at open-interval operator. - Added some comments. - Now check if non-interval repetition operators follow an - interval one if the syntax bit RE_NO_CONSEC_REPEATS is set. - Now only check if what precedes an interval repetition - operator isn't a regular expression which matches one - character if the syntax bit RE_NO_CONSEC_REPEATS is set. - Now return "Unmatched [ or [^" instead of "Unmatched [". - (is_in_stack): Added to check if a given register number is in - the stack. - (re_match_2): If initial variable allocations fail, return -2, - instead of -1. - Now set reg's `num_regs' field when allocating regs. - Now before allocating them, free regs->start and end if they - aren't NULL and return -2 if either allocation fails. - Now use regs->num_regs instead of num_regs_wanted to control - regs loops. - Now increment past the newline when matching it with an - end-of-line operator. - (recomp): Added to the header comment. - Now return REG_ESUBREG if regex_compile returns "Unmatched [ - or [^" instead of doing so if it returns "Unmatched [". - Now return REG_BADRPT if in addition to returning "Missing - preceding regular expression", regex_compile returns "Invalid - preceding regular expression". - Now return new return value names (see regex.h changes). - (regexec): Added to header comment. - Initialize regs structure. - Now match whole string. - Now always free regs.start and regs.end instead of just when - the string matched. - (regerror): Now return "Regex error: Unmatched [ or [^.\n" - instead of "Regex error: Unmatched [.\n". - Now return "Regex error: Preceding regular expression either - missing or not simple.\n" instead of "Regex error: Missing - preceding regular expression.\n". - Removed REG_INVALID_PREV_RE case (it got subsumed into the - REG_BADRPT case). - -Thu Jan 17 09:52:35 1991 Kathy Hargreaves (kathy at hayley) - - * regex.h: Changed a comment. - - * regex.c: Changed and added large header comments. - (re_compile_pattern): Now if detect that `laststart' for an - interval points to a byte code for a regular expression which - matches more than one character, make it an internal error. - (regerror): Return error message, don't print it. - -Tue Jan 15 15:32:49 1991 Kathy Hargreaves (kathy at hayley) - - * regex.h (regcomp return codes): Added GNU ones. - Updated some comments. - - * regex.c (DO_RANGE): Changed `obscure_syntax' to `syntax'. - (regex_compile): Added `following_left_brace' to keep track of - where pseudo interval following a valid interval starts. - Changed some instances that returned "Invalid regular - expression" to instead return error strings coinciding with - POSIX error codes. - Changed some comments. - Now consider only things between `[:' and `:]' to be possible - character class names. - Now a character class expression can't end a pattern; at - least a `]' must close the list. - Now if the syntax bit RE_NO_BK_CURLY_BRACES is set, then a - valid interval must be followed by yet another to get an error - for preceding an interval (in this case, the second one) with - a regular expression that matches more than one character. - Now if what follows a valid interval begins with a open - interval operator but doesn't begin a valid interval, then set - following_left_bracket to it, put it in C and go to - normal_char label. - Added some comments. - Return "Invalid character class name" instead of "Invalid - character class". - (regerror): Return messages for all POSIX error codes except - REG_ECOLLATE and REG_NEWLINE, along with all GNU error codes. - Added `break's after all cases. - (main): Call re_set_syntax instead of setting `obscure_syntax' - directly. - -Sat Jan 12 13:37:59 1991 Kathy Hargreaves (kathy at hayley) - - * regex.h (Copyright): Updated date. - (#include <sys/types.h>): Include unconditionally. - (RE_CANNOT_MATCH_NEWLINE): Deleted this syntax bit. - (RE_SYNTAX_POSIX_BASIC, RE_SYNTAX_POSIX_EXTENDED): Removed - setting the RE_ANCHOR_NOT_NEWLINE syntax bit from these. - Changed and added some comments. - (struct re_pattern_buffer): Changed some flags from chars to bits. - Added field `syntax'; holds which syntax pattern was compiled with. - Added bit flag `return_default_num_regs'. - (externs for GNU and Berkeley UNIX routines): Added `const's to - parameter types to be compatible with POSIX. - (#define const): Added to support old C compilers. - - * regex.c (Copyright): Updated date. - (enum regexpcode): Deleted `newline'. - (regex_compile): Renamed re_compile_pattern to this, added a - syntax parameter so it can set the pattern buffer's `syntax' - field. - Made `pattern', and `size' `const's so could pass to POSIX - interface routines; also made `const' whatever interval - variables had to be to make this work. - Changed references to `obscure_syntax' to new parameter `syntax'. - Deleted putting `newline' in buffer when see `\n'. - Consider invalid character classes which have nothing wrong - except the character class name; if so, return character-class error. - (is_char_class): Added routine for regex_compile. - (re_compile_pattern): added a new one which calls - regex_compile with `obscure_syntax' as the actual parameter - for the formal `syntax'. - Gave this the old routine's header comments. - Made `pattern', and `size' `const's so could use POSIX interface - routine parameters. - (re_search, re_search_2, re_match, re_match_2): Changed - `pbufp' to `bufp'. - (re_search_2, re_match_2): Changed `mstop' to `stop'. - (re_search, re_search_2): Made all parameters except `regs' - `const's so could use POSIX interface routines parameters. - (re_search_2): Added private copies of `const' parameters so - could change their values. - (re_match_2): Made all parameters except `regs' `const's so - could use POSIX interface routines parameters. - Changed `size1' and `size2' parameters to `size1_arg' and - `size2_arg' and so could change; added local `size1' and - `size2' and set to these. - Added some comments. - Deleted `newline' case. - `begline' can also possibly match if `d' contains a newline; - if it does, we have to increment d to point past the newline. - Replaced references to `obscure_syntax' with `bufp->syntax'. - (re_comp, re_exec): Made parameter `s' a `const' so could use POSIX - interface routines parameters. - Now call regex_compile, passing `obscure_syntax' via the - `syntax' parameter. - (re_exec): Made local `len' a `const' so could pass to re_search. - (regcomp): Added header comment. - Added local `syntax' to set and pass to regex_compile rather - than setting global `obscure_syntax' and passing it. - Call regex_compile with its `syntax' parameter rather than - re_compile_pattern. - Return REG_ECTYPE if character-class error. - (regexec): Don't initialize `regs' to anything. - Made `private_preg' a nonpointer so could set to what the - constant `preg' points. - Initialize `private_preg's `return_default_num_regs' field to - zero because want to return `nmatch' registers, not however - many there are subexpressions in the pattern. - Also test if `nmatch' > 0 to see if should pass re_match `regs'. - -Tue Jan 8 15:57:17 1991 Kathy Hargreaves (kathy at hayley) - - * regex.h (struct re_pattern_buffer): Reworded comment. - - * regex.c (EXTEND_BUFFER): Also reset beg_interval. - (re_search_2): Return val if val = -2. - (NUM_REG_ITEMS): Listed items in comment. - (NUM_OTHER_ITEMS): Defined this for using in > 1 definition. - (MAX_NUM_FAILURE_ITEMS): Replaced `+ 2' with NUM_OTHER_ITEMS. - (NUM_FAILURE_ITEMS): As with definition above and added to - comment. - (PUSH_FAILURE_POINT): Replaced `* 2's with `<< 1's. - (re_match_2): Test with equality with 1 to see pbufp->bol and - pbufp->eol are set. - -Fri Jan 4 15:07:22 1991 Kathy Hargreaves (kathy at hayley) - - * regex.h (struct re_pattern_buffer): Reordered some fields. - Updated some comments. - Added not_bol and not_eol fields. - (extern regcomp, regexec, regerror): Added return types. - (extern regfree): Added `extern'. - - * regex.c (min): Deleted unused macro. - (re_match_2): Compacted some code. - Removed call to macro `min' from `for' loop. - Fixed so unused registers get filled with -1's. - Fail if the pattern buffer's `not_bol' field is set and - encounter a `begline'. - Fail if the pattern buffer's `not_eol' field is set and - encounter a `endline'. - Deleted redundant check for empty stack in fail case. - Don't free pattern buffer's components in re_comp. - (regexec): Initialize variable regs. - Added `private_preg' pattern buffer so could set `not_bol' and - `not_eol' fields and hand to re_match. - Deleted naive attempt to detect anchors. - Set private pattern buffer's `not_bol' and `not_eol' fields - according to eflags value. - `nmatch' must also be > 0 for us to bother allocating - registers to send to re_match and filling pmatch - with their results after the call to re_match. - Send private pattern buffer instead of argument to re_match. - If use the registers, always free them and then set them to NULL. - (regerror): Added this Posix routine. - (regfree): Added this Posix routine. - -Tue Jan 1 15:02:45 1991 Kathy Hargreaves (kathy at hayley) - - * regex.h (RE_NREGS): Deleted this definition, as now the user - can choose how many registers to have. - (REG_NOTBOL, REG_NOTEOL): Defined these Posix eflag bits. - (REG_NOMATCH, REG_BADPAT, REG_ECOLLATE, REG_ECTYPE, - REG_EESCAPE, REG_ESUBREG, REG_EBRACK, REG_EPAREN, REG_EBRACE, - REG_BADBR, REG_ERANGE, REG_ESPACE, REG_BADRPT, REG_ENEWLINE): - Defined these return values for Posix's regcomp and regexec. - Updated some comments. - (struct re_pattern_buffer): Now typedef this as regex_t - instead of the other way around. - (struct re_registers): Added num_regs field. Made start and - end fields pointers to char instead of fixed size arrays. - (regmatch_t): Added this Posix register type. - (regcomp, regexec, regerror, regfree): Added externs for these - Posix routines. - - * regex.c (enum boolean): Typedefed this. - (re_pattern_buffer): Reformatted some comments. - (re_compile_pattern): Updated some comments. - Always push start_memory and its attendant number whenever - encounter a group, not just when its number is less than the - previous maximum number of registers; same for stop_memory. - Get 4 bytes of buffer space instead of 2 when pushing a - set_number_at. - (can_match_nothing): Added this to elaborate on and replace - code in re_match_2. - (reg_info_type): Made can_match_nothing field a bit instead of int. - (MIN): Added for re_match_2. - (re_match_2 macros): Changed all `for' loops which used - RE_NREGS to now use num_internal_regs as upper bounds. - (MAX_NUM_FAILURE_ITEMS): Use num_internal_regs instead of RE_NREGS. - (POP_FAILURE_POINT): Added check for empty stack. - (FREE_VARIABLES): Added this to free (and set to NULL) - variables allocated in re_match_2. - (re_match_2): Rearranged parameters to be in order. - Added variables num_regs_wanted (how many registers the user wants) - and num_internal_regs (how many groups there are). - Allocated initial_stack, regstart, regend, old_regstart, - old_regend, reginfo, best_regstart, and best_regend---all - which used to be fixed size arrays. Free them all and return - -1 if any fail. - Free above variables if starting position pos isn't valid. - Changed all `for' loops which used RE_NREGS to now use - num_internal_regs as upper bounds---except for the loops which - fill regs; then use num_regs_wanted. - Allocate regs if the user has passed it and wants more than 0 - registers filled. - Set regs->start[i] and regs->end[i] to -1 if either - regstart[i] or regend[i] equals -1, not just the first. - Free allocated variables before returning. - Updated some comments. - (regcomp): Return REG_ESPACE, REG_BADPAT, REG_EPAREN when - appropriate. - Free translate array. - (regexec): Added this Posix interface routine. - -Mon Dec 24 14:21:13 1990 Kathy Hargreaves (kathy at hayley) - - * regex.h: If _POSIX_SOURCE is defined then #include <sys/types.h>. - Added syntax bit RE_CANNOT_MATCH_NEWLINE. - Defined Posix cflags: REG_EXTENDED, REG_NEWLINE, REG_ICASE, and - REG_NOSUB. - Added fields re_nsub and no_sub to struct re_pattern_buffer. - Typedefed regex_t to be `struct re_pattern_buffer'. - - * regex.c (CHAR_SET_SIZE): Defined this to be 256 and replaced - incidences of this value with this constant. - (re_compile_pattern): Added switch case for `\n' and put - `newline' into the pattern buffer when encounter this. - Increment the pattern_buffer's `re_nsub' field whenever open a - group. - (re_match_2): Match a newline with `newline'---provided the - syntax bit RE_CANNOT_MATCH_NEWLINE isn't set. - (regcomp): Added this Posix interface routine. - (enum test_type): Added interface_test tag. - (main): Added Posix interface test. - -Tue Dec 18 12:58:12 1990 Kathy Hargreaves (kathy at hayley) - - * regex.h (struct re_pattern_buffer): reformatted so would fit - in texinfo documentation. - -Thu Nov 29 15:49:16 1990 Kathy Hargreaves (kathy at hayley) - - * regex.h (RE_NO_EMPTY_ALTS): Added this bit. - (RE_SYNTAX_POSIX_EXTENDED): Added above bit. - - * regex.c (re_compile_pattern): Disallow empty alternatives only - when RE_NO_EMPTY_ALTS is set, not when RE_CONTEXTUAL_INVALID_OPS is. - Changed RE_NO_BK_CURLY_BRACES to RE_NO_BK_PARENS when testing - for empty groups at label handle_open. - At label handle_bar: disallow empty alternatives if RE_NO_EMPTY_ALTS - is set. - Rewrote some comments. - - (re_compile_fastmap): cleaned up code. - - (re_search_2): Rewrote comment. - - (struct register_info): Added field `inner_groups'; it records - which groups are inside of the current one. - Added field can_match_nothing; it's set if the current group - can match nothing. - Added field ever_match_something; it's set if current group - ever matched something. - - (INNER_GROUPS): Added macro to access inner_groups field of - struct register_info. - - (CAN_MATCH_NOTHING): Added macro to access can_match_nothing - field of struct register_info. - - (EVER_MATCHED_SOMETHING): Added macro to access - ever_matched_something field of struct register_info. - - (NOTE_INNER_GROUP): Defined macro to record that a given group - is inside of all currently active groups. - - (re_match_2): Added variables *p1 and mcnt2 (multipurpose). - Added old_regstart and old_regend arrays to hold previous - register values if they need be restored. - Initialize added fields and variables. - case start_memory: Find out if the group can match nothing. - Save previous register values in old_restart and old_regend. - Record that current group is inside of all currently active - groups. - If the group is inside a loop and it ever matched anything, - restore its registers to values before the last failed match. - Restore the registers for the inner groups, too. - case duplicate: Can back reference to a group that never - matched if it can match nothing. - -Thu Nov 29 11:12:54 1990 Karl Berry (karl at hayley) - - * regex.c (bcopy, ...): define these if either _POSIX_SOURCE or - STDC_HEADERS is defined; same for including <stdlib.h>. - -Sat Oct 6 16:04:55 1990 Kathy Hargreaves (kathy at hayley) - - * regex.h (struct re_pattern_buffer): Changed field comments. - - * regex.c (re_compile_pattern): Allow a `$' to precede an - alternation operator (`|' or `\|'). - Disallow `^' and/or `$' in empty groups if the syntax bit - RE_NO_EMPTY_GROUPS is set. - Wait until have parsed a valid `\{...\}' interval expression - before testing RE_CONTEXTUAL_INVALID_OPS to see if it's - invalidated by that. - Don't use RE_NO_BK_CURLY_BRACES to test whether or not a validly - parsed interval expression is invalid if it has no preceding re; - rather, use RE_CONTEXTUAL_INVALID_OPS. - If an interval parses, but there is no preceding regular - expression, yet the syntax bit RE_CONTEXTUAL_INDEP_OPS is set, - then that interval can match the empty regular expression; if - the bit isn't set, then the characters in the interval - expression are parsed as themselves (sans the backslashes). - In unfetch_interval case: Moved PATFETCH to above the test for - RE_NO_BK_CURLY_BRACES being set, which would force a goto - normal_backslash; the code at both normal_backsl and normal_char - expect a character in `c.' - -Sun Sep 30 11:13:48 1990 Kathy Hargreaves (kathy at hayley) - - * regex.h: Changed some comments to use the terms used in the - documentation. - (RE_CONTEXTUAL_INDEP_OPS): Changed name from `RE_CONTEXT_INDEP_OPS'. - (RE_LISTS_NOT_NEWLINE): Changed name from `RE_HAT_NOT_NEWLINE.' - (RE_ANCHOR_NOT_NEWLINE): Added this syntax bit. - (RE_NO_EMPTY_GROUPS): Added this syntax bit. - (RE_NO_HYPHEN_RANGE_END): Deleted this syntax bit. - (RE_SYNTAX_...): Reformatted. - (RE_SYNTAX_POSIX_BASIC, RE_SYNTAX_EXTENDED): Added syntax bits - RE_ANCHOR_NOT_NEWLINE and RE_NO_EMPTY_GROUPS, and deleted - RE_NO_HYPHEN_RANGE_END. - (RE_SYNTAX_POSIX_EXTENDED): Added syntax bit RE_DOT_NOT_NULL. - - * regex.c (bcopy, bcmp, bzero): Define if _POSIX_SOURCE is defined. - (_POSIX_SOURCE): ifdef this, #include <stdlib.h> - (#ifdef emacs): Changed comment of the #endif for the its #else - clause to be `not emacs', not `emacs.' - (no_pop_jump): Changed name from `jump'. - (pop_failure_jump): Changed name from `finalize_jump.' - (maybe_pop_failure_jump): Changed name from `maybe_finalize_jump'. - (no_pop_jump_n): Changed name from `jump_n.' - (EXTEND_BUFFER): Use shift instead of multiplication to double - buf->allocated. - (DO_RANGE, recompile_pattern): Added macro to set the list bits - for a range. - (re_compile_pattern): Fixed grammar problems in some comments. - Checked that RE_NO_BK_VBAR is set to make `$' valid before a `|' - and not set to make it valid before a `\|'. - Checked that RE_NO_BK_PARENS is set to make `$' valid before a ')' - and not set to make it valid before a `\)'. - Disallow ranges starting with `-', unless the range is the - first item in a list, rather than disallowing ranges which end - with `-'. - Disallow empty groups if the syntax bit RE_NO_EMPTY_GROUPS is set. - Disallow nothing preceding `{' and `\{' if they represent the - open-interval operator and RE_CONTEXTUAL_INVALID_OPS is set. - (register_info_type): typedef-ed this using `struct register_info.' - (SET_REGS_MATCHED): Compacted the code. - (re_match_2): Made it fail if back reference a group which we've - never matched. - Made `^' not match a newline if the syntax bit - RE_ANCHOR_NOT_NEWLINE is set. - (really_fail): Added this label so could force a final fail that - would not try to use the failure stack to recover. - -Sat Aug 25 14:23:01 1990 Kathy Hargreaves (kathy at hayley) - - * regex.h (RE_CONTEXTUAL_OPS): Changed name from RE_CONTEXT_OPS. - (global): Rewrote comments and rebroke some syntax #define lines. - - * regex.c (isgraph): Added definition for sequents. - (global): Now refer to character set lists as ``lists.'' - Rewrote comments containing ``\('' or ``\)'' to now refer to - ``groups.'' - (RE_CONTEXTUAL_OPS): Changed name from RE_CONTEXT_OPS. - - (re_compile_pattern): Expanded header comment. - -Sun Jul 15 14:50:25 1990 Kathy Hargreaves (kathy at hayley) - - * regex.h (RE_CONTEX_INDEP_OPS): the comment's sense got turned - around when we changed how it read; changed it to be correct. - -Sat Jul 14 16:38:06 1990 Kathy Hargreaves (kathy at hayley) - - * regex.h (RE_NO_EMPTY_BK_REF): changed name to - RE_NO_MISSING_BK_REF, as this describes it better. - - * regex.c (re_compile_pattern): changed RE_NO_EMPTY_BK_REF - to RE_NO_MISSING_BK_REF, as above. - -Thu Jul 12 11:45:05 1990 Kathy Hargreaves (kathy at hayley) - - * regex.h (RE_NO_EMPTY_BRACKETS): removed this syntax bit, as - bracket expressions should *never* be empty regardless of the - syntax. Removes this bit from RE_SYNTAX_POSIX_BASIC and - RE_SYNTAX_POSIX_EXTENDED. - - * regex.c (SET_LIST_BIT): in the comment, now refer to character - sets as (non)matching sets, as bracket expressions can now match - other things in addition to characters. - (re_compile_pattern): refer to groups as such instead of `\(...\)' - or somesuch, because groups can now be enclosed in either plain - parens or backslashed ones, depending on the syntax. - In the '[' case, added a boolean just_had_a_char_class to detect - whether or not a character class begins a range (which is invalid). - Restore way of breaking out of a bracket expression to original way. - Add way to detect a range if the last thing in a bracket - expression was a character class. - Took out check for c != ']' at the end of a character class in - the else clause, as it had already been checked in the if part - that also checked the validity of the string. - Set or clear just_had_a_char_class as appropriate. - Added some comments. Changed references to character sets to - ``(non)matching lists.'' - -Sun Jul 1 12:11:29 1990 Karl Berry (karl at hayley) - - * regex.h (BYTEWIDTH): moved back to regex.c. - - * regex.h (re_compile_fastmap): removed declaration; this - shouldn't be advertised. - -Mon May 28 15:27:53 1990 Kathy Hargreaves (kathy at hayley) - - * regex.c (ifndef Sword): Made comments more specific. - (global): include <stdio.h> so can write fatal messages on - standard error. Replaced calls to assert with fprintfs to - stderr and exit (1)'s. - (PREFETCH): Reformatted to make more readable. - (AT_STRINGS_BEG): Defined to test if we're at the beginning of - the virtual concatenation of string1 and string2. - (AT_STRINGS_END): Defined to test if at the end of the virtual - concatenation of string1 and string2. - (AT_WORD_BOUNDARY): Defined to test if are at a word boundary. - (IS_A_LETTER(d)): Defined to test if the contents of the pointer D - is a letter. - (re_match_2): Rewrote the wordbound, notwordbound, wordbeg, wordend, - begbuf, and endbuf cases in terms of the above four new macros. - Called SET_REGS_MATCHED in the matchsyntax, matchnotsyntax, - wordchar, and notwordchar cases. - -Mon May 14 14:49:13 1990 Kathy Hargreaves (kathy at hayley) - - * regex.c (re_search_2): Fixed RANGE to not ever take STARTPOS - outside of virtual concatenation of STRING1 and STRING2. - Updated header comment as to this. - (re_match_2): Clarified comment about MSTOP in header. - -Sat May 12 15:39:00 1990 Kathy Hargreaves (kathy at hayley) - - * regex.c (re_search_2): Checked for out-of-range STARTPOS. - Added comments. - When searching backwards, not only get the character with which - to compare to the fastmap from string2 if the starting position - >= size1, but also if size1 is zero; this is so won't get a - segmentation fault if string1 is null. - Reformatted code at label advance. - -Thu Apr 12 20:26:21 1990 Kathy Hargreaves (kathy at hayley) - - * regex.h: Added #pragma once and #ifdef...endif __REGEXP_LIBRARY. - (RE_EXACTN_VALUE): Added for search.c to use. - Reworded some comments. - - regex.c: Punctuated some comments correctly. - (NULL): Removed this. - (RE_EXACTN_VALUE): Added for search.c to use. - (<ctype.h>): Moved this include to top of file. - (<assert.h>): Added this include. - (struct regexpcode): Assigned 0 to unused and 1 to exactn - because of RE_EXACTN_VALUE. - Added comment. - (various macros): Lined up backslashes near end of line. - (insert_jump): Cleaned up the header comment. - (re_search): Corrected the header comment. - (re_search_2): Cleaned up and completed the header comment. - (re_max_failures): Updated comment. - (struct register_info): Constructed as bits so as to save space - on the stack when pushing register information. - (IS_ACTIVE): Macro for struct register_info. - (MATCHED_SOMETHING): Macro for struct register_info. - (NUM_REG_ITEMS): How many register information items for each - register we have to push on the stack at each failure. - (MAX_NUM_FAILURE_ITEMS): If push all the registers on failure, - this is how many items we push on the stack. - (PUSH_FAILURE_POINT): Now pushes whether or not the register is - currently active, and whether or not it matched something. - Checks that there's enough space allocated to accomodate all the - items we currently want to push. (Before, a test for an empty - stack sufficed because we always pushed and popped the same - number of items). - Replaced ``2'' with MAX_NUM_FAILURE_POINTS when ``2'' refers - to how many things get pushed on the stack each time. - When copy the stack into the newly allocated storage, now only copy - the area in use. - Clarified comment. - (POP_FAILURE_POINT): Defined to use in places where put number - of registers on the stack into a variable before using it to - decrement the stack, so as to not confuse the compiler. - (IS_IN_FIRST_STRING): Defined to check if a pointer points into - the first string. - (SET_REGS_MATCHED): Changed to use the struct register_info - bits; also set the matched-something bit to false if the - register isn't currently active. (This is a redundant setting.) - (re_match_2): Cleaned up and completed the header comment. - Updated the failure stack comment. - Replaced the ``2'' with MAX_NUM_FAILURE_ITEMS in the static - allocation of initial_stack, because now more than two (now up - to MAX_FAILURE_ITEMS) items get pushed on the failure stack each - time. - Ditto for stackb. - Trashed restart_seg1, regend_seg1, best_regstart_seg1, and - best_regend_seg1 because they could have erroneous information - in them, such as when matching ``a'' (in string1) and ``ab'' (in - string2) with ``(a)*ab''; before using IS_IN_FIRST_STRING to see - whether or not the register starts or ends in string1, - regstart[1] pointed past the end of string1, yet regstart_seg1 - was 0! - Added variable reg_info of type struct register_info to keep - track of currently active registers and whether or not they - currently match anything. - Commented best_regs_set. - Trashed reg_active and reg_matched_something and put the - information they held into reg_info; saves space on the stack. - Replaced NULL with '\000'. - In begline case, compacted the code. - Used assert to exit if had an internal error. - In begbuf case, because now force the string we're working on - into string2 if there aren't two strings, now allow d == string2 - if there is no string1 (and the check for that is size1 == 0!); - also now succeeds if there aren't any strings at all. - (main, ifdef canned): Put test type into a variable so could - change it while debugging. - -Sat Mar 24 12:24:13 1990 Kathy Hargreaves (kathy at hayley) - - * regex.c (GET_UNSIGNED_NUMBER): Deleted references to num_fetches. - (re_compile_pattern): Deleted num_fetches because could keep - track of the number of fetches done by saving a pointer into the - pattern. - Added variable beg_interval to be used as a pointer, as above. - Assert that beg_interval points to something when it's used as above. - Initialize succeed_n's to lower_bound because re_compile_fastmap - needs to know it. - (re_compile_fastmap): Deleted unnecessary variable is_a_jump_n. - Added comment. - (re_match_2): Put number of registers on the stack into a - variable before using it to decrement the stack, so as to not - confuse the compiler. - Updated comments. - Used error routine instead of printf and exit. - In exactn case, restored longer code from ``original'' regex.c - which doesn't test translate inside a loop. - - * regex.h: Moved #define NULL and the enum regexpcode definition - and to regex.c. Changed some comments. - - regex.c (global): Updated comments about compiling and for the - re_compile_pattern jump routines. - Added #define NULL and the enum regexpcode definition (from - regex.h). - (enum regexpcode): Added set_number_at to reset the n's of - succeed_n's and jump_n's. - (re_set_syntax): Updated its comment. - (re_compile_pattern): Moved its heading comment to after its macros. - Moved its include statement to the top of the file. - Commented or added to comments of its macros. - In start_memory case: Push laststart value before adding - start_memory and its register number to the buffer, as they - might not get added. - Added code to put a set_number_at before each succeed_n and one - after each jump_n; rewrote code in what seemed a more - straightforward manner to put all these things in the pattern so - the succeed_n's would correctly jump to the set_number_at's of - the matching jump_n's, and so the jump_n's would correctly jump - to after the set_number_at's of the matching succeed_n's. - Initialize succeed_n n's to -1. - (insert_op_2): Added this to insert an operation followed by - two integers. - (re_compile_fastmap): Added set_number_at case. - (re_match_2): Moved heading comment to after macros. - Added mention of REGS to heading comment. - No longer turn a succeed_n with n = 0 into an on_failure_jump, - because n needs to be reset each time through a loop. - Check to see if a succeed_n's n is set by its set_number_at. - Added set_number_at case. - Updated some comments. - (main): Added another main to run posix tests, which is compiled - ifdef both test and canned. (Old main is still compiled ifdef - test only). - -Tue Mar 19 09:22:55 1990 Kathy Hargreaves (kathy at hayley) - - * regex.[hc]: Change all instances of the word ``legal'' to - ``valid'' and all instances of ``illegal'' to ``invalid.'' - -Sun Mar 4 12:11:31 1990 Kathy Hargreaves (kathy at hayley) - - * regex.h: Added syntax bit RE_NO_EMPTY_RANGES which is set if - an ending range point has to collate higher or equal to the - starting range point. - Added syntax bit RE_NO_HYPHEN_RANGE_END which is set if a hyphen - can't be an ending range point. - Set to two above bits in RE_SYNTAX_POSIX_BASIC and - RE_SYNTAX_POSIX_EXTENDED. - - regex.c: (re_compile_pattern): Don't allow empty ranges if the - RE_NO_EMPTY_RANGES syntax bit is set. - Don't let a hyphen be a range end if the RE_NO_HYPHEN_RANGE_END - syntax bit is set. - (ESTACK_PUSH_2): renamed this PUSH_FAILURE_POINT and made it - push all the used registers on the stack, as well as the number - of the highest numbered register used, and (as before) the two - failure points. - (re_match_2): Fixed up comments. - Added arrays best_regstart[], best_regstart_seg1[], best_regend[], - and best_regend_seg1[] to keep track of the best match so far - whenever reach the end of the pattern but not the end of the - string, and there are still failure points on the stack with - which to backtrack; if so, do the saving and force a fail. - If reach the end of the pattern but not the end of the string, - but there are no more failure points to try, restore the best - match so far, set the registers and return. - Compacted some code. - In stop_memory case, if the subexpression we've just left is in - a loop, push onto the stack the loop's on_failure_jump failure - point along with the current pointer into the string (d). - In finalize_jump case, in addition to popping the failure - points, pop the saved registers. - In the fail case, restore the registers, as well as the failure - points. - -Sun Feb 18 15:08:10 1990 Kathy Hargreaves (kathy at hayley) - - * regex.c: (global): Defined a macro GET_BUFFER_SPACE which - makes sure you have a specified number of buffer bytes - allocated. - Redefined the macro BUFPUSH to use this. - Added comments. - - (re_compile_pattern): Call GET_BUFFER_SPACE before storing or - inserting any jumps. - - (re_match_2): Set d to string1 + pos and dend to end_match_1 - only if string1 isn't null. - Force exit from a loop if it's around empty parentheses. - In stop_memory case, if found some jumps, increment p2 before - extracting address to which to jump. Also, don't need to know - how many more times can jump_n. - In begline case, d must equal string1 or string2, in that order, - only if they are not null. - In maybe_finalize_jump case, skip over start_memorys' and - stop_memorys' register numbers, too. - -Thu Feb 15 15:53:55 1990 Kathy Hargreaves (kathy at hayley) - - * regex.c (BUFPUSH): off by one goof in deciding whether to - EXTEND_BUFFER. - -Wed Jan 24 17:07:46 1990 Kathy Hargreaves (kathy at hayley) - - * regex.h: Moved definition of NULL to here. - Got rid of ``In other words...'' comment. - Added to some comments. - - regex.c: (re_compile_pattern): Tried to bulletproof some code, - i.e., checked if backward references (e.g., p[-1]) were within - the range of pattern. - - (re_compile_fastmap): Fixed a bug in succeed_n part where was - getting the amount to jump instead of how many times to jump. - - (re_search_2): Changed the name of the variable ``total'' to - ``total_size.'' - Condensed some code. - - (re_match_2): Moved the comment about duplicate from above the - start_memory case to above duplicate case. - - (global): Rewrote some comments. - Added commandline arguments to testing. - -Wed Jan 17 11:47:27 1990 Kathy Hargreaves (kathy at hayley) - - * regex.c: (global): Defined a macro STORE_NUMBER which stores a - number into two contiguous bytes. Also defined STORE_NUMBER_AND_INCR - which does the same thing and then increments the pointer to the - storage place to point after the number. - Defined a macro EXTRACT_NUMBER which extracts a number from two - continguous bytes. Also defined EXTRACT_NUMBER_AND_INCR which - does the same thing and then increments the pointer to the - source to point to after where the number was. - -Tue Jan 16 12:09:19 1990 Kathy Hargreaves (kathy at hayley) - - * regex.h: Incorporated rms' changes. - Defined RE_NO_BK_REFS syntax bit which is set when want to - interpret back reference patterns as literals. - Defined RE_NO_EMPTY_BRACKETS syntax bit which is set when want - empty bracket expressions to be illegal. - Defined RE_CONTEXTUAL_ILLEGAL_OPS syntax bit which is set when want - it to be illegal for *, +, ? and { to be first in an re or come - immediately after a | or a (, and for ^ not to appear in a - nonleading position and $ in a nontrailing position (outside of - bracket expressions, that is). - Defined RE_LIMITED_OPS syntax bit which is set when want +, ? - and | to always be literals instead of ops. - Fixed up the Posix syntax. - Changed the syntax bit comments from saying, e.g., ``0 means...'' - to ``If this bit is set, it means...''. - Changed the syntax bit defines to use shifts instead of integers. - - * regex.c: (global): Incorporated rms' changes. - - (re_compile_pattern): Incorporated rms' changes - Made it illegal for a $ to appear anywhere but inside a bracket - expression or at the end of an re when RE_CONTEXTUAL_ILLEGAL_OPS - is set. Made the same hold for $ except it has to be at the - beginning of an re instead of the end. - Made the re "[]" illegal if RE_NO_EMPTY_BRACKETS is set. - Made it illegal for | to be first or last in an re, or immediately - follow another | or a (. - Added and embellished some comments. - Allowed \{ to be interpreted as a literal if RE_NO_BK_CURLY_BRACES - is set. - Made it illegal for *, +, ?, and { to appear first in an re, or - immediately follow a | or a ( when RE_CONTEXTUAL_ILLEGAL_OPS is set. - Made back references interpreted as literals if RE_NO_BK_REFS is set. - Made recursive intervals either illegal (if RE_NO_BK_CURLY_BRACES - isn't set) or interpreted as literals (if is set), if RE_INTERVALS - is set. - Made it treat +, ? and | as literals if RE_LIMITED_OPS is set. - Cleaned up some code. - -Thu Dec 21 15:31:32 1989 Kathy Hargreaves (kathy at hayley) - - * regex.c: (global): Moved RE_DUP_MAX to regex.h and made it - equal 2^15 - 1 instead of 1000. - Defined NULL to be zero. - Moved the definition of BYTEWIDTH to regex.h. - Made the global variable obscure_syntax nonstatic so the tests in - another file could use it. - - (re_compile_pattern): Defined a maximum length (CHAR_CLASS_MAX_LENGTH) - for character class strings (i.e., what's between the [: and the - :]'s). - Defined a macro SET_LIST_BIT(c) which sets the bit for C in a - character set list. - Took out comments that EXTEND_BUFFER clobbers C. - Made the string "^" match itself, if not RE_CONTEXT_IND_OPS. - Added character classes to bracket expressions. - Change the laststart pointer saved with the start of each - subexpression to point to start_memory instead of after the - following register number. This is because the subexpression - might be in a loop. - Added comments and compacted some code. - Made intervals only work if preceded by an re matching a single - character or a subexpression. - Made back references to nonexistent subexpressions illegal if - using POSIX syntax. - Made intervals work on the last preceding character of a - concatenation of characters, e.g., ab{0,} matches abbb, not abab. - Moved macro PREFETCH to outside the routine. - - (re_compile_fastmap): Added succeed_n to work analogously to - on_failure_jump if n is zero and jump_n to work analogously to - the other backward jumps. - - (re_match_2): Defined macro SET_REGS_MATCHED to set which - current subexpressions had matches within them. - Changed some comments. - Added reg_active and reg_matched_something arrays to keep track - of in which subexpressions currently have matched something. - Defined MATCHING_IN_FIRST_STRING and replaced ``dend == end_match_1'' - with it to make code easier to understand. - Fixed so can apply * and intervals to arbitrarily nested - subexpressions. (Lots of previous bugs here.) - Changed so won't match a newline if syntax bit RE_DOT_NOT_NULL is set. - Made the upcase array nonstatic so the testing file could use it also. - - (main.c): Moved the tests out to another file. - - (tests.c): Moved all the testing stuff here. - -Sat Nov 18 19:30:30 1989 Kathy Hargreaves (kathy at hayley) - - * regex.c: (re_compile_pattern): Defined RE_DUP_MAX, the maximum - number of times an interval can match a pattern. - Added macro GET_UNSIGNED_NUMBER (used to get below): - Added variables lower_bound and upper_bound for upper and lower - bounds of intervals. - Added variable num_fetches so intervals could do backtracking. - Added code to handle '{' and "\{" and intervals. - Added to comments. - - (store_jump_n): (Added) Stores a jump with a number following the - relative address (for intervals). - - (insert_jump_n): (Added) Inserts a jump_n. - - (re_match_2): Defined a macro ESTACK_PUSH_2 for the error stack; - it checks for overflow and reallocates if necessary. - - * regex.h: Added bits (RE_INTERVALS and RE_NO_BK_CURLY_BRACES) - to obscure syntax to indicate whether or not - a syntax handles intervals and recognizes either \{ and - \} or { and } as operators. Also added two syntaxes - RE_SYNTAX_POSIX_BASIC and RE_POSIX_EXTENDED and two command codes - to the enumeration regexpcode; they are succeed_n and jump_n. - -Sat Nov 18 19:30:30 1989 Kathy Hargreaves (kathy at hayley) - - * regex.c: (re_compile_pattern): Defined INIT_BUFF_SIZE to get rid - of repeated constants in code. Tested with value 1. - Renamed PATPUSH as BUFPUSH, since it pushes things onto the - buffer, not the pattern. Also made this macro extend the buffer - if it's full (so could do the following): - Took out code at top of loop that checks to see if buffer is going - to be full after 10 additions (and reallocates if necessary). - - (insert_jump): Rearranged declaration lines so comments would read - better. - - (re_match_2): Compacted exactn code and added more comments. - - (main): Defined macros TEST_MATCH and MATCH_SELF to do - testing; took out loop so could use these instead. - -Tue Oct 24 20:57:18 1989 Kathy Hargreaves (kathy at hayley) - - * regex.c (re_set_syntax): Gave argument `syntax' a type. - (store_jump, insert_jump): made them void functions. - -Local Variables: -mode: indented-text -left-margin: 8 -version-control: never -End: |