| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
data structures that scale better with large character sets, instead of
arrays indexed by character value:
- Sets of characters to delete/squeeze are stored in a new "cset" structure,
which is implemented as a splay tree of extents. This structure has the
ability to store character classes (ala wctype(3)), but this is not
currently fully utilized.
- Mappings between characters are stored in a new "cmap" structure, which
is also a splay tree.
- The parser no longer builds arrays containing all the characters in a
particular class; instead, next() determines them on-the-fly using
nextwctype(3).
|
|
|
|
|
|
| |
makes one malloc unneeded, removes two bzero's and makes code more readable.
"Bright ideas comes only _after_ commits."
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1st one is relatively minor: according our own manpage, upper and lower
classes must be sorted, but currently not.
2nd one is serious:
tr '[:lower:]' '[:upper:]'
(and vice versa) currently works only if upper and lower classes
have exact the same number of elements. When it is not true, like for
many ISO8859-x locales which have bigger amount of lowercase letters,
tr may do nasty things.
See this page
http://www.opengroup.org/onlinepubs/007908799/xcu/tr.html
for detailed description of desired tr behaviour in such cases.
|
| |
|
|
|
|
| |
of a previous commit implementing equivalence classes.
|
| |
|
| |
|
|
|