[lug] grep question
Jeffrey Haemer
jeffrey.haemer at gmail.com
Tue Jun 12 09:21:35 MDT 2007
Collins,
OK, read your explanation (LANG=C), but I have LANG=en_US.UTF-8, and I
> don't have the problem.
Lucky you!
Posix made the, oh, interesting choice of specifying an I18N --
"internationalization," because there are 18 letters in the word between 'i'
and 'n' -- mechanism without specifying a behavior.
With the exception of LANG=C and its synonyms, you could, I think,
legitimately create a character set with Tibetan characters only, and call
it
en_US.UTF-8. It would be left to the marketplace to select against your
distro.
LANG=C permits portability, LANG=something_else permits flexibility. (I
have the nagging suspicion that the Linux Standards Base may now specify
some of these other values, but I confess I don't know.)
In my experience, the behavior Chip complained about -- the opposite
behavior from yours -- appears to be a typical, default, Linux desktop
behavior. ASCII is wired into my firmware, so this breaks lots of scripts I
write. In self-defense, my (GNU) makefiles often say "export LANG := C"
early on.
I find it hard to believe that any distro
> would have a LANG= setting that would include lower case characters in
> the range A-Z.
Don't start me on things that I find hard to believe. :-)
For arbitrary character sets, '[A-Z]' means "Any character that falls
between 'A' and 'Z,' inclusive, in the collating sequence."
For guaranteed lower case, Posix offers the
impossible-to-remember-much-less-type expression "[[:lower:]]"
I am, as Dave Barry would say, not making this up. And yes, you can use
this with your grep: try " grep '[[:lower:]]' "
OTOH, a distro with /usr/bin/grep is already quite non-standard.
Unusual, yes. Non-standard, no. I have run on Posix-conforming boxes in
which /usr/bin was just a symlink to /bin. The first time I saw it, I found
it, well, hard to believe.
Look at it this way: You can't teach an old dog new tricks. Me, I try to
learn something new every day as a sentinel; when I finally succeed, I'll
know I'm finally entering my second childhood.
--
Jeffrey Haemer <jeffrey.haemer at gmail.com>
720-837-8908 [cell]
http://goyishekop.blogspot.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lug.boulder.co.us/pipermail/lug/attachments/20070612/ec087f64/attachment.html>
More information about the LUG
mailing list