GREP(1V)                 USER COMMANDS                   GREP(1V)

NAME
     grep, egrep, fgrep - search a file for a string  or  regular
     expression

SYNOPSIS
     grep [ -bchilnsvw ] [ -e [4mexpression[m ] [ [4mfilename[m... ]

     egrep [ -bchilnsv ] [ -e [4mexpression[m ] [ -f [4mfilename[m ]
          [ [4mexpression[m ] [ [4mfilename[m... ]

     fgrep [ -bchilnsvx ] [ -e [4mstring[m ] [ -f [4mfilename[m ]
          [ [4mstring[m ] [ [4mfilename[m... ]

SYSTEM V SYNOPSIS
     /usr/5bin/grep [ -bchilnsvw ] [ -e [4mexpression[m ]
          [ [4mfilename[m... ]

AVAILABILITY
     The System V version of this command is available  with  the
     [4mSystem[m  [4mV[m software installation option.  Refer to [4mInstalling[m
     [4mSunOS[m  [4m4[m.[4m1[m  for  information  on  how  to  install  optional
     software.

DESCRIPTION
     Commands of the grep family search the input [4mfilename[ms  (the
     standard  input default) for lines matching a pattern.  Nor-
     mally, each line found is copied  to  the  standard  output.
     grep  patterns  are limited regular expressions in the style
     of ed(1).   egrep  patterns  are  full  regular  expressions
     including  alternation.   fgrep patterns are fixed strings -
     no regular expression metacharacters are supported.

     In general, egrep is the fastest of these programs.

     Take care when using the characters `$', `*', [,  `^',  `|',
     `(', `)', and `\' in the [4mexpression[m, as these characters are
     also meaningful to the shell.  It is safest to  enclose  the
     entire [4mexpression[m argument in single quotes '...'.

     When any of the grep utilities is applied to more  than  one
     input file, the name of the file is displayed preceding each
     line  which  matches  the  pattern.   The  filename  is  not
     displayed  when processing a single file, so if you actually
     want the filename to appear, use /dev/null as a second  file
     in the list.

OPTIONS
     -b   Precede each line by the block number on which  it  was
          found.  This is sometimes useful in locating disk block
          numbers by context.

     -c   Display  a  count  of  matching   lines   rather   than

Sun Release 4.1   Last change: 30 November 1988                 1

GREP(1V)                 USER COMMANDS                   GREP(1V)

          displaying the lines which match.

     -h   Do not display filenames.

     -i   Ignore the case of letters in making comparisons - that
          is, upper and lower case are considered identical.

     -l   List only the names of files with matching lines (once)
          separated by NEWLINE characters.

     -n   Precede each line by its relative line  number  in  the
          file.

     -s   Work silently, that is, display  nothing  except  error
          messages.   This  is  useful  for  checking  the  error
          status.

     -v   Invert the search to only display  lines  that  [4mdo[m  [4mnot[m
          match.

     -w   Search for the expression as a word as if surrounded by
          \< and \>.  This applies to grep only.

     -x   Display only those lines which match exactly - that is,
          only lines which match in their entirety.  This applies
          to fgrep only.

     -e [4mexpression[m
          Same as a simple [4mexpression[m argument, but  useful  when
          the [4mexpression[m begins with a `-'.

     -e [4mstring[m
          For fgrep the argument is a literal character [4mstring[m.

     -f [4mfilename[m
          Take the  regular  expression  (egrep)  or  a  list  of
          strings separated by NEWLINE (fgrep) from [4mfilename[m.

SYSTEM V OPTIONS
     The -s option to grep  indicates  that  error  messages  for
     nonexistent  or  unreadable  files should be suppressed, not
     that all  messages  [4mexcept[m  for  error  messages  should  be
     suppressed.

REGULAR EXPRESSIONS
     The following [4mone[m-[4mcharacter[m regular expressions match a [4msin-[m
     [4mgle[m character:

     [4mc[m    An ordinary character ([4mnot[m one of the  special  charac-
          ters   discussed  below)  is  a  one-character  regular
          expression that matches that character.

Sun Release 4.1   Last change: 30 November 1988                 2

GREP(1V)                 USER COMMANDS                   GREP(1V)

     \[4mc[m   A backslash (\) followed by any special character is  a
          one-character  regular expression that matches the spe-
          cial character itself.  The special characters are:

               o+    `.', `*', `[',  and  `\'  (period,  asterisk,
                    left  square  bracket, and backslash, respec-
                    tively), which  are  always  special,  [4mexcept[m
                    when they appear within square brackets ([]).

               o+    `^' (caret or circumflex), which  is  special
                    at the [4mbeginning[m of an [4mentire[m regular expres-
                    sion, or when it immediately follows the left
                    of a pair of square brackets ([]).

               o+    $ (currency symbol), which is special at  the
                    [4mend[m of an entire regular expression.

     A backslash followed by one of `<', `>', `(', `)',  `{',  or
     `}',  represents  a  special operator in the regular expres-
     sion; see below.

     .    A `.' (period) is a  one-character  regular  expression
          that matches any character except NEWLINE.

     [[4mstring[m]
          A non-empty string of  characters  enclosed  in  square
          brackets  is  a  one-character  regular expression that
          matches [4many[m [4mone[m character in that string.  If, however,
          the  first  character of the string is a `^' (a circum-
          flex or caret), the  one-character  regular  expression
          matches  any character [4mexcept[m NEWLINE and the remaining
          characters in the string.  The  `^'  has  this  special
          meaning [4monly[m if it occurs first in the string.  The `-'
          (minus) may be used to indicate a range of  consecutive
          ASCII  characters;  for example, [0-9] is equivalent to
          [0123456789].  The `-' loses this special meaning if it
          occurs  first (after an initial `^', if any) or last in
          the string.  The `]' (right square  bracket)  does  not
          terminate  such a string when it is the first character
          within it (after an initial  `^',  if  any);  that  is,
          []a-f]  matches either `]' (a right square bracket ) or
          one of the letters a through  f  inclusive.   The  four
          characters  `.', `*', `[', and `\' stand for themselves
          within such a string of characters.

     The following rules may be used to construct regular expres-
     sions:

     *    A one-character regular expression followed by `*'  (an
          asterisk)  is a regular expression that matches [4mzero[m or
          more occurrences of the one-character  regular  expres-
          sion.   If  there  is  any choice, the longest leftmost

Sun Release 4.1   Last change: 30 November 1988                 3

GREP(1V)                 USER COMMANDS                   GREP(1V)

          string that permits a match is chosen.

     \(and\)
          A regular expression  enclosed  between  the  character
          sequences \( and \) matches whatever the unadorned reg-
          ular expression matches.  This applies only to grep.

     \[4mn[m   The expression \[4mn[m matches the same string of characters
          as was matched by an expression enclosed between \( and
          \) [4mearlier[m in the same regular expression.  Here [4mn[m is a
          digit;  the  sub-expression specified is that beginning
          with the [4mn[mth occurrence of \( counting from  the  left.
          For  example,  the expression ^\(.*\)\1$ matches a line
          consisting of two  repeated  appearances  of  the  same
          string.

  Concatenation
     The  concatenation  of  regular  expressions  is  a  regular
     expression  that  matches  the  concatenation of the strings
     matched by each component of the regular expression.

     \<   The sequence \< in a regular expression constrains  the
          one-character  regular expression immediately following
          it only to  match  something  at  the  beginning  of  a
          "word";  that is, either at the beginning of a line, or
          just before a letter, digit, or underline and  after  a
          character not one of these.

     \>   The sequence \> in a regular expression constrains  the
          one-character  regular expression immediately following
          it only to match something at the end of a "word"; that
          is, either at the end of a line, or just before a char-
          acter which is neither a letter, digit, nor underline.

     \{[4mm[m\}
     \{[4mm[m,\}
     \{[4mm[m,[4mn[m\}
          A regular expression  followed  by  \{[4mm[m\},  \{[4mm[m,[4m\[m},  or
          \{[4mm[m,[4mn[m\}  matches  a range of occurrences of the regular
          expression.  The values of [4mm[m and [4mn[m must be non-negative
          integers   less  than  256;  \{[4mm[m\}  matches  [4mexactly[m  [4mm[m
          occurrences; \{[4mm[m,\} matches  [4mat[m  [4mleast[m  [4mm[m  occurrences;
          \{[4mm[m,[4mn[m\} matches [4many[m [4mnumber[m of occurrences [4mbetween[m [4mm[m and
          [4mn[m inclusive.  Whenever a  choice  exists,  the  regular
          expression matches as many occurrences as possible.

     ^    A circumflex or caret (^) at the beginning of an entire
          regular  expression  constrains that regular expression
          to match an [4minitial[m segment of a line.

     $    A currency symbol ($) at the end of an  entire  regular
          expression  constrains that regular expression to match

Sun Release 4.1   Last change: 30 November 1988                 4

GREP(1V)                 USER COMMANDS                   GREP(1V)

          a [4mfinal[m segment of a line.

     The construction

          example% ^[4mentire[m [4mregular[m [4mexpression[m $

     constrains the entire regular expression to match the entire
     line.

     egrep accepts regular expressions  of  the  same  sort  grep
     does,  except  for  \(, \), \[4mn[m, \<, \>, \{, and \}, with the
     addition of:

               *    A  regular  expression  (not  just   a   one-
                    character regular expression) followed by `*'
                    (an asterisk) is a  regular  expression  that
                    matches  [4mzero[m or more occurrences of the one-
                    character regular expression.   If  there  is
                    any  choice, the longest leftmost string that
                    permits a match is chosen.

               +    A regular expression followed by `+' (a  plus
                    sign)  is  a  regular expression that matches
                    [4mone[m or more occurrences of the  one-character
                    regular  expression.  If there is any choice,
                    the longest leftmost string  that  permits  a
                    match is chosen.

               ?    A regular expression followed by `?' (a ques-
                    tion  mark)  is  a  regular  expression  that
                    matches [4mzero[m or [4mone[m occurrences of  the  one-
                    character  regular  expression.   If there is
                    any choice, the longest leftmost string  that
                    permits a match is chosen.

               |    Alternation:    two    regular    expressions
                    separated  by  `|'  or NEWLINE match either a
                    match for  the  first  or  a  match  for  the
                    second.

               ()   A regular expression enclosed in  parentheses
                    matches a match for the regular expression.

     The order of precedence of operators at the same parenthesis
     level  is  `[ ]'  (character  classes),  then  `*'  `+'  `?'
     (closures),then  concatenation,  then  `|'  (alternation)and
     NEWLINE.

---------------Regex info from ARCHIE

 (4) "regex" This is the DEFAULT search method.

     ed(1) regular expressions. Searches the database with the user
     (search) string which is given in the form of an ed(1) regular
     expression.

 NOTE: Unless specifically anchored to the beginning (with ^) or end
 (with $) of a line, ed(1) regular expressions have ".*" prepended and
 appended to them. For example, it is NOT NECESSARY to say

                prog .*xnlock.*

 since
                prog xnlock

 will suffice. Thus the regex match becomes a simple substring match.
 
 
 An "ed(1) regular expression" (from here on called RE) is the particular
 type of regular expression used in the "ed" editor under Unix.  For those
 who are interested in all the gory details of REs see the help for
 "regex" (which is incomplete, at the moment :-(), otherwise what follows
 should be sufficient for most needs.

 A regular expression is a convenient way to search for a set of specific
 strings matching a pattern.  To be able to specify such a pattern with
 only the ordinary set of printable character we have to co-opt some of
 them.  For example in a RE the period means _any_ single character,
 while an asterisk, '*', means zero or more occurences of the *PRECEDING*
 RE.

 For example:

   knob      - matches any string containing the substring 'knob'

   a*splat   - matches strings that contain zero or more a's followed by the
               string 'splat'

   #.*#      - would match anything containing a '#' followed by zero or more
               occurences of _any_ character, followed by another '#'

 Other special characters that may be useful are '[' and ']', which are
 used together.  They can be used to specify either a set of characters
 to match or a set of characters to not match.  An example of the first
 case is:

    [abcd]

 which matches any of one of the four letters, while an example of the
 second case is:

    [^abcd]

 in which the '^' _in_the_first_position_ means that any character _not_
 in the list will be matched.  As well, ranges can be specified with a
 '-'.

    [a-z]

 matches any lower case letter and,

    [^a-z]

 matches any character other than a lower case letter.  Furthermore, you
 can specify multiple ranges such as:

    [%@a-z0-9]

 or

    [^A-Za-z]

 meaning: match '%' or '@' or any lower case letter or digit, and match
 any character other than a letter, respectively.

 When you want to match a character which has a special meaning you should
 precede it by a backslash, '\'.

 Some final examples of REs are:

    [Mm]ac\.txt         - match anything containg the string "Mac.txt" or
                          "mac.txt"

    [^aeiou][^aeiou]*   - match any string consisting entirely of non-vowels

    foo-v[0-9]\.tar\.Z  - match "foo-v0.tar.Z" through "foo-v9.tar.Z"


 Good luck, and remember that many things can be found with only a simple
 substring (e.g. latex).