Grep

From Jonathan Gardner's Tech Wiki
Jump to: navigation, search

I love grep, and I use it all the time. A closely related program is find.

This is my favorite way to use it. If I am looking for any file with a particular line in it:

grep -rI 'the string' .

Explanation:

  • -r: Recurse through directories.
  • -I: Ignore binary files.
  • 'the string': The string to find. I always use single quotes.
  • .: The directory to search through.

Here are the most common patterns I use in grep: (See Regex)

  • \|: Alternatives.
  • *: The Kleene Star-0 or more of the preceeding.
  • \+: One or more.
  • \?: Zero or one. In other words, optional.
  • \<, \>: Word boundaries. For instance, if I was looking for the word 'python' and not 'pythons' or 'unpythonic', I would type '\<python\>'.
  • \(, \): Grouping parens.

Sometimes I want a case-insensitive search (rarely though). Just use the -i option to grep.

I never, never use egrep. I am too used to grep's regex syntax.

-A NUM Show NUM lines after the matching line.
-B NUM Show NUM lines before the matching line.
-C NUM Show NUM lines around the matching line.
--colour Color the matching expressions. Some people do "alias cgrep grep --color". I don't.
-c Just show how many lines matched.
-I Ignore binary files.
-i Ignore case.
-l Show only the filenames that match.
-L Show only the filenames that don't match.
-n Show the line numbers.
-o Show only the match.
-r Recurse through directories.
-v Show lines that don't match the expression. This is great for using with tail when you don't want to see certain lines.
-P Use perl's regex engine.

Patterns

See regex.

Most of grep's regex syntax needs to be escaped. Don't forget that bash does some magic on double-quotes, so use single-quotes if possible.

Characters

[chars] Any char. You can also use POSIX groups inside.
. Any char.
\w Any word char.
\W Any non-word char.
\char char if char is a special character.

POSIX groups can be used inside the square brackets. These are [:alnum:], [:alpha:], [:cntrl:], [:digit:], [:graph:], [:lower:], [:print:], [:punct:], [:space:], [:upper:], and [:xdigit:]. Don't forget that these can be combined, as in [[:lower:][:digit:]%].

Grouping

\(pattern\) Basic grouping.
\| Alternation.

Positions

^ Start of line.
$ End of line.
\< \> Beginning, end of word.
\b Word boundary
\B Non-word boundary


Multipliers

* Zero or more
\+ One or more
\? Zero or one
\{n\} n times
\{n,m\} n to m times, inclusive
\{n,\} n or more times

These are all greedy. Non-greedy has a '\?' after the multiplier.