regex – Grep and Python

regex – Grep and Python

The natural question is why not just use grep?! But assuming you cant…

import re
import sys

file = open(sys.argv[2], r)

for line in file:
     if re.search(sys.argv[1], line):
         print line,

Things to note:

  • search instead of match to find anywhere in string
  • comma (,) after print removes carriage return (line will have one)
  • argv includes python file name, so variables need to start at 1

This doesnt handle multiple arguments (like grep does) or expand wildcards (like the Unix shell would). If you wanted this functionality you could get it using the following:

import re
import sys
import glob

for arg in sys.argv[2:]:
    for file in glob.iglob(arg):
        for line in open(file, r):
            if re.search(sys.argv[1], line):
                print line,

Concise and memory efficient:

#!/usr/bin/env python
# file: grep.py
import re, sys, collections

collections.deque(map(sys.stdout.write,(l for l in sys.stdin if re.search(sys.argv[1],l))),maxlen=0)

It works like egrep (without too much error handling), e.g.:

cat input-file | grep.py RE

And here is the one-liner:

cat input-file | python -c import re,sys,collections;collections.deque(map(sys.stdout.write,(l for l in sys.stdin if re.search(sys.argv[1],l))),maxlen=0) RE

Note that the collections.deque function is required in Python3 because map has become a lazy function.

regex – Grep and Python

Adapted from a grep in python.

Accepts a list of filenames via [2:], does no exception handling:

#!/usr/bin/env python
import re, sys, os

for f in filter(os.path.isfile, sys.argv[2:]):
    for line in open(f).readlines():
        if re.match(sys.argv[1], line):
            print line

sys.argv[1] resp sys.argv[2:] works, if you run it as an standalone executable, meaning

chmod +x

first

Leave a Reply

Your email address will not be published. Required fields are marked *