Using grep in python
Using grep in python
First of all, you are not iterating over the file properly. You can simply use for b in f:
without the .readline()
stuff.
Then your code will blow in your face as soon as the filename contains any characters which have a special meaning in the shell. Use subprocess.call
instead of os.system()
and pass an argument list.
Heres a fixed version:
import os
import subprocess
with open(query.txt, r) as f:
for line in f:
line = line.rstrip() # remove trailing whitespace such as n
subprocess.call([/bin/grep, line, my2.txt])
However, you can improve your code even more by not calling grep
at all.
Read my2.txt
to a string instead and then use the re
module to perform the search. In case you do not need a regex at all, you can even simply use if line in my2_content
Your code scans the whole my2.txt
file for each query in query.txt
.
You want to:
- read all queries into a list
- iterate once over all lines of the text file and check each file against all queries.
Try this code:
with open(query.txt,r) as f:
queries = [l.strip() for l in f]
with open(my2.txt,r) as f:
for line in f:
for query in queries:
if query in line:
print query, line
Using grep in python
This isnt actually a good way to use Python, but if you have to do something like that, then do it correctly:
from __future__ import with_statement
import subprocess
def grep_lines(filename, query_filename):
with open(query_filename, rb) as myfile:
for line in myfile:
subprocess.call([/bin/grep, line.strip(), filename])
grep_lines(my2.txt, query.txt)
And hope that your file doesnt contain any characters which have special meanings in regular expressions =)
Also, you might be able to do this with grep
alone:
grep -f query.txt my2.txt
It works like this:
~ $ cat my2.txt
One two
two two
two three
~ $ cat query.txt
two two
three
~ $ python bar.py
two two
two three