python – When should I ever use or file.readlines()?

python – When should I ever use or file.readlines()?

The short answer to your question is that each of these three methods of reading bits of a file have different use cases. As noted above, reads the file as an individual string, and so allows relatively easy file-wide manipulations, such as a file-wide regex search or substitution.

f.readline() reads a single line of the file, allowing the user to parse a single line without necessarily reading the entire file. Using f.readline() also allows easier application of logic in reading the file than a complete line by line iteration, such as when a file changes format partway through.

Using the syntax for line in f: allows the user to iterate over the file line by line as noted in the question.

(As noted in the other answer, this documentation is a very good read):

It was previously claimed that f.readline() could be used to skip a line during a for loop iteration. However, this doesnt work in Python 2.7, and is perhaps a questionable practice, so this claim has been removed.

Hope this helps!

When size is omitted or negative, the entire contents of the file will be read and returned; it’s your problem if the file is twice as large as your machine’s memory

Sorry for all the edits!

For reading lines from a file, you can loop over the file object. This is memory efficient, fast, and leads to simple code:

for line in f:
    print line,

This is the first line of the file.
Second line of the file

python – When should I ever use or file.readlines()?

Note that readline() is not comparable to the case of reading all lines in for-loop since it reads line by line and there is an overhead which is pointed out by others already.

I ran timeit on two identical snippts but one with for-loop and the other with readlines(). You can see my snippet below:

def test_read_file_1():  
    f = open(ml/, r)  
    for line in f.readlines():  
def test_read_file_2():  
    f = open(ml/, r)  
    for line in f:  
def test_time_read_file():  
    from timeit import timeit  
    duration_1 = timeit(lambda: test_read_file_1(), number=1000000)  
    duration_2 = timeit(lambda: test_read_file_2(), number=1000000)  
    print(duration using readlines():, duration_1)  
    print(duration using for-loop:, duration_2)

And the results:

duration using readlines(): 78.826229238
duration using for-loop: 69.487692794

The bottomline, I would say, for-loop is faster but in case of possibility of both, Id rather readlines().

Leave a Reply

Your email address will not be published. Required fields are marked *