Python enumerate() tqdm progress-bar when reading a file?

Python enumerate() tqdm progress-bar when reading a file?

Youre on the right track. Youre using tqdm correctly, but stop short of printing each line inside the loop when using tqdm. Youll also want to use tqdm on your first for loop and not on others, like so:

with open(file_path, r) as f:
    for i, line in enumerate(tqdm(f)):
        if i >= start and i <= end:
            for i in range(0, line_size, batch_size):
                # pause if find a file naed pause at the currend dir
                re_batch = {}
                for j in range(batch_size):
                    re_batch[j] = re.search(line, last_span)

Some notes on using enumerate and its usage in tqdm here.

I ran into this as well – tqdm is not displaying a progress bar, because the number of lines in the file object has not been provided.

The for loop will iterate over lines, reading until the next newline character is encountered.

In order to add the progress bar to tqdm, you will first need to scan the file and count the number of lines, then pass it to tqdm as the total

from tqdm import tqdm

num_lines = sum(1 for line in open(myfile.txt,r))
with open(myfile.txt,r) as f:
    for line in tqdm(f, total=num_lines):
        print(line)

Python enumerate() tqdm progress-bar when reading a file?

Im trying to do the same thing on a file containing all Wikipedia articles. So I dont want to count the total lines before starting processing. Also its a bz2 compressed file, so the len of the decompressed line overestimates the number of bytes read in that iteration, so…

with tqdm(total=Path(filepath).stat().st_size) as pbar:
    with bz2.open(filepath) as fin:
        for line in fin:
            pbar.update(fin.tell() - pbar.n)
    
    # used this to figure out the attributes of the pbar instance
    # print(vars(pbar))

Thank you Yohan Kuanke for your deleted answer. If moderators undelete it you can crib mine.

Leave a Reply

Your email address will not be published. Required fields are marked *