Find a file in python

Find a file in python

os.walk is the answer, this will find the first match:

import os

def find(name, path):
    for root, dirs, files in os.walk(path):
        if name in files:
            return os.path.join(root, name)

And this will find all matches:

def find_all(name, path):
    result = []
    for root, dirs, files in os.walk(path):
        if name in files:
            result.append(os.path.join(root, name))
    return result

And this will match a pattern:

import os, fnmatch
def find(pattern, path):
    result = []
    for root, dirs, files in os.walk(path):
        for name in files:
            if fnmatch.fnmatch(name, pattern):
                result.append(os.path.join(root, name))
    return result

find(*.txt, /path/to/dir)

In Python 3.4 or newer you can use pathlib to do recursive globbing:

>>> import pathlib
>>> sorted(pathlib.Path(.).glob(**/*.py))
[PosixPath(build/lib/pathlib.py),
 PosixPath(docs/conf.py),
 PosixPath(pathlib.py),
 PosixPath(setup.py),
 PosixPath(test_pathlib.py)]

Reference: https://docs.python.org/3/library/pathlib.html#pathlib.Path.glob

In Python 3.5 or newer you can also do recursive globbing like this:

>>> import glob
>>> glob.glob(**/*.txt, recursive=True)
[2.txt, sub/3.txt]

Reference: https://docs.python.org/3/library/glob.html#glob.glob

Find a file in python

I used a version of os.walk and on a larger directory got times around 3.5 sec. I tried two random solutions with no great improvement, then just did:

paths = [line[2:] for line in subprocess.check_output(find . -iname *.txt, shell=True).splitlines()]

While its POSIX-only, I got 0.25 sec.

From this, I believe its entirely possible to optimise whole searching a lot in a platform-independent way, but this is where I stopped the research.

Leave a Reply

Your email address will not be published. Required fields are marked *