regex – How to do sed like text replace with python?

regex – How to do sed like text replace with python?

You can do that like this:

with open(/etc/apt/sources.list, r) as sources:
    lines = sources.readlines()
with open(/etc/apt/sources.list, w) as sources:
    for line in lines:
        sources.write(re.sub(r^# deb, deb, line))

The with statement ensures that the file is closed correctly, and re-opening the file in w mode empties the file before you write to it. re.sub(pattern, replace, string) is the equivalent of s/pattern/replace/ in sed/perl.

Edit: fixed syntax in example

Authoring a homegrown sed replacement in pure Python with no external commands or additional dependencies is a noble task laden with noble landmines. Who would have thought?

Nonetheless, it is feasible. Its also desirable. Weve all been there, people: I need to munge some plaintext files, but I only have Python, two plastic shoelaces, and a moldy can of bunker-grade Maraschino cherries. Help.

In this answer, we offer a best-of-breed solution cobbling together the awesomeness of prior answers without all of that unpleasant not-awesomeness. As plundra notes, David Millers otherwise top-notch answer writes the desired file non-atomically and hence invites race conditions (e.g., from other threads and/or processes attempting to concurrently read that file). Thats bad. Plundras otherwise excellent answer solves that issue while introducing yet more – including numerous fatal encoding errors, a critical security vulnerability (failing to preserve the permissions and other metadata of the original file), and premature optimization replacing regular expressions with low-level character indexing. Thats also bad.

Awesomeness, unite!

import re, shutil, tempfile

def sed_inplace(filename, pattern, repl):
    
    Perform the pure-Python equivalent of in-place `sed` substitution: e.g.,
    `sed -i -e s/${pattern}/${repl} ${filename}`.
    
    # For efficiency, precompile the passed regular expression.
    pattern_compiled = re.compile(pattern)

    # For portability, NamedTemporaryFile() defaults to mode w+b (i.e., binary
    # writing with updating). This is usually a good thing. In this case,
    # however, binary writing imposes non-trivial encoding constraints trivially
    # resolved by switching to text writing. Lets do that.
    with tempfile.NamedTemporaryFile(mode=w, delete=False) as tmp_file:
        with open(filename) as src_file:
            for line in src_file:
                tmp_file.write(pattern_compiled.sub(repl, line))

    # Overwrite the original file with the munged temporary file in a
    # manner preserving file attributes (e.g., permissions).
    shutil.copystat(filename, tmp_file.name)
    shutil.move(tmp_file.name, filename)

# Do it for Johnny.
sed_inplace(/etc/apt/sources.list, r^# deb, deb)

regex – How to do sed like text replace with python?

massedit.py (http://github.com/elmotec/massedit) does the scaffolding for you leaving just the regex to write. Its still in beta but we are looking for feedback.

python -m massedit -e re.sub(r^# deb, deb, line) /etc/apt/sources.list

will show the differences (before/after) in diff format.

Add the -w option to write the changes to the original file:

python -m massedit -e re.sub(r^# deb, deb, line) -w /etc/apt/sources.list

Alternatively, you can now use the api:

>>> import massedit
>>> filenames = [/etc/apt/sources.list]
>>> massedit.edit_files(filenames, [re.sub(r^# deb, deb, line)], dry_run=True)

Leave a Reply

Your email address will not be published. Required fields are marked *