python – replace the punctuation with whitespace

python – replace the punctuation with whitespace

It is easy to achieve by changing your maketrans like this:

import string
tweet = I am tired! I like fruit...and milk
translator = string.maketrans(string.punctuation,  *len(string.punctuation)) #map punctuation to space
print(tweet.translate(translator))

It works on my machine running python 3.5.2 and 2.x.
Hope that it works on yours too.

Here is a regex based solution that has been tested under Python 3.5.1. I think it is both simple and succinct.

import re

tweet = I am tired! I like fruit...and milk
clean = re.sub(r
               [,.;@#?!&$]+  # Accept one or more copies of punctuation
                *           # plus zero or more copies of a space,
               ,
                ,          # and replace it with a single space
               tweet, flags=re.VERBOSE)
print(tweet + n + clean)

Results:

I am tired! I like fruit...and milk
I am tired I like fruit and milk

Compact version:

tweet = I am tired! I like fruit...and milk
clean = re.sub(r[,.;@#?!&$]+ *,  , tweet)
print(tweet + n + clean)

python – replace the punctuation with whitespace

There are a few ways to approach this problem. I have one that works, but believe it is suboptimal. Hopefully someone who knows regex better will come along and improve the answer or offer a better one.

Your question is labeled python-3.x, but your code is python 2.x, so my code is 2.x as well. I include a version that works in 3.x.

#!/usr/bin/env python

import re

tweet = I am tired! I like fruit...and milk
# print tweet

clean_words = tweet.translate(None, ,.;@#?!&$)  # Python 2
# clean_words = tweet.translate(,.;@#?!&$)  # Python 3
print(clean_words)  # Does not handle fruit...and

regex_sub = re.sub(r[,.;@#?!&$]+,  , tweet)  # + means match one or more
print(regex_sub)  # extra space between tired and I

regex_sub = re.sub(rs+,  , regex_sub)  # Replaces any number of spaces with one space
print(regex_sub)  # looks good

Leave a Reply

Your email address will not be published. Required fields are marked *