python – How to use a variable inside a regular expression?

python – How to use a variable inside a regular expression?

You have to build the regex as a string:

TEXTO = sys.argv[1]
my_regex = rb(?=w) + re.escape(TEXTO) + rb(?!w)

if re.search(my_regex, subject, re.IGNORECASE):
    etc.

Note the use of re.escape so that if your text has special characters, they wont be interpreted as such.

From python 3.6 on you can also use Literal String Interpolation, f-strings. In your particular case the solution would be:

if re.search(rfb(?=w){TEXTO}b(?!w), subject, re.IGNORECASE):
    ...do something

EDIT:

Since there have been some questions in the comment on how to deal with special characters Id like to extend my answer:

raw strings (r):

One of the main concepts you have to understand when dealing with special characters in regular expressions is to distinguish between string literals and the regular expression itself. It is very well explained here:

In short:

Lets say instead of finding a word boundary b after TEXTO you want to match the string boundary. The you have to write:

TEXTO = Var
subject = rVarboundary

if re.search(rfb(?=w){TEXTO}\boundary(?!w), subject, re.IGNORECASE):
    print(match)

This only works because we are using a raw-string (the regex is preceded by r), otherwise we must write \\boundary in the regex (four backslashes). Additionally, without r, b would not converted to a word boundary anymore but to a backspace!

re.escape:

Basically puts a backspace in front of any special character. Hence, if you expect a special character in TEXTO, you need to write:

if re.search(rfb(?=w){re.escape(TEXTO)}b(?!w), subject, re.IGNORECASE):
    print(match)

NOTE: For any version >= python 3.7: !, , %, , ,, /, :, ;, <, =, >, @, and ` are not escaped. Only special characters with meaning in a regex are still escaped. _ is not escaped since Python 3.3.(s. here)

Curly braces:

If you want to use quantifiers within the regular expression using f-strings, you have to use double curly braces. Lets say you want to match TEXTO followed by exactly 2 digits:

if re.search(rfb(?=w){re.escape(TEXTO)}d{{2}}b(?!w), subject, re.IGNORECASE):
    print(match)

python – How to use a variable inside a regular expression?

if re.search(rb(?<=w)%sb(?!w) % TEXTO, subject, re.IGNORECASE):

This will insert what is in TEXTO into the regex as a string.

Leave a Reply

Your email address will not be published. Required fields are marked *