python – Find the similarity metric between two strings

python – Find the similarity metric between two strings

There is a built in.

from difflib import SequenceMatcher

def similar(a, b):
    return SequenceMatcher(None, a, b).ratio()

Using it:

>>> similar(Apple,Appel)
>>> similar(Apple,Mango)

I think maybe you are looking for an algorithm describing the distance between strings. Here are some you may refer to:

  1. Hamming distance
  2. Levenshtein distance
  3. Damerau–Levenshtein distance
  4. Jaro–Winkler distance

python – Find the similarity metric between two strings

Solution #1: Python builtin

use SequenceMatcher from difflib

native python library, no need extra package.
cons: too limited, there are so many other good algorithms for string similarity out there.

example :

>>> from difflib import SequenceMatcher
>>> s = SequenceMatcher(None, abcd, bcde)
>>> s.ratio()

Solution #2: jellyfish library

its a very good library with good coverage and few issues.
it supports:
– Levenshtein Distance
– Damerau-Levenshtein Distance
– Jaro Distance
– Jaro-Winkler Distance
– Match Rating Approach Comparison
– Hamming Distance

easy to use, gamut of supported algorithms, tested.
cons: not native library.


>>> import jellyfish
>>> jellyfish.levenshtein_distance(ujellyfish, usmellyfish)
>>> jellyfish.jaro_distance(ujellyfish, usmellyfish)
>>> jellyfish.damerau_levenshtein_distance(ujellyfish, ujellyfihs)

Leave a Reply

Your email address will not be published. Required fields are marked *