FuzzyWuzzy
Fuzzy string matching like a boss. It uses
Levenshtein Distance
to calculate the differences between sequences in a simple-to-use package.
Requirements
Python 2.7 or higher
difflib
python-Levenshtein
(optional, provides a 4-10x speedup in String
Matching, though may result in
differing results for certain cases
)
For testing
pycodestyle
hypothesis
pytest
Using PIP via PyPI
pip install fuzzywuzzy
or the following to install
python-Levenshtein
too
pip install fuzzywuzzy[speedup]
Using PIP via Github
pip install git+git://github.com/seatgeek/[email protected]#egg=fuzzywuzzy
Adding to your
requirements.txt
file (run
pip install
-r
requirements.txt
afterwards)
git+ssh://[email protected]/seatgeek/[email protected]#egg=fuzzywuzzy
Manually via GIT
git clone git://github.com/seatgeek/fuzzywuzzy.git fuzzywuzzy
cd fuzzywuzzy
python setup.py install
Usage
>>> from fuzzywuzzy import fuzz
>>> from fuzzywuzzy import process
Simple Ratio
>>> fuzz.ratio("this is a test", "this is a test!")
Partial Ratio
>>> fuzz.partial_ratio("this is a test", "this is a test!")
Token Sort Ratio
>>> fuzz.ratio("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")
>>> fuzz.token_sort_ratio("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")
Token Set Ratio
>>> fuzz.token_sort_ratio("fuzzy was a bear", "fuzzy fuzzy was a bear")
>>> fuzz.token_set_ratio("fuzzy was a bear", "fuzzy fuzzy was a bear")
Process
>>> choices = ["Atlanta Falcons", "New York Jets", "New York Giants", "Dallas Cowboys"]
>>> process.extract("new york jets", choices, limit=2)
[('New York Jets', 100), ('New York Giants', 78)]
>>> process.extractOne("cowboys", choices)
("Dallas Cowboys", 90)
You can also pass additional parameters to extractOne method to make it use a specific scorer. A typical use case is to match file paths:
>>> process.extractOne("System of a down - Hypnotize - Heroin", songs)
('/music/library/good/System of a Down/2005 - Hypnotize/01 - Attack.mp3', 86)
>>> process.extractOne("System of a down - Hypnotize - Heroin", songs, scorer=fuzz.token_sort_ratio)
("/music/library/good/System of a Down/2005 - Hypnotize/10 - She's Like Heroin.mp3", 61)
Known Ports
FuzzyWuzzy is being ported to other languages too! Here are a few ports we know about:
Java: xpresso’s fuzzywuzzy implementation
Java: fuzzywuzzy (java port)
Rust: fuzzyrusty (Rust port)
JavaScript: fuzzball.js (JavaScript port)
C++: Tmplt/fuzzywuzzy
C#: fuzzysharp (.Net port)
Go: go-fuzzywuzz (Go port)
Free Pascal: FuzzyWuzzy.pas (Free Pascal port)
Kotlin multiplatform: FuzzyWuzzy-Kotlin
R: fuzzywuzzyR (R port)
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
The dropdown lists show the available interpreters, ABIs, and platforms.
Enable javascript to be able to filter the list of wheel files.
Copy a direct link to the current filters
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.1
File hashes
Hashes for fuzzywuzzy-0.18.0.tar.gz
Algorithm
Hash digest
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.1
File hashes
Hashes for fuzzywuzzy-0.18.0-py2.py3-none-any.whl
Algorithm
Hash digest