Team Matcher

Fuzzy-match football team names across data feeds. Try the open-source algorithm we built and use in production at Scorecast — handles abbreviations, accents, alternate languages, swapped home/away, and inconsistent league names.

Try an example:

Feed A (your data)

Home team

Away team

League / competition

Kickoff (local)

Feed B (candidate)

Home team

Away team

League / competition

Kickoff (local)

Final score

1.000/ 1.000

HIGH CONFIDENCE Time bonus +0.200

Component scores

Home team similarity (×0.4)1.000

Away team similarity (×0.4)1.000

League similarity (×0.2)0.000

Kickoff-time bonus (additive, max +0.20)0.200

base = 0.4 × 1.000 + 0.4 × 1.000 + 0.2 × 0.000= 0.800 (+ time 0.200)

Tokens after normalization & stop-word filter

A: Man Utd

manchesterunited

B: Manchester United FC

manchesterunited

A: Liverpool

liverpool

B: Liverpool FC

liverpool

Use this in your code

The exact algorithm shown above is open-source under the MIT license. Pure Python, zero dependencies, fully typed.

$ pip install team-matcher

from team_matcher import match_fixture

View on GitHub View on PyPI

Why team-name matching is hard

Anyone who has joined data from two different sports providers has hit the wall: Man Utd vs Manchester United FC, Real Madrid CF vs Real Madrid, Bayern München vs FC Bayern Munich. Naive equality fails. Off-the-shelf string distance (e.g. difflib.SequenceMatcher) is fragile — it scores Manchester United vs Manchester City at over 80%, which is dangerously high.

How the algorithm works

1. Tokenize

Strip accents, drop parentheticals ((W), (II)), drop age tags (U21), split on whitespace and punctuation.

2. Filter stop-words

Drop generic designators (fc, sc, cf, real, atletico) and language particles. Keep distinguishing words like united, city.

3. Hybrid similarity

sim = 0.4 × jaccard + 0.6 × containment. Containment normalises by the smaller set so Olancho matches Olancho FC at 1.0.

4. Kickoff-time bonus

League names are wildly inconsistent across feeds. If both fixtures have a kickoff within 30 min, add up to +0.20. This single rule typically raises cross-feed match rates from ~10% to over 65%.

Open-source & self-hostable

The algorithm above is published as an MIT-licensed Python package with zero dependencies and a full test suite. Drop it into any data pipeline.

pip install team-matcher

from datetime import datetime
from team_matcher import Candidate, match_fixture

candidates = [
    Candidate(
        home="Manchester United FC",
        away="Liverpool FC",
        league="Premier League",
        kickoff=datetime(2026, 4, 27, 19, 45),
        payload="match_id_123",
    ),
]
m = match_fixture("Man Utd", "Liverpool", "EPL",
                  candidates, kickoff=datetime(2026, 4, 27, 19, 45))
print(m.score, m.candidate.payload)  # 1.0 match_id_123

More betting tools

Odds Converter Arbitrage Calculator Margin Calculator