Tracks NOT in the library were matched to a DIFFERENT song by the SAME artist
and reported with high confidence instead of as missing — e.g. "Dani
California" -> "Californication" (Red Hot Chili Peppers), "Under The Bridge"
-> "Around the World".
Root cause: _calculate_track_confidence scores 0.5*title + 0.5*artist. A
same-artist comparison always yields artist = 1.0, so the title score is the
only thing that can tell two of an artist's songs apart — but that score is a
SequenceMatcher CHARACTER ratio, which over-credits unrelated titles that
share a long substring ("californi…" = 0.67) or just a stopword ("the" =
0.62). With the flat 0.5 artist term, anything clearing the weak 0.6 char
floor lands at ~0.81-0.83, well over the 0.7 sync threshold. Reproduced on
dev: both reported pairs score 0.81/0.83.
Fix: new core/text/title_match.py:titles_plausibly_same, called in
_calculate_track_confidence right before the floor. It accepts a pair only
when it's near-identical char-wise (>=0.85, so typos / punctuation / casing
like "Beleive"->"Believe", "HUMBLE."->"Humble" still match) OR the titles
share at least one significant (non-stopword) word. Two different songs by the
same artist share no content word, so they're rejected and the real track is
correctly reported missing. ("the" is a stopword — that's what leaked "Under
The Bridge"/"Around the World".)
Scoped deliberately: the word-overlap test fires ONLY when at least one side
has 2+ content words. For single-word titles there is no other word to share,
so it defers to the existing char floor — otherwise legitimate stylized
spellings ("Grey"/"Gray", "Tonite"/"Tonight", "4ever"/"Forever") would become
new false-negatives. Verified those still match. The few single-word variants
that do score low (Ok/Okay, Thru/Through) were already rejected by the
pre-existing length-ratio penalty, not by this gate.
Both reported false positives now score 0.33/0.31 -> missing. Does NOT address
the harder case of two different same-artist songs that DO share a content
word (e.g. "Believe"/"Believer") — pre-existing and unworsened. Any residual
error fails safe: a false-missing is re-downloaded/wishlisted, vs the old
behavior which silently substituted the wrong song.
Tests: tests/test_title_match_guard.py (14) — pure-guard unit tests + a
13-pair battery driving the REAL _calculate_track_confidence (genuine matches
stay >=0.7, same-artist different songs drop below), plus an explicit
no-regression test for stylized single-word spellings. 292 matching/sync tests
pass.