SoulSync

History

BoulderBadgeDad 174513d351 Fix #769 : playlist sync matched wrong same-artist track with high confidence Tracks NOT in the library were matched to a DIFFERENT song by the SAME artist and reported with high confidence instead of as missing — e.g. "Dani California" -> "Californication" (Red Hot Chili Peppers), "Under The Bridge" -> "Around the World". Root cause: _calculate_track_confidence scores 0.5title + 0.5artist. A same-artist comparison always yields artist = 1.0, so the title score is the only thing that can tell two of an artist's songs apart — but that score is a SequenceMatcher CHARACTER ratio, which over-credits unrelated titles that share a long substring ("californi…" = 0.67) or just a stopword ("the" = 0.62). With the flat 0.5 artist term, anything clearing the weak 0.6 char floor lands at ~0.81-0.83, well over the 0.7 sync threshold. Reproduced on dev: both reported pairs score 0.81/0.83. Fix: new core/text/title_match.py:titles_plausibly_same, called in _calculate_track_confidence right before the floor. It accepts a pair only when it's near-identical char-wise (>=0.85, so typos / punctuation / casing like "Beleive"->"Believe", "HUMBLE."->"Humble" still match) OR the titles share at least one significant (non-stopword) word. Two different songs by the same artist share no content word, so they're rejected and the real track is correctly reported missing. ("the" is a stopword — that's what leaked "Under The Bridge"/"Around the World".) Scoped deliberately: the word-overlap test fires ONLY when at least one side has 2+ content words. For single-word titles there is no other word to share, so it defers to the existing char floor — otherwise legitimate stylized spellings ("Grey"/"Gray", "Tonite"/"Tonight", "4ever"/"Forever") would become new false-negatives. Verified those still match. The few single-word variants that do score low (Ok/Okay, Thru/Through) were already rejected by the pre-existing length-ratio penalty, not by this gate. Both reported false positives now score 0.33/0.31 -> missing. Does NOT address the harder case of two different same-artist songs that DO share a content word (e.g. "Believe"/"Believer") — pre-existing and unworsened. Any residual error fails safe: a false-missing is re-downloaded/wishlisted, vs the old behavior which silently substituted the wrong song. Tests: tests/test_title_match_guard.py (14) — pure-guard unit tests + a 13-pair battery driving the REAL _calculate_track_confidence (genuine matches stay >=0.7, same-artist different songs drop below), plus an explicit no-regression test for stylized single-word spellings. 292 matching/sync tests pass.		2 weeks ago
..
__init__.py	basic db structure	10 months ago
music_database.py	Fix #769 : playlist sync matched wrong same-artist track with high confidence	2 weeks ago
personalized_schema.py	Personalized pipeline: auto-refresh stale snapshots after watchlist scan	4 weeks ago