mirror of https://github.com/Nezreka/SoulSync.git
main
dev
video
fix/disable-beatport-features
johnbaumb-discover-redesign
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2.0
2.1
2.2
2.3
2.4.0
2.4.1
2.4.2
2.5.0
2.5.1
2.5.2
2.5.3
2.5.4
2.5.5
2.5.6
2.5.7
2.5.9
2.6.0
2.6.1
2.6.2
2.6.3
2.6.4
2.6.5
2.6.6
2.6.7
2.6.8
2.6.9
2.7.0
2.7.1
2.7.2
2.7.3
2.7.4
2.7.5
2.7.6
2.7.7
2.7.8
2.7.9
2.8.0
v0.65
${ noResults }
3 Commits (2fcdfd3145d415355b9202c406d8985fda8eb003)
| Author | SHA1 | Message | Date |
|---|---|---|---|
|
|
bba0836324 |
Fix #768: playlist sync editor refusing to match certain tracks
Three compounding bugs hit tracks whose source metadata is YouTube/streaming- shaped — title "Artist - Song", artist "Official Artist"/"Artist - Topic"/ "ArtistVEVO" (reported: "Arctic Monkeys - Do I Wanna Know?" by "Official Arctic Monkeys"). Server-agnostic — affects Plex/Jellyfin/Navidrome, not just the reporter's Navidrome. Bug A — the match fails. The confidence scorer and the editor's reconcile both compared the raw "Artist - Song" title against the library's clean "Song"; the length-ratio penalty + floor drove it to ~0.18 (NO-MATCH), so the track showed unmatched while its server copy showed as an orphan "extra". New pure core/text/source_title.py (clean_source_artist / strip_artist_prefix / canonical_source_track) strips the channel/video decoration, applied at BOTH matching seams: services/sync_service._find_track_in_media_server (tries raw then canonical, keeps the best) and the editor reconcile. Conservative: a title prefix is stripped only when it equals the artist, so "Self-Titled", "Jay-Z", and "Marvin Gaye" (by another artist) are untouched, and the canonical form is an additional best-of candidate so it can only help. Bug B — manual matches never persisted. get_server_playlist_tracks built the per-source entry WITHOUT source_track_id, so "Find & add" posted an empty id and _persist_find_and_add_match returned early. The match reverted to "extra" on reload and re-adding looped. The editor's 3-pass matcher is now lifted to a pure, tested core.sync.playlist_reconcile.reconcile_playlist that includes source_track_id (the frontend at pages-extra.js:1836 already reads + sends it). Bug C — manual match duplicated + delete wiped all copies. "Find & add" always inserted, so linking a source to an already-present server track appended a duplicate (pos 72, 73...); remove filtered out EVERY entry with the target id. New pure core.sync.playlist_edit (plan_playlist_add: link-don't-duplicate when the target is already present; remove_one_occurrence: drop a single copy) wired into the Plex/Jellyfin/Navidrome add + remove branches. Tests (extreme): tests/test_source_title.py (35), tests/test_playlist_reconcile.py (11 — incl. the reported case, parity for override/exact/fuzzy/extra, and duplicate-server handling), tests/test_playlist_edit.py (12). 286 matching/sync tests still pass. Caveats: the sync_service change and the add/remove/editor endpoints are read-verified, not executed against a live media server (none in CI). The pure cores they call are exhaustively unit-tested; output-shape parity of the reconcile lift is covered. Delete removes the first matching copy (duplicates are identical, so harmless). |
4 weeks ago |
|
|
174513d351 |
Fix #769: playlist sync matched wrong same-artist track with high confidence
Tracks NOT in the library were matched to a DIFFERENT song by the SAME artist
and reported with high confidence instead of as missing — e.g. "Dani
California" -> "Californication" (Red Hot Chili Peppers), "Under The Bridge"
-> "Around the World".
Root cause: _calculate_track_confidence scores 0.5*title + 0.5*artist. A
same-artist comparison always yields artist = 1.0, so the title score is the
only thing that can tell two of an artist's songs apart — but that score is a
SequenceMatcher CHARACTER ratio, which over-credits unrelated titles that
share a long substring ("californi…" = 0.67) or just a stopword ("the" =
0.62). With the flat 0.5 artist term, anything clearing the weak 0.6 char
floor lands at ~0.81-0.83, well over the 0.7 sync threshold. Reproduced on
dev: both reported pairs score 0.81/0.83.
Fix: new core/text/title_match.py:titles_plausibly_same, called in
_calculate_track_confidence right before the floor. It accepts a pair only
when it's near-identical char-wise (>=0.85, so typos / punctuation / casing
like "Beleive"->"Believe", "HUMBLE."->"Humble" still match) OR the titles
share at least one significant (non-stopword) word. Two different songs by the
same artist share no content word, so they're rejected and the real track is
correctly reported missing. ("the" is a stopword — that's what leaked "Under
The Bridge"/"Around the World".)
Scoped deliberately: the word-overlap test fires ONLY when at least one side
has 2+ content words. For single-word titles there is no other word to share,
so it defers to the existing char floor — otherwise legitimate stylized
spellings ("Grey"/"Gray", "Tonite"/"Tonight", "4ever"/"Forever") would become
new false-negatives. Verified those still match. The few single-word variants
that do score low (Ok/Okay, Thru/Through) were already rejected by the
pre-existing length-ratio penalty, not by this gate.
Both reported false positives now score 0.33/0.31 -> missing. Does NOT address
the harder case of two different same-artist songs that DO share a content
word (e.g. "Believe"/"Believer") — pre-existing and unworsened. Any residual
error fails safe: a false-missing is re-downloaded/wishlisted, vs the old
behavior which silently substituted the wrong song.
Tests: tests/test_title_match_guard.py (14) — pure-guard unit tests + a
13-pair battery driving the REAL _calculate_track_confidence (genuine matches
stay >=0.7, same-artist different songs drop below), plus an explicit
no-regression test for stylized single-word spellings. 292 matching/sync tests
pass.
|
4 weeks ago |
|
|
feb6778af4 |
Address Cin review: extract helpers, indexed pool fetch, tidy nits
Three changes folded into one perf+cleanup pass: 1. Indexed fast path for the per-artist pool fetch. The previous `search_tracks(artist=name)` call hit `unidecode_lower(artists.name) LIKE ?`, a function-in-WHERE that can't use `idx_artists_name`. New `MusicDatabase.get_artist_tracks_indexed` does a two-step lookup: exact-name match (indexed) plus a case-insensitive fallback, then `tracks WHERE artist_id IN (...)` via `idx_tracks_artist_id`. Drops per-artist fetch from seconds to milliseconds for the common case. The sync helper falls back to the old LIKE-based `search_tracks` only when the indexed lookup finds nothing, preserving diacritic recall and `tracks.track_artist` feature-artist matches with zero regression. 2. Public text-normalization helper. Lifted the body of `MusicDatabase._normalize_for_comparison` into `core/text/normalize.py:normalize_for_comparison` so callers outside the database layer (matching engine, sync pool, future import-side comparisons) don't reach across the module boundary into a leading-underscore "private" method. The DB method now delegates, so existing internal call sites stay untouched. Sync's lazy pool now imports the public helper. 3. Artist-name walker extracted. `_artist_name` at module level in `services/sync_service.py` replaces two near-identical inline str-or-dict-or-fallback walkers (one in `sync_playlist`, one in `_find_track_in_media_server`). Returns `''` for None instead of the literal string `'None'`. Plus three small tidies from the same review: - `_POOL_FETCH_LIMIT = 10000` constant in place of the literal at the pool-fetch call site. - Trimmed the verbose docstring + comment block on the pool helper. - Set-intersection predicate for the trigger-shape reset in `core/automation/api.py` instead of a two-line `or` chain. Also removed the duplicate `_get_active_media_client()` call at sync_service.py:212/214 — pre-existing wart that was sitting in the same block I was editing. Tests: 21 new tests across `tests/database/`, `tests/sync/`, and `tests/text/`, plus updates to the existing pool tests to cover the new fast/fallback split. Full suite stays green (3953 passing). |
1 month ago |