Users manually match an album to the regular edition, but enrichment/
repair keeps treating it as the deluxe (missing songs, renumbered tracks).
Root cause: an album has TWO identities — the enrichment match
(spotify_album_id, which manual-match sets and the worker already honors)
and a SEPARATE canonical version pin (canonical_album_id, added by #777).
The canonical pin is what track-number repair / reorganize / missing-track
detection actually read, and library_manual_match never wrote it — so it
was resolved independently and landed on the deluxe edition.
(So #777 did NOT solve #758: it added canonical pinning, but manual
matches didn't write the pin.)
Fix: a manual ALBUM match on a canonical-recognised source now also pins
AND locks the canonical version to the chosen release:
- new canonical_locked column (same migration pattern as the other
canonical cols).
- set_album_canonical(..., locked=False) gains an atomic WHERE-clause
guard: an auto write can't overwrite a locked pin; a manual write
(locked=True) always wins. get_album_canonical exposes `locked`.
- library_manual_match pins canonical for album matches via the pure
should_pin_manual_canonical(entity_type, source).
The auto resolve job already skips already-pinned albums, so the lock is
protected on two fronts; the new guard also covers any future
re-resolution. A new manual match still overrides.
18 tests: the pure gate (+ a sync-invariant test vs _ALBUM_ID_COLUMNS)
and the DB lock seam (auto can't clobber a manual lock; manual overrides;
auto-over-auto still works). Additive — locked defaults False, so the
auto path is unchanged unless a manual lock exists. Full suite clean.
#798 follow-up. The enrichment worker's own loop already bridges to the
no-creds Spotify Free source during a rate-limit ban (its guard checks
is_spotify_metadata_available()). But the resume button's pre-check
(_spotify_resume_pre_check) blocked resume on ANY rate-limit with no
awareness of Free — so a Free-opted-in user who got rate-limited was
locked out of restarting the worker, unable to fall through to the free
API.
Fix: the resume guard now mirrors the worker. Block only when
rate-limited AND nothing can serve (plain auth, no Free) — where resuming
would just sleep out the ban. When Free is available it serves during the
ban (is_spotify_authenticated() is False while banned, so
is_spotify_metadata_available() reports the free source), so resume is
allowed and the worker bridges via Free, then returns to real auth once
the ban lifts. Stays real-API-first; Free is only the bridge.
The rule is pinned in a pure helper should_block_rate_limited_resume()
next to the other gate functions, with 3 tests. Full suite clean (only
pre-existing soundcloud /app env failures remain).
A mis-grouped library track sits under a 'Various Artists' / '[Unknown
Album]' record while the file itself is correctly tagged. Write Tags
reads the DB and stamped that junk over the file — destroying the
correct tags. (Rematching never helped: a match only stores a source-ID
pointer, it never changes the local name/title the writer reads.)
Guard: never replace a real file value with a placeholder. Added at both
seams so preview and write agree:
- build_tag_diff marks such a field protected/no-change (the preview no
longer shows a wrong overwrite, and has_changes reflects reality),
- write_tags_to_file reads the file's current values and skips the
placeholder-over-real fields, preserving the file.
Field-agnostic and direction-safe: the guard fires ONLY when the DB
value is a placeholder AND the file holds a real one. A legitimate value
still writes — including a genuine 'Various Artists' album artist on a
real compilation, where the file has no conflicting real value, so the
guard doesn't fire. Every write_tags_to_file caller writes DB->file as a
correction, so blocking placeholder-over-real is correct for all of them
(the download post-process uses a different path, embed_source_ids).
23 new tests (placeholder detection, the guard fn, build_tag_diff on the
screenshot-#2 scenario, end-to-end FLAC write preserving real values and
still overwriting with real ones). 113 existing repair/retag/tag tests
pass.
The post-scan reconcile previously ran AFTER the worker's 'finished'
signal, which flips db_update_state status to 'finished'. Automations
wait for a scan by polling that status, so they stopped waiting before
the reconcile ran — and the dashboard/Tools card showed "Completed" then
flipped to "Reading file tags…". For incremental scans this was
invisible (sub-second); for a full refresh it was a real gap (a chained
automation would fire minutes before the IDs were filled).
Fix: the worker now routes completion through _emit_finished(), which
runs self.post_scan_hook (the reconcile) FIRST, then emits 'finished'.
The hook is injected by the web layer (it owns path resolution). So:
- status stays 'running' through the reconcile,
- the reconcile pushes its phase ("Reading file tags for N new tracks…")
and per-track progress through the SAME db_update_state callbacks the
scan already uses — so automations, the dashboard card, and the Tools
page all see it for free and wait for it,
- 'finished' is emitted exactly once, AFTER the reconcile — race-free, no
status blip a poll could catch,
- best-effort: a hook exception never blocks 'finished', so a scan can't
get stranded as perpetually 'running'.
Both scan entry points (_run_database_update_task, _run_deep_scan_task)
set the hook before run()/run_deep_scan(); the redundant post-run calls
are removed.
5 ordering tests pin the contract (hook-before-finished, finished still
fires without a hook, hook exception doesn't block finished, hook gets
the worker). Full suite clean (only pre-existing soundcloud /app env
failures remain).
Extends the manual "Import IDs from File Tags" backfill so newly-scanned
files get their embedded provider IDs pulled into the DB automatically —
no button press needed to keep up with new music.
How it works:
- insert_or_update_media_track now returns 'inserted' / 'updated' / False
(truthy-compatible; existing `if track_success` callers unaffected) so
the scan worker can tell a genuinely new row from an update.
- DatabaseUpdateWorker collects the ids it newly INSERTED this run
(self._new_track_ids) across all insert paths (Plex/Jellyfin/deep).
- After run()/run_deep_scan(), web_server calls _reconcile_after_scan(),
which gap-fills embedded IDs for just those new tracks. Runs as a
post-scan pass (the scan loop itself is untouched/fast — the media
server API never exposes these custom IDs, so the file must be read
once regardless; batching at the end keeps it out of the hot loop and
best-effort so it can never abort a scan). A progress phase ("Reading
file tags for N new tracks…") surfaces the full-refresh tail.
Shared engine:
- New reconcile_library() in core does the paging + lazy parent-map
loading (only loads albums/artists actually referenced — cheap when
scoped to a few new tracks) + per-page commits. BOTH the manual button
and the scan hook call it, so there's one tested orchestration, no
duplication. The backfill job was refactored onto it.
Same hardened safety: gap-fill only, atomically guarded against
overwrite, schema-introspected, idempotent. Scoped to new arrivals for
incremental/deep; full refresh re-inserts everything as new (recovering
the IDs a full-refresh wipe destroys).
+10 reconcile tests (reconcile_library scope/idempotency/progress/stop +
the engine). Full suite clean (only pre-existing soundcloud /app env
failures remain).
Files SoulSync (or MusicBrainz Picard) already tagged carry Spotify /
iTunes / MusicBrainz / Deezer / Tidal / AudioDB / Genius / Last.fm IDs in
their metadata. Enrichment workers gate their queues on
{provider}_match_status IS NULL, so reading those IDs back and gap-filling
the {provider}_id + match_status='matched' columns lets the workers skip
the API lookup entirely — big API savings on an already-tagged library.
New manual job in Tools -> Database & Scanning ("Import IDs from File
Tags"): scans every library file, reads embedded IDs, fills any that are
missing in the DB. Background job + progress card, mirroring the
write-tags-batch pattern.
core/library/embedded_id_reconcile.py (pure + tested):
- plan_reconcile(): gap-fill plan for a track + its album + artist. Only
empty id columns are planned; a disagreeing embedded id is a conflict,
never applied.
- apply_reconcile_plan(): one guarded UPDATE per id column —
WHERE id=? AND (col IS NULL OR col=''). The guard makes the fill atomic:
if an enrichment worker matched the same entity between our read and
this write, the UPDATE affects 0 rows instead of clobbering it. Columns
are introspected so a schema missing a provider's columns is skipped.
- reconcile_track_row(): per-track orchestration (id extraction, plan ->
apply, keeping the in-memory parent maps fresh for sibling tracks).
Job hardening: paged track scan (bounded memory), per-page commits (don't
starve concurrent workers), per-file try/finally (one bad file can't abort
the run), counters from real rowcount.
Scope: 19 column-fills across 8 providers. MB *recording* (track) id is
left out (UFID frame the reader doesn't surface; Vorbis key ambiguous) —
MB album+artist are covered. Amazon/ASIN deliberately excluded (ASIN is a
different namespace than the worker's amazon_id). All target columns
verified against the live schema.
Purely additive: new module, two new endpoints, one new Tools card —
no existing behavior changed. 20 unit tests (incl. the concurrency guard).
Full suite clean (only pre-existing soundcloud /app env failures remain).
AcoustID returns a recording's title/artist in their ORIGINAL script
(e.g. "久石譲" for Joe Hisaishi) while SoulSync's expected metadata is
romanized/English. A correct download then fails verification on two
walls: the title can never clear the 0.70 similarity bar cross-script,
and the only skip path that ignores the title required a near-perfect
0.95 fingerprint plus a resolved alias. Result: every non-English
artist trips it. Two complementary fixes, per the reporter's two ideas.
Graceful fix (automatic):
- New pure core/matching/script_compat.py detects when two strings are
in genuinely different writing systems (CJK/Hangul/Cyrillic/Greek/
Arabic/Hebrew/Thai vs Latin). Accented Latin (Beyoncé, Sigur Rós)
stays Latin — no false trigger.
- acoustid_verification.py: when the EXPECTED artist and the matched
artist span scripts AND the artist is confirmed via the existing
MusicBrainz alias bridge, SKIP instead of quarantine, without the
0.95 floor (the 0.80 trust floor already gates the fingerprint).
- Deliberately narrow: keyed on the ARTIST spanning scripts + being
confirmed. A same-script artist with only a cross-script title keeps
the stricter 0.95 floor, so the #607 wrong-file protection (Kendrick
R.O.T.C, low-fingerprint Japanese-title) is untouched.
Per-request toggle (manual escape hatch):
- New "Skip AcoustID verification" checkbox in the download-missing
modal beside "Force Download All".
- skip_acoustid threads request -> batch -> per-track track_info ->
download context (same path as _playlist_folder_mode), landing on
the existing _skip_quarantine_check='acoustid' bypass. No new
mechanism; only the AcoustID gate is bypassed (integrity/bit-depth
still run).
Tests:
- tests/matching/test_script_compat.py — script-boundary cases.
- test_acoustid_skip_logic.py — Joe Hisaishi SKIPs at 0.85; unconfirmed
cross-script artist still FAILs; same-script low-fingerprint still
FAILs.
- test_downloads_candidates.py — toggle injects the bypass; absent
toggle keeps verification.
Full suite: 5169 passed; only pre-existing soundcloud /app env failures
remain. Zero regressions.
Per the cleaner model: the free source only runs for users who explicitly picked
'Spotify Free' — not for every connected user. _free_wanted() is now just
_free_selected() (dropped the has-credentials auto-trigger). So:
- Plain 'Spotify' user, rate-limited -> waits out the ban as before (no surprise
background scraping, no ToS exposure for people who never chose free).
- 'Spotify Free' user, no auth -> free serves.
- 'Spotify Free' user who also connects an account -> official when healthy,
free bridges only during a rate-limit, then switches back.
Rewrote the metadata-source help text as a plain per-source list with a clear
note on how Spotify Free + a connected account interact. Gate tests updated to
pin the opt-in behavior (plain-Spotify ratelimit = no bridge; Spotify-Free
ratelimit = bridge).
Consistency fix: Spotify Free is now its own entry in the metadata-source
dropdown (alongside Spotify / iTunes / Deezer / MusicBrainz) instead of a
side-toggle. Stored as fallback_source='spotify' + spotify_free=true so all
downstream 'spotify' routing and the spotify_* columns are unchanged.
Refined gate model (no toggle):
- Connected user (has credentials) -> official; bridges to free AUTOMATICALLY
during a rate-limit ban (no opt-in needed).
- No-auth user -> must pick 'Spotify Free' in the dropdown; then free serves.
- Never opted into Spotify (no creds, didn't pick it) -> free never runs, so no
surprise scraping. _free_wanted() = has_credentials OR picked-spotify-free is
the guard.
- AUTHED + healthy -> official always; free never opens.
UI: dropdown gains 'Spotify Free (no credentials)' (selectable when the package
is installed — surfaced via status.free_installed, since selecting it is the
opt-in and can't depend on having selected it); load/save map the dropdown value
to the (fallback_source, spotify_free) pair; old checkbox removed.
Gate model pinned by 6 scenario tests (connected/healthy, connected/ratelimited
bridge, no-auth picked, no-auth not-opted-in, package-missing). 117 tests green.
Adds an opt-in no-creds Spotify metadata path: SpotifyClient serves SpotipyFree
data (real Spotify IDs) when metadata.spotify_free is enabled AND SpotipyFree is
installed AND there's no real Spotify auth. The data lands in the SAME spotify_*
columns, so the enrichment worker, search, and #775 lookups work UNCHANGED —
they just receive free-sourced data. The worker's only change is its availability
gate.
- core/spotify_free_metadata.py: SpotifyFreeMetadataClient + normalize_artist +
pure gate fns (should_use_free_fallback / should_offer_spotify_metadata /
spotify_free_installed).
- SpotifyClient: _free_enabled (opt-in, default OFF) / _free_available /
is_spotify_metadata_available / _free_active + in-client routing for
search_artists/tracks + get_album/artist/track_details/album_tracks/artist_albums.
- 3 scoped availability gates use is_spotify_metadata_available(): enrichment
worker loop, search resolve_client, watchlist. The ~40 discovery/user-library
sites stay auth-only (no sprawl, no user-data risk).
Authed users are untouched (gate closed when auth healthy). UI surfacing comes
next. Pure gates + normalizer tested; orchestrator resolve test added.
The #799 uniqueness-guard added a source_id_conflict(self.db, ...) call to the
MusicBrainz worker's artist path. Two TestWorkerAliasEnrichment fixtures build
the worker via __new__ and set only .database, not .db, so the new call raised
AttributeError and the artist was marked 'error'. Mirror the third fixture
(which already sets worker.db). Production always sets self.db in __init__ —
test-only gap exposed by the new code path.
Enumerates all 64 flag combinations and asserts should_rediscover matches the
verbatim pre-refactor inline logic for every case except the one intended fix
(manual_match beating a stale wing_it_fallback). Guarantees the auto Playlist
Pipeline behaves identically post-refactor — no regression.
A manually-fixed mirrored track silently reverted to 'Wing It' after re-running
discovery. Two compounding causes:
- extra_data is MERGED on save (update_mirrored_track_extra_data), and the
manual-fix DB write (web_server.py) didn't clear the prior wing_it_fallback
flag — so a track fixed after being a Wing It stub kept wing_it_fallback=True.
- the Playlist Pipeline pre-scan checked wing_it_fallback BEFORE manual_match
(if/elif), so the stale flag won: the track was re-discovered and, on a miss,
fell back to Wing It — discarding the user's pick.
Fix: extracted the pre-scan gate into core.discovery.manual_match.should_rediscover
(manual_match checked FIRST = authoritative, regardless of leftover flags), and
the manual-fix write now also clears wing_it_fallback/unmatched_by_user. Behavior
is identical for every other branch — only the manual-vs-wing-it ordering changes.
Tested at the seam incl. the exact regression (wing_it_fallback + manual_match
both set -> skip). 227 discovery tests green.
Ships the source-id cleanup to all users: a marker-gated one-time migration in
MusicDatabase init clears any source id (deezer/spotify/itunes/musicbrainz/
discogs/audiodb/qobuz/tidal) shared across differently-named artists — the
enrichment-corruption signature. Same-name cross-server duplicates are left
untouched (DISTINCT-name check). Cleared rows re-derive correct ids on the next
enrichment pass; the now name-guarded workers won't re-corrupt.
Runs once (CREATE TABLE _source_id_dedupe_v1 marker), idempotent, per-column
try/except so a missing column can't abort it. Test forces a re-run and asserts
corruption is cleared while a legit same-name dup survives.
Two complementary fixes to stop distinct artists ending up with the same source
id (the near-name collisions: ODESZA/odessa, Blance/Blanke, Lady A/Lady Gaga,
plus MusicBrainz's combined-score weak matches like Grant/Amy Grant):
- core/worker_utils.accept_artist_match() / source_id_conflict(): one shared,
tested gate. Rejects artist matches below 0.85 (stricter than the 0.80 used
for album/track titles, since short artist names false-positive easily) AND
refuses to store a source id a DIFFERENTLY-named artist already holds. A
same-named holder (one act across two media servers) is still allowed.
- Routed every artist-match worker through it: deezer, qobuz, tidal, discogs,
itunes, spotify (its scorer now uses the 0.85 threshold), audiodb, and
musicbrainz (conflict guard only — its matcher is combined-score, so the
guard is the net that catches its weak-name matches).
Centralizing in worker_utils avoids the copy-paste that let the original
album/track overwrite bug live in four workers at once. 17 new gate tests.
core/maintenance/dedupe_source_ids.py + scripts/dedupe_source_ids.py: find
source-id clusters held by differently-named artists (the enrichment-corruption
signature) and clear the id + match-status on those rows so the now-name-checked
workers re-derive each correctly on the next enrichment pass. Same-name
duplicates (one artist across two media servers) are left untouched.
Dry-run by default; --apply to write. 8 seam tests cover detection (corrupt vs
legit), dry-run safety, apply behaviour, and the no-op case.
The blind 'correct the parent artist's source id from an album/track match'
logic was copy-pasted into four enrichment workers; the Deezer fix only covered
one. AudioDB, Qobuz, and Tidal had the identical bug and would corrupt their own
id columns (and re-corrupt after any cleanup).
All three now gate the correction on a name match between the result's artist
and the parent artist (audiodb reads result['strArtist']; qobuz/tidal thread the
result artist name in from their callers, as Deezer does). Regression tests
cover mismatch-skips and match-corrects for each.
Root cause of the duplicate deezer_id corruption: when enriching an album or
track, _verify_artist_id 'corrected' the parent artist's deezer_id to the
search result's primary-artist id whenever they differed — with NO name check.
For a collaboration/compilation track (e.g. one our library credits to Jorja
Smith that lives on Kendrick Lamar's curated 'Black Panther' album), the result
resolves to Kendrick's album, so Kendrick's id (525046) got written onto Jorja,
Vince Staples, SOB X RBE, etc. — many artists ending up with the same id.
Now the correction only fires when the result's primary-artist NAME matches the
parent artist (the album/track-artist path now mirrors _process_artist, which
already name-checks). Mismatches are logged and skipped as collab/compilation.
Note: this prevents new corruption; existing wrong ids in a library aren't
auto-repaired (per-artist enrichment preserves an existing deezer_id).
A pasted Deezer artist link (or any Deezer-source artist click) opened the
wrong artist's header: deezer_id 525046 is stamped on 4 library rows (Kendrick
+ 3 others — an enrichment-corruption bug), and the library-upgrade lookup did
WHERE deezer_id=? LIMIT 1, grabbing an arbitrary row (Jorja Smith) while the
discography loaded fresh from Deezer (Kendrick) — a Frankenstein page.
find_library_artist_for_source now detects when a source id maps to >1 library
artist and refuses to guess: it skips the id-based upgrade (still allowing the
name fallback), so the caller renders the source artist directly — landing on
the correct artist. Unique ids are unaffected (no regression).
The underlying enrichment bug that writes one source id onto multiple artists
is separate and still worth a follow-up.
Spotify/Apple/MusicBrainz/Deezer artist links now resolve via each source's
get-by-id (get_artist / Deezer get_artist_info), shaped to the artist card and
rendered as an artist result that opens the artist detail page through the
existing flow. Album/track link handling is unchanged; bare IDs still rejected.
Follow-up to the bare-ID footgun: a bare number like 525046 carries no
source and no entity type, so it resolved to whatever album happened to own
that id (a user pasting Kendrick's Deezer artist id got an unrelated album).
Now the resolver accepts provider URLs (and the explicit spotify: URI) only;
a bare/unrecognized string is rejected and the dropdown surfaces a hint to
paste a full link. URL parsing + album/track resolution are unchanged.
New 'Link / ID' input on the Search page: paste a Spotify / Apple Music /
MusicBrainz / Deezer URL (or a bare ID) and it's looked up directly on the
owning source — no fuzzy search, no scoring.
- core/search/by_id.py: source-agnostic parser (URL domain/path or bare-ID
format -> source,kind,id; numeric IDs fan out, first hit wins) + per-source
get-by-id dispatch + adapters projecting each provider's dict onto the
standard album/track card shape.
- /api/enhanced-search/by-id: thin additive route over resolve_identifier.
- Frontend: dedicated input that adopts the resolved source as active and
renders through the existing dropdown + download/import flow.
Purely additive — existing files are insertion-only; the resolver runs only
behind the new route. 29 seam tests cover parsing, shaping, fan-out, and
not-found.
The album-bundle path COPIES slskd's completed files into private staging (then
on to the library) but never removed slskd's originals, so they piled up in the
download folder. (copy, not move, is correct for the torrent/usenet bundle paths
— those clients keep seeding — so the shared copier can't just always delete.)
Add an opt-in remove_source to copy_audio_files_atomically that deletes each
source ONLY after it copies successfully (never on a failed stage), and set it
for the Soulseek path only. Torrent/usenet keep their originals.
Tests: keeps source by default / removes when requested / keeps on failed copy.
#768 added canonical_source_track to the live-sync matcher and the playlist
editor reconcile, but NOT to the two paths that actually run for file/CSV
mirrored playlists: the discovery worker (core/discovery/playlist.py) and the
DB-only matcher (core/discovery/sync.py). YouTube playlists are cleaned at
ingest, so they matched; file playlists fed the raw 'Arctic Monkeys - Do I
Wanna Know?' title into search+scoring and never matched the library's clean
'Do I Wanna Know?' → reported missing / shown as 'extra'.
Add a conservative canonical best-of to both: score with the raw title AND the
canonicalized one, keep the better. canonical_source_track only strips an
'<artist> - ' prefix when it equals the artist, so it can only add a candidate.
Tests: _canonical_best_score seam (file-style match / clean title scored once /
keeps original when better).
start_playlist_sync validated the resolved mode with 'if sync_mode not in
(replace, append): sync_mode = replace' — a pre-existing clamp two lines below
the config read I added, which silently downgraded a configured 'reconcile' to
'replace'. So config=reconcile resolved correctly then got clobbered, and every
sync ran replace regardless of the setting (and incognito didn't help — it's
backend, not cache).
Replace the hand-rolled clamp with a pure normalize_sync_mode(requested,
configured) helper (VALID_SYNC_MODES includes reconcile) so the resolution is
testable and can't silently drop a mode again. Regression tests cover
reconcile-from-config, request-overrides-config, and unknown->replace.
Replace mode (default) deletes + recreates the server playlist every sync,
which wipes its custom image, description, and identity. Add an opt-in
'reconcile' sync mode that edits the existing playlist in place — adds the
tracks now in the source, removes the ones gone — without destroying the
object, so the user's custom art/description survive.
- Pure planner plan_playlist_reconcile(current, desired) -> {add, remove}.
- Per-client reconcile_playlist: Plex addItems/removeItems on the same object;
Navidrome Subsonic updatePlaylist delta (songIdToAdd / descending
songIndexToRemove); Jellyfin add + remove-by-PlaylistItemId on /Playlists/{id}/Items.
- sync_service: reconcile branch with a replace FALLBACK (if a server's in-place
edit is unavailable/fails, sync still succeeds destructively — logged loudly).
- Default stays 'replace' (no behavior change). New Settings > Playlist sync mode
picker (replace/reconcile/append) backed by playlist_sync.mode; per-request
sync_mode still overrides.
- Reconcile skips the post-sync source-image push so a custom poster isn't
re-clobbered (the bug).
Tests: planner (add/remove/dedupe/order/empty) + reconcile-or-replace dispatch
(success / false-fallback / exception-fallback / no-method). Per-server in-place
API calls need dev validation against real Plex/Jellyfin/Navidrome.
NOTE: opt-in only; default behavior unchanged.
Consistency follow-up: the Filler picked art via metadata source-priority +
its own prefer_source knob, ignoring metadata_enhancement.album_art_order —
the explicit cover-art source list that post-process embed and the Library
Re-tag job now use. So 'cover art sources' meant two different things.
Prefer select_preferred_art_url (the configured order) first; fall back to the
existing prefer_source / source-priority loop when no order is configured
(the default) — non-breaking, existing behavior unchanged for those users.
Help text updated. Test: configured order wins + skips the source loop.
User feedback (Sokhi): after changing cover-art sources, re-tag should
re-download fresh covers from THEM. The job took cover_url only from the
matched metadata source's album image, ignoring the user's configured
cover-art order. Now prefer select_preferred_art_url (the same
metadata_enhancement.album_art_order the post-process embed honors), falling
back to the source image when no order is configured (non-breaking).
'replace' cover mode already force-refreshes art on every matched album, and
the embed replaces existing art (no duplicate pictures) — so 'replace' + a
configured art order = fresh covers from those sources. Help text updated.
Tests: prefers configured source URL / falls back to source image when unset.
Find & Add on the playlist-sync page only wrote sync_match_cache, which is
DELETEd wholesale after every DB scan — so the source->library pairing (and
the user's manual matches) reverted to 'extra'/red-dot on the next shallow
scan. The three match stores (sync_match_cache, manual_library_track_matches,
discovery extra_data) were disconnected and all pointed at tracks.id, which a
rescan re-keys (esp. Jellyfin/Navidrome GUIDs).
Unify the match so it's one durable fact, recorded once, honored everywhere:
- Find & Add also writes a durable manual_library_track_matches row (one-way;
the manual-match tool has no playlist to act on, so no reverse). Carries the
library file path.
- New library_file_path column (idempotent migration) + find_track_id_by_file_path:
re-resolve a stale library_track_id after a rescan re-keys the track, and
self-heal the row.
- The sync compare display's override lookup now falls back to the durable
manual match (resolve_durable_match_server_id) when sync_match_cache misses —
so the pairing persists across a scan instead of reverting to a red dot.
Purely additive: only adds matches when the cache returns nothing.
Tests: durable resolver (valid / stale-reresolve+self-heal / no-match / not-in-
playlist / missing-methods), file_path persistence + find_track_id_by_file_path.
Follow-on to 07801aeb (the orphaned-file delete committed alone because the
git add aborted on the already-removed pathspec). Removes the two RetagDeps
tests in test_lyrics_reembed_from_sidecar (the dataclass was deleted with the
old Retag Tool) and a no-placeholder f-string in test_library_retag_job.
Removing the old Retag Tool (d91e6a38) deleted core/library/retag.py but left
its tests behind. tests/library/test_retag.py imported the gone module at
module top, so pytest aborted collection of the ENTIRE suite (CI red on every
run). tests/test_lyrics_reembed_from_sidecar.py had two RetagDeps tests for the
same removed dataclass (lazy imports — would fail at runtime once collection
proceeded).
Delete the orphaned test file + drop the two dead RetagDeps tests (the rest of
the lyrics-sidecar file is unrelated and stays). Also drop a pointless
f-string in test_library_retag_job. Suite now collects all 5008 tests.
A bare host like '192.168.1.5:8080' or 'qbittorrent.lan:8080' (no scheme)
is what users naturally type, but requests then raises 'No connection
adapters were found for ...' — it can't pick an http/https adapter, and a
bare host:port even gets misparsed as scheme=host. This surfaced as the
generic 'qbittorrent probe failed' with a 'login error: No connection
adapters were found' in the logs.
Add normalize_client_url() in torrent_clients/base: default a missing scheme
to http:// (+ trim trailing slash), and route all three adapters'
_load_config through it. Transmission normalizes the base before appending
/transmission/rpc.
Tests: normalizer unit cases + per-adapter regression (bare host -> http://).
Note: usenet adapters (sabnzbd/nzbget) share the same pattern and need the
same treatment in a follow-up.
Follow-up hardening to #789. The selection was keyed purely by folder name,
so renaming a music folder in Navidrome silently reverted the scan to all
libraries. Now persist the folder id (stable across renames) as the primary
key alongside the name (kept for display + back-compat), and restore by id
first with a name fallback. Self-heals on reconnect: pre-id installs and
drifted/renamed names get the id + fresh name written back, so the settings
dropdown keeps highlighting the right folder.
Tests: restore-by-id-after-rename (+ name heal), name-fallback self-heals id,
no-drift writes nothing.
The saved music-folder selection was silently dropped on every reconnect.
_setup_client's restore step called the public get_music_folders(), which
starts with ensure_connection() — but we're already inside ensure_connection()
at that point (_is_connecting=True, _connection_attempted not yet set), so the
re-entrant call bailed and returned []. The restore matched nothing,
music_folder_id stayed None, and the per-call musicFolderId filters all
no-op'd → scans imported every library regardless of the user's choice.
Surfaces after any restart or settings save (reload_config resets the state).
Split get_music_folders() into the public method (does the connection check)
and a non-reentrant _fetch_music_folders() seam; the restore now calls the
seam directly (connection is already established + ping succeeded by then).
Regression + seam tests added.
- depth setting (light = core tags + matched source ids; full = same
multi-source enrichment cascade a fresh download gets, run additively
via embed_source_ids). Threaded through scan/finding/auto-apply and the
repair_worker fix handler.
- source now defaults to 'auto' (= your source priority / active source)
instead of blank.
- give native <option> popups a solid dark background (were white-on-white).
- tests for full-depth full_meta payload + enrich invocation + light no-op.
The job was the odd one out — auto_fix=False, no dry_run setting, so it never
showed the 'Dry Run' badge the other jobs do (the badge keys off
settings.dry_run === true). Aligned it to the standard pattern:
- auto_fix=True + dry_run setting defaulting True. Default behavior is unchanged
(findings only, nothing written) AND it now shows the Dry Run badge.
- Turning dry_run off makes the scan auto-apply in place (result.auto_fixed),
no finding — the opt-in 'just retag it' mode.
- Extracted a shared apply_track_plans() used by both the scan auto-apply and
the repair_worker fix handler (handler now resolves Docker paths then
delegates — one code path, no duplication).
Tests: dry_run=False auto-applies + writes + no finding; existing dry-run
finding/skip/apply tests still green. 410 passing.
Closes the kettui gap — the orchestration was unproven. Injected-fake seam
tests (temp sqlite + real empty track files, no metadata APIs / no real tag
writes):
- embed_known_source_ids: builds the right canonical id_tags from flat db keys,
honors the musicbrainz embed gate, no-ops when there's nothing to write.
- library_retag scan: produces a detailed finding with the per-track old->new
diff + stamped source ids, and skips an album that's already correct.
- _add_source_ids: per-source key mapping.
- _fix_library_retag apply: writes each track's payload, and reports failure
when files are unreachable.
476 tests pass; ruff clean.
The old per-download Retag Tool was limited (only native-pipeline downloads,
100-group cap, manual per-group) and did the wrong thing — it moved/reorganized
files instead of just tagging. It's superseded by the new Library Re-tag job
(whole-library, in-place) + the enhanced-library 'Write Tags' button.
Removed: the post-download record_retag_download ingestion hook (stops writing
retag_groups on every download), core/library/retag.py, the web_server state +
deps + /api/retag/* endpoints + the tool:retag WebSocket emit, the dashboard
card + both modals (index.html), the core.js socket handler, and the tools-page
wiring + help entry (wishlist-tools.js). Updated the import-pipeline test.
Verified: web_server parses, app + core imports OK, 392 tests pass, no live
references to removed symbols.
Left as inert (harmless) for a careful follow-up sweep: the retag_groups/
retag_tracks tables + their DB CRUD methods (no longer written/read), and the
now-orphaned retag JS helper functions (no entry point/wiring/socket calls them;
interspersed with wishlist functions, so not blind-deleted).
The testable core for the new library-wide re-tag job. Given a source album's
metadata + tracklist and the library tracks' current file tags, it:
- matches source tracks to library tracks (disc+track number, then title sim),
- computes the per-field diff (old -> new) for the dry-run finding,
- builds the minimal write_tags_to_file payload — only fields that actually
change under the chosen mode (overwrite vs fill-missing), so applying never
touches unrelated/unchanged tags.
No IO/network/DB — 10 unit tests cover matching, both modes, blank-source
fields, and the album-artist/track-count payload mapping.
Verification found a non-additive edge: embed_album_art_metadata uses FLAC
add_picture(), which APPENDS — so applying to an album where some tracks already
had art would have added a duplicate embedded picture. The apply now checks each
file and skips any that already carry art (shared _audio_has_art helper), so it
only ever ADDS art to files missing it. Test covers the skip (no re-embed).
Previously the filler only flagged albums whose DB thumb_url was empty and, on
apply, only updated that DB thumb_url — so albums whose files had no embedded
art and no cover.jpg (but whose DB row had a URL) were never found, and even
'applying' art never touched the files. That's the reported 'doesn't scan all
albums' gap.
New core.metadata.art_apply (reuses the post-processing standard so the user's
album_art_order is honored):
- album_has_art_on_disk(): cheap-first check — folder cover.jpg/folder.jpg
sidecar, then embedded art in a representative track (FLAC/ID3/MP4/Vorbis).
- apply_art_to_album_files(): embeds via embed_album_art_metadata + writes
cover.jpg via download_cover_art; only ADDS art (never rewrites the user's
tags); read-only/unwritable files are skipped + counted, never crash.
Scan now examines every titled album and flags it when art is missing in the DB
OR on disk. Apply embeds into the album's audio files + writes cover.jpg in
addition to the DB thumbnail (media-server-only albums fall back to DB-only).
Tests cover sidecar/embedded detection, the cheap-first short-circuit, and the
apply orchestration (embeds each file + cover.jpg; read-only failures counted).
The title/artist fallback search took results[0]'s artwork unconditionally, so
a loose full-text match returned the wrong album's cover (the 'new sources give
incorrect art' reports). Now it pulls a few results and only accepts one whose
title matches (subset, to allow Deluxe/Remaster) AND whose artist matches
exactly — the artist being the strong guard against wrong covers. Falls back to
an exact title match when a result carries no artist.
The album's own stored source-id path is unchanged (that id is authoritative).
Tests: wrong-artist rejected, skips wrong result for a matching one, + unit
coverage of the matcher (deluxe/feat/stopwords accepted, wrong artist/title
rejected).
qBittorrent 5.2.0 changed /api/v2/auth/login to return HTTP 204 (No Content)
on success instead of HTTP 200 with body 'Ok.'. The adapter required the body
to equal 'Ok.', so every login on 5.2.0+ failed with 'HTTP 204 body=' — the
connection probe and all torrent actions were broken.
Treat login as successful on the SID auth cookie and/or a success body: 'Ok.'
(<=5.1) or an empty HTTP 204 (>=5.2.0). Still reject bad creds, which
qBittorrent reports as HTTP 200 + 'Fails.' (not a 4xx).
Tests: 204-empty -> success, SID-cookie+empty-body -> success, 'Fails.' (even
with a stale cookie) -> failure.
resolve_mirrored_playlist tried the mirrored-playlists primary key FIRST for
any all-digit ref. Deezer upstream ids are all-numeric, so a Deezer playlist id
was mistaken for the PK and the organize-by-playlist toggle resolved a wrong row
(or nothing) — the toggle silently wouldn't save / 'Open in Mirrored' missed.
Resolve by (source, source_playlist_id) first, fall back to PK only when the
source lookup misses. Thread the batch/wishlist source through the download-path
callers so numeric upstream ids resolve correctly there too. Spotify (base62
ids) is unaffected.
Seam tests: numeric Deezer id resolves by source (not PK), spotify alphanumeric
by source, PK fallback still works, profile-scoped, empty refs -> None.
Adds get_recommendation_sources() — for each recommended similar artist it
resolves the polymorphic similar_artists.source_artist_id back to the display
names of the user's OWN artists (library + watchlist) that list it, by matching
against every provider-id column on both tables. The /api/discover/similar-artists
endpoint now attaches a 'because' array per recommendation so the UI can show
'because you have X, Y, Z' instead of just a count.
Seam tests cover: library + watchlist resolution across different provider-id
columns, dedup + name-sort, max_per cap, orphan source omission, profile scoping.
The worker's WARNING observability proved the '38 errors' were almost all
MusicMap returning 404 (artist has no map page) — a genuine not-found, not a
fetch failure. But iter_musicmap_similar_artist_events flattened every
RequestException to status_code 502, and the worker maps 400/404 -> not_found
/ everything-else -> error, so these inflated the error count.
Surface the real HTTP status from the exception's response (404 stays 404),
falling back to 502 only when there's no response (timeout/connection drop,
which is correctly still an error eligible for retry).
Regression tests: 404 -> 404 (not_found), timeout -> 502 (error), 500 stays
error, plus an end-to-end worker check that a 404 result marks 'not_found'
and stores nothing.
Verified against live data: 1312/1313 stored similars carry a metadata source id,
but 1 slipped through name-only (a match on a source with no id column, e.g.
discogs). Enforce the standard: process_artist now SKIPS any similar whose match
doesn't map to a storable id column (spotify/itunes/deezer/musicbrainz) instead
of writing a useless name-only row. Regression test covers discogs-match + no-id
cases. Now 100% of newly-stored similars are actionable.
The kettui move: 38/79 fetches errored on the first live run, but they were
logged at DEBUG only — invisible in app.log, so the cause (rate-limit vs
no-providers vs bug) is unprovable. process_artist now returns a (status, count,
detail) triple carrying the error reason (status code + message / exception),
and the worker logs the first 15 errors per session at WARNING (rest DEBUG) +
keeps _last_error. No blind pacing tweak — let it run, read the real reason, then
fix the proven cause. Seam tests updated + assert the reason is captured.
SoulSync standalone matches library tracks without Plex fetchItem,
reports missing counts correctly, and skips server playlist writes.
Automation re-syncs when the mirror grows; after sync finishes, starts
organize download (organize-by-playlist) or wishlist processing.
UI: Spotify URL playlist-folder controls, organize toggle layout in the
discovery modal, reload organize preference when reopening Download Missing.
Co-authored-by: Cursor <cursoragent@cursor.com>