Expand matched MusicBrainz release groups into concrete releases for specific album searches so import users can choose the correct edition by track count, format, country, and disambiguation. Preserve distinct MusicBrainz release IDs instead of deduping same-title variants, carry release metadata through import matching, and surface those details on album result cards. Add coverage for variant preservation and release-group expansion.
Keep full refresh moving when post-clear VACUUM hits a transient disk I/O error, and retry clear_server_data once when the clear step itself sees the same transient SQLite failure.
Retry metadata cache maintenance writes once on transient disk I/O errors so first-attempt cache jobs do not fail when an immediate retry would succeed.
Tests cover best-effort VACUUM, clear retry behavior, and cache maintenance retry behavior.
Add a disk-backed image cache with hashed browser URLs, SQLite metadata, size/type validation, stale fallback, and per-image fetch locking. Route normalized artwork through /api/image-cache while keeping /api/image-proxy as a compatibility shim, and align browser max-age with the image cache TTL. Add focused tests for cache behavior and image URL normalization.
S-Bryce reported that for some artists (Vocaloid producers, JP indie
acts, niche Western indie) the artist detail page was missing whole
release-groups visible on musicbrainz.org. Downloaded tracks from
those release-groups appeared in artist track counts but were not
bound to any visible album / single card — orphan "ghost" tracks the
user couldn't browse to.
Two duplicated bugs fed each other:
1. `core/musicbrainz_search.py` browsed MB release-groups with
`release_types=['album', 'ep', 'single']`. MB's primary-type
vocabulary is {Album, Single, EP, Broadcast, Other} — music
videos, one-off web releases, and broadcast singles use Other.
Pre-fix the filter dropped them at the API layer.
2. Three sites duplicated the same "raw primary-type → internal
album_type" mapping with slightly different vocabularies and all
silently defaulted unknown values (including 'Other') to 'album':
core/musicbrainz_search.py `_map_release_type`
core/metadata/types.py inline `{single:single, ep:ep}.get(...)`
core/metadata/cache.py Deezer-specific record_type guard
Letting Other through the filter without a real mapper would have
placed music videos in the Albums view alongside LPs — visually
misleading.
Fix shape:
- New `core/metadata/release_type.py` — single canonical mapper
consumed by every provider's raw→Album projection. Knows the full
MB vocabulary including 'other' and 'broadcast'; routes both into
the singles bucket since they're functionally single-track
releases. Compilation secondary-type override preserved (MB's
canonical Greatest-Hits pattern is `primary=Album,
secondary=[Compilation]`).
- `core/musicbrainz_search.py` `_map_release_type` becomes a thin
alias for the new helper so the six internal call sites stay
intact. API filter gains 'other'.
- `core/metadata/types.py` Album projection drops its inline mini-
mapper and calls the canonical helper. Now also handles the
compilation secondary-type override it was previously missing.
- The Deezer-specific cache.py guard stays as-is — Deezer's
record_type vocabulary is closed (album|single|ep), not affected
by this issue.
Verified end-to-end against MB for S-Bryce's artist (`46196b9c-affa-
4616-b53b-e967c8bd70e0`, inabakumori): pre-fix returned 22 release-
groups; post-fix returns 27, with the 5 extra all landing in the
Singles section with album_type='single' as intended.
23 new unit tests pin the mapper contract (case-insensitive primary
types, compilation secondary override, Other/Broadcast → single,
unknown → album default preserved, defensive empty/None inputs).
2 new tests in test_musicbrainz_search pin the API filter inclusion
of 'other' and the round-trip into the Singles bucket. All 516
existing metadata tests still green — refactor leaves historical
behaviour for {album, ep, single, compilation} unchanged.
Register MusicBrainz as a first-class metadata source alongside Deezer, iTunes, Spotify, Discogs, and Hydrabase. Expose the shared client through metadata services, add the settings option, and expand the MusicBrainz search adapter with source-compatible artist, album, track, and detail methods.
Carry MusicBrainz IDs through similar-artist discovery, recommended artists, artist map serialization, and personalized playlist selection. Update DB migrations and lookup filters so similar_artist_musicbrainz_id is preserved on older schemas and used for source requirements and library exclusion.
Normalize MusicBrainz album adapter output for import context and add regression coverage for registry mapping, typed album conversion, and similar-artist filtering. Verified by user with 120 focused tests passing.
- Artist cards, hero section, and enhanced view now show Amazon Music badges
when amazon_id is populated (AMAZON_LOGO_URL constant, orange #FF9900 brand)
- Enhanced view artist and album match status rows include amazon_match_status
chip with click-to-rematch via openManualMatchModal
- getServiceUrl: added amazon (album/track ASIN → music.amazon.com) and fixed
missing discogs entries; serviceLabels adds tidal/qobuz/amazon
- Enhanced view enhanced-artist-id-badges includes amazon_id entry
- DB SELECTs for library artists list and artist detail now return amazon_id;
both response dicts include the field
- watchlist_artists migration adds amazon_artist_id column
- Watchlist config GET: amazon_artist_id in SELECT/WHERE/response (index 18)
- Watchlist artists list response includes amazon_artist_id
- link-provider endpoint: amazon added to valid_providers and col_map
- _populateLinkedProviderSection: amazonId param + Amazon Music source row
- Watchlist card source badges render Amazon pill (watchlist-source-amazon CSS)
- _openSourceSearch labels map includes amazon
- service_search: amazon_worker injected via init(); _search_service amazon branch
uses search_artists/albums/tracks, same {id,name,image,extra} return shape
- _SERVICE_ID_COLUMNS: amazon → amazon_id for artist/album/track
- _init_service_search call passes amazon_worker_obj
- amazon_client._fetch_album_metas: 5-minute TTL cache per ASIN — cached hits
skip _rate_limit() and HTTP call entirely; fixes ~10s artist detail load
- registry.py: removed amazon from METADATA_SOURCE_PRIORITY and
METADATA_SOURCE_LABELS — T2Tunes has no discography API, cannot serve as a
primary metadata source; Amazon remains a download source + ASIN enricher
- Settings metadata source dropdown and help text updated accordingly
Wires AmazonClient into the metadata source registry following the
exact same pattern as DeezerClient. No existing source paths touched.
- Add get_album_metadata / get_artist_info / get_artist_albums_list
aliases to AmazonClient (mirrors DeezerClient interface aliases)
- Register amazon in METADATA_SOURCE_PRIORITY and METADATA_SOURCE_LABELS
- Add _get_amazon_factory() + get_amazon_client() to registry.py
- Add amazon branch to get_client_for_source(); thread amazon_client_factory
kwarg through get_primary_client() and get_primary_source_status()
- Re-export get_amazon_client from the core.metadata_service shim
- Add Amazon Music option to Settings metadata source dropdown
- 3530 tests pass
The first token-leak fix scrubbed the artwork URL fixer's own log
calls. This catches three more sites that ALSO leaked tokens, plus
one upstream gap that let URL-encoded tokens slip through the
redactor.
Three sites in `web_server.py` (artist endpoint at line 8765-8773):
- "Artist image before fix: '...'" -- logged the raw image_url with
the auth token in plain form.
- "Artist image after fix: '...'" -- logged the URL-encoded form
after it had been wrapped in the image proxy
(`/api/image-proxy?url=<percent-encoded-token>`).
- "Final artist data being sent: {...}" -- dumped the entire
artist_info dict on every render, including the image_url field.
All three were dev-time debug noise. Removed entirely. The "No
artist image URL found" warning at line 8770 stays (no URL, just
the artist name).
One site in `core/discovery/sync.py:402`:
- "[PLAYLIST IMAGE] image_url=..." -- logged the playlist poster URL
during sync. Same auth-token leak risk for Plex / Jellyfin
playlists. Changed to log only `has_image=True/False`.
Upstream gap in `_redact_url_secrets`:
- The original regex only matched plain query params (`?key=value`).
When an auth-bearing URL gets wrapped inside another URL's query
string (our `/api/image-proxy?url=<encoded>` flow) the auth params
end up percent-encoded -- `%3FX-Plex-Token%3D...` -- and slipped
through.
- New second pattern catches the URL-encoded form. Both passes run
on every redact call; idempotent.
Verified manually:
/api/image-proxy?url=...%3FX-Plex-Token%3DABC...
-> /api/image-proxy?url=...%3FX-Plex-Token%3D***REDACTED***
6 artwork tests pass.
The artwork URL normalizer was logging the full constructed media-
server URL on every cover-art lookup at INFO level, including the
auth query params (X-Plex-Token / X-Emby-Token / Subsonic t+s+p).
Those lines pile up in app.log on disk -- anyone with read access to
the log file gains full read access to the user's media server.
Also dropped the noisy per-call "Plex/Jellyfin/Navidrome config -
base_url: ..., token: ..." INFO lines that fired on every thumbnail.
Even the truncated `token[:10]` form is enough partial-known-plaintext
to be uncomfortable to leak.
- New `_redact_url_secrets` helper masks the values of X-Plex-Token,
X-Emby-Token, api_key, apikey, Subsonic t / s / p, generic token /
password query params. Regex anchored on `?` or `&` boundary so
short keys like `t` don't false-match inside `format=Jpg`.
- "Fixed URL: ..." log calls moved from INFO to DEBUG so they don't
persist by default, and the URL passed in is run through the
redactor first.
- Per-call "Plex config - ..." / "Jellyfin config - ..." /
"Navidrome config - ..." INFO lines removed entirely. Config
inspection has dedicated UI; per-thumbnail spam belongs to no one.
- Error-path logging (line 149) also routed through the redactor in
case the failing URL had auth params attached.
Users with existing app.log files containing the leaked tokens
should rotate / wipe the log. Plex tokens can be regenerated by
signing out of all devices in Plex settings; Jellyfin api_keys can
be revoked from the dashboard; Navidrome users should rotate the
account password.
Discord report (netti93): downloaded album tracks were tagged with
TRCK = "6/0" instead of "6/13" when source data was incomplete. The
retag tool wrote correct "6/13" because core/tag_writer.py already
handled the case.
Trace: core/metadata/enrichment.py:105 formatted unconditionally as
f"{track_number}/{total_tracks}" and many album-dict construction
sites pass total_tracks: 0 (per types.py, 0 means "unknown" — not a
real count). That 0 propagated straight to disk.
Fix at the consumer boundary so every album-dict constructor stays
unchanged. Lifted to pure helper
core/metadata/track_number_format.py:format_track_number_tag that
drops the /N suffix when total is 0 / None / negative — emits just
"6" instead. Matches retag's behavior + ID3 spec convention (TRCK
can be "N" or "N/M"). MP4 trkn tuple gets the same treatment via
format_track_number_tuple returning (6, 0) per spec's "unknown
total" marker.
Wired into all three format-write sites in enrichment.py: ID3 (TRCK),
Vorbis (tracknumber), MP4 (trkn). When source data has correct
total_tracks (album downloads via the metadata-source pipeline,
retag flow), behavior unchanged — still writes "6/13".
16 boundary tests pin every shape: known total / zero total / none
total / none track / zero track / negative inputs / string coercion
/ unparseable strings / floats truncate.
Full suite: 3113 passed.
Soulseek matched-download contexts populate `original_search_result`
with `artist` (singular string) and no `artists` list — the full
multi-artist array lives on `track_info` (the matched Spotify track
object). `extract_source_metadata` only read `original_search.artists`,
so the Soulseek path always fell through to the single-artist branch
and TPE1 ended up with the primary artist only. Deezer-direct
downloads were unaffected because their context populates
`original_search.artists` as a proper list.
Lifted artist resolution into a pure helper
`core/metadata/artist_resolution.py:resolve_track_artists` that walks
`original_search.artists` → `track_info.artists` → `artist_dict.name`
fallback chain. Normalizes mixed list-item shapes (Spotify-style
dicts, bare strings, anything else stringified) and drops empty
entries.
13 new tests pin the resolution order, fallback chain, mixed-shape
normalization, whitespace stripping, and empty/none handling. The
existing `_artists_list` no-fall-through test in
`test_multi_artist_tag_settings.py` was updated to reflect the new
contract (always populated; multi-value write still gated on
`len > 1`) plus a new regression test for the Soulseek shape.
Composes with the existing Deezer per-track upgrade (still fires when
single-artist + track_id available) and feat_in_title /
artist_separator settings (still drive the joined ARTIST string
downstream).
- legacy duck-typed builder only checked the `album_type` key; deezer
uses `record_type`, tidal uses `type` (uppercase), some flattened
musicbrainz shapes use `primary-type` — all defaulted to album, so
EPs and singles ended up filed under Album/ in user templates that
reference $albumtype
- widen lookup to album_type / record_type / type / primary-type and
route through new pure `_normalize_album_type` helper that
case-folds + validates against the canonical token set
(album / single / ep / compilation), unknown → album
- typed-converter path (spotify / deezer / itunes / discogs / mb /
hydrabase / qobuz) unchanged — those were already correct
Discord report (CAL).
Adds an explicit field to the Album dataclass in core/metadata/types.py
and the client-level Album dataclasses in deezer_client.py,
itunes_client.py, and hydrabase_client.py (the legacy discography path
reads from client objects, not typed dicts).
Deezer extracts explicit_lyrics (int→bool), iTunes extracts
collectionExplicitness ('explicit' string), Hydrabase forwards the
explicit field from the server response. Spotify, Discogs, MusicBrainz,
Qobuz, and Tidal have no explicit signal and stay None.
The flag threads through both builder functions in discography.py and
renders as a small "E" badge next to explicit titles in the discography
download modal and artist-detail page cards.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- new track_already_owned helper wraps db.check_track_exists at
the same confidence threshold the discography backfill repair job
uses (0.7) — name+artist+album, format-agnostic so blasphemy-mode
libraries (flac → mp3 + delete original) match correctly
- endpoint runs the check after the artist + content-type filters and
before add_to_wishlist, so a second discography click on the same
artist no longer re-queues every track that already downloaded
- per-album response carries a new tracks_skipped_owned counter
alongside the existing artist/content/wishlist skip categories
Discord report (Skowl).
- drop tracks where the requested artist isn't named in track.artists
(keeps features, drops compilation / appears_on contamination)
- honor watchlist.global_include_live/remixes/acoustic/instrumentals
the same way the discography backfill repair job already does
- surface per-album skip counts in the ndjson stream (artist mismatch
+ content filter) so the ui can show what was filtered
Closes#559.
Two follow-ups to the multi-artist tag settings PR:
1. Deezer contributors upgrade — closes the "known limitation"
flagged in the prior commit. Deezer's `/search` endpoint only
returns the primary artist for each track; the full contributors
array (feat., remix collaborators, producers credited as artists)
lives on `/track/<id>` and gets parsed by `_build_enhanced_track`.
Without the upgrade Deezer-sourced tracks never got multi-artist
tags even with the right settings on.
Fix in `core/metadata/source.py`: when source==deezer AND the
search response had a single artist AND a track_id is available,
fetch full track details via `get_deezer_client().get_track_details`
and replace `all_artists` with the upgraded list.
- One extra API call per affected Deezer track
- Skipped when search already returned multiple (no-op fast path)
- Skipped for non-Deezer sources (Spotify/Tidal/iTunes search
responses already include all artists)
- Skipped when no track_id is available
- Defensive try/except: on /track/<id> failure (network error,
deezer client unavailable), fall through to the search-result
list — never lose the data we already had
2. Double-append guard hardened with a word-boundary regex.
Prior commit checked for `"feat." not in title.lower() and "(ft."
not in title.lower()` — too narrow. Source platforms produce
wildly different feat-marker conventions: "(feat. X)", "(Feat X)",
"(FEAT X)", "(Featuring X)", "[feat. X]", "ft. X" (no parens),
"FT. X", etc. Any of these as the SOURCE title would cause a
double-append: `"Track (Feat X) (feat. Y)"`.
Replaced with `re.search(r'\b(?:feat|feat\.|featuring|ft|ft\.)\b',
title, IGNORECASE)`. Word-boundary regex catches every common
variant. Substring matches like "Aftermath" containing `ft`
correctly fall through to the append path (pinned by a regression
test).
16 new tests (29 total in the file):
- 9 parametrized variants of the double-append guard
- 1 substring guard ("Aftermath")
- 6 Deezer upgrade scenarios (fires when expected, doesn't fire
for non-Deezer / multi-artist search / no track_id, defensive
fall-through on failure, no false-positive when /track/<id>
confirms single artist)
Full pytest 2727 passed.
Three settings on Settings → Metadata → Tags were partially or
completely unimplemented. Reporter (Netti93) traced each one.
(1) `write_multi_artist` only "worked" because of a never-populated
`_artists_list` field. `core/metadata/source.py` built
`metadata["artist"]` as a hardcoded ", "-joined string but never
assigned `metadata["_artists_list"]`. `core/metadata/enrichment.py`
line 107 reads that field and gates the multi-value tag write
on `len(_artists_list) > 1` — always saw an empty list, silently
no-op'd the write.
(2) `artist_separator` (default ", ") was referenced in the UI +
settings.js save path but ZERO Python code read the value. Every
multi-artist track ended up with hardcoded ", " regardless of
what the user picked.
(3) `feat_in_title` (when true: pull featured artists into the title
as " (feat. X, Y)" and leave only primary in the ARTIST tag —
Picard convention) had no implementation at all.
Fix in source.py:
* Populate `_artists_list` from the search response's artists array
* Read `feat_in_title` and `artist_separator` configs
* When `feat_in_title=True` and >1 artist: ARTIST = primary only,
append "(feat. X, Y)" to title with double-append guard
* Else: ARTIST = artists joined with `artist_separator`
* Single-artist case unaffected by either setting
Double-append guard uses a word-boundary regex catching all common
"feat" variants source platforms produce — `feat`, `feat.`,
`featuring`, `ft`, `ft.` — case-insensitive. Substring matches
(e.g. "Aftermath" containing "ft") correctly fall through to the
append path.
Fix in enrichment.py ID3 branch:
* TPE1 stays as the display string (with separator or primary-only
per the user's settings)
* Multi-value list goes to a separate `TXXX:Artists` frame (Picard
convention) when `write_multi_artist` is on
* Pre-fix the ID3 path wrote TPE1 twice — single-string then list
— and the second `add` overwrote the first, clobbering both the
configured separator AND the feat_in_title semantics. Vorbis path
was already correct (separate "artist" + "artists" keys).
Known limitation (flagged in WHATS_NEW): Deezer's `/search` endpoint
only returns the primary artist. The full contributors array lives
on `/track/<id>`. Enrichment uses search-result data so Deezer-
sourced tracks may still get only the primary artist until a follow-
up commit wires the per-track contributors fetch into the enrichment
flow. Spotify, Tidal, and iTunes search responses include all
artists so they work now.
23 new tests in `tests/metadata/test_multi_artist_tag_settings.py`:
* `_artists_list` populated for multi/single/no-artist cases
* `artist_separator` drives ARTIST string (default ", " + custom
";" + custom "; " + " & ")
* Single-artist case unaffected by either setting
* `feat_in_title=True` pulls featured to title, leaves primary in
ARTIST
* `feat_in_title` no-op for single artist
* Double-append guard recognizes 9 source-title variants ("(feat.
X)", "(Feat. X)", "(FEAT X)", "(feat X)", "(Featuring X)",
"[feat. X]", "ft. X", "(ft X)", "FT. X")
* Substring guard test pins "Aftermath" doesn't false-positive
* Combined-settings precedence: feat_in_title wins ARTIST string
but `_artists_list` carries everyone for multi-value tag
Full pytest 2711 passed.
Defensive followup. If Deezer CDN ever refuses the upgraded
1900×1900 URL for a specific album (rare — empirically tested 4
albums and none hit it), pre-fix would have succeeded with the
1000×1000 URL and post-fix would have failed entirely.
Both download sites now retry with the original URL when the
upgraded URL fails:
- `core/metadata/artwork.py::download_cover_art` — auto post-process
flow. Resolves the original URL from album_info / context the same
way the existing path does.
- `core/tag_writer.py::download_cover_art` — captures the original
URL before upgrade so the retry has it without a second context
lookup.
Strictly non-regressive: worst plausible post-fix case is now
identical to pre-fix (cover at 1000×1000 succeeds). Fallback only
fires on the rare CDN-refusal edge.
Tests added (2):
- `test_tag_writer_retries_with_original_on_failure` — upgraded URL
raises, original succeeds, both attempts logged in call order
- `test_tag_writer_no_fallback_for_non_dzcdn_url` — non-Deezer URLs
go through unchanged, no fallback path triggered (single attempt)
Verification:
- 18/18 helper + integration tests pass
- 2561 full suite passes
- Ruff clean
Discord report (Tim): downloaded cover art via Deezer metadata
source came out visibly blurry in Navidrome / on phones — large
displays exposed the limited resolution.
# Cause
Deezer's API returns `cover_xl` URLs at 1000×1000. The underlying
CDN actually serves up to 1900×1900 by rewriting the size segment
in the URL path (same trick the iTunes mzstatic + Spotify scdn
upgrades already use). SoulSync wasn't doing the rewrite — every
Deezer-sourced cover got embedded at 1000×1000 regardless of how
much higher resolution the CDN had available.
# Verified empirically
```
$ for size in 1000 1400 1800 1900 2000; do curl -I "...{size}x{size}-..."; done
1000: 200 OK 106 KB
1400: 200 OK 198 KB
1800: 200 OK 331 KB
1900: 200 OK 371 KB
2000: 403 Forbidden
```
1900 is the safe ceiling. Above that the CDN returns 403. CDN
serves source-native bytes when source < target (smaller-source
albums get same bytes whether we ask for 1000 or 1900), so asking
for 1900 universally is safe.
# Fix
New `_upgrade_deezer_cover_url(url, target_size=1900)` helper in
`core/deezer_client.py`. Pure function, mirrors the
`_upgrade_spotify_image_url` pattern that already lives in
`core/spotify_client.py`. Defensive on every input shape:
- Empty / None → returned as-is
- Non-Deezer URL (no `dzcdn`) → returned as-is
- No size segment in URL → returned as-is
- Already at/above target → returned as-is (idempotent, never
downgrades)
Applied at both cover-download sites:
- `core/metadata/artwork.py::download_cover_art` — auto post-process
flow. Mirrors the existing iTunes mzstatic upgrade right above it.
- `core/tag_writer.py::download_cover_art` — enhanced library view's
"Write Tags to File" feature.
# Scope discipline
- Helper applied at the DOWNLOAD boundary, not the source extraction
point in `deezer_client.py`. Means cached entries in the metadata
cache + DB row `image_url` columns keep the original 1000×1000 URL
Deezer's API returned. Future CDN behavior changes only affect the
download path, not stored data.
- Pre-existing `prefer_caa_art` toggle (Settings → Library →
Post-Processing) untouched — orthogonal workaround for users who
want even higher quality (MusicBrainz Cover Art Archive, often
3000×3000+).
- iTunes / Spotify upgrade paths untouched — they already worked.
# Tests added (16)
`tests/metadata/test_deezer_cover_url_upgrade.py`:
- Standard upgrade: default target 1900 on cover URL, alternate
dzcdn host (`e-cdns-images.dzcdn.net` vs `cdn-images.dzcdn.net`),
artist picture URLs (same path pattern), 500×500 source upgrades
too
- Custom target size: smaller target = no-op (never downgrade),
larger target works
- Idempotent: already at/above target returned unchanged
- Defensive on non-Deezer URLs: parametrised across 5 hosts
(Spotify scdn, iTunes mzstatic, MB CAA, Last.fm, random) — all
returned untouched
- Defensive on malformed Deezer URL (no size segment) → returned
as-is
- Empty / None handling
# Verification
- 16/16 helper tests pass
- 560/560 metadata + imports tests pass (no regression)
- 2559 full suite passes
- Ruff clean
Live-API verification revealed advanced-syntax queries hurt more
than they help on this endpoint. Switching the import-modal Deezer
search back to free-text + local rerank.
# What live testing showed
Hit Deezer's public API with both query forms for the issue #534
case (`Dirty White Boy` + `Foreigner`):
**Free-text (`q=Dirty White Boy Foreigner`):**
- Returns 21 results
- Real Foreigner Head Games studio cut at #1
- Live versions at #2-10
- Karaoke / cover variants at #11-15
**Advanced (`q=track:"Dirty White Boy" artist:"Foreigner"`):**
- Returns 12 results
- "(2008 Remaster)" at #1 — canonical Head Games cut MISSING from
top 8 entirely
- Live + alt-album versions follow
Advanced syntax DOES filter karaoke at the API level (none in the
12-result set vs. 5 at positions 11-15 in free-text), but it has
its own ranking bias that surfaces remasters / "Best Of" cuts
ahead of the canonical recording. Net regression for the user-
facing goal.
# Fix
1. Endpoint reverts to free-text query with local rerank applied.
2. Local rerank gains "remaster" / "remastered" / "reissue"
patterns under VARIANT_TAG_PATTERNS (soft 0.4× penalty — user
may want them but they shouldn't outrank the original).
3. Client kwarg support (`track=` / `artist=` / `album=`) preserved
for future opt-in callers (e.g. exact-match flows where API-
level filtering matters more than ranking).
# Verified end-to-end against live Deezer API
Re-ran the exact #534 case through the live API + new rerank.
Top 15 results post-rerank:
1. Dirty White Boy — Foreigner — Head Games ← REAL CUT AT TOP
2-10. Various Live versions
11-15. Karaoke / cover / tribute variants ← BURIED
Real Foreigner Head Games studio cut at #1, exactly the user's
ask.
# Tests
- `test_relevance.py` — variant tag patterns extended; existing
tests still pass (50 tests).
- `test_search_match_endpoints.py::test_joins_track_and_artist_into_free_text_query`
— replaces `test_passes_track_and_artist_as_kwargs`; verifies
endpoint sends free-text join, NOT field-scoped kwargs (the
prior test asserted the wrong direction now).
- Karaoke-burying assertion at the endpoint still pins the
user-visible behaviour.
- Client kwarg path tests untouched (still pin advanced-syntax
construction for future opt-in callers).
# Verification
- 75 relevance + endpoint + query tests pass
- 2445 full suite passes
- Ruff clean
- Live Deezer API shows real cut at #1 post-rerank
# Background
User reported (#534) that the import-modal "Search for Match" dialog
returned irrelevant results when Deezer was the metadata source.
Searching `Dirty White Boy` + `Foreigner` returned 5+ karaoke /
"originally performed by" / "in the style of" / "re-recorded" /
tribute-band results ranked above the actual Foreigner studio cut
from Head Games. User had to scroll past the junk every time, or
fall back to iTunes search which is much slower.
# Root cause — two layers
1. **Endpoint joined `track + artist` into free-text query.**
`/api/deezer/search_tracks` was passing `q=Dirty White Boy Foreigner`
to Deezer's `/search/track` API. Deezer fuzzy-matches that
string across title / lyrics / artist / album / contributors and
orders by global popularity — anything that appears across many
compilations outranks the canonical recording.
2. **No local rerank.** None of the search-modal endpoints applied
any post-filtering. Deezer's API order shipped straight to the
user.
# Fix — same architectural shape Cin would build
## Layer 1: field-scoped query at the client boundary
`core/deezer_client.py::search_tracks()` now accepts optional
`track`, `artist`, `album` kwargs. When provided, builds Deezer's
advanced search syntax: `q=track:"X" artist:"Y" album:"Z"`. Massive
relevance improvement because each term matches the right field
instead of fuzzy-matching everywhere.
Backward compat preserved: legacy free-text `query=` callers still
work unchanged. Field-scoped path takes precedence when both are
provided. Empty input fast-fails without an API call. Embedded
double-quotes stripped (Deezer's syntax has no escape mechanism).
## Layer 2: provider-neutral relevance reranker
New `core/metadata/relevance.py` module — pure-function rerank over
the canonical `Track` dataclass. Composable scoring:
- **Cover/karaoke patterns** (multiplier 0.05, effectively buries):
matches "karaoke", "originally performed by", "in the style of",
"made famous by", "tribute", "vocal version", "backing track",
"cover version", "re-recorded", "cover by", etc. across title,
album, AND artist fields. Catches the screenshot's exact junk:
artist credits like "Pop Music Workshop" / "The Karaoke Channel"
/ "Foreigner Tribute Band".
- **Variant tags** (multiplier 0.4): live / acoustic / demo /
instrumental / remix / radio edit / club mix etc. — softer
penalty since the user MAY want them. Skipped entirely when the
expected_title contains the same tag (so searching
"Track (Live)" still ranks Live versions first).
- **Exact artist boost** (multiplier 1.5): primary artist exactly
matches expected_artist after normalisation. Single strongest
signal for "this is the canonical recording".
- **Title + artist similarity** via SequenceMatcher (parentheticals
+ punctuation stripped before comparison).
- **Album-type weighting**: album=1.0 > single/ep=0.85 > compilation=0.7.
Compilations are more likely tribute / karaoke repackages.
Each component is a standalone function so tests pin them
individually without standing up the full pipeline.
## Wired at three search-modal endpoints
- `/api/deezer/search_tracks` — uses both layers (field-scoped
query + rerank).
- `/api/itunes/search_tracks` — uses rerank only (iTunes API has
no advanced-syntax search, but karaoke / cover variants still
leak through and need the local penalty).
- `/api/spotify/search_tracks` — already builds field-scoped
`track:X artist:Y` query; rerank added as the consistency safety
net so all three sources behave the same from the user's
perspective.
Other Deezer call sites (matching engine, watchlist scanner,
auto-import single-track ID) deliberately not touched in this PR
— they have their own elaborate scoring pipelines tuned to their
specific contexts and aren't surfacing the user-reported issue.
Per Cin: "don't refactor beyond what the task requires."
# Tests
71 new tests across 3 files:
- `tests/metadata/test_relevance.py` (50 tests) — every scoring
component pinned individually + the issue #534 screenshot
reproduced as a regression test (real Foreigner cut wins after
rerank, karaoke variants drop to bottom).
- `tests/metadata/test_deezer_search_query.py` (14 tests) —
advanced-syntax query construction, field-scoped wiring at the
client boundary, free-text path unchanged, kwargs win when
ambiguous, limit clamping, cache key consistency.
- `tests/imports/test_search_match_endpoints.py` (7 tests) —
end-to-end through Flask test client: Deezer endpoint passes
kwargs not joined query; karaoke buried at bottom for all three
sources; legacy query param still works without rerank.
# Verification
- 2441 full suite passes (+71 from baseline 2370)
- 0 failures (the prior watchdog flake fix held)
- Ruff clean across all changed files
- JS parses clean (`node -c webui/static/helper.js`)
# Architectural standards followed
- **Logic at the right boundary.** Query construction lives in the
client (every caller benefits from one change). Rerank lives in
a neutral module (`core/metadata/relevance.py`) over the
canonical `Track` dataclass — works for any source, not Deezer-
specific.
- **Explicit > implicit.** Every scoring rule has its own named
function. Pattern tables are module-level constants tests can
introspect.
- **Scope discipline.** Audited every Deezer search call site;
fixed the user-reported one + the consistent siblings. Did NOT
speculatively normalise every Deezer call across the codebase.
- **Backward compat.** Free-text `query=` callers untouched. Kwargs
added to existing client method signature with safe defaults.
- **Tests pin contract at correct boundary.** Pure-function rerank
tests don't mock anything; client-query tests stub at `_api_get`;
endpoint tests run through the real Flask app.
Brings the auto-import matcher to picard / beets / roon parity by
reaching for the existing AcoustID-grade infrastructure (typed Album
foundation, integrity check thresholds) and layering id-based exact
matches on top of the fuzzy scorer. Picard-tagged libraries now land
every track with full confidence on the first pass.
Three layered phases in `core/imports/album_matching.match_files_to_tracks`:
1. **MBID exact match** — file has `musicbrainz_trackid` tag, source
returns the same id → instant pair, full confidence, no fuzzy
scoring. Picard's primary identifier; per-recording.
2. **ISRC exact match** — file has `isrc` tag, source returns the same
id → same fast-path, slightly lower priority than mbid (isrc can
be shared across remasters). Both ids normalised before compare
(uppercase + strip dashes/spaces for isrc, lowercase for mbid).
3. **Duration sanity gate** — files in the fuzzy phase whose audio
length differs from the candidate track's duration by more than
`DURATION_TOLERANCE_MS` (3s, matching the post-download integrity
check) are rejected before scoring runs. Defends against the
cross-disc / cross-release / wrong-edit problem the integrity
check used to catch only AFTER the file had already been moved +
tagged + db-inserted.
Tag reader (`_read_file_tags`) extended:
- Reads `isrc` (uppercased, strip / / spaces normalisation deferred
to matcher)
- Reads `musicbrainz_trackid` as `mbid` (lowercased)
- Reads `audio.info.length` and converts to `duration_ms` to match
the metadata-source convention
Metadata-source layer (`_build_album_track_entry`) extended:
- Propagates `isrc` from top-level OR `external_ids.isrc` (spotify
shape — would otherwise be stripped before reaching the matcher)
- Propagates `musicbrainz_id` from top-level OR `external_ids.mbid`
/ `external_ids.musicbrainz`
- Without this layer, fast paths would silently never fire in
production even though unit tests pass — pinned by
`test_album_track_entry_propagates_isrc_and_mbid_from_source`
18 new tests in `tests/imports/test_album_matching_exact_id.py`:
- Direct: `find_exact_id_matches` with mbid, isrc, isrc normalisation,
mbid > isrc priority, spotify-shape `external_ids.isrc`, no-id
empty result, file-used-at-most-once
- Direct: `duration_sanity_ok` within / outside tolerance, missing
durations defer
- End-to-end via `match_files_to_tracks`: mbid match short-circuits
fuzzy scoring, id-matched files excluded from fuzzy phase, duration
gate rejects wrong-disc collisions in fuzzy phase, normal matches
pass through the gate, missing durations fall through, deezer
seconds-vs-ms conversion, full picard-tagged 10-track album via
mbid only
- Production-shape: `_build_album_track_entry` propagates isrc + mbid
from spotify-shape (`external_ids.isrc`) AND itunes-shape (top-
level `isrc`)
Verification:
- 35 album-matching tests pass total (17 helper + 18 fast-path)
- 23 multi-disc tests still pass after the extension (additive)
- Full suite: 2311 passed (+18 new), 1 pre-existing flaky timing test
failure (`test_watchdog_warns_about_stuck_workers` — passes in
isolation, fails only in full-suite runs, unrelated to this PR)
- Ruff clean
For users:
- Picard / Beets / Mp3Tag-tagged libraries (anyone who's organised
their music) get instant perfect-confidence matches every time.
- Soulseek-tagged downloads (which usually carry isrc when sourced
via metadata-aware soulseekers) get the fast path too.
- Naively-named files with no useful tags fall through to the
improved fuzzy + duration-gated path — same correctness as before
for the common case, much harder for the matcher to confidently
pair the wrong file.
- One step closer to standalone-DB feature parity with plex /
jellyfin / navidrome scanners. Acoustid fingerprint fallback
(for files with NO useful tags AND no MBID/ISRC) is the next
followup PR.
Catches the silent excepts the awk-based earlier sweeps missed:
- Bare `except:` followed by `pass` (also swallows KeyboardInterrupt
and SystemExit — actively wrong). Upgraded to `except Exception as
e: logger.debug("...: %s", e)`. ~14 sites across connection_detect,
soulseek_client, listenbrainz_manager, watchlist_scanner,
youtube_client, navidrome_client, jellyfin_client, web_server.
- `except Exception:` + pass that the awk pattern missed (e.g.
multi-line or unusual whitespace). ~31 sites across automation_engine,
database_update_worker, music_database, spotify_client, web_server,
others.
- 14 legitimate cleanup sites left silent with explicit `# noqa: S110`
+ comment explaining why (atexit handlers, finally-block conn.close
calls). Logging during shutdown can itself crash because file handles
get torn down before the handler fires.
Also enables `S110` rule in pyproject.toml so this pattern fails CI
going forward — drift fails at PR review instead of at runtime against
a wedged worker thread. Tests path keeps S110 ignored (test fixtures
legitimately use try-except-pass for cleanup).
Adds a WHATS_NEW entry to helper.js summarizing the full #369 sweep.
Verified: `python -m ruff check .` → All checks passed.
Verified: `python -m pytest tests/` → 2188 passed.
Closes#369
Two fixes.
(1) Discography endpoint now does server-side per-source ID resolution.
When the user clicked Download Discography on a library artist, the
endpoint received whichever artist ID the frontend happened to pick
(spotify_artist_id || itunes_artist_id || deezer_id || library_db_id)
and dispatched it as-is to whichever source it queried. If the picked
ID didn't match the queried source's ID format, the lookup returned
wrong-artist results (numeric ID collisions) or fell back to a fuzzy
name search that picked a wrong artist.
Two reproducible cases:
- 50 Cent's library row had DB id 194687 — coincidentally a real
Deezer artist ID for "Young Hot Rod". When the frontend's
/enhanced fetch silently fell back to the DB id, the backend
sent 194687 to Deezer, and Deezer returned Young Hot Rod's
50 albums in 50 Cent's discography modal.
- Weird Al's library row had a stored Spotify ID. The frontend
sent that to Deezer, which rejected the alphanumeric ID and
fell back to fuzzy name search — which picked The Beatles
somehow, returning 45 Beatles albums.
The mechanism for per-source ID dispatch already exists in
``MetadataLookupOptions.artist_source_ids``, and the watchlist scanner
already uses it; the on-demand discography endpoint just wasn't wired
to it. Fix: when the URL artist_id matches a library row by ANY stored
ID (DB id, spotify_artist_id, itunes_artist_id, deezer_id, or
musicbrainz_id), pull every stored provider ID and pass them as
``artist_source_ids``. Each source gets its OWN stored ID regardless
of which one the URL carries. When the URL ID is a non-library
source-native ID and the row lookup misses entirely, behavior is
identical to before (single-ID dispatch fallback).
Logged the resolved per-source ID dict at INFO so future "wrong artist
showed up" diagnostics are immediately legible in app.log.
(2) Logger namespace fix in core/artists/quality.py and
core/metadata/multi_source_search.py.
Both modules used ``logging.getLogger(__name__)`` which resolves to
``core.artists.quality`` / ``core.metadata.multi_source_search`` —
neither under the ``soulsync`` namespace where the file handler is
wired. Result: every [Enhance], [MultiSourceSearch], and direct-lookup
INFO line was being written to a logger with no handlers and silently
dropped. App log showed the slow-request warning but no diagnostic
detail. Switched both to ``get_logger()`` from utils.logging_config so
the soulsync.* namespace picks them up. Same content, now actually
lands in app.log. Confirmed working in live test:
``[Enhance] Direct lookup matched: deezer ID 1476162252 → 'Desastre'``
No behavior change in any other caller. Empty ``artist_source_ids``
(no library row matched) reaches lookup as ``None`` → identical to
current single-ID dispatch path. Logger fix is pure routing — no
content change.
Track Redownload had been doing parallel multi-source metadata search
across every configured source the whole time; Enhance Quality was
running a single-source primary fallback that returned junk matches
with empty fields when the primary was iTunes (Discord report:
"unknown artist - unknown album - unknown track" wishlist entries
for users with neither Spotify nor Deezer connected).
Lift the redownload search into core/metadata/multi_source_search.py
and point both flows at it. Same scoring, same per-source query
optimization (Deezer's structured artist:/track: form), same
current-match flagging via stored source IDs.
ArtistQualityDeps now takes get_metadata_search_sources (returns
[(name, client), ...] for every configured source) instead of the
single-primary get_metadata_fallback_client + get_metadata_fallback_source.
Spotify direct-lookup stays as a fast-path optimization (only Spotify
exposes get_track_details(id) returning rich raw payload); when it
doesn't fire, the multi-source parallel search picks the cross-source
best match. Empty-field matches still rejected before wishlist add.
Tests: _build_deps helper updated to accept the new search_sources
contract while preserving fallback_client/fallback_source ergonomics.
Reframed tests for the new semantics — direct-lookup is no longer
gated on Spotify being the active primary; failure reason now lists
every searched source. Added a test pinning the no-sources-configured
prompt. 17/17 quality tests green, 2128/2128 full suite green.
Three more album-shape consumers now route through
Album.from_<source>_dict() when caller passes a known source:
- _build_discography_release_dict (artist discography cards)
- _build_artist_detail_release_card (artist detail release cards)
- _normalize_track_album (quality scanner result normalization)
Legacy duck-typing stays as fallback for unknown source,
non-dict input, or converter errors. Pure additive — existing
callers without source kwarg unchanged.
Steps 2+3 of typed metadata migration. Two album-info builders now
route through Album.from_<source>_dict() when caller passes a
known source:
- _build_album_info (album-tracks lookups)
- _build_single_import_context_payload (single-track import context)
Legacy duck-typing stays as fallback for unknown source, non-dict
input, or converter errors. Pure additive — existing callers
without source kwarg unchanged.
Audit caught two missing providers from the foundation pr. Both
return album-shaped data via their clients (search + download
flows). Tidal uses tidalapi objects rather than dicts so the
converter is from_tidal_object, not _dict.
Enrichment-only providers (lastfm/genius/acoustid/listenbrainz/
audiodb) intentionally have no album converter — they enrich
existing rows, never return album shapes.
Tests: +8 cases. 40 total now.
New core/metadata/types.py with canonical dataclasses + classmethod
converters for spotify/itunes/deezer/discogs/musicbrainz/hydrabase.
Each converter is the single place that knows that provider's wire
shape — addresses the duck-typing pattern Cin flagged.
Pure additive: no consumer code changed. Follow-up PRs migrate
consumers one at a time. Migration plan at
docs/metadata-types-migration.md.
Tests: 32 cases pin per-provider semantics + cross-provider
invariants. Also stabilized a flaky discogs test that depended on
local config state.
Discord report (Samuel [KC]): tracks of the same album sometimes carry
different MUSICBRAINZ_ALBUMID tags, which causes Navidrome (and other
media servers grouping by album MBID) to split the album into multiple
entries. Two-part fix — one for existing libraries, one for the root
cause that lets new imports drift.
Part 1 — Detector + fix action (catches existing dissenters):
`core/repair_jobs/mbid_mismatch_detector.py`:
- New helpers: `_read_album_mbid_from_file` and
`_write_album_mbid_to_file` use the Picard-standard tag conventions
(`TXXX:MusicBrainz Album Id` for MP3, `MUSICBRAINZ_ALBUMID` for
FLAC/OGG, `----:com.apple.iTunes:MusicBrainz Album Id` for MP4).
- New scan phase `_scan_album_mbid_consistency` runs after the
existing track-MBID scan: groups tracks by DB `album_id`, reads
each track's embedded album MBID, finds the consensus
(most-common) MBID via `Counter`, flags dissenters. Tracks without
an album MBID at all are skipped (they don't break Navidrome —
only an explicit MBID disagreement does). Albums where MBIDs are
perfectly tied (no clear consensus) are skipped too — surface as
a manual decision instead of fixing toward a 1/N tie.
- New finding type `album_mbid_mismatch` carries `consensus_mbid`,
`wrong_mbid`, `consensus_count`, `total_tracks_with_mbid`, and a
human-readable reason string.
`core/repair_worker.py`:
- Added `'album_mbid_mismatch': self._fix_album_mbid_mismatch` to the
fix dispatch dict and to the `fixable_types` tuple so auto-fix +
bulk-fix paths pick it up.
- New `_fix_album_mbid_mismatch` method reads `consensus_mbid` from
finding details, resolves the dissenter's file path via the shared
library resolver, calls `_write_album_mbid_to_file` to rewrite the
tag in place. Doesn't touch the album's other tracks (they're
already in agreement).
Part 2 — Root cause fix (prevents new SoulSync imports from drifting):
The original in-memory `mb_release_cache` in `core/metadata/source.py`
maps `(normalized_album, artist) -> release_mbid` so per-track
enrichment of the same album hits the cache and writes the same
MUSICBRAINZ_ALBUMID to every track. That cache is bounded (4096
entries) and in-process — so cache eviction (when other albums are
processed in between) and server restart can BOTH cause
inconsistency. Per-track album-name variation (e.g. some tracks
tagged `"Album"`, others tagged `"Album (Deluxe)"`) and per-track
artist variation (features) make it worse.
`core/metadata/album_mbid_cache.py` (new module):
- DB-backed `lookup(normalized_album, artist) -> release_mbid` and
`record(...)` functions. Same key shape as the in-memory cache.
- Strict additive design: every public function is wrapped in
try/except and degrades to None / no-op on ANY database error.
The existing in-memory cache + MusicBrainz lookup remains the
authoritative fallback. If this module breaks, downloads continue
exactly as they would today.
`database/music_database.py`:
- New `mb_album_release_cache` table with composite primary key
`(normalized_album_key, artist_key)`. Reverse-lookup index on
`release_mbid` for future debug tooling. Created via the existing
`CREATE TABLE IF NOT EXISTS` migration pattern — idempotent, no
schema version bump needed.
`core/metadata/source.py`:
- Surgical change inside the existing `embed_source_ids`
in-memory-cache-miss branch: BEFORE calling MusicBrainz, consult
the persistent cache. If a previous SoulSync run already resolved
this album's release MBID, reuse it. After a successful MB lookup,
store in BOTH caches. Both calls wrapped in defensive try/except
so any failure falls through to existing logic.
Tests:
- `tests/metadata/test_album_mbid_cache.py` — 16 cache tests:
round-trip, idempotent re-record, overwrite semantics, clear_all,
album+artist independence (no Greatest Hits collisions),
defensive None-on-empty-input, graceful degradation when the DB
is unavailable / connection raises / commit fails, schema sanity
(table + index exist after init).
- `tests/test_album_mbid_consistency.py` — 13 detector tests:
tag read/write round-trip on real FLAC files, Picard-standard tag
descriptors, defensive paths (unreadable file, empty input),
detector behavior (agreement → no flags, lone dissenter → flag,
ties → no flag, single-track albums → skipped, no-MBID tracks →
skipped, unresolvable file paths → skipped).
- `tests/metadata/test_metadata_enrichment.py` — added autouse
fixture monkeypatching the persistent cache to no-op for tests in
this file. The existing tests pin per-call MB counts and
in-memory cache state; without the fixture, persistent rows from
earlier tests would bypass the MB call. Persistent layer has its
own dedicated tests.
Verified: 1782 tests pass (29 new), ruff clean, smoke test confirms
end-to-end cache round-trip works.
WHATS_NEW entry under '2.4.2' dev cycle.
Followup to fix/watchlist-external-id-match. The companion PR closed
the demand side — the watchlist scanner asks for tracks by external IDs
before falling back to fuzzy. But for users on Plex / Jellyfin /
Navidrome the supply side was still broken: tracks.spotify_track_id
(and the other ID columns) only got populated by the asynchronous
enrichment workers, sometimes hours after the file was actually
written. During that window the ID match fell through to fuzzy and
the bug returned.
We were already collecting every ID during post-processing — they
live in the `pp` dict in core/metadata/source.py:embed_source_ids and
get embedded into file tags. We just dropped the in-memory copy
afterwards.
This PR persists them and uses them:
- Schema migration adds spotify_track_id / itunes_track_id /
deezer_track_id / tidal_track_id / qobuz_track_id /
musicbrainz_recording_id / audiodb_id / soul_id / isrc columns +
indexes to the existing track_downloads table (already keyed by
file_path).
- core/metadata/source.py:embed_source_ids exposes pp["id_tags"] and
the resolved ISRC back to the import context as _embedded_id_tags
/ _isrc.
- core/imports/side_effects.py:record_download_provenance reads those
context fields and passes them to db.record_track_download, which
now accepts the new ID kwargs and persists them.
- New db.get_provenance_by_file_path with exact + basename-suffix
fallback (handles container mount-root differences between
download-time path and media-server-reported path).
- New db.backfill_track_external_ids_from_provenance copies IDs
from track_downloads onto a tracks row idempotently — COALESCE on
every column preserves any value the enrichment worker already
wrote (enrichment is more authoritative for late binding).
- database/music_database.py:insert_or_update_media_track (the
single insertion point used by every Plex / Jellyfin / Navidrome
sync) calls the backfill immediately after each INSERT/UPDATE.
- New core/library/track_identity.py:find_provenance_by_external_id
used as a second-tier fallback in watchlist_scanner.is_track_missing
_from_library — catches the window between download and media-server
sync. Caller checks os.path.exists on the provenance file_path
before treating it as "already in library" so a deleted file
doesn't prevent re-download.
Effect: freshly downloaded files become ID-recognizable to the
watchlist on the very next scan, no enrichment-wait window.
19 regression tests in tests/test_provenance_id_persistence.py:
- Schema migration adds expected columns + indexes
- record_track_download persists every ID kwarg
- record_track_download backward-compat (old kwargs still work)
- get_provenance_by_file_path: exact match, basename fallback for
mount-root differences, multi-record latest-wins, defensive None
- backfill: copies all IDs, preserves existing via COALESCE,
no-op when no provenance exists
- find_provenance_by_external_id: per-ID lookup, ISRC cross-bridge,
OR semantics, latest-wins on multiple matches
Out of scope: backfilling provenance for files downloaded BEFORE
this PR (their track_downloads rows don't carry the new IDs). Those
continue to wait for enrichment. Acceptable — only affects historical
files; new downloads benefit immediately.
Full pytest 1625 passed; ruff clean.
- keep existing /api/image-proxy URLs from being wrapped again
- reuse the shared metadata package instead of duplicating URL logic in web_server.py
- add regression coverage for proxy passthrough and internal URL normalization
- move Spotify status publishing onto auth, disconnect, and rate-limit transitions
- keep dashboard and debug consumers on the shared cached snapshot
- leave only the initial snapshot seed as a fallback probe
- move metadata-source and Spotify status caching out of web_server.py
- keep the public /status payload unchanged while shrinking server-side glue
- centralize invalidation and TTL handling in core/metadata/status.py
- Keep the primary metadata provider snapshot generic and move Spotify auth/rate-limit details into a separate status object.
- Update the websocket fixture and dashboard/settings consumers to read the two buckets independently.
- Flatten the Spotify service-status rendering so it shows rate-limit and recovery states explicitly, while otherwise displaying the active metadata provider directly.
- Keep the Spotify auth controls and metadata-source picker aligned with the real session state after authenticate and disconnect flows.
- Return "Unmapped" for unknown metadata source labels instead of implying iTunes.
- Update the metadata registry tests to cover the new label fallback.
- Send Spotify auth completion back to the opener so the settings page refreshes immediately
- Make the local auth flow go straight through to Spotify instead of showing the temporary instruction page
- Keep the remote/docker instruction page available for manual callback setups
- Sync Spotify status, connect/disconnect buttons, and metadata source selection after auth and disconnect
- Keep the disconnect behavior aligned with the active primary metadata source
- Hide the auth button when a Spotify session is active
- Treat disconnect as a session change, not a provider swap
- Share metadata source labels in the registry
- Tighten rate-limit copy around Spotify-specific behavior
- let core.metadata.registry own per-profile Spotify client caching
- register the DB-backed profile credentials provider from web_server.py
- invalidate only the affected profile cache entry on save, delete, and auth
- make web_server.py read and refresh Spotify from core.metadata.registry
- add single-key metadata cache eviction for Spotify reauth
- export the new cache helper through the metadata package shims