SoulSync

Commit Graph

Author	SHA1	Message	Date
Broque Thomas	4ca3f70bf3	Show MusicBrainz release variants in import Expand matched MusicBrainz release groups into concrete releases for specific album searches so import users can choose the correct edition by track count, format, country, and disambiguation. Preserve distinct MusicBrainz release IDs instead of deduping same-title variants, carry release metadata through import matching, and surface those details on album result cards. Add coverage for variant preservation and release-group expansion.	2 days ago
Broque Thomas	b9af4ef4ef	Handle transient SQLite IO during maintenance Keep full refresh moving when post-clear VACUUM hits a transient disk I/O error, and retry clear_server_data once when the clear step itself sees the same transient SQLite failure. Retry metadata cache maintenance writes once on transient disk I/O errors so first-attempt cache jobs do not fail when an immediate retry would succeed. Tests cover best-effort VACUUM, clear retry behavior, and cache maintenance retry behavior.	5 days ago
Broque Thomas	136d665c8a	feat(webui): cache artwork images on disk Add a disk-backed image cache with hashed browser URLs, SQLite metadata, size/type validation, stale fallback, and per-image fetch locking. Route normalized artwork through /api/image-cache while keeping /api/image-proxy as a compatibility shim, and align browser max-age with the image cache TTL. Add focused tests for cache behavior and image URL normalization.	6 days ago
Broque Thomas	987409508b	fix(metadata): surface MusicBrainz 'Other' release-groups in discography (#650 ) S-Bryce reported that for some artists (Vocaloid producers, JP indie acts, niche Western indie) the artist detail page was missing whole release-groups visible on musicbrainz.org. Downloaded tracks from those release-groups appeared in artist track counts but were not bound to any visible album / single card — orphan "ghost" tracks the user couldn't browse to. Two duplicated bugs fed each other: 1. `core/musicbrainz_search.py` browsed MB release-groups with `release_types=['album', 'ep', 'single']`. MB's primary-type vocabulary is {Album, Single, EP, Broadcast, Other} — music videos, one-off web releases, and broadcast singles use Other. Pre-fix the filter dropped them at the API layer. 2. Three sites duplicated the same "raw primary-type → internal album_type" mapping with slightly different vocabularies and all silently defaulted unknown values (including 'Other') to 'album': core/musicbrainz_search.py `_map_release_type` core/metadata/types.py inline `{single:single, ep:ep}.get(...)` core/metadata/cache.py Deezer-specific record_type guard Letting Other through the filter without a real mapper would have placed music videos in the Albums view alongside LPs — visually misleading. Fix shape: - New `core/metadata/release_type.py` — single canonical mapper consumed by every provider's raw→Album projection. Knows the full MB vocabulary including 'other' and 'broadcast'; routes both into the singles bucket since they're functionally single-track releases. Compilation secondary-type override preserved (MB's canonical Greatest-Hits pattern is `primary=Album, secondary=[Compilation]`). - `core/musicbrainz_search.py` `_map_release_type` becomes a thin alias for the new helper so the six internal call sites stay intact. API filter gains 'other'. - `core/metadata/types.py` Album projection drops its inline mini- mapper and calls the canonical helper. Now also handles the compilation secondary-type override it was previously missing. - The Deezer-specific cache.py guard stays as-is — Deezer's record_type vocabulary is closed (album\|single\|ep), not affected by this issue. Verified end-to-end against MB for S-Bryce's artist (`46196b9c-affa- 4616-b53b-e967c8bd70e0`, inabakumori): pre-fix returned 22 release- groups; post-fix returns 27, with the 5 extra all landing in the Singles section with album_type='single' as intended. 23 new unit tests pin the mapper contract (case-insensitive primary types, compilation secondary override, Other/Broadcast → single, unknown → album default preserved, defensive empty/None inputs). 2 new tests in test_musicbrainz_search pin the API filter inclusion of 'other' and the round-trip into the Singles bucket. All 516 existing metadata tests still green — refactor leaves historical behaviour for {album, ep, single, compilation} unchanged.	7 days ago
Broque Thomas	daf9a527d9	feat(fix-popup): include MusicBrainz in the auto-search cascade The Fix Track Match modal's auto-search was hardcoded to query only Spotify -> Deezer -> iTunes, ignoring MusicBrainz entirely — even for users with MB set as their primary metadata source. MB-niche recordings (canonical entries with diacritics, fringe / non-mainstream tracks that the commercial catalogues don't carry) had no chance. Wiring: - New `MusicBrainzSearchClient.search_tracks_with_artist(track, artist, limit)` for surfaces that already have title + artist split. Uses MB's bare-query mode (strict=False) — diacritic-folded, alias/sortname indexed — same recall rationale as the earlier MBID-paste endpoint. - New route `GET /api/musicbrainz/search_tracks` mirrors the existing /api/{spotify,itunes,deezer}/search_tracks endpoints exactly: accepts `track`+`artist` (or legacy `query`) + `limit`, returns `{tracks: [{id, name, artists, album, duration_ms, image_url, source}]}`. Applies the same `core.metadata.relevance.rerank_tracks` pass Deezer / iTunes use, which is critical because MB's free-text scoring weighs title-text matches heavily and would otherwise rank cover / tribute recordings above the canonical version. - `_search_tracks_text` gains a `min_score` parameter. The cascade path passes 20 (vs the enhanced-search-tab default of 80) so MB recordings whose title doesn't literally contain the artist name still enter the candidate pool — without that, "Army of Me" + "Bjork" only surfaces the HIRS Collective cover (score 100) and drops Björk's canonical recording (score 28). The rerank pass then surfaces Björk by artist match. Verified against real MB API: pre-fix returned only the cover; post-fix top 5 are all Björk. - Fix popup `allSources` array (wishlist-tools.js) gets MB appended. The existing `activeIdx` reorder logic moves MB to the front when it's the active primary; otherwise MB sits last (1 req/sec rate limit makes it the slowest source). 7 new unit tests on the adapter: bare-query mode is used, missing artist falls back to None (drops AND-clause), empty inputs short-circuit, low-score candidates are kept for rerank to handle, default strict + default min_score behaviour preserved for the existing search-tab path, client errors are swallowed so the cascade falls through to the next source. Discogs intentionally absent — Discogs has no track-level search API (see core/discogs_client.py:575 — returns []). Adding a Flask endpoint that always returns empty would be a permanent no-op.	1 week ago
Broque Thomas	036faff8b1	feat(fix-popup): paste MusicBrainz URL/MBID to match directly Power-user escape hatch on the Discovery Fix Track Match modal — when fuzzy auto-search ranks the wrong recording among many same-title versions (10 remasters, live cuts, alt sessions), paste the MusicBrainz recording URL or bare UUID into the new field and resolve straight to that record. Layout: - Shape adapter `get_recording_flat(mbid)` lives in `core/musicbrainz_search.py` next to existing `get_track_details`. Returns the flat Fix-popup track shape (artists as `string[]`, album as string, single `image_url`) — distinct from the Spotify-shaped nested dict `get_track_details` returns. - New route `GET /api/musicbrainz/recording/<mbid>` is a thin wrapper: validates MBID format with an anchored UUID regex, calls the adapter, returns 400 / 404 / 200 with no inline shape massaging. - Frontend `parseMusicBrainzMbid()` lives in `shared-helpers.js` — pure URL/UUID parser, reusable from other surfaces (failed-MB cache, manual match) without duplication. - Fix modal HTML gets one new input row + button; existing search row and result render pipeline are untouched. New `lookupDiscoveryFixByMbid()` fetches the endpoint and feeds the single result through the existing `renderDiscoveryFixResults` -> confirm-dialog -> match pipeline, so MB- paste matches go through the exact same selection flow as auto-search results. - Enter-key bound on the MBID input via a separate handler ref so its lifecycle matches the search-input handlers without conflating the two submit targets. 7 unit tests cover the adapter: happy path, empty/None MBID, MB returns None, recording-without-release (empty album), multi-artist credits, includes-list contract, and client-error swallow. Out of scope: the Fix popup's fuzzy cascade is still hardcoded to spotify/deezer/itunes regardless of which primary source the user has configured. Adding MB to that cascade (when MB is the active primary) is a separate concern.	1 week ago
Broque Thomas	43ed30b4d2	fix(musicbrainz): user-facing search recall + album-detail 404 Two bugs surfacing on the Fix popup and enhanced-search MB tab: 1. Strict Lucene phrase queries (`recording:"X" AND artist:"Y"`) killed recall on user-facing manual search — diacritics ("Bjork" vs canonical "Björk"), bracketed suffixes like "(Live)", and any AND-clause mismatch returned zero results. Added `strict: bool = True` param to `search_release` / `search_recording`; when False, sends a bare query joining title + artist so MB hits alias/sortname indexes with diacritic folding. `/api/musicbrainz/search` (Fix popup) and `core/library/service_search.py` (service tabs) now pass strict=False. Enrichment workers stay on strict mode — precision matters there because they auto-accept the top hit above a confidence threshold. 2. Every MB album click was silently 404-ing — `_render_release_as_album` passed `cover-art-archive` as an MB `inc` param, but it's not a valid include for the /release resource (MB rejects with 400). The CAA flags come back on every release response by default, so dropping the bad include preserves the image-scope picker logic intact.	1 week ago
Broque Thomas	e0e31079e6	Update test: get_release includes cover-art-archive	1 week ago
Broque Thomas	5bc5fbb662	Add MusicBrainz as a metadata source Register MusicBrainz as a first-class metadata source alongside Deezer, iTunes, Spotify, Discogs, and Hydrabase. Expose the shared client through metadata services, add the settings option, and expand the MusicBrainz search adapter with source-compatible artist, album, track, and detail methods. Carry MusicBrainz IDs through similar-artist discovery, recommended artists, artist map serialization, and personalized playlist selection. Update DB migrations and lookup filters so similar_artist_musicbrainz_id is preserved on older schemas and used for source requirements and library exclusion. Normalize MusicBrainz album adapter output for import context and add regression coverage for registry mapping, typed album conversion, and similar-artist filtering. Verified by user with 120 focused tests passing.	1 week ago
Broque Thomas	3a4017ea2b	feat: artist-detail deep linking — /artist-detail/:source/:id Artist detail pages previously always pushed /artist-detail to the URL, so refreshing the page or sharing a link would drop users on a broken empty page with no artist loaded. URL format is now /artist-detail/:source/:id (e.g. /artist-detail/spotify/4tZwfgrHOc3mvqsCAfo4LT or /artist-detail/library/42). The source segment lets the backend synthesize a response from the right metadata client without a DB hit. Changes: Client routing (legacy shell + TanStack bridge) - buildArtistDetailPath / _getDeepLinkArtistDetail added to init.js; parse both new :source/:id and legacy bare :id formats so old bookmarks still work - navigateToPage passes artistId + artistSource through to the router bridge, which builds the dynamic href instead of hardcoding route.path - resolveShellPageFromPath / resolveLegacyShellPageFromPath use a prefix match so /artist-detail/* resolves to artist-detail page-id - globals.d.ts typed for artistId / artistSource options - activateLegacyPath and syncActivePageFromLocation (popstate) both restore artist from URL using skipRouteChange:true to avoid a re-navigation loop back to /artist-detail - loadInitialData restores artist from URL on page load (router not yet mounted at DOMContentLoaded so legacy path runs unconditionally) - Same-artist guard in navigateToArtistDetail prevents double-fetch when the router fires activateLegacyPath after the initial navigation Server - artist_source_detail.build_source_only_artist_detail now resolves artist name from the source API when none is supplied, so deep-link restores with an empty name string still render correctly Tests - test_spa_deep_linking: /artist-detail/42 and /artist-detail/spotify/ID both serve index.html - bridge.test.ts: source-aware URL building and library fallback - route-manifest.test.ts: prefix path resolution - artist_source_detail: name resolved from source when input is empty	1 week ago
Broque Thomas	54dbd150cb	Preserve full release dates in audio tags	1 week ago
Broque Thomas	025007b97f	Tighten artist discography soundtrack matching	1 week ago
Broque Thomas	121651da2c	Add amazon_id column to artists table for full source parity Schema: ALTER TABLE artists ADD COLUMN amazon_id TEXT with index, added via _add_amazon_columns migration called after Discogs in _run_migrations. SOURCE_ID_FIELD: add "amazon" -> "amazon_id" entry. find_library_artist_for_ source now looks up Amazon artists by slug before falling back to name match, same as every other source. artist_source_detail already stamps artist_info [source_id_field] = artist_id so the amazon_id is set on source-only payloads. Tests: add "amazon": "amazon_id" to EXPECTED_SOURCE_ID_FIELD; revert test assertion back to strict equality (SOURCE_ONLY_ARTIST_SOURCES == SOURCE_ID_ FIELD.keys() holds again now that amazon has a column).	1 week ago
Broque Thomas	265fe5233e	Fix Amazon artist detail: library upgrade lookup and artist images Library upgrade: find_library_artist_for_source returned None immediately for Amazon because SOURCE_ID_FIELD has no 'amazon' entry (no DB column for Amazon artist IDs). The name-based fallback was unreachable. Fix: only skip the column query when column is None, not the whole function — name lookup now runs for any source when artist_name + active_server are provided. Artist images: add AmazonClient._get_artist_image_from_albums so the standard _get_artist_image_from_source path in metadata/artist_image.py can call it as a fallback (same hook iTunes/Deezer/Discogs expose). Searches by unslugified artist name, matches primary artist, fetches album cover from album_metadata. Test: updated test_source_only_set_matches_mapping_keys → _contains_all_mapped_ sources to assert subset (not equality) — SOURCE_ONLY_ARTIST_SOURCES intentionally includes sources without a DB column that rely on name-only lookup.	1 week ago
Broque Thomas	30f017d1f0	Stop writing TRCK as "6/0" when album total_tracks is unknown Discord report (netti93): downloaded album tracks were tagged with TRCK = "6/0" instead of "6/13" when source data was incomplete. The retag tool wrote correct "6/13" because core/tag_writer.py already handled the case. Trace: core/metadata/enrichment.py:105 formatted unconditionally as f"{track_number}/{total_tracks}" and many album-dict construction sites pass total_tracks: 0 (per types.py, 0 means "unknown" — not a real count). That 0 propagated straight to disk. Fix at the consumer boundary so every album-dict constructor stays unchanged. Lifted to pure helper core/metadata/track_number_format.py:format_track_number_tag that drops the /N suffix when total is 0 / None / negative — emits just "6" instead. Matches retag's behavior + ID3 spec convention (TRCK can be "N" or "N/M"). MP4 trkn tuple gets the same treatment via format_track_number_tuple returning (6, 0) per spec's "unknown total" marker. Wired into all three format-write sites in enrichment.py: ID3 (TRCK), Vorbis (tracknumber), MP4 (trkn). When source data has correct total_tracks (album downloads via the metadata-source pipeline, retag flow), behavior unchanged — still writes "6/13". 16 boundary tests pin every shape: known total / zero total / none total / none track / zero track / negative inputs / string coercion / unparseable strings / floats truncate. Full suite: 3113 passed.	2 weeks ago
Broque Thomas	c9d4b02a02	Fix Deezer contributors tagging silently dropping for cache-polluted tracks Closes #588. Contributing-artist tagging worked for some tracks but silently dropped them for others — most reproducibly when the album had been fetched before the per-track post-process ran. Trace: get_track_details cache check used `track_position in cached` as the "full payload" sentinel. Both `/track/<id>` AND `/album/<id>/tracks` set track_position. Only `/track/<id>` sets the `contributors` array. When album-tracks data hit the cache first, get_track_details returned the partial record → _build_enhanced_track found no contributors → metadata-source contributors-upgrade silently fell back to single-artist. Reporter's case (Andrea Botez - Sacrifice): the album fetch logged "Retrieved 4 tracks for album 673558211" before the post-process, which cached all 4 tracks as partial records. The contributors- upgrade then hit the partial cache and the upgrade log line never fired because len(upgraded) was never > 1. Lifted cache-validity to a pure helper `_is_full_track_payload` that requires BOTH `track_position` AND `contributors` key presence. Empty list `[]` is valid — single-artist tracks fetched via `/track/<id>` carry it explicitly. Partial cache hits fall through to a fresh `/track/<id>` fetch, which writes the full payload back to cache. 11 boundary tests pin every shape: full payload, single-artist with empty contributors list, partial album-tracks shape, search-result shape, none/non-dict, and the cache-hit/cache-miss/api-failure paths on get_track_details (including the exact reporter-scenario regression). Full suite: 3021 passed.	2 weeks ago
Broque Thomas	0769fcd5cc	Fix Soulseek downloads losing collab artist tags Soulseek matched-download contexts populate `original_search_result` with `artist` (singular string) and no `artists` list — the full multi-artist array lives on `track_info` (the matched Spotify track object). `extract_source_metadata` only read `original_search.artists`, so the Soulseek path always fell through to the single-artist branch and TPE1 ended up with the primary artist only. Deezer-direct downloads were unaffected because their context populates `original_search.artists` as a proper list. Lifted artist resolution into a pure helper `core/metadata/artist_resolution.py:resolve_track_artists` that walks `original_search.artists` → `track_info.artists` → `artist_dict.name` fallback chain. Normalizes mixed list-item shapes (Spotify-style dicts, bare strings, anything else stringified) and drops empty entries. 13 new tests pin the resolution order, fallback chain, mixed-shape normalization, whitespace stripping, and empty/none handling. The existing `_artists_list` no-fall-through test in `test_multi_artist_tag_settings.py` was updated to reflect the new contract (always populated; multi-value write still gated on `len > 1`) plus a new regression test for the Soulseek shape. Composes with the existing Deezer per-track upgrade (still fires when single-artist + track_id available) and feat_in_title / artist_separator settings (still drive the joined ARTIST string downstream).	2 weeks ago
Broque Thomas	46206b3240	Pin type='track' / type='artist' collision case for album-type normalizer	2 weeks ago
Broque Thomas	5eae24b8bb	Fix $albumtype defaulting to album for non-Spotify sources - legacy duck-typed builder only checked the `album_type` key; deezer uses `record_type`, tidal uses `type` (uppercase), some flattened musicbrainz shapes use `primary-type` — all defaulted to album, so EPs and singles ended up filed under Album/ in user templates that reference $albumtype - widen lookup to album_type / record_type / type / primary-type and route through new pure `_normalize_album_type` helper that case-folds + validates against the canonical token set (album / single / ep / compilation), unknown → album - typed-converter path (spotify / deezer / itunes / discogs / mb / hydrabase / qobuz) unchanged — those were already correct Discord report (CAL).	2 weeks ago
Broque Thomas	4892baf8d4	Skip already-owned tracks during download discography - new track_already_owned helper wraps db.check_track_exists at the same confidence threshold the discography backfill repair job uses (0.7) — name+artist+album, format-agnostic so blasphemy-mode libraries (flac → mp3 + delete original) match correctly - endpoint runs the check after the artist + content-type filters and before add_to_wishlist, so a second discography click on the same artist no longer re-queues every track that already downloaded - per-album response carries a new tracks_skipped_owned counter alongside the existing artist/content/wishlist skip categories Discord report (Skowl).	2 weeks ago
Broque Thomas	d4ad5bf57f	Filter cross-artist + content-type tracks during download discography - drop tracks where the requested artist isn't named in track.artists (keeps features, drops compilation / appears_on contamination) - honor watchlist.global_include_live/remixes/acoustic/instrumentals the same way the discography backfill repair job already does - surface per-album skip counts in the ndjson stream (artist mismatch + content filter) so the ui can show what was filtered Closes #559.	2 weeks ago
Broque Thomas	d5de724f9b	Multi-artist Deezer upgrade + double-append guard hardening Two follow-ups to the multi-artist tag settings PR: 1. Deezer contributors upgrade — closes the "known limitation" flagged in the prior commit. Deezer's `/search` endpoint only returns the primary artist for each track; the full contributors array (feat., remix collaborators, producers credited as artists) lives on `/track/<id>` and gets parsed by `_build_enhanced_track`. Without the upgrade Deezer-sourced tracks never got multi-artist tags even with the right settings on. Fix in `core/metadata/source.py`: when source==deezer AND the search response had a single artist AND a track_id is available, fetch full track details via `get_deezer_client().get_track_details` and replace `all_artists` with the upgraded list. - One extra API call per affected Deezer track - Skipped when search already returned multiple (no-op fast path) - Skipped for non-Deezer sources (Spotify/Tidal/iTunes search responses already include all artists) - Skipped when no track_id is available - Defensive try/except: on /track/<id> failure (network error, deezer client unavailable), fall through to the search-result list — never lose the data we already had 2. Double-append guard hardened with a word-boundary regex. Prior commit checked for `"feat." not in title.lower() and "(ft." not in title.lower()` — too narrow. Source platforms produce wildly different feat-marker conventions: "(feat. X)", "(Feat X)", "(FEAT X)", "(Featuring X)", "[feat. X]", "ft. X" (no parens), "FT. X", etc. Any of these as the SOURCE title would cause a double-append: `"Track (Feat X) (feat. Y)"`. Replaced with `re.search(r'\b(?:feat\|feat\.\|featuring\|ft\|ft\.)\b', title, IGNORECASE)`. Word-boundary regex catches every common variant. Substring matches like "Aftermath" containing `ft` correctly fall through to the append path (pinned by a regression test). 16 new tests (29 total in the file): - 9 parametrized variants of the double-append guard - 1 substring guard ("Aftermath") - 6 Deezer upgrade scenarios (fires when expected, doesn't fire for non-Deezer / multi-artist search / no track_id, defensive fall-through on failure, no false-positive when /track/<id> confirms single artist) Full pytest 2727 passed.	2 weeks ago
Broque Thomas	c11a5b7eab	Multi-artist tag settings: implement artist_separator + feat_in_title + populate _artists_list Three settings on Settings → Metadata → Tags were partially or completely unimplemented. Reporter (Netti93) traced each one. (1) `write_multi_artist` only "worked" because of a never-populated `_artists_list` field. `core/metadata/source.py` built `metadata["artist"]` as a hardcoded ", "-joined string but never assigned `metadata["_artists_list"]`. `core/metadata/enrichment.py` line 107 reads that field and gates the multi-value tag write on `len(_artists_list) > 1` — always saw an empty list, silently no-op'd the write. (2) `artist_separator` (default ", ") was referenced in the UI + settings.js save path but ZERO Python code read the value. Every multi-artist track ended up with hardcoded ", " regardless of what the user picked. (3) `feat_in_title` (when true: pull featured artists into the title as " (feat. X, Y)" and leave only primary in the ARTIST tag — Picard convention) had no implementation at all. Fix in source.py: * Populate `_artists_list` from the search response's artists array * Read `feat_in_title` and `artist_separator` configs * When `feat_in_title=True` and >1 artist: ARTIST = primary only, append "(feat. X, Y)" to title with double-append guard * Else: ARTIST = artists joined with `artist_separator` * Single-artist case unaffected by either setting Double-append guard uses a word-boundary regex catching all common "feat" variants source platforms produce — `feat`, `feat.`, `featuring`, `ft`, `ft.` — case-insensitive. Substring matches (e.g. "Aftermath" containing "ft") correctly fall through to the append path. Fix in enrichment.py ID3 branch: * TPE1 stays as the display string (with separator or primary-only per the user's settings) * Multi-value list goes to a separate `TXXX:Artists` frame (Picard convention) when `write_multi_artist` is on * Pre-fix the ID3 path wrote TPE1 twice — single-string then list — and the second `add` overwrote the first, clobbering both the configured separator AND the feat_in_title semantics. Vorbis path was already correct (separate "artist" + "artists" keys). Known limitation (flagged in WHATS_NEW): Deezer's `/search` endpoint only returns the primary artist. The full contributors array lives on `/track/<id>`. Enrichment uses search-result data so Deezer- sourced tracks may still get only the primary artist until a follow- up commit wires the per-track contributors fetch into the enrichment flow. Spotify, Tidal, and iTunes search responses include all artists so they work now. 23 new tests in `tests/metadata/test_multi_artist_tag_settings.py`: * `_artists_list` populated for multi/single/no-artist cases * `artist_separator` drives ARTIST string (default ", " + custom ";" + custom "; " + " & ") * Single-artist case unaffected by either setting * `feat_in_title=True` pulls featured to title, leaves primary in ARTIST * `feat_in_title` no-op for single artist * Double-append guard recognizes 9 source-title variants ("(feat. X)", "(Feat. X)", "(FEAT X)", "(feat X)", "(Featuring X)", "[feat. X]", "ft. X", "(ft X)", "FT. X") * Substring guard test pins "Aftermath" doesn't false-positive * Combined-settings precedence: feat_in_title wins ARTIST string but `_artists_list` carries everyone for multi-value tag Full pytest 2711 passed.	2 weeks ago
Broque Thomas	8a4c0dc92a	Deezer cover-art download: fallback to original URL on CDN refusal Defensive followup. If Deezer CDN ever refuses the upgraded 1900×1900 URL for a specific album (rare — empirically tested 4 albums and none hit it), pre-fix would have succeeded with the 1000×1000 URL and post-fix would have failed entirely. Both download sites now retry with the original URL when the upgraded URL fails: - `core/metadata/artwork.py::download_cover_art` — auto post-process flow. Resolves the original URL from album_info / context the same way the existing path does. - `core/tag_writer.py::download_cover_art` — captures the original URL before upgrade so the retry has it without a second context lookup. Strictly non-regressive: worst plausible post-fix case is now identical to pre-fix (cover at 1000×1000 succeeds). Fallback only fires on the rare CDN-refusal edge. Tests added (2): - `test_tag_writer_retries_with_original_on_failure` — upgraded URL raises, original succeeds, both attempts logged in call order - `test_tag_writer_no_fallback_for_non_dzcdn_url` — non-Deezer URLs go through unchanged, no fallback path triggered (single attempt) Verification: - 18/18 helper + integration tests pass - 2561 full suite passes - Ruff clean	2 weeks ago
Broque Thomas	80cf16339c	Deezer cover art: upgrade CDN URL to 1900×1900 (was embedding 1000×1000) Discord report (Tim): downloaded cover art via Deezer metadata source came out visibly blurry in Navidrome / on phones — large displays exposed the limited resolution. # Cause Deezer's API returns `cover_xl` URLs at 1000×1000. The underlying CDN actually serves up to 1900×1900 by rewriting the size segment in the URL path (same trick the iTunes mzstatic + Spotify scdn upgrades already use). SoulSync wasn't doing the rewrite — every Deezer-sourced cover got embedded at 1000×1000 regardless of how much higher resolution the CDN had available. # Verified empirically ``` $ for size in 1000 1400 1800 1900 2000; do curl -I "...{size}x{size}-..."; done 1000: 200 OK 106 KB 1400: 200 OK 198 KB 1800: 200 OK 331 KB 1900: 200 OK 371 KB 2000: 403 Forbidden ``` 1900 is the safe ceiling. Above that the CDN returns 403. CDN serves source-native bytes when source < target (smaller-source albums get same bytes whether we ask for 1000 or 1900), so asking for 1900 universally is safe. # Fix New `_upgrade_deezer_cover_url(url, target_size=1900)` helper in `core/deezer_client.py`. Pure function, mirrors the `_upgrade_spotify_image_url` pattern that already lives in `core/spotify_client.py`. Defensive on every input shape: - Empty / None → returned as-is - Non-Deezer URL (no `dzcdn`) → returned as-is - No size segment in URL → returned as-is - Already at/above target → returned as-is (idempotent, never downgrades) Applied at both cover-download sites: - `core/metadata/artwork.py::download_cover_art` — auto post-process flow. Mirrors the existing iTunes mzstatic upgrade right above it. - `core/tag_writer.py::download_cover_art` — enhanced library view's "Write Tags to File" feature. # Scope discipline - Helper applied at the DOWNLOAD boundary, not the source extraction point in `deezer_client.py`. Means cached entries in the metadata cache + DB row `image_url` columns keep the original 1000×1000 URL Deezer's API returned. Future CDN behavior changes only affect the download path, not stored data. - Pre-existing `prefer_caa_art` toggle (Settings → Library → Post-Processing) untouched — orthogonal workaround for users who want even higher quality (MusicBrainz Cover Art Archive, often 3000×3000+). - iTunes / Spotify upgrade paths untouched — they already worked. # Tests added (16) `tests/metadata/test_deezer_cover_url_upgrade.py`: - Standard upgrade: default target 1900 on cover URL, alternate dzcdn host (`e-cdns-images.dzcdn.net` vs `cdn-images.dzcdn.net`), artist picture URLs (same path pattern), 500×500 source upgrades too - Custom target size: smaller target = no-op (never downgrade), larger target works - Idempotent: already at/above target returned unchanged - Defensive on non-Deezer URLs: parametrised across 5 hosts (Spotify scdn, iTunes mzstatic, MB CAA, Last.fm, random) — all returned untouched - Defensive on malformed Deezer URL (no size segment) → returned as-is - Empty / None handling # Verification - 16/16 helper tests pass - 560/560 metadata + imports tests pass (no regression) - 2559 full suite passes - Ruff clean	2 weeks ago
Broque Thomas	59992d42a8	Deezer search: free-text fallback when advanced query returns 0 Defensive followup to the relevance fix. Deezer's advanced search syntax (`artist:"X"`) is documented as substring match, but in practice it's brittle on artist name variants ("Foreigner [US]", "The Foreigner") and on tracks indexed under non-canonical title spellings. When the advanced query returns nothing, we'd previously land at "No matches" — a regression vs. pre-fix behaviour where free-text would have returned a less-relevant but non-empty set. Fix: when the advanced query returns 0 results AND the caller used field-scoped kwargs, fall back to a free-text join of the same kwargs and re-query. Caller-side rerank still tightens whatever the fallback returns, so the worst-case post-fix behaviour is the pre-fix behaviour — never strictly worse. Pulled the cache + parse + store dance into a private helper (`_search_tracks_with_query`) so the orchestration can call it twice (advanced → fallback) without code duplication. Single API call when the advanced query has results — no wasted requests. Diagnostic logger.debug fires when the fallback triggers so we can see in production whether it's happening (and to which queries). # Tests added (4) - `test_falls_back_to_free_text_when_advanced_empty` — advanced query returns 0, free-text returns hits; client returns the free-text hits + both API calls fire. - `test_no_fallback_when_advanced_query_has_results` — single hit on advanced query → no second API call. - `test_no_fallback_when_legacy_free_text_call` — legacy callers already exhausted the only path; empty result is final. - `test_no_fallback_when_query_unchanged` — empty kwargs path doesn't trigger the fallback branch (used_advanced=False). # Existing tests updated The 4 prior `TestSearchTracksQueryWiring` + `TestSearchTracksCacheKey` tests were stubbing `_api_get` to return empty `{'data': []}` and asserting `assert_called_once`. With the new fallback, those stubs trigger a second API call and the assertions break — even though the FIRST call construction is what the tests cared about. Updated the stubs to return one fake hit so the fallback doesn't fire, and switched to `call_args_list[0]` for first-call inspection. # Verification - 18/18 deezer query tests pass (14 prior + 4 new) - 2445 full suite passes (+4 from prior commit) - Ruff clean	2 weeks ago
Broque Thomas	1cc37081a6	Fix Deezer search relevance — issue #534 # Background User reported (#534) that the import-modal "Search for Match" dialog returned irrelevant results when Deezer was the metadata source. Searching `Dirty White Boy` + `Foreigner` returned 5+ karaoke / "originally performed by" / "in the style of" / "re-recorded" / tribute-band results ranked above the actual Foreigner studio cut from Head Games. User had to scroll past the junk every time, or fall back to iTunes search which is much slower. # Root cause — two layers 1. Endpoint joined `track + artist` into free-text query. `/api/deezer/search_tracks` was passing `q=Dirty White Boy Foreigner` to Deezer's `/search/track` API. Deezer fuzzy-matches that string across title / lyrics / artist / album / contributors and orders by global popularity — anything that appears across many compilations outranks the canonical recording. 2. No local rerank. None of the search-modal endpoints applied any post-filtering. Deezer's API order shipped straight to the user. # Fix — same architectural shape Cin would build ## Layer 1: field-scoped query at the client boundary `core/deezer_client.py::search_tracks()` now accepts optional `track`, `artist`, `album` kwargs. When provided, builds Deezer's advanced search syntax: `q=track:"X" artist:"Y" album:"Z"`. Massive relevance improvement because each term matches the right field instead of fuzzy-matching everywhere. Backward compat preserved: legacy free-text `query=` callers still work unchanged. Field-scoped path takes precedence when both are provided. Empty input fast-fails without an API call. Embedded double-quotes stripped (Deezer's syntax has no escape mechanism). ## Layer 2: provider-neutral relevance reranker New `core/metadata/relevance.py` module — pure-function rerank over the canonical `Track` dataclass. Composable scoring: - Cover/karaoke patterns (multiplier 0.05, effectively buries): matches "karaoke", "originally performed by", "in the style of", "made famous by", "tribute", "vocal version", "backing track", "cover version", "re-recorded", "cover by", etc. across title, album, AND artist fields. Catches the screenshot's exact junk: artist credits like "Pop Music Workshop" / "The Karaoke Channel" / "Foreigner Tribute Band". - Variant tags (multiplier 0.4): live / acoustic / demo / instrumental / remix / radio edit / club mix etc. — softer penalty since the user MAY want them. Skipped entirely when the expected_title contains the same tag (so searching "Track (Live)" still ranks Live versions first). - Exact artist boost (multiplier 1.5): primary artist exactly matches expected_artist after normalisation. Single strongest signal for "this is the canonical recording". - Title + artist similarity via SequenceMatcher (parentheticals + punctuation stripped before comparison). - Album-type weighting: album=1.0 > single/ep=0.85 > compilation=0.7. Compilations are more likely tribute / karaoke repackages. Each component is a standalone function so tests pin them individually without standing up the full pipeline. ## Wired at three search-modal endpoints - `/api/deezer/search_tracks` — uses both layers (field-scoped query + rerank). - `/api/itunes/search_tracks` — uses rerank only (iTunes API has no advanced-syntax search, but karaoke / cover variants still leak through and need the local penalty). - `/api/spotify/search_tracks` — already builds field-scoped `track:X artist:Y` query; rerank added as the consistency safety net so all three sources behave the same from the user's perspective. Other Deezer call sites (matching engine, watchlist scanner, auto-import single-track ID) deliberately not touched in this PR — they have their own elaborate scoring pipelines tuned to their specific contexts and aren't surfacing the user-reported issue. Per Cin: "don't refactor beyond what the task requires." # Tests 71 new tests across 3 files: - `tests/metadata/test_relevance.py` (50 tests) — every scoring component pinned individually + the issue #534 screenshot reproduced as a regression test (real Foreigner cut wins after rerank, karaoke variants drop to bottom). - `tests/metadata/test_deezer_search_query.py` (14 tests) — advanced-syntax query construction, field-scoped wiring at the client boundary, free-text path unchanged, kwargs win when ambiguous, limit clamping, cache key consistency. - `tests/imports/test_search_match_endpoints.py` (7 tests) — end-to-end through Flask test client: Deezer endpoint passes kwargs not joined query; karaoke buried at bottom for all three sources; legacy query param still works without rerank. # Verification - 2441 full suite passes (+71 from baseline 2370) - 0 failures (the prior watchdog flake fix held) - Ruff clean across all changed files - JS parses clean (`node -c webui/static/helper.js`) # Architectural standards followed - Logic at the right boundary. Query construction lives in the client (every caller benefits from one change). Rerank lives in a neutral module (`core/metadata/relevance.py`) over the canonical `Track` dataclass — works for any source, not Deezer- specific. - Explicit > implicit. Every scoring rule has its own named function. Pattern tables are module-level constants tests can introspect. - Scope discipline. Audited every Deezer search call site; fixed the user-reported one + the consistent siblings. Did NOT speculatively normalise every Deezer call across the codebase. - Backward compat. Free-text `query=` callers untouched. Kwargs added to existing client method signature with safe defaults. - Tests pin contract at correct boundary. Pure-function rerank tests don't mock anything; client-query tests stub at `_api_get`; endpoint tests run through the real Flask app.	2 weeks ago
Broque Thomas	cf5461f2f1	Fix: maintenance findings badge inflated when scan dedup-skipped `_create_finding` silently dedup-skipped re-discovered issues but the caller incremented `findings_created` regardless. So a re-scan that found the same issues as a prior scan reported 364 findings in the badge while 0 NEW pending rows hit the db, leaving the findings tab empty. `_create_finding` now returns bool (True on insert, False on dedup-skip / db error). All 16 repair jobs updated to only increment `findings_created` on True. Added `findings_skipped_dedup` counter surfaced in scan log: "Done: X scanned, 0 fixed, 0 findings (363 already existed), 0 errors". Also fixed a missing `job_id` kwarg in album_tag_consistency that was silently breaking finding creation for that scan.	3 weeks ago
Broque Thomas	77c54ab7a7	Migrate discography + quality scanner to typed Album path Three more album-shape consumers now route through Album.from_<source>_dict() when caller passes a known source: - _build_discography_release_dict (artist discography cards) - _build_artist_detail_release_card (artist detail release cards) - _normalize_track_album (quality scanner result normalization) Legacy duck-typing stays as fallback for unknown source, non-dict input, or converter errors. Pure additive — existing callers without source kwarg unchanged.	3 weeks ago
Broque Thomas	967c7f7c0a	Migrate album-info builders to typed Album path Steps 2+3 of typed metadata migration. Two album-info builders now route through Album.from_<source>_dict() when caller passes a known source: - _build_album_info (album-tracks lookups) - _build_single_import_context_payload (single-track import context) Legacy duck-typing stays as fallback for unknown source, non-dict input, or converter errors. Pure additive — existing callers without source kwarg unchanged.	3 weeks ago
Broque Thomas	eab1297afc	Add Qobuz + Tidal album converters Audit caught two missing providers from the foundation pr. Both return album-shaped data via their clients (search + download flows). Tidal uses tidalapi objects rather than dicts so the converter is from_tidal_object, not _dict. Enrichment-only providers (lastfm/genius/acoustid/listenbrainz/ audiodb) intentionally have no album converter — they enrich existing rows, never return album shapes. Tests: +8 cases. 40 total now.	3 weeks ago
Broque Thomas	529486a2d1	Foundation: typed Album/Track/Artist + per-provider converters New core/metadata/types.py with canonical dataclasses + classmethod converters for spotify/itunes/deezer/discogs/musicbrainz/hydrabase. Each converter is the single place that knows that provider's wire shape — addresses the duck-typing pattern Cin flagged. Pure additive: no consumer code changed. Follow-up PRs migrate consumers one at a time. Migration plan at docs/metadata-types-migration.md. Tests: 32 cases pin per-provider semantics + cross-provider invariants. Also stabilized a flaky discogs test that depended on local config state.	3 weeks ago
Broque Thomas	4b15fe0b75	Fix album MBID inconsistency: detector + persistent release-MBID cache Discord report (Samuel [KC]): tracks of the same album sometimes carry different MUSICBRAINZ_ALBUMID tags, which causes Navidrome (and other media servers grouping by album MBID) to split the album into multiple entries. Two-part fix — one for existing libraries, one for the root cause that lets new imports drift. Part 1 — Detector + fix action (catches existing dissenters): `core/repair_jobs/mbid_mismatch_detector.py`: - New helpers: `_read_album_mbid_from_file` and `_write_album_mbid_to_file` use the Picard-standard tag conventions (`TXXX:MusicBrainz Album Id` for MP3, `MUSICBRAINZ_ALBUMID` for FLAC/OGG, `----:com.apple.iTunes:MusicBrainz Album Id` for MP4). - New scan phase `_scan_album_mbid_consistency` runs after the existing track-MBID scan: groups tracks by DB `album_id`, reads each track's embedded album MBID, finds the consensus (most-common) MBID via `Counter`, flags dissenters. Tracks without an album MBID at all are skipped (they don't break Navidrome — only an explicit MBID disagreement does). Albums where MBIDs are perfectly tied (no clear consensus) are skipped too — surface as a manual decision instead of fixing toward a 1/N tie. - New finding type `album_mbid_mismatch` carries `consensus_mbid`, `wrong_mbid`, `consensus_count`, `total_tracks_with_mbid`, and a human-readable reason string. `core/repair_worker.py`: - Added `'album_mbid_mismatch': self._fix_album_mbid_mismatch` to the fix dispatch dict and to the `fixable_types` tuple so auto-fix + bulk-fix paths pick it up. - New `_fix_album_mbid_mismatch` method reads `consensus_mbid` from finding details, resolves the dissenter's file path via the shared library resolver, calls `_write_album_mbid_to_file` to rewrite the tag in place. Doesn't touch the album's other tracks (they're already in agreement). Part 2 — Root cause fix (prevents new SoulSync imports from drifting): The original in-memory `mb_release_cache` in `core/metadata/source.py` maps `(normalized_album, artist) -> release_mbid` so per-track enrichment of the same album hits the cache and writes the same MUSICBRAINZ_ALBUMID to every track. That cache is bounded (4096 entries) and in-process — so cache eviction (when other albums are processed in between) and server restart can BOTH cause inconsistency. Per-track album-name variation (e.g. some tracks tagged `"Album"`, others tagged `"Album (Deluxe)"`) and per-track artist variation (features) make it worse. `core/metadata/album_mbid_cache.py` (new module): - DB-backed `lookup(normalized_album, artist) -> release_mbid` and `record(...)` functions. Same key shape as the in-memory cache. - Strict additive design: every public function is wrapped in try/except and degrades to None / no-op on ANY database error. The existing in-memory cache + MusicBrainz lookup remains the authoritative fallback. If this module breaks, downloads continue exactly as they would today. `database/music_database.py`: - New `mb_album_release_cache` table with composite primary key `(normalized_album_key, artist_key)`. Reverse-lookup index on `release_mbid` for future debug tooling. Created via the existing `CREATE TABLE IF NOT EXISTS` migration pattern — idempotent, no schema version bump needed. `core/metadata/source.py`: - Surgical change inside the existing `embed_source_ids` in-memory-cache-miss branch: BEFORE calling MusicBrainz, consult the persistent cache. If a previous SoulSync run already resolved this album's release MBID, reuse it. After a successful MB lookup, store in BOTH caches. Both calls wrapped in defensive try/except so any failure falls through to existing logic. Tests: - `tests/metadata/test_album_mbid_cache.py` — 16 cache tests: round-trip, idempotent re-record, overwrite semantics, clear_all, album+artist independence (no Greatest Hits collisions), defensive None-on-empty-input, graceful degradation when the DB is unavailable / connection raises / commit fails, schema sanity (table + index exist after init). - `tests/test_album_mbid_consistency.py` — 13 detector tests: tag read/write round-trip on real FLAC files, Picard-standard tag descriptors, defensive paths (unreadable file, empty input), detector behavior (agreement → no flags, lone dissenter → flag, ties → no flag, single-track albums → skipped, no-MBID tracks → skipped, unresolvable file paths → skipped). - `tests/metadata/test_metadata_enrichment.py` — added autouse fixture monkeypatching the persistent cache to no-op for tests in this file. The existing tests pin per-call MB counts and in-memory cache state; without the fixture, persistent rows from earlier tests would bypass the MB call. Persistent layer has its own dedicated tests. Verified: 1782 tests pass (29 new), ruff clean, smoke test confirms end-to-end cache round-trip works. WHATS_NEW entry under '2.4.2' dev cycle.	3 weeks ago
Antti Kettunen	b85a05fb88	Move image URL normalization into metadata helpers - keep existing /api/image-proxy URLs from being wrapped again - reuse the shared metadata package instead of duplicating URL logic in web_server.py - add regression coverage for proxy passthrough and internal URL normalization	3 weeks ago
elmerohueso	f9f47f978e	fix post-download tagging, and enable tagging for hifi	4 weeks ago
Broque Thomas	7e32618f86	Drop old per-service enrichment routes after registry cutover Followup to the enrichment-bubble registry consolidation. The dashboard polling + click handlers all hit /api/enrichment/<service>/{status,pause,resume} now, so the 30 hand-rolled per-service routes in web_server.py have zero callers and can come out: /api/musicbrainz/{status,pause,resume} /api/audiodb/{status,pause,resume} /api/discogs/{status,pause,resume} /api/deezer/{status,pause,resume} /api/spotify-enrichment/{status,pause,resume} /api/itunes-enrichment/{status,pause,resume} /api/lastfm-enrichment/{status,pause,resume} /api/genius-enrichment/{status,pause,resume} /api/tidal-enrichment/{status,pause,resume} /api/qobuz-enrichment/{status,pause,resume} Worker init blocks stay (they still construct the workers + persist pause state). Section comment headers are preserved with a one-line note pointing readers at the new generic blueprint. Test fixtures in tests/conftest.py and tests/metadata/test_enrichment_events.py also updated to use the new URL paths so they reflect production reality. They were synthetic stubs that never depended on the production routes — purely cosmetic alignment. Net: ~510 lines deleted from web_server.py. Full pytest 1541 passed; ruff clean.	4 weeks ago
Antti Kettunen	74e3cc460c	Simplify service status and labels - Flatten the Spotify service-status rendering so it shows rate-limit and recovery states explicitly, while otherwise displaying the active metadata provider directly. - Keep the Spotify auth controls and metadata-source picker aligned with the real session state after authenticate and disconnect flows. - Return "Unmapped" for unknown metadata source labels instead of implying iTunes. - Update the metadata registry tests to cover the new label fallback.	4 weeks ago
Antti Kettunen	55603be14c	Clarify Spotify auth flow and sync UI - Send Spotify auth completion back to the opener so the settings page refreshes immediately - Make the local auth flow go straight through to Spotify instead of showing the temporary instruction page - Keep the remote/docker instruction page available for manual callback setups - Sync Spotify status, connect/disconnect buttons, and metadata source selection after auth and disconnect - Keep the disconnect behavior aligned with the active primary metadata source	4 weeks ago
Antti Kettunen	9646f6ca7f	Clarify Spotify auth actions - Hide the auth button when a Spotify session is active - Treat disconnect as a session change, not a provider swap - Share metadata source labels in the registry - Tighten rate-limit copy around Spotify-specific behavior	4 weeks ago
Antti Kettunen	e6c2bee427	Move profile Spotify cache into registry - let core.metadata.registry own per-profile Spotify client caching - register the DB-backed profile credentials provider from web_server.py - invalidate only the affected profile cache entry on save, delete, and auth	4 weeks ago
Antti Kettunen	50e1ae3a3f	Move metadata helpers into package modules - split metadata lookup logic into core/metadata/* - keep core/metadata_service.py as the legacy barrel - update tests and artist-detail code to patch concrete modules	4 weeks ago
Antti Kettunen	a759f778b6	Move metadata API into package - add package-owned metadata API, cache, registry, and lookup modules - keep legacy metadata_service and metadata_cache paths as explicit shims - update metadata call sites and tests to use package-owned helpers	4 weeks ago
Broque Thomas	c121582557	MusicBrainz genres: fall back to release then artist when recording is empty User report: SoulSync was only pulling MusicBrainz genres from the recording (track-level) endpoint. Most MB recordings don't carry genres at the track level — they live on the release (album) or artist. So the MB tier was contributing nothing to the genre merge for the overwhelming majority of tracks. Fix: - Added `'genres'` to the release-detail `includes` (was missing). - After release-detail processing, if pp['mb_genres'] is still empty, populate from release_detail['genres'] (sorted by count desc). - If still empty AND artist_mbid is set, fetch artist with `includes=['genres']` and use those. No extra API call when the recording (or release) already had genres — the artist fetch only fires when both upstream tiers came back empty. The downstream genre merge in _embed_metadata_genres is unchanged; this just makes the MB feed into it richer. Tests: 4 new (recording present, recording empty → release, recording + release empty → artist, all empty → []). Full suite 873 passing. Ruff clean. Reported by @kcaoyef421 in Discord.	4 weeks ago
Antti Kettunen	02305096a3	Tighten metadata and import safety - Normalize album import track display handling so queue labels and match rows stay consistent - Bound MusicBrainz caches and avoid caching transient lookup failures - Stop swallowing programmer errors in source enrichment helpers - Restore import config test seams without reintroducing lazy imports - Guard task completion calls and fix the Windows path test expectation - Keep file lock tracking from growing without bound	4 weeks ago
Antti Kettunen	9315e74bea	Broaden import and metadata test coverage - Cover search_result fallback normalization and ambiguous album detection. - Add staging metadata, multi-disc path, and MusicBrainz enrichment cases. - Move the single-track context test next to the imports code it exercises.	4 weeks ago
Antti Kettunen	4c819681a1	Move single-track resolver; fix wishlist cleanup - keep single-track import lookup in imports/resolution.py - normalize simple-download search_result data before wishlist matching - run wishlist cleanup for simple-download post-processing - keep source-only artist detail on resolved names and MB short-circuit	4 weeks ago
Antti Kettunen	9b2b6d856f	Split runtime builders into owning modules - Move the import pipeline runtime factory into core.imports.pipeline - Move the metadata runtime factory into core.metadata.enrichment - Keep the web server wiring thin and drop the shared glue module - Add contract tests that keep the two runtime bundles separate	4 weeks ago
Antti Kettunen	bcab54095e	Group metadata tests under tests/metadata - Move the metadata and MusicBrainz-related tests into a dedicated tests/metadata subfolder. - Keep the rest of the suite flat for now. - Preserve the existing test filenames so the change stays organizational rather than behavioral.	4 weeks ago

48 Commits (dev)