mirror of https://github.com/Nezreka/SoulSync.git
dev
main
fix/quarantine-source-dedup
release/2.5.3
fix/disable-beatport-features
johnbaumb-discover-redesign
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2.0
2.1
2.2
2.3
2.4.0
2.4.1
2.4.2
2.5.0
2.5.1
2.5.2
2.5.3
2.5.4
2.5.5
2.5.6
2.5.7
2.5.9
2.6.0
2.6.1
v0.65
${ noResults }
23 Commits (dev)
| Author | SHA1 | Message | Date |
|---|---|---|---|
|
|
43ed30b4d2 |
fix(musicbrainz): user-facing search recall + album-detail 404
Two bugs surfacing on the Fix popup and enhanced-search MB tab:
1. Strict Lucene phrase queries (`recording:"X" AND artist:"Y"`) killed
recall on user-facing manual search — diacritics ("Bjork" vs canonical
"Björk"), bracketed suffixes like "(Live)", and any AND-clause
mismatch returned zero results. Added `strict: bool = True` param to
`search_release` / `search_recording`; when False, sends a bare query
joining title + artist so MB hits alias/sortname indexes with
diacritic folding. `/api/musicbrainz/search` (Fix popup) and
`core/library/service_search.py` (service tabs) now pass strict=False.
Enrichment workers stay on strict mode — precision matters there
because they auto-accept the top hit above a confidence threshold.
2. Every MB album click was silently 404-ing — `_render_release_as_album`
passed `cover-art-archive` as an MB `inc` param, but it's not a valid
include for the /release resource (MB rejects with 400). The CAA flags
come back on every release response by default, so dropping the bad
include preserves the image-scope picker logic intact.
|
5 days ago |
|
|
aaf312cd34 |
Honor manual library matches across source labels
Manual matches can be created from sync history as mirrored while wishlist and download flows later see the same track as wishlist or a provider source. Add a shared track-level lookup that falls back from exact source/id to source_track_id and title/artist, then use it for wishlist adds, cleanup, and download analysis so mapped tracks are not re-added or redownloaded. Add coverage for mirrored-source matches being honored by wishlist cleanup and download batches, including the internal wishlist force-download path. |
6 days ago |
|
|
0345478361 |
Skip wishlist adds for manual library matches
|
1 week ago |
|
|
42f4aa5eac |
Add manual library track matching
|
1 week ago |
|
|
f3d5ef6528 |
Test missing-track existing file imports
Add service-level coverage for the Enhanced Library I Have This flow: copying an existing source file, writing the target album DB row, preserving source audio, inheriting album identity tags, and migrating older track tables that lack disc_number. |
1 week ago |
|
|
f9ae0e8d58 |
Extract missing-track import service
Move the existing-file missing-track import workflow out of web_server.py and into core/library/missing_track_import.py. Keep the Flask route focused on request wiring and response formatting while the service handles staging copy, post-processing, album identity tag inheritance, DB upsert, and media-server sync. |
1 week ago |
|
|
42a833fcb2 |
Amazon Music: UI badges, enrichment match chips, watchlist linking, metadata cache
- Artist cards, hero section, and enhanced view now show Amazon Music badges
when amazon_id is populated (AMAZON_LOGO_URL constant, orange #FF9900 brand)
- Enhanced view artist and album match status rows include amazon_match_status
chip with click-to-rematch via openManualMatchModal
- getServiceUrl: added amazon (album/track ASIN → music.amazon.com) and fixed
missing discogs entries; serviceLabels adds tidal/qobuz/amazon
- Enhanced view enhanced-artist-id-badges includes amazon_id entry
- DB SELECTs for library artists list and artist detail now return amazon_id;
both response dicts include the field
- watchlist_artists migration adds amazon_artist_id column
- Watchlist config GET: amazon_artist_id in SELECT/WHERE/response (index 18)
- Watchlist artists list response includes amazon_artist_id
- link-provider endpoint: amazon added to valid_providers and col_map
- _populateLinkedProviderSection: amazonId param + Amazon Music source row
- Watchlist card source badges render Amazon pill (watchlist-source-amazon CSS)
- _openSourceSearch labels map includes amazon
- service_search: amazon_worker injected via init(); _search_service amazon branch
uses search_artists/albums/tracks, same {id,name,image,extra} return shape
- _SERVICE_ID_COLUMNS: amazon → amazon_id for artist/album/track
- _init_service_search call passes amazon_worker_obj
- amazon_client._fetch_album_metas: 5-minute TTL cache per ASIN — cached hits
skip _rate_limit() and HTTP call entirely; fixes ~10s artist detail load
- registry.py: removed amazon from METADATA_SOURCE_PRIORITY and
METADATA_SOURCE_LABELS — T2Tunes has no discography API, cannot serve as a
primary metadata source; Amazon remains a download source + ASIN enricher
- Settings metadata source dropdown and help text updated accordingly
|
1 week ago |
|
|
b05ba5d498 |
Reorganize: optional embedded-tag mode (closes #592)
Adds an opt-in alternative metadata source for reorganize. The existing API path (query Spotify / iTunes / Deezer / Discogs / Hydrabase for the canonical tracklist) stays the default and is unchanged. The new tag mode reads each file's embedded tags as the source of truth instead -- useful for well-enriched libraries where API drift can produce inconsistent renames, and avoids API calls entirely. - New pure helper `core/library/reorganize_tag_source.py` adapts the output of `read_embedded_tags` (the same mutagen path the audit- trail modal uses) to the `api_album` / `api_track` shapes that `_build_post_process_context` already consumes. Handles ID3-style "5/12" track + disc shapes, multi-value Artists tags, year normalization across 5 date formats, releasetype canonical tokens, multi-artist string splits across 9 separators. - `plan_album_reorganize` accepts `metadata_source: 'api' | 'tags'` (default 'api') and `resolve_file_path_fn`. Tag mode branches into a new `_plan_from_tags` that reads each track's file and produces per-item `api_album` + `api_track` instead of a shared one. - `_run_post_process_for_track` accepts a per-item `api_album` override so each file's own album metadata flows through post- process (not a single shared dict). - `total_discs` in tag mode honors the `totaldiscs` tag and the trailing `/N` of an ID3 `discnumber = "1/2"`. Partial-album reorganize still routes into the correct `Disc N/` subfolder when the tag knows the total even if not all discs are present locally. - Bare `discnumber = "1"` no longer poisons `total_discs` -- it carries no total signal. - `reorganize_album` surfaces a tag-mode-specific error when no files are readable, instead of the API-mode "run enrichment first" message which would mislead in tag mode. - `QueueItem.metadata_source` field, `enqueue` / `enqueue_many` pass-through, runner injects `item.metadata_source` into `reorganize_album`. - `web_server.py` endpoints accept `mode` body param. Falls back to the `library.reorganize_metadata_source` config setting, then to 'api'. Strict allowlist (api / tags) -- anything else falls back. - Frontend: per-album modal + reorganize-all modal both grow a new "Metadata Mode" dropdown above the source picker. Tag mode hides the source picker (irrelevant). Choice persisted in localStorage. Both preview + execute fetches send `mode` in body. Tests: - 49 boundary tests on the pure helper pin every shape: ID3 "5/12", multi-artist split, year normalization, releasetype validation, total_discs precedence, defensive paths. - 6 planner-level integration tests pin the wiring: tag-mode with good tags, partial-disc with totaldiscs tag, file missing, some-match-some-fail, defensive resolve_file_path_fn=None, API-mode regression guard. - All 3171 tests pass; 52 existing reorganize tests unchanged. |
1 week ago |
|
|
2f284efa57 |
Retag now re-embeds LYRICS tag instead of leaving it empty
Discord report (netti93). The download flow runs `enhance_file_metadata` (clears all tags) then `generate_lrc_file` (writes .lrc sidecar AND embeds USLT). The retag flow only ran the first half — `enhance_file_metadata` cleared USLT and there was no follow-up to restore it. Two coordinated fixes (no new setting per kettui scope discipline — user described it as "might even be an idea," consistency was the load-bearing ask). Fix 1 — retag calls generate_lrc_file after enhance `core/library/retag.py:execute_retag` now invokes `deps.generate_lrc_file` right after the `enhance_file_metadata` call, mirroring the download pipeline. New `generate_lrc_file` field on `RetagDeps`, defaults to None for backward compat with any test caller that builds RetagDeps without it. Web_server's `_build_retag_deps()` factory wires in the real `core.metadata.lyrics.generate_lrc_file`. Placement matters — runs BEFORE `safe_move_file` so the helper sees the audio file at its current path with its existing sidecar (which retag hasn't moved yet). After the embed, the audio file gets moved with USLT now present; the sidecar move step that follows is unaffected. Fix 2 — create_lrc_file re-embeds from existing sidecar `core/lyrics_client.py:create_lrc_file` used to early-return True when an .lrc / .txt sidecar already existed (skipping the LRClib fetch). For the retag case the sidecar is already there, so the shortcut hit and USLT was never re-written. Now the helper reads the existing sidecar and calls `_embed_lyrics` with its content before returning. Empty / unreadable sidecars short-circuit silently — defensive, no crash. Download flow unaffected because no sidecar exists at fetch time. 7 boundary tests pin: existing .lrc triggers re-embed, existing .txt triggers re-embed, empty sidecar skips embed, unreadable sidecar swallows error, no sidecar falls through to LRClib (download path regression guard), RetagDeps.generate_lrc_file field accepted, field optional for backward compat. Full suite: 3120 passed. |
1 week ago |
|
|
89246a7304 |
Write artist.jpg to artist folder so Navidrome shows real photos
Closes #572 (rhwc). Navidrome has no API for setting an artist image — it reads `artist.jpg` (or `folder.jpg`) from the artist folder during library scans. SoulSync's `update_artist_poster` for Navidrome was a no-op, so users only ever saw album-art-derived thumbnails as the artist photo. - new "Write Artist Image" button on artist detail page - POST /api/artist/<id>/write-image-to-disk derives the artist folder from any track's resolved file_path (reuses _resolve_library_file_path so docker mount translation + library.music_paths probes from #558 apply), fetches the photo from the configured metadata source priority chain, downloads with content-type validation, writes atomically via `<filename>.tmp + os.replace` - when active server is Navidrome, triggers a library scan immediately so the file is picked up - respects existing artist.jpg (frontend prompts before overwriting) so user-supplied photos aren't clobbered - works for plex / jellyfin too as a fallback layer — both servers also read artist.jpg from disk 26 tests pin the pure helpers in core/library/artist_image.py: folder derivation (trailing sep / empty / non-string), URL picking (missing attr / whitespace / non-string), download (non-image content-type / 404 / timeout / empty body), atomic write (replace / temp-cleanup-on-failure / overwrite guard / missing folder). |
2 weeks ago |
|
|
6ce185491d |
Add per-download Audit Trail modal to Library History
- new "Audit" button on each download row in the library history modal opens a second modal visualizing the download lifecycle as an interactive horizontal stepper (request → source → match → verify → process → place) with click-to-expand detail cards - hero header with album art + track title + meta line + status pills (source / quality / acoustid result) - three tabs: Lifecycle / Tags / Lyrics - Tags tab reads the audio file live via mutagen at audit-open time via new GET /api/library/history/<id>/file-tags endpoint; file is the single source of truth so background enrichment writes (audiodb / lastfm / genius / replaygain / lyrics fetch) show up too. flat key/value rows stacked vertically (label-above- value) so long MBIDs / URLs / joined genre lists wrap cleanly. source IDs grouped per-service into 2-col sub-card grid. - Lyrics tab renders the full transcript with dimmed timecodes. - post-processing step infers observable changes from source-vs- final state (format conversion, file rename via tag template, folder template). - "Download History" button also added to the Downloads page batch panel header so it's reachable outside the dashboard. - mobile responsive: tabs + stepper scroll horizontally, modal goes full-screen, hero stacks below 480px. 19 helper tests pin the mutagen reader: id3 (TIT2/TPE1/TALB + TXXX + USLT + APIC), vorbis (FLAC dict + _id/_url passthrough), file metadata (format / bitrate / duration), defensive paths (empty / missing file / mutagen returns None / mutagen raises), stringify edge cases (list / tuple / int / frame-with-text / whitespace). |
2 weeks ago |
|
|
56ae10693b |
Album Completeness: surface diagnostic when resolver can't find album folder
GitHub issue #558: clicking Auto-Fill / Fix Selected on the Album Completeness findings page returned a flat "Could not determine album folder from existing tracks" error with no diagnostic. Reporter is on Navidrome on Docker — the path resolver in `core/library/path_resolver.py` couldn't find any of the album's tracks on disk because Navidrome's Subsonic API doesn't expose filesystem library paths the way Plex's API does (probed in #476). Default settings → `library.music_paths` empty → no base directories to probe → silent None. User had no signal about what to configure. Not a regression of #476 — that fix targeted Plex auto-discovery and worked correctly for it. Navidrome was never covered because the protocol gives the resolver nothing to probe. Fix scoped to the diagnostic surface, not auto-magic discovery: - Added `resolve_library_file_path_with_diagnostic` returning `(resolved, ResolveAttempt)`. ResolveAttempt records what the resolver tried — `raw_path_existed`, `base_dirs_tried`, `had_config_manager`, `had_plex_client`. Pure data, no rendering opinions. - Legacy `resolve_library_file_path` becomes a thin wrapper that drops the attempt; every existing call site is unchanged. - `RepairWorker._fix_incomplete_album` now uses the diagnostic helper and renders a multi-part error via `_build_unresolvable_album_folder_error`: names the active media server, shows one sample DB-recorded path, lists every base directory the resolver actually probed, and points the user at Settings → Library → Music Paths as the actionable fix. - Distinguishes empty-base-dirs vs tried-and-failed cases so the user knows whether to add a mount or fix the existing one. - No auto-probing of common Docker conventions (`/music`, `/media`, etc). Speculative — could resolve to wrong dirs on the suffix-walk if a conventional path happens to contain a partial collision. User stays in control. 12 new tests: - 7 in `tests/library/test_path_resolver.py`: tuple-shape contract, raw-path-existed short-circuit, base-dirs listed even on walk failure, had-flags reflect caller inputs, no-base-dirs returns None with empty attempt, legacy `resolve_library_file_path` delegates correctly across happy / suffix-walk / failure paths. - 8 in `tests/test_repair_worker_unresolvable_folder_error.py`: active server name in error, sample DB path verbatim, base dirs listed, empty-base-dirs phrased differently, Settings hint always present, defensive against None attempt / missing sample / missing config_manager. Full pytest sweep: 2774 passed. |
2 weeks ago |
|
|
9602d1827c |
Final silent-exception sweep + ruff S110 lint guardrail — ~45 sites
Catches the silent excepts the awk-based earlier sweeps missed:
- Bare `except:` followed by `pass` (also swallows KeyboardInterrupt
and SystemExit — actively wrong). Upgraded to `except Exception as
e: logger.debug("...: %s", e)`. ~14 sites across connection_detect,
soulseek_client, listenbrainz_manager, watchlist_scanner,
youtube_client, navidrome_client, jellyfin_client, web_server.
- `except Exception:` + pass that the awk pattern missed (e.g.
multi-line or unusual whitespace). ~31 sites across automation_engine,
database_update_worker, music_database, spotify_client, web_server,
others.
- 14 legitimate cleanup sites left silent with explicit `# noqa: S110`
+ comment explaining why (atexit handlers, finally-block conn.close
calls). Logging during shutdown can itself crash because file handles
get torn down before the handler fires.
Also enables `S110` rule in pyproject.toml so this pattern fails CI
going forward — drift fails at PR review instead of at runtime against
a wedged worker thread. Tests path keeps S110 ignored (test fixtures
legitimately use try-except-pass for cleanup).
Adds a WHATS_NEW entry to helper.js summarizing the full #369 sweep.
Verified: `python -m ruff check .` → All checks passed.
Verified: `python -m pytest tests/` → 2188 passed.
Closes #369
|
2 weeks ago |
|
|
aa54bed818 |
Surface silent exceptions across remaining modules — ~70 sites
Final sweep. Covers: - Downloads: candidates / lifecycle / master / monitor / wishlist_failed - Metadata: source / registry / cache / common / artwork (+ plex_client) - Imports: pipeline / resolution / file_ops / paths / guards - Library: path_resolver / retag / duplicate_cleaner - Stats / playlists / wishlist / discovery / automation / enrichment - Misc: hydrabase_client, soulsync_client, tag_writer, debug_info, api_call_tracker, album_consistency, beatport_unified_scraper, reorganize_runner, seasonal_discovery, lidarr_download_client, services/sync_service.py, automation_engine, automation/progress Two `_e` renames in imports/file_ops.py (outer scope binding `e`). A few finally-block sites in metadata/album_mbid_cache.py, library/track_identity.py, listening_stats_worker.py, watchlist/ auto_scan.py left silent — same reason as the rest of the sweep (logger calls during cleanup paths can themselves raise). Refs #369 |
2 weeks ago |
|
|
d17365296a |
Lift shared download dataclasses + boot via singleton factory
Two architectural cleanups on top of the download engine refactor. (1) Shared dataclasses move to neutral plugin package. TrackResult, AlbumResult, DownloadStatus, SearchResult lived in core/soulseek_client.py for historical reasons — every other plugin imported them from the soulseek module just to satisfy the contract, coupling 8 clients to a sibling source for type imports only. Moved them to the new core/download_plugins/types.py module and updated all 14 import sites across the deezer/hifi/lidarr/qobuz/soundcloud/tidal/ youtube clients, the engine, matching engine, redownload helper, and tests. Clean break, no backward-compat re-export. (2) web_server.py boots the orchestrator via the singleton factory. After construction it now calls set_download_orchestrator(...) so get_download_orchestrator() returns the same instance the global handle points at instead of lazily building a separate orchestrator. Matches the get_metadata_engine() pattern. |
3 weeks ago |
|
|
d8437c87c6 |
Fix Album Completeness Auto-Fill on Docker / shared-library setups (#476)
GitHub issue #476 (gabistek, Docker on Arch host): "Auto-Fill" / "Fix Selected" on the Album Completeness findings page returned "Could not determine album folder from existing tracks" for every album. Reproduces on any setup where the media-server library lives outside the SoulSync transfer/download folders — Docker is the headline case but native installs that point Plex at a NAS via SMB hit it too. Root cause: `core/repair_worker.py:_resolve_file_path` only probed the transfer + download folders. Docker users have their Plex/Jellyfin library bind-mounted at /music (or similar) — neither configured in SoulSync. Every existing track got silently treated as missing, so `album_folder` stayed None and the fix workflow bailed. The same incomplete logic was duplicated four more times in the repair_jobs/ modules, all with the same bug. Album Completeness was just the most user-visible — the same setups were also producing false "missing file" findings from Dead File Cleaner, silent skips in MBID Mismatch Detector, etc. The web server already had the correct logic at `web_server.py:_resolve_library_file_path` (probes transfer + download + Plex-reported library locations + user-configured library.music_paths). The repair workers had never been updated to match. Fix: - New `core/library/path_resolver.py` extracts the union logic into a single shared function `resolve_library_file_path()`. Probes (in order, deduped): explicit transfer/download kwargs, config-derived soulseek.transfer_path/download_path, Plex-reported library locations (when a plex_client is passed), user-configured library.music_paths. Each defensive: malformed config or a flaky Plex client degrades to the dirs that did succeed. - `core/repair_worker.py:_resolve_file_path` becomes a delegating wrapper preserving the legacy signature, with a new `config_manager` kwarg. All 15 in-tree call sites updated to thread `self._config_manager` through. - `core/repair_jobs/dead_file_cleaner.py`, `mbid_mismatch_detector.py`, and `lossy_converter.py` get the same treatment: duplicate function replaced with a thin wrapper, call sites pass `context.config_manager`. - `core/repair_jobs/acoustid_scanner.py` and `unknown_artist_fixer.py` (which used to import from repair_worker) now call the shared resolver directly with `context.config_manager`. Side benefit: every other repair job (Dead File Cleaner, MBID Mismatch Detector, Lossy Converter, AcoustID Scanner, Unknown Artist Fixer) also stops missing files in the media-server library mount. Single fix unblocks five user-visible features. Tests: `tests/library/test_path_resolver.py` — 20 cases covering all four base-dir sources, suffix-walk algorithm, dedup, defensive paths (None plex client, malformed config entries, raising config_manager.get, broken plex attribute access), Docker path translation. Full suite 1677 passed locally. WHATS_NEW entry under '2.4.2' dev cycle. |
3 weeks ago |
|
|
24c2d75c6d |
Make extract_external_ids recognize all source-tagging conventions
Smoke-testing the just-merged provenance PR against live logs revealed the new ID-match block was silently no-opping: no [ExtID Match] / [Provenance Match] log lines despite the code path being live. Tracing revealed two related gaps in extract_external_ids' source detection: 1. **Underscore-prefixed key.** Deezer / Discogs / Hydrabase clients tag normalized track dicts with ``_source`` (underscore prefix — convention used in 8+ places across core/). The extractor only looked for ``provider`` and ``source``, so Deezer-sourced tracks silently returned no IDs. 2. **No provider field at all.** Spotify and iTunes raw API responses carry ``id`` but no provider/source key of any kind. The extractor couldn't disambiguate the native ``id``, so Spotify-primary scans would have hit the same silent miss once the user switched primary sources. Two-part fix: - ``extract_external_ids`` now recognizes ``_source`` as another candidate provider field. - New optional ``source_hint`` parameter lets the caller supply the configured primary source as a fallback when the track dict has no provider field of its own. Track-side provider field still wins when present (defensive against a wrong hint). Watchlist scanner now passes ``get_primary_source()`` as the hint so both naming conventions (Deezer-style _source, Spotify-style no-tag) get handled uniformly. 6 new regression tests cover: - _source recognized for Deezer - _source recognized for Hydrabase (cross-provider mapping) - _source recognized for Discogs (no library column — verifies graceful no-crash) - source_hint disambiguates raw tracks for spotify/itunes/deezer - track-side provider takes precedence over hint - None hint defaults safely Full pytest 1630 passed; ruff clean. After this lands and the server restarts, watchlist scans should produce [ExtID Match] / [Provenance Match] log lines for tracks already on disk regardless of which metadata source the user has configured as primary. |
3 weeks ago |
|
|
34ba26f5c8 |
Persist source IDs at download time + backfill onto tracks on sync
Followup to fix/watchlist-external-id-match. The companion PR closed the demand side — the watchlist scanner asks for tracks by external IDs before falling back to fuzzy. But for users on Plex / Jellyfin / Navidrome the supply side was still broken: tracks.spotify_track_id (and the other ID columns) only got populated by the asynchronous enrichment workers, sometimes hours after the file was actually written. During that window the ID match fell through to fuzzy and the bug returned. We were already collecting every ID during post-processing — they live in the `pp` dict in core/metadata/source.py:embed_source_ids and get embedded into file tags. We just dropped the in-memory copy afterwards. This PR persists them and uses them: - Schema migration adds spotify_track_id / itunes_track_id / deezer_track_id / tidal_track_id / qobuz_track_id / musicbrainz_recording_id / audiodb_id / soul_id / isrc columns + indexes to the existing track_downloads table (already keyed by file_path). - core/metadata/source.py:embed_source_ids exposes pp["id_tags"] and the resolved ISRC back to the import context as _embedded_id_tags / _isrc. - core/imports/side_effects.py:record_download_provenance reads those context fields and passes them to db.record_track_download, which now accepts the new ID kwargs and persists them. - New db.get_provenance_by_file_path with exact + basename-suffix fallback (handles container mount-root differences between download-time path and media-server-reported path). - New db.backfill_track_external_ids_from_provenance copies IDs from track_downloads onto a tracks row idempotently — COALESCE on every column preserves any value the enrichment worker already wrote (enrichment is more authoritative for late binding). - database/music_database.py:insert_or_update_media_track (the single insertion point used by every Plex / Jellyfin / Navidrome sync) calls the backfill immediately after each INSERT/UPDATE. - New core/library/track_identity.py:find_provenance_by_external_id used as a second-tier fallback in watchlist_scanner.is_track_missing _from_library — catches the window between download and media-server sync. Caller checks os.path.exists on the provenance file_path before treating it as "already in library" so a deleted file doesn't prevent re-download. Effect: freshly downloaded files become ID-recognizable to the watchlist on the very next scan, no enrichment-wait window. 19 regression tests in tests/test_provenance_id_persistence.py: - Schema migration adds expected columns + indexes - record_track_download persists every ID kwarg - record_track_download backward-compat (old kwargs still work) - get_provenance_by_file_path: exact match, basename fallback for mount-root differences, multi-record latest-wins, defensive None - backfill: copies all IDs, preserves existing via COALESCE, no-op when no provenance exists - find_provenance_by_external_id: per-ID lookup, ISRC cross-bridge, OR semantics, latest-wins on multiple matches Out of scope: backfilling provenance for files downloaded BEFORE this PR (their track_downloads rows don't carry the new IDs). Those continue to wait for enrichment. Acceptable — only affects historical files; new downloads benefit immediately. Full pytest 1625 passed; ruff clean. |
3 weeks ago |
|
|
ecb8939c80 |
Match library tracks by external IDs before fuzzy in watchlist scan
Reported case (CAL): a track already on disk got re-downloaded by the
watchlist scanner on every scan. Library DB had stale album metadata
for the file (track tagged on album "Left Alone") while the metadata
source reported it on a different album ("NPC" single). The
title+artist+album fuzzy block correctly said the album names didn't
match and declared the track missing — but the file's stable external
IDs (Spotify ID, ISRC, etc.) unambiguously identified it as the same
recording.
The earlier compilation-album fix (PR #461) handled qualifier drift
("OST" vs "Music From The Motion Picture"). This case is two
genuinely different album names referring to the same song.
Fix: provider-neutral external-ID short-circuit before the fuzzy
block in `is_track_missing_from_library`. Pulls every recognized ID
off the source track (Spotify / iTunes / Deezer / Tidal / Qobuz /
MusicBrainz / AudioDB / Hydrabase / ISRC), runs a single SELECT
against the indexed external-ID columns on the `tracks` table, and
treats any hit as "track exists in library — don't re-download".
If no IDs are available (older imports without enrichment, library
scans that didn't populate external IDs), falls through to the
existing fuzzy logic so the safety net stays intact.
New `core/library/track_identity.py` module with two helpers:
- `extract_external_ids(track)`: handles dict and object-style track
shapes, direct-field aliases (spotify_id / spotify_track_id /
SPOTIFY_TRACK_ID), and provider-disambiguated native `id` fields
(when track has `provider='deezer'` and `id='X'`, treats X as a
Deezer ID).
- `find_library_track_by_external_id(db, external_ids,
server_source)`: builds an OR of indexed column matches with
IS NOT NULL guards, optional server_source filter that also
passes legacy NULL rows, single-row LIMIT.
ISRC bridges across providers — a library track imported via Deezer
can be matched against a Spotify scan when both sides carry the
same ISRC.
43 regression tests in `tests/test_library_track_identity.py`:
- 9 ID-extraction tests for direct fields (Spotify / iTunes / Deezer /
ISRC / MBID / AudioDB / Hydrabase)
- 8 ID-extraction tests via the provider field (8 providers + source
alias + missing-provider-ignored)
- 7 mixed/defensive tests (multiple IDs, object-style, empty strings,
None track, numeric coercion)
- 8 lookup tests (per-provider + ISRC cross-bridge)
- 3 OR-semantics tests
- 4 server_source filter tests
- 2 ID-column-map sanity tests
Full pytest 1606 passed; ruff clean.
|
3 weeks ago |
|
|
b395e33820 |
Lift redownload_start to core/library/redownload.py
Body byte-identical to the original. Spotify proxy via registry, iTunes/Deezer client shims wrap registry helpers, _resolve_library_file_path, _attempt_download_with_candidates, and missing_download_executor are injected via init() right after _init_wishlist_failed where all three deps are already defined. web_server.py: 35239 → 35063 (-176 lines). |
3 weeks ago |
|
|
8299dc211e |
Lift _run_duplicate_cleaner to core/library/duplicate_cleaner.py
Body byte-identical to the original. The shared state dict, lock, docker_resolve_path helper, and automation engine are injected via init() at the lift point, where all four originals are already defined. web_server.py: 37015 → 36833 (-182 lines). |
4 weeks ago |
|
|
dae7f21265 |
Lift _search_service to core/library/service_search.py
Lifts _search_service and its _detect_provider helper. Both bodies are byte-identical to the originals. The nine enrichment worker handles (spotify/itunes/mb/lastfm/genius/tidal/qobuz/discogs/audiodb) are injected via init() right after qobuz is constructed, which is the last worker to come up — and well before Flask starts accepting requests, so the route handlers never see unbound workers. web_server.py: 37245 → 37015 (-230 lines). |
4 weeks ago |
|
|
3a6597561a |
Lift _execute_retag to core/library/retag.py
Pulls the 258-line retag worker out of `web_server.py` into a new
`core/library/` package. Pure 1:1 lift — wrapper keeps the original
entry-point name so the retag-trigger endpoint continues to work
without changes.
What `execute_retag` does:
1. Fetch album + track metadata for the new `album_id` (Spotify or
iTunes — the Spotify client transparently falls back).
2. Load existing files in the retag group from the DB.
3. Match each existing track to a new Spotify track:
- Priority 1: same disc + track number.
- Priority 2: title similarity >= 0.6 (SequenceMatcher).
4. For each matched pair:
- Re-write metadata tags via `_enhance_file_metadata`.
- Compute the new path via `_build_final_path_for_track` and move
the audio file (plus .lrc / .txt sidecars) if the path changes.
- Drop an orphaned cover.jpg if it's left in an empty directory.
- Clean up empty parent directories left behind.
- Download the new cover art into the new album dir.
5. Update the retag group record with new artist / album / image /
total_tracks / release_date and the appropriate Spotify-or-iTunes
album ID (numeric → iTunes, alphanumeric → Spotify).
6. Mark the retag state 'finished' (or 'error' on exception).
Strict 1:1 byte parity:
The original mutated `retag_state` as a module global (the function
declared `global retag_state` even though it only mutates in place).
Here `retag_state` is exposed through the `RetagDeps` proxy as a Python
property so the lifted body keeps `name[key] = value` /
`name.update(...)` syntax. The property setter rebinds the
web_server.py reference if the function ever reassigns it (currently
it doesn't, but the setter is wired for parity with the watchlist lift).
Diff vs original after `deps.X` → global X normalization is **zero
differences** apart from the dropped `global retag_state` decl and the
inline `from database.music_database import get_database` (replaced by
deps.get_database()). 258 lines orig = 258 lines lifted, byte-identical
body otherwise.
Dependencies injected via `RetagDeps` (13 fields) — config_manager,
retag_lock, spotify_client, plus 8 callable helpers
(get_audio_quality_string, enhance_file_metadata,
build_final_path_for_track, safe_move_file, cleanup_empty_directories,
download_cover_art, docker_resolve_path, get_database) and 2 property
delegates (_get_retag_state / _set_retag_state).
Tests: 11 new under tests/library/test_retag.py covering setup error
paths (no album data, no album tracks, no existing tracks),
track-number priority match, title-similarity fallback, no-match skip,
missing file skip, file move when path changes, group record update
(spotify vs iTunes ID branching by alphanumeric vs numeric album_id),
multi-disc total_discs computation.
Full suite: 1330 passing (was 1319). Ruff clean.
|
4 weeks ago |