SoulSync

Commit Graph

Author	SHA1	Message	Date
Broque Thomas	e95452b465	Surface silent exceptions in workers + repair jobs — ~30 sites Across all background workers (Spotify/Tidal/Deezer/Qobuz/iTunes/ Discogs/Genius/AudioDB/MusicBrainz/Last.fm/SoulID + the metadata-update worker) and the repair-job scanners. All converted to `logger.debug("...: %s", e)`. Two `_e` renames in genius_worker and soulid_worker where outer scope was already binding `e`. Two finally-block sites in repair_jobs/ library_reorganize.py left silent (conn.close on shutdown path). Refs #369	1 week ago
Broque Thomas	cceffbd8ec	Honor manually-matched source IDs in per-source enrichment workers GitHub issue #501 (@Tacobell444). After manually matching an album to a specific source ID via the match-chip UI, clicking "Enrich" on that album would fuzzy-search by name and overwrite the manual match with whatever the search returned — or revert the match status to ``not_found`` if name search missed. Reorganize then read the now- wrong ID and moved files to the wrong destination. Root cause was in the per-source enrichment workers' ``_process_*_individual`` methods. Several workers (Spotify, iTunes) ran search-by-name unconditionally with no check for an existing stored ID. Others (Deezer, Tidal, Qobuz) skipped on existing-ID but without refreshing metadata — preserved the ID but didn't actually honor the user's intent of "use this match to pull fresh data". Cin-shape lift: same fix needed in 5 workers, so extracted the shared behavior into ``core/enrichment/manual_match_honoring.py``: honor_stored_match( db, entity_table, entity_id, id_column, client_fetch_fn, on_match_fn, log_prefix, ) -> bool Per-worker variability (DB column name, client fetch method, response shape) plugs in via callbacks. Workers call the helper at the top of ``_process_album_individual`` / ``_process_track_individual``; if it returns True, the manual match was honored and the search-by-name fallback is skipped. If False (no stored ID, fetch failed, or empty response), the worker's existing search-by-name flow runs as before. Workers wired: - spotify_worker — album + track (was overwriting; now honors) - itunes_worker — album + track (was overwriting; now honors) - deezer_worker — album + track (was skip-on-id; now refreshes) - tidal_worker — album + track (was skip-on-id; now refreshes) - qobuz_worker — album + track (was skip-on-id; now refreshes) Workers left alone (already correct): - discogs_worker — already had inline stored-ID fast path that refreshes metadata. Same behavior, just inline; refactoring to use the shared helper would be churn for zero behavior change. - audiodb_worker — same — inline fast path with full metadata refresh. - musicbrainz_worker — preserves existing MBID and marks status, which is the correct behavior for MB (the MBID itself is the match payload — no separate metadata fetch). - lastfm_worker / genius_worker — name-based services with no source-specific IDs to honor. Inherent re-search per call. Reorganize fixed indirectly — it always honored stored IDs correctly via ``library_reorganize._extract_source_ids``. The "Reorganize broken" symptom was downstream of broken Enrich corrupting the stored ID. Tests: - ``tests/enrichment/test_manual_match_honoring.py`` — 11 tests pinning the shared helper contract: stored-ID fast path, no-ID fallthrough, empty-string treated as no ID, missing row, fetch exception caught and falls through, fetch returns None falls through, callback exceptions propagate, configurable table + column, defensive table-name whitelist. - Per-worker wiring NOT tested individually — the workers depend on live DB / client objects that are heavy to mock. The shared helper's contract is pinned; per-worker call sites are short enough to verify by code review. 2173/2173 full suite green. Closes #501.	2 weeks ago
Broque Thomas	1d5f1e2047	fix: pause Spotify worker on non-Spotify primary + cut daily budget to 500 The Spotify enrichment worker was auto-starting unconditionally at boot, hammering /v1/search to match every track in the library against the Spotify catalog regardless of which metadata source the user had actually chosen as their primary. Users on Deezer, iTunes, Discogs, or Hydrabase saw multi-hour 429 bans (typically 14400s) on Spotify even though they never wanted Spotify-driven enrichment in the first place — the worker generated dead API traffic the user neither asked for nor benefited from. Compounded by Spotify's February 2026 API tightening: - /v1/search max limit cut from 50 to 10 per request, default from 20 to 5 — every track now needs more pagination, more requests. - Sustained-rate detection more aggressive — repeated calls over hours trigger automated long-form bans even when each individual 30-second window is well under the rolling limit. Result: a user on Deezer would see their Spotify connection get banned for 4 hours after about 30 tracks of enrichment activity, with no recourse other than manually pausing the worker each session. Two-part fix: 1. Boot gate (web_server.py): only auto-start the worker when `get_primary_source() == 'spotify'`. Otherwise initialize in the paused state with an explanatory log line. The settings UI manual unpause control remains functional for users who explicitly want background Spotify enrichment regardless of primary source. Boot logic: - User manually paused (existing config) → stays paused (preserved). - Primary = 'spotify' → starts running (preserved). - Primary != 'spotify' → starts paused with log line. 2. Daily budget reduction (core/spotify_worker.py): drop from 3000 to 500 items per calendar day. The 3000 cap was set when /v1/search returned 50 results per call; now that it caps at 10, each track needs roughly 5x the API load to find a confident match. 500/day keeps the worker productive without crossing Spotify's hidden sustained-rate detection threshold. The runtime side of the boot gate — auto-pausing when the user switches primary source mid-session — is out of scope. The settings UI already exposes the manual toggle, and primary-source switches are infrequent enough that requiring a manual unpause after the fact is acceptable. Full suite: 1355 passing. Ruff clean.	3 weeks ago
Broque Thomas	a60546929e	Fix Album Completeness job reporting zero findings for every album Reported by sassmastawillis: the Album Completeness maintenance job scans 3127 albums in 0.1 seconds and reports 0 findings — for every user, regardless of whether their library is actually complete. Restoring an older DB surfaced 7 correct findings, so the code logic works; the DB state is what's making everything look complete. Root cause: `albums.track_count` is only ever written by server-sync paths — Plex's `leafCount`/`childCount` and SoulSync standalone's `len(tracks)`. It's the OBSERVED count of tracks SoulSync has indexed, which is always exactly what `COUNT(tracks)` returns for that album. The completeness job treated it as the EXPECTED total and compared it against the observed count. They're equal by construction, so `actual >= expected` is always true: skip, 0.1s scan, 0 findings. Fix: new `api_track_count INTEGER` column on `albums`, written only by metadata-source code paths. Populated in two places so the scan is fast and the fallback is robust. 1. Enrichment workers — shared helper `set_album_api_track_count` in `core/worker_utils.py`. Called by each worker's existing `_update_album` method alongside its other album-column UPDATEs: - spotify_worker: `album_obj.total_tracks` from the Spotify Album dataclass (already in hand, zero new API calls) - itunes_worker: same, from the iTunes Album dataclass - deezer_worker: `nb_tracks` from full_data, falling back to search_data when the full lookup didn't run - discogs_worker: count of tracklist rows where `type_=='track'` (Discogs tracklists interleave heading and index rows that shouldn't count as songs) Helper skips the write on zero/None/negative/non-numeric inputs so a source lacking track info can't clobber a good value a different source already wrote. Caller owns the transaction — helper just queues an UPDATE on the caller's cursor without committing, so it batches cleanly with each worker's existing multi-UPDATE pattern. Hydrabase worker deliberately not touched — it's a P2P mirror that doesn't write album metadata to the local DB. Hydrabase- primary users hit the fallback path below. 2. Album Completeness repair job — new `al.api_track_count` column in the SELECT, read first in the scan loop. On miss (album never enriched, or enrichment workers haven't run yet on a fresh install), falls through to the existing `_get_expected_total()` API lookup and persists the result via the same shared helper (wrapped in connection/commit management since the repair job runs outside a worker's batched transaction). Also removed `al.track_count` from the scan's SELECT — now unused since the observed count was the whole source of this bug, and leaving a dead SELECT would invite a future engineer to re-introduce the same comparison. Help text on the job card was reworded so it honestly describes current behavior ("counts cached during normal enrichment are used when available; otherwise the job queries a metadata source directly") rather than the old "active provider first, then others as fallback" phrasing, which doesn't match how the cache actually fills — any enrichment worker that runs can populate it, and the last writer wins. Document-only follow-up if this edge case ever bites in practice: add a `api_track_count_source` column so the scan can prefer the configured primary source's count over others (e.g. deluxe vs. standard edition mismatches). Not worth the complexity today. For existing users, the first completeness scan after upgrade is fast to the extent their library is already enriched: the workers already ran and populated `api_track_count` on their normal schedule. For brand-new installs, the scan's fallback path handles the cold start — slower, but correct, and subsequent scans are fast. Does NOT affect: - Download / post-processing / wishlist / sync code paths — none of them read `track_count` for completeness semantics. - Plex / Jellyfin / Navidrome / standalone sync — still write `track_count` exactly as before; `api_track_count` is a separate column they never touch. - Other repair jobs. - Any UI path — same finding schema, just correct counts now. Files: - database/music_database.py — idempotent migration adding `api_track_count INTEGER DEFAULT NULL` to the existing album-column check block. - core/worker_utils.py — new `set_album_api_track_count` helper with the documented skip-on-bad-input contract. - core/spotify_worker.py, itunes_worker.py, deezer_worker.py, discogs_worker.py — one-liner call from each `_update_album`. - core/repair_jobs/album_completeness.py — scan uses the cache; fallback path persists API-lookup results via the shared helper; help text updated to match actual behavior. - tests/test_worker_utils_album_track_count.py — 9 tests covering the helper's write/skip contract + no-commit invariant. - tests/test_album_completeness_job.py — 2 tests for the repair job's fallback-path wrapper. - webui/static/helper.js — WHATS_NEW entry. Credit: sassmastawillis spotted the bug; the "restored older DB finds 7 albums" signal pinpointed DB state over code logic and made the diagnosis tractable.	3 weeks ago
Broque Thomas	288776a7f3	Add genre whitelist for filtering junk tags during enrichment New core/genre_filter.py with ~180 curated default genres. When strict mode is enabled in Settings → Library Preferences → Genre Whitelist, only whitelisted genres pass through during enrichment. Junk tags from Last.fm (artist names, radio shows, playlist names) are silently dropped. Applied at all 10 genre write points: Spotify, Last.fm, AudioDB, Deezer, Discogs, iTunes, Qobuz enrichment workers + post-processing genre merge + initial download artist/album creation. Strict mode is OFF by default — zero behavior change for existing users. First enable auto-populates the whitelist with defaults. Users can add, remove, search, and reset genres via the Settings UI.	4 weeks ago
Broque Thomas	9d77c403cc	Fix Spotify enrichment worker infinite loop on pre-matched artists Artists with an existing spotify_artist_id but NULL spotify_match_status were fetched by the priority queue every ~3 seconds. _process_artist returned early (preserving the ID) without marking the status, so the same artist was re-queued indefinitely — burning CPU and inflating API call counters. Now marks the artist as 'matched' on the early-return path.	1 month ago
Antti Kettunen	aec3047216	Improve graceful shutdown and rollback safety - Add interruptible stop events to background workers so shutdown wakes out of long sleeps instead of waiting on fixed delays. - Stop scan managers, repair worker, executors, and cleanup helpers deterministically so process exit does not leave background threads alive. - Add startup warnings for stale SQLite WAL/SHM sidecars so unclean shutdowns are easier to spot before init/migration errors cascade. - Prevent forced kills from leaving SQLite sidecars behind, which made rollbacks to older branches fail with malformed database errors.	1 month ago
Broque Thomas	e674a79c88	Persist API call history, record rate limit events, fix Spotify re-auth issues API Call Tracker: - Save/load 24h minute-bucketed history + events to database/api_call_history.json - Persists across server restarts via atexit + signal handler hooks - New record_event() for rate limit bans (called from _set_global_rate_limit) - New get_debug_summary() for Copy Debug Info — 24h totals, peak cpm with timestamp, per-endpoint breakdown, and last 20 rate limit events - Fixed race condition: events iteration now inside lock during save Spotify Rate Limit Mitigation: - Enrichment worker: max_pages=5 on get_artist_albums (was unlimited — artist with 217 albums caused 22 paginated API calls, now capped at 5) - Enrichment worker: inter_item_sleep raised from 0.5s to 1.5s Spotify Re-Auth Fix: - Both OAuth callbacks (port 8008 + 8888) now clear rate limit ban AND post-ban cooldown after successful re-auth — Spotify usable immediately instead of stuck on Deezer fallback for 5 minutes - Auth cache invalidated on both global client and enrichment worker client	1 month ago
Broque Thomas	1a0fd8b95e	Apply manual match protection to all enrichment workers (#226 ) The original #221 fix only covered Genius and AudioDB. All other workers (Spotify, iTunes, Last.fm, MusicBrainz, Deezer, Tidal, Qobuz) had the same bug: enrichment overwrites manual match status to not_found when name search fails. Each worker now checks for an existing service ID before searching by name and returns early if one exists, preserving the manual match.	2 months ago
Broque Thomas	7133595e0d	Fix enrichment widget showing Running when rate limited (#217 ) The tooltip only checked paused/authenticated/idle/running states. When Spotify was rate limited or daily budget exhausted, the worker thread was still alive (sleeping in guards) so it showed "Running" with no current item and stale 0% progress. Now checks rate_limited and daily_budget.exhausted before running: - Rate limited: "Rate Limited — Waiting Xm for rate limit to clear" - Budget exhausted: "Daily Limit Reached — Resets in Xh Xm" - No current item: "Waiting for next item..." instead of blank Also adds rate_limit info object to get_stats() response for the countdown display.	2 months ago
Broque Thomas	20452859c5	Improve enrichment matching to pick best result instead of first (#210 ) Artist/album/track matching previously took the first Spotify search result above the 0.80 similarity threshold. If Spotify returned a near-match before the exact match (e.g. "Brother's Keeper" before "Brothers Keepers"), the wrong entity would be selected. Now scores all candidates and picks the highest, so an exact match (1.0) always wins over a near-match (0.94). No change to threshold or batch matching logic — strictly better or equal results.	2 months ago
Broque Thomas	3f866ebf5e	Add daily budget to Spotify enrichment worker to prevent rate limit bans The background enrichment worker now caps itself at 3,000 processed items per calendar day. Counter resets at midnight automatically. When exhausted, the worker sleeps and checks every 5 minutes for a new day. This is scoped entirely to the enrichment worker — user-initiated Spotify API calls (searches, playlist ops, album lookups, etc.) are completely unaffected. Budget status is exposed in the worker's get_stats() response for the dashboard widget.	2 months ago
Broque Thomas	c1287f0ec0	Helper V2 complete + enrichment worker fixes Helper system phases 2-7: - Setup Progress: onboarding checklist with progress ring, auto-detection via /status, /api/settings, /api/library, /api/watchlist, /api/automations - Quick Actions: accent pill buttons in popovers (service cards get "Open Settings" and "View Docs" actions) - Keyboard Shortcuts: full-screen overlay with key cap styling, grouped by scope (Global, Player, Helper, Forms) - Search: fuzzy search across 200+ help entries, 11 tours, and shortcuts with cross-page navigation via _guessPageFromSelector() - What's New: version-tagged highlights with "Show me" navigation, red badge on ? button for unseen versions, older version cycling - Troubleshoot: scans dashboard service cards for disconnected/error states, shows fix steps with action buttons, "All Clear" when healthy - Contextual menu: page-aware tour suggestion at top of menu - Ctrl+K / Cmd+K opens helper search globally - First-launch welcome tooltip with pulsing ? button - Redesigned floating button (48px, accent gradient, glass effect) - Redesigned menu (unified card panel, accent left-stripe on contextual) Enrichment worker fixes: - AcoustID: individual recording matches downgraded INFO→DEBUG to reduce log noise (14 lines for one track → 1 summary line) - Name normalization: strip " - Suffix" dash format (Spotify) same as "(Suffix)" parens format across all 8 workers. Fixes false mismatch on tracks like "Electric Eyes (Studio Brussels Remix)" vs "Electric Eyes - Studio Brussels Remix" (was 0.54, now matches)	2 months ago
Broque Thomas	d4a57ae654	Start Spotify enrichment worker unpaused by default like other workers	2 months ago
Broque Thomas	429306c7f3	Fix enrichment retry loops, cover art finding dupes, and Spotify rate limit during art scan - All 9 enrichment workers: stop auto-retrying 'error' status items (was infinite loop) Only 'not_found' items retry after configured days; errors require manual full refresh - Cover art dedup: check both 'pending' AND 'resolved' findings to prevent recreation - Cover art scanner: top-level Spotify rate limit check skips Spotify entirely when banned, falls back to iTunes/Deezer only, logs once instead of spamming 429s	2 months ago
Broque Thomas	3c51f27e97	Fix Spotify enrichment worker rejecting every track via fallback Worker checked self.client.sp (non-None even without Spotify auth due to fallback) instead of is_spotify_authenticated(). Searched via iTunes/Deezer fallback, got numeric IDs, rejected them all with warnings. Now sleeps when Spotify isn't authenticated instead of making pointless fallback searches.	2 months ago
Broque Thomas	be77397132	Fix enrichment workers never showing idle/complete status Pending count queries included NULL-ID rows that _get_next_item filters out, so pending stayed > 0 even when no processable items remained. Workers reported running instead of idle, UI never turned green. Added AND id IS NOT NULL to _count_pending_items across all 9 workers to match the _get_next_item filter.	2 months ago
Broque Thomas	e0533215da	Fix enrichment workers looping on tracks with NULL IDs Workers would endlessly match the same track because UPDATE WHERE id = NULL matches 0 rows in SQL. Added AND id IS NOT NULL to all enrichment queries (individual, batch EXISTS, and batch fetch) across all 9 workers. Also added process-level guard for belt-and-suspenders safety. Fix Deezer get_track → get_track_details method name mismatch.	2 months ago
Broque Thomas	66e9457d0e	Stop unnecessary Spotify API call every 60s from enrichment status polling	2 months ago
Broque Thomas	bc22bdca07	Fix infinite Spotify rate limit loop from unguarded auth probes and swallowed errors	2 months ago
Broque Thomas	bde2be1cfa	Spotify rate limit re-trigger loop caused by periodic auth probes	2 months ago
Broque Thomas	eac97a6c2b	Smart Spotify rate limit detection with global ban, auto-suppression, and frontend modal	2 months ago
Broque Thomas	2d6c55e294	Fix chromaprint crash on surround audio and Spotify worker status display	2 months ago
Broque Thomas	2ab0b387d6	Update spotify_worker.py	3 months ago
Broque Thomas	24bfc2462d	Add Spotify & iTunes workers; update repair worker Add full-featured SpotifyWorker and iTunesWorker background workers to enrich artists, albums, and tracks with external metadata using batch cascading searches, fuzzy name matching, ID validation, and DB backfills. Update RepairWorker to re-read the transfer path from the database each scan, resolve host paths when running in Docker, and trigger immediate rescans when the transfer path changes; remove the static config_manager dependency. Also include supporting changes to the database layer and web UI/server (stats, controls, and styles) to integrate the new workers and reflect updated worker status.	3 months ago

25 Commits (main)