SoulSync

Commit Graph

Author	SHA1	Message	Date
Broque Thomas	19307630d1	Fix missing album art for non-Spotify sources + animate Downloads nav icon - watchlist_scanner: fall back to album.image_url when album object has no images list (affects MusicBrainz CAA URLs, iTunes, Deezer — all use image_url on the Album dataclass, not the Spotify-style images array) - Pulse Downloads nav icon while active downloads are in progress, same pattern as watchlist scan animation	3 weeks ago
Broque Thomas	f3ad65de34	Complete MusicBrainz watchlist source parity Add MusicBrainz watchlist artist ID storage, badges, linked-provider editing, and per-artist preferred source support. Backfill watchlist MusicBrainz matches from already-enriched library artists so existing MusicBrainz worker matches appear in watchlist cards and settings. Extend bulk watchlist add, liked artist matching, artist map source picking, and service status labels to recognize MusicBrainz, with regression tests for watchlist ID persistence and backfill.	3 weeks ago
Broque Thomas	5bc5fbb662	Add MusicBrainz as a metadata source Register MusicBrainz as a first-class metadata source alongside Deezer, iTunes, Spotify, Discogs, and Hydrabase. Expose the shared client through metadata services, add the settings option, and expand the MusicBrainz search adapter with source-compatible artist, album, track, and detail methods. Carry MusicBrainz IDs through similar-artist discovery, recommended artists, artist map serialization, and personalized playlist selection. Update DB migrations and lookup filters so similar_artist_musicbrainz_id is preserved on older schemas and used for source requirements and library exclusion. Normalize MusicBrainz album adapter output for import context and add regression coverage for registry mapping, typed album conversion, and similar-artist filtering. Verified by user with 120 focused tests passing.	3 weeks ago
Broque Thomas	877d0e7d81	Personalized pipeline: auto-refresh stale snapshots after watchlist scan Snapshots now track when their source data changes. Watchlist scan emits stale flags on the playlists whose underlying pool just got refreshed; the next pipeline run sees the flag and regenerates the snapshot before syncing, so the server playlist never lags the source. Schema: - new `is_stale INTEGER NOT NULL DEFAULT 0` column on `personalized_playlists`, plus an idempotent ADD COLUMN migration in `ensure_personalized_schema` for installs created before this PR. - `PlaylistRecord.is_stale: bool = False` exposed on the dataclass so callers can branch on freshness without re-querying. Manager: - new `mark_kinds_stale(kinds, profile_id=None)` flips the flag in bulk for a list of kinds (used by upstream data refreshers). - `_persist_snapshot` clears `is_stale = 0` on successful refresh. - SELECT statements + `_row_to_record` updated to read the column (with tuple-form length guard for safety). Pipeline: - `_build_payloads_for_kinds` now branches: refresh_first=True OR `existing.is_stale` -> refresh_playlist, else read existing snapshot. So the auto-refresh kicks in without needing the user to toggle the refresh-each-run option. Watchlist scanner emits stale flags at three sites: - after `update_discovery_pool_timestamp` -> marks pool-fed kinds stale: hidden_gems, discovery_shuffle, popular_picks, time_machine, genre_playlist, daily_mix. - after release_radar `save_curated_playlist` -> marks `fresh_tape`. - after discovery_weekly `save_curated_playlist` -> marks `archives`. All three calls go through a module-level `_mark_personalized_kinds_stale` helper that builds a PersonalizedPlaylistManager with `deps=None` (only DB access is needed for the flag update — no generator dispatch). Each call is wrapped in try/except so a flag failure can never abort the scan itself. Tests: - new `TestStaleFlag` class in `test_personalized_manager.py` (6 tests): default-false, single-kind flip, multi-kind, profile scoping, refresh-clears, empty-list noop. - two new pipeline tests pin the auto-refresh dispatch: `test_stale_snapshot_auto_refreshes_even_without_refresh_first` and `test_non_stale_snapshot_skips_refresh`. - existing stub-manager `SimpleNamespace` returns gained `is_stale=False` so the new attribute read doesn't AttributeError. Full suite: 3391 pass. User-facing WHATS_NEW entry added under 2.5.2 (above the prior pipeline auto-sync entry) describing the auto-refresh behavior.	4 weeks ago
Broque Thomas	9602d1827c	Final silent-exception sweep + ruff S110 lint guardrail — ~45 sites Catches the silent excepts the awk-based earlier sweeps missed: - Bare `except:` followed by `pass` (also swallows KeyboardInterrupt and SystemExit — actively wrong). Upgraded to `except Exception as e: logger.debug("...: %s", e)`. ~14 sites across connection_detect, soulseek_client, listenbrainz_manager, watchlist_scanner, youtube_client, navidrome_client, jellyfin_client, web_server. - `except Exception:` + pass that the awk pattern missed (e.g. multi-line or unusual whitespace). ~31 sites across automation_engine, database_update_worker, music_database, spotify_client, web_server, others. - 14 legitimate cleanup sites left silent with explicit `# noqa: S110` + comment explaining why (atexit handlers, finally-block conn.close calls). Logging during shutdown can itself crash because file handles get torn down before the handler fires. Also enables `S110` rule in pyproject.toml so this pattern fails CI going forward — drift fails at PR review instead of at runtime against a wedged worker thread. Tests path keeps S110 ignored (test fixtures legitimately use try-except-pass for cleanup). Adds a WHATS_NEW entry to helper.js summarizing the full #369 sweep. Verified: `python -m ruff check .` → All checks passed. Verified: `python -m pytest tests/` → 2188 passed. Closes #369	1 month ago
Broque Thomas	8dc9f79f97	Surface silent exceptions in watchlist + discovery + reorganize — 18 sites - watchlist_scanner.py: 6 sites - discovery/playlist.py: 5 sites - discovery/sync.py: 4 sites - watchlist/auto_scan.py: 1 site (1 left silent — finally-block scanner cleanup) - library_reorganize.py: 2 sites (4 left silent — all in finally blocks: conn.close, staging rmtree, sidecar delete, cleanup_empty_dir) All non-finally sites converted to `logger.debug("...: %s", e)`. Finally-block sites kept silent because logger calls during cleanup (after exception was already raised) can themselves raise. Refs #369	1 month ago
Broque Thomas	24c2d75c6d	Make extract_external_ids recognize all source-tagging conventions Smoke-testing the just-merged provenance PR against live logs revealed the new ID-match block was silently no-opping: no [ExtID Match] / [Provenance Match] log lines despite the code path being live. Tracing revealed two related gaps in extract_external_ids' source detection: 1. Underscore-prefixed key. Deezer / Discogs / Hydrabase clients tag normalized track dicts with ``_source`` (underscore prefix — convention used in 8+ places across core/). The extractor only looked for ``provider`` and ``source``, so Deezer-sourced tracks silently returned no IDs. 2. No provider field at all. Spotify and iTunes raw API responses carry ``id`` but no provider/source key of any kind. The extractor couldn't disambiguate the native ``id``, so Spotify-primary scans would have hit the same silent miss once the user switched primary sources. Two-part fix: - ``extract_external_ids`` now recognizes ``_source`` as another candidate provider field. - New optional ``source_hint`` parameter lets the caller supply the configured primary source as a fallback when the track dict has no provider field of its own. Track-side provider field still wins when present (defensive against a wrong hint). Watchlist scanner now passes ``get_primary_source()`` as the hint so both naming conventions (Deezer-style _source, Spotify-style no-tag) get handled uniformly. 6 new regression tests cover: - _source recognized for Deezer - _source recognized for Hydrabase (cross-provider mapping) - _source recognized for Discogs (no library column — verifies graceful no-crash) - source_hint disambiguates raw tracks for spotify/itunes/deezer - track-side provider takes precedence over hint - None hint defaults safely Full pytest 1630 passed; ruff clean. After this lands and the server restarts, watchlist scans should produce [ExtID Match] / [Provenance Match] log lines for tracks already on disk regardless of which metadata source the user has configured as primary.	1 month ago
Broque Thomas	34ba26f5c8	Persist source IDs at download time + backfill onto tracks on sync Followup to fix/watchlist-external-id-match. The companion PR closed the demand side — the watchlist scanner asks for tracks by external IDs before falling back to fuzzy. But for users on Plex / Jellyfin / Navidrome the supply side was still broken: tracks.spotify_track_id (and the other ID columns) only got populated by the asynchronous enrichment workers, sometimes hours after the file was actually written. During that window the ID match fell through to fuzzy and the bug returned. We were already collecting every ID during post-processing — they live in the `pp` dict in core/metadata/source.py:embed_source_ids and get embedded into file tags. We just dropped the in-memory copy afterwards. This PR persists them and uses them: - Schema migration adds spotify_track_id / itunes_track_id / deezer_track_id / tidal_track_id / qobuz_track_id / musicbrainz_recording_id / audiodb_id / soul_id / isrc columns + indexes to the existing track_downloads table (already keyed by file_path). - core/metadata/source.py:embed_source_ids exposes pp["id_tags"] and the resolved ISRC back to the import context as _embedded_id_tags / _isrc. - core/imports/side_effects.py:record_download_provenance reads those context fields and passes them to db.record_track_download, which now accepts the new ID kwargs and persists them. - New db.get_provenance_by_file_path with exact + basename-suffix fallback (handles container mount-root differences between download-time path and media-server-reported path). - New db.backfill_track_external_ids_from_provenance copies IDs from track_downloads onto a tracks row idempotently — COALESCE on every column preserves any value the enrichment worker already wrote (enrichment is more authoritative for late binding). - database/music_database.py:insert_or_update_media_track (the single insertion point used by every Plex / Jellyfin / Navidrome sync) calls the backfill immediately after each INSERT/UPDATE. - New core/library/track_identity.py:find_provenance_by_external_id used as a second-tier fallback in watchlist_scanner.is_track_missing _from_library — catches the window between download and media-server sync. Caller checks os.path.exists on the provenance file_path before treating it as "already in library" so a deleted file doesn't prevent re-download. Effect: freshly downloaded files become ID-recognizable to the watchlist on the very next scan, no enrichment-wait window. 19 regression tests in tests/test_provenance_id_persistence.py: - Schema migration adds expected columns + indexes - record_track_download persists every ID kwarg - record_track_download backward-compat (old kwargs still work) - get_provenance_by_file_path: exact match, basename fallback for mount-root differences, multi-record latest-wins, defensive None - backfill: copies all IDs, preserves existing via COALESCE, no-op when no provenance exists - find_provenance_by_external_id: per-ID lookup, ISRC cross-bridge, OR semantics, latest-wins on multiple matches Out of scope: backfilling provenance for files downloaded BEFORE this PR (their track_downloads rows don't carry the new IDs). Those continue to wait for enrichment. Acceptable — only affects historical files; new downloads benefit immediately. Full pytest 1625 passed; ruff clean.	1 month ago
Broque Thomas	ecb8939c80	Match library tracks by external IDs before fuzzy in watchlist scan Reported case (CAL): a track already on disk got re-downloaded by the watchlist scanner on every scan. Library DB had stale album metadata for the file (track tagged on album "Left Alone") while the metadata source reported it on a different album ("NPC" single). The title+artist+album fuzzy block correctly said the album names didn't match and declared the track missing — but the file's stable external IDs (Spotify ID, ISRC, etc.) unambiguously identified it as the same recording. The earlier compilation-album fix (PR #461) handled qualifier drift ("OST" vs "Music From The Motion Picture"). This case is two genuinely different album names referring to the same song. Fix: provider-neutral external-ID short-circuit before the fuzzy block in `is_track_missing_from_library`. Pulls every recognized ID off the source track (Spotify / iTunes / Deezer / Tidal / Qobuz / MusicBrainz / AudioDB / Hydrabase / ISRC), runs a single SELECT against the indexed external-ID columns on the `tracks` table, and treats any hit as "track exists in library — don't re-download". If no IDs are available (older imports without enrichment, library scans that didn't populate external IDs), falls through to the existing fuzzy logic so the safety net stays intact. New `core/library/track_identity.py` module with two helpers: - `extract_external_ids(track)`: handles dict and object-style track shapes, direct-field aliases (spotify_id / spotify_track_id / SPOTIFY_TRACK_ID), and provider-disambiguated native `id` fields (when track has `provider='deezer'` and `id='X'`, treats X as a Deezer ID). - `find_library_track_by_external_id(db, external_ids, server_source)`: builds an OR of indexed column matches with IS NOT NULL guards, optional server_source filter that also passes legacy NULL rows, single-row LIMIT. ISRC bridges across providers — a library track imported via Deezer can be matched against a Spotify scan when both sides carry the same ISRC. 43 regression tests in `tests/test_library_track_identity.py`: - 9 ID-extraction tests for direct fields (Spotify / iTunes / Deezer / ISRC / MBID / AudioDB / Hydrabase) - 8 ID-extraction tests via the provider field (8 providers + source alias + missing-provider-ignored) - 7 mixed/defensive tests (multiple IDs, object-style, empty strings, None track, numeric coercion) - 8 lookup tests (per-provider + ISRC cross-bridge) - 3 OR-semantics tests - 4 server_source filter tests - 2 ID-column-map sanity tests Full pytest 1606 passed; ruff clean.	1 month ago
Broque Thomas	6e61890551	Stop watchlist re-downloading compilation tracks; catch slskd dedup orphans Two related bugs reported on Discord by Mushy. 1. The watchlist re-downloaded the same OST track up to 7 times. ``is_track_missing_from_library`` compared Spotify's album name and the media-server scan's album name with a raw SequenceMatcher at a strict 0.85 threshold. Compilations and soundtracks routinely fail this — Spotify reports ``"Napoleon Dynamite (Music From The Motion Picture)"`` while the Plex / Navidrome / Jellyfin tag scan saves it as ``"Napoleon Dynamite OST"``. Raw similarity ≈ 0.49, so the scanner declared the track missing on every 30-minute scan and added it back to the wishlist. The wishlist then issued a fresh download. slskd appended ``_<19-digit-ns-timestamp>`` to each new copy because the target file already existed, and the user ended up with seven copies of one song in one folder. Fix: extract two pure helpers — ``_normalize_album_for_match`` strips qualifier parentheticals (Music From X, OST, Deluxe Edition, Remastered, Anniversary, etc.) and trailing dash-clauses; ``_albums_likely_match`` checks equality after normalization, substring containment, and a relaxed 0.6 fuzzy ratio. A volume / part / disc / standalone-trailing-number guard rejects pairs like ``"Greatest Hits Vol. 1"`` vs ``"Greatest Hits Vol. 2"`` so the relaxed threshold doesn't introduce false positives on serialized releases. After this change the Napoleon Dynamite case collapses to ``"napoleon dynamite" == "napoleon dynamite"`` via the equality short-circuit and the redownload loop dies. 2. The duplicate detector found only one of the seven dupe files. The detector buckets tracks by the first 4 chars of their normalized tag title. Files written by slskd directly into a library folder often get inconsistent (or blank) tags from the media-server rescan, so the seven copies were bucketed apart by parsed title and never compared. Fix: refactor the per-bucket comparison into ``_scan_bucket``, then add a second pass — ``_build_filename_buckets`` re-buckets leftover tracks by canonical filename stem (slskd dedup tail stripped via ``_strip_slskd_dedup_suffix``, same regex the import-cleanup PR uses) plus extension. Filename agreement is itself strong evidence the files came from the same source download, so the second pass calls ``_scan_bucket`` with ``require_metadata_match=False`` to skip the title / artist / cross-album gates. The same-physical-file guard still runs so bind-mount duplicates aren't flagged. 72 new regression tests across two files cover the album-match helpers (28 tests including the Napoleon Dynamite scenario, 7 volume disagreements, 8 positive/negative pairs, 5 defensive cases) and the new filename-bucket pass (16 tests across bucket construction, scan integration, and existing title-pass behavior). Full pytest 1509 passed; ruff clean. Reported by Mushy in Discord.	1 month ago
Antti Kettunen	a759f778b6	Move metadata API into package - add package-owned metadata API, cache, registry, and lookup modules - keep legacy metadata_service and metadata_cache paths as explicit shims - update metadata call sites and tests to use package-owned helpers	1 month ago
Antti Kettunen	0bbf44809f	Move the import flows and related post-processing pipelines into separate modules - Extract the import pipeline, album import, staging, path, file ops, guards, runtime state, side effects, and metadata enrichment out of . - Canonicalize the refactored import path around and remove legacy , , , and request shapes from the import endpoints. - Make album and track metadata lookups follow the configured provider priority instead of hard-coding Spotify, while still falling back when needed. - Update the import routes and frontend payloads to use the new core helpers. - Add coverage for the extracted helpers and the refactored import flows. PS. apologies to anyone who might check this commit out - the intention was to start small, but things kinda snowballed out of control at some point since the logic just kept going on and on, and everything kinda had to be changed all at once for it all to make any sense	2 months ago
Broque Thomas	e5d4d61c0e	Fix watchlist content filters: live false positives + auto-scan bypass Two bugs reported in issue #320: 1. Auto-watchlist scan bypassed Global Override settings. scan_watchlist_profile applied _apply_global_watchlist_overrides, but the scheduled auto-scan called scan_watchlist_artists directly — bypassing the override. Users who unchecked "Albums" or "Live" under Watchlist → Global Override still saw full albums and live tracks added during nightly scans (per-artist defaults, which include everything, won). Moved override application into scan_watchlist_artists itself so every entry point respects it. scan_watchlist_profile now forwards the apply_global_overrides flag through to avoid double-application. 2. is_live_version (watchlist + discography backfill) and live_commentary_cleaner's content patterns used bare \blive\b, which matched verb uses like "What We Live For" by American Authors, "Live Forever" by Oasis, "Live and Let Die" by Wings. Tightened the live patterns to require clear recording context: (Live) / [Live Version] / - Live / Live at\|from\|in\|on\|version\| session\|recording\|performance\|album\|show\|tour\|concert\|edit\|cut\|take / In Concert / On Stage / Unplugged / Concert. Locked in 11 regression tests covering the reported false positives (What We Live For, Live Forever, Living on a Prayer, Live and Let Die) and the reported true positives (Dimension - Live at Big Day Out, MTV Unplugged, etc.). Version bumped to 2.37 with changelog entries.	2 months ago
Broque Thomas	d9217237d2	Clean up 286 ruff lint errors to unblock CI and fix 10 latent bugs PR #340 added ruff to the build-and-test.yml CI gate, which surfaced 286 pre-existing lint errors. Left unfixed, every feature branch push fails CI. This commit resolves all of them so CI goes green and contributors can actually land work. Auto-fixes (248 of 286): removed unused f-string prefixes (F541), renamed unused loop control variables with underscore prefix (B007), removed duplicate imports (F811). Manually fixed 10 latent bugs ruff caught (all wrapped in try/except today, silently failing): - music_database.py: _add_discovery_tables() called undefined conn.commit() — would have crashed the iTunes-support migration for existing databases. Now uses cursor.connection.commit(). - web_server.py settings GET: referenced undefined download_orchestrator when it should be soulseek_client. Feature (_source_status on the settings payload) was silently missing for UI auto-disable logic. - web_server.py _process_wishlist_automatically: active_server undefined in track-ownership check. Auto-wishlist was falling through to the error handler and re-downloading owned tracks. - web_server.py start_wishlist_missing_downloads: same active_server bug in the manual wishlist path. - web_server.py _process_failed_tracks_to_wishlist_exact: emitted wishlist_item_added automation event with undefined artist_name and track. Automation event silently never fired correctly. - web_server.py discovery metadata enrichment: referenced cache without calling get_metadata_cache() first. Track enrichment from cached API responses was silently skipped. - web_server.py Beatport discovery worker: wing-it fallback branch used undefined successful_discoveries variable. Wing-it counter never incremented correctly. Now uses state['spotify_matches'] consistently with the rest of the function. - web_server.py _run_full_missing_tracks_process: stale import json mid-function shadowed the module-level import, making an earlier json.dumps() call reference an unbound local (F823). - web_server.py discovery loop: platform loop variable shadowed the module-level platform import (F402). - core/watchlist_scanner.py: 7 lambda captures of loop variables (B023 classic Python closure-in-loop bug) now bind at creation. No existing tests had to change. Full suite stays at 263 passed.	2 months ago
Broque Thomas	b17a6e2dd7	Add per-artist metadata source override for watchlist scans Users can now override which metadata provider (Spotify, Deezer, Apple Music, Discogs) is used when scanning a specific watchlist artist for new releases. The selector appears in the artist config modal and only shows sources the artist has enrichment IDs for. Default behavior is unchanged — all artists use the global metadata source unless explicitly overridden.	2 months ago
Antti Kettunen	7d18d4ecb2	Clarify comments	2 months ago
Antti Kettunen	eead0c3dac	Clarify similar-artist freshness and backfill Freshness is now age-only, and scan-time backfill runs separately without Spotify-auth gating or retired iTunes compatibility flags.	2 months ago
Antti Kettunen	8382b8e247	Refactor similar artist backfill Switch similar-artist backfill to the shared provider-priority flow instead of assuming iTunes as the fallback. Reuse the generic metadata search helpers, keep a compatibility alias for the old helper name, and update the scanner tests to cover the new path. Add a regression test that verifies backfill walks each available fallback provider and persists the resolved IDs per source.	2 months ago
Antti Kettunen	47a6c257ad	Refactor MusicMap similar artist matching Shift similar-artist lookup to the shared metadata provider priority flow. Use generic provider clients for search and metadata extraction instead of branching on Spotify/iTunes-specific paths. Add a regression test that verifies MusicMap matching queries the provider priority list and preserves canonical metadata from the best match.	2 months ago
Antti Kettunen	7e1fc13e52	Make watchlist update_discovery_pool_incremental use provider priority Continuation on recent changes	2 months ago
Antti Kettunen	bc83874c6f	Discovery fan-out and playlists follow source priority Make discovery pool population and curated playlists follow the configured metadata source order. Keep Spotify strict where fallback would corrupt source-specific IDs, and trim fan-out with smaller similar-artist samples and page caps. Leave the remaining incremental path for follow-up.	2 months ago
Antti Kettunen	030374c5b0	Tune discovery fan-out and caching Reduce request volume in the discovery helpers while keeping the source-priority model intact. - make cache_discovery_recent_albums source-priority aware - cap Spotify artist-album pagination in the discovery and incremental paths - reduce the similar-artist sample size for the cache-refresh helper - keep Spotify strict where fallback would contaminate source-specific IDs - add regression coverage for source order, strict Spotify lookups, and pagination caps	2 months ago
Antti Kettunen	6f9ea2de56	Remove redundant spotify auth check again	2 months ago
Broque Thomas	09d358ef69	Fix watchlist scan false failures, Spotify backfill, and wishlist remove Watchlist scanner: empty discography (no new releases in lookback) was treated as API failure, causing "Failed to get artist discography" for artists like Kendrick Lamar who simply had no recent releases. Now distinguishes None (API failure → try next source) from [] (success, no new tracks). Spotify backfill now uses the authenticated client instance instead of creating a fresh unauthenticated one. Wishlist nebula: album remove now sends album_name (API updated to accept album_name as fallback alongside album_id). Track remove re-renders the nebula after deletion. Toned down processing pulse animation. Updated test to verify fallback triggers on API failure (None), not on empty results.	2 months ago
Antti Kettunen	e447cf6ab0	Reduce discovery fan-out and pagination Make discovery pool population respect provider priority while keeping Spotify strict, and reduce unnecessary request volume in the hot discovery paths. - keep discovery fan-out source-priority aware - preserve cache use where freshness is not required - cap Spotify artist-album pagination in discovery and cache refresh paths - keep incremental release checks to a single page, since they only need the newest releases - add regression coverage for provider order, strict Spotify handling, and pagination caps	2 months ago
Antti Kettunen	08ac39bc13	Fix watchlist discography lookback handling Route get_artist_discography through the shared client helper so it uses the existing lookback logic instead of referencing an out-of-scope variable.	2 months ago
Antti Kettunen	e657a1d432	Make watchlist Spotify matching strict Resolve Spotify artist matching through the exact Spotify client only, so watchlist ID backfill cannot drift to fallback-provider results. Remove the remaining preemptive provider availability check from the backfill loop.	2 months ago
Antti Kettunen	7b3a32ccc5	Remove dead watchlist source helpers Drop the old active-provider artist lookup helpers from watchlist_scanner now that the web scan flow resolves sources through the shared metadata priority. Keep the Spotify-specific feature toggles in place for discovery and sync paths that still use them.	2 months ago
Antti Kettunen	38b907097d	Make watchlist scanning source-aware Move the web watchlist scan core onto the shared metadata source priority so primary provider settings are respected during artist, album, and image resolution. Add coverage for primary-source-first discography lookup and fallback to later providers when the primary source has no albums.	2 months ago
Antti Kettunen	9d73b8b561	Restore placeholder filtering and shared image backfill Bring placeholder tracklist skipping back into the shared watchlist scan path, and centralize the DB-only artist image backfill helper so both web scan entrypoints reuse the same logic.	2 months ago
Antti Kettunen	40fa139804	Remove dead watchlist scan paths Drop the legacy watchlist scan entrypoints that are no longer used by the web scan flow, and keep the live refresh path pointed at the shared scanner helper.	2 months ago
Antti Kettunen	657d86cace	Consolidate web watchlist scanning Move the shared watchlist scan loop into core/watchlist_scanner.py so web_server.py only handles triggers, locks, progress, and post-scan orchestration. Manual and scheduled watchlist scans now share the same scanner-side core, while the web entrypoints keep profile selection and automation progress updates.	2 months ago
Broque Thomas	fe399636b2	Fix Spotify API calls leaking when Deezer/iTunes is primary source Spotify was being called for album/artist data fetching across multiple background workers and the Artists page search even when the user had Deezer or iTunes set as their primary metadata source. Being authenticated for playlist sync was treated as permission to use Spotify for everything. - watchlist_scanner: add _spotify_is_primary_source() that checks both auth and primary source config; use it for all album/artist data fetching (discovery pool, recent album caching, playlist curation, similar artist ID matching, proactive ID backfill). _spotify_available_for_run() is kept for sync_spotify_library_cache which must run regardless of primary source - repair_jobs/metadata_gap_filler: gate Spotify ISRC lookup on primary source being 'spotify'; MusicBrainz lookup unaffected - repair_jobs/unknown_artist_fixer: replace hardcoded spotify_client with source-aware client selection — primary source ID tried first, each ID matched to its correct client (fixes latent bug passing Deezer IDs to Spotify) - web_server.py /api/match/search: Artists page search was hardcoded to spotify_client.search_artists(); now uses _get_metadata_fallback_client() so results come from the configured primary source	2 months ago
Broque Thomas	251c27e006	Add Last.fm Track Radio to Discover page Adds a new Last.fm Radio section to the Discover page that lets users search a track on Last.fm, generate a similar-tracks playlist, and run it through the existing discovery/download/sync pipeline. Also generates playlists automatically from top listening history during watchlist scans (max once per week). - core/lastfm_client.py: Add get_similar_tracks() using track.getsimilar - core/listenbrainz_manager.py: Add save_lastfm_radio_playlist() with deterministic MBID (MD5 seed), cleanup limit of 5 for lastfm_radio type - web_server.py: Add /api/lastfm/configured, /api/lastfm/search/tracks, /api/lastfm/radio/generate, /api/discover/listenbrainz/lastfm-radio; fix playlist['name'] KeyError in discovery worker that was resetting phase back to 'fresh' after completion - core/watchlist_scanner.py: Add _generate_lastfm_radio_playlists() with weekly throttle, called at end of scan_all_watchlist_artists() - webui/index.html: Add #lastfm-radio-section above ListenBrainz section, hidden unless Last.fm API key is configured - webui/static/script.js: Search/generation/card-load functions; fix discovery modal labels (Last.fm Radio vs ListenBrainz), description update on completion, belt-and-suspenders completion handling inside updateYouTubeDiscoveryModal; fix album/duration display for tracks without metadata; music note SVG placeholder for missing art - webui/static/style.css: Styles for search bar, dropdown, result rows	2 months ago
Broque Thomas	0edd8f5c81	Raise artist discography limit from 50 to 200 with Deezer pagination Deezer and iTunes defaulted to 50 albums max, silently truncating large discographies. Deezer now paginates (100 per page) up to 200. iTunes raised to 200 (single call). All callers in web_server.py updated to use the new defaults instead of hardcoding limit=50. Also adds diagnostic logging for allow_duplicates album comparison to help debug inconsistent singles behavior.	2 months ago
Broque Thomas	3a7c25f20f	Fix allow_duplicates not working for singles in watchlist scanner Singles like "idol" weren't added when the same song existed on a different album because check_track_exists used album-aware matching that found the track via album name. Now skips the album hint when allow_duplicates is on so matching is title+artist only, then compares album names ourselves with a strict 0.85 threshold. Only affects users with allow_duplicates enabled.	2 months ago
Broque Thomas	e65f73abe2	Fix allow_duplicates setting not working in watchlist scanner The setting only affected wishlist dedup but the watchlist scanner's library check still skipped tracks by title+artist regardless. Now when allow_duplicates is enabled, the scanner compares album names and only skips if the same album matches. Same song on a different album is allowed through to the wishlist.	2 months ago
Broque Thomas	a7877e6e0b	Skip albums with placeholder track names in watchlist scanner Spotify lists unreleased albums with placeholder names like "Track 1", "Track 2" before the real tracklist is revealed. The scanner was trying to download these, searching Soulseek for "Track 1" by artist which matches random files. Now skips any album where more than half the tracks match the placeholder pattern. Covers both the watchlist scan and discovery pool paths.	2 months ago
Broque Thomas	71e4df65e3	Remove emojis from all Python log and print statements Stripped 4,200+ emoji characters from print(), logger calls across 39 Python files. Logs are now clean text — easier to grep, more professional, no encoding issues on terminals without Unicode support. Seasonal config icons preserved for UI display.	2 months ago
BoulderBadgeDad	8977120ba8	Merge pull request #274 from kettui/feat/metadata-client-caching Add caching for metadata clients	2 months ago
BoulderBadgeDad	6d0ffae5fb	Merge pull request #275 from kettui/fix/spotify-ratelimited-search Add missing rate-limit handling for Spotify search requests	2 months ago
Antti Kettunen	4946ff0d03	Remove redundant repetition of lookback period change during watchlist scan Unnecessary noise on the logs	2 months ago
Antti Kettunen	fd6335a66e	Add / improve metadata client caching Clients are for the most part being initialized per-request, which leads to a lot of redundant client initialization, as well as noise on the logs, since each client initialization emits a row on the logs, eg. 'Deezer client initialized'	2 months ago
Antti Kettunen	1b979193eb	Skip Spotify requests for the rest of the watchlist scan if rate-limited State is stored per-scan	2 months ago
Broque Thomas	498c22e7c3	Centralize metadata source selection in core/metadata_service.py All metadata source decisions now flow through get_primary_source() and get_primary_client() in core/metadata_service.py. Previously 6 different files reimplemented this logic with inconsistent defaults ('itunes' vs 'deezer') and auth checks, causing bugs when any one was missed. Changes: - metadata_service.py: Added canonical get_primary_source/get_primary_client - web_server.py: _get_metadata_fallback_source() and _get_active_discovery_source() are now thin wrappers delegating to metadata_service - seasonal_discovery.py: _get_source() delegates to metadata_service - personalized_playlists.py: _get_active_source() delegates to metadata_service - spotify_client.py: Fixed _fallback_source default from 'itunes' to 'deezer' - watchlist_scanner.py: _get_fallback_metadata_client() delegates to metadata_service Future changes to source selection only need to update one file.	2 months ago
Broque Thomas	06e32d84c3	Skip future/unreleased albums in watchlist scanner Albums announced but not yet released have no real audio available, causing Soulseek to match random tracks with similar names. Both discography methods (Spotify and generic client) now filter out albums with release dates in the future. Skipped albums are not marked as processed — they will be picked up on the first scan after their release date passes.	2 months ago
Broque Thomas	4e4f258d25	Reduce watchlist Spotify API calls ~90% + configurable rate interval Addresses all three points from community rate-limiting report: 1. Watchlist scans fetched ALL albums then filtered — 262 albums = 27 API calls per artist. Now determines upfront if full discography is needed: subsequent scans and time-bounded lookbacks use max_pages=1 (1 API call). Only "full discography" global setting fetches all. 2. MIN_API_INTERVAL (350ms) now configurable via spotify.min_api_interval setting. Users who get rate-limited frequently can increase the delay. Floor at 100ms to prevent abuse. 3. Retry-After header extraction improved: added diagnostic logging when headers exist but lack Retry-After key, plus regex fallback to parse the value from the error message string.	2 months ago
Broque Thomas	82f9b84e5b	Add Discogs to watchlist — column, backfill, matching - Add discogs_artist_id column to watchlist_artists table (migration) - Add discogs_artist_id to WatchlistArtist dataclass - Add to get_watchlist_artists optional_columns and constructor - Add update_watchlist_discogs_id DB method - Backfill loop includes Discogs when token is configured - Add _match_to_discogs for cross-provider artist matching - Backfill maps updated: id_attr, match_fn, update_fn all include discogs	2 months ago
Broque Thomas	f6b0bd30e3	Backfill all metadata source IDs at start of every watchlist scan - Was only backfilling the active provider — artists added via Deezer never got Spotify/iTunes IDs, and vice versa - Now backfills iTunes (always), Deezer (always), and Spotify (if authenticated) at the start of every scan - Added _match_to_deezer() and update_watchlist_deezer_id() for Deezer cross-provider matching - Generalized backfill with provider→attribute/function maps	2 months ago
Broque Thomas	e42fe995d3	Throttle Spotify pagination and harden watchlist scanner against rate limits - Add rate limiting to all 4 Spotify pagination loops (get_artist_albums, get_user_playlists, get_playlist_tracks, get_album_tracks) — these called sp.next() bypassing the rate_limited decorator entirely, causing unthrottled API calls that triggered 429 bans - Track pagination calls in API rate monitor (separate endpoint names) - Increase DELAY_BETWEEN_ARTISTS from 2s to 4s in watchlist scanner - Abort watchlist scan immediately if Spotify rate limit detected mid-scan instead of continuing to hammer the API	2 months ago

1 2

98 Commits (e0e31079e6606e4e4dbdbe9b2da75e0ab14dbf21)