SoulSync

History

Broque Thomas d75ae48981 Discover: sharpen track selection (diversity, source-aware popularity, library dedup, SQL genre) Four selection-quality fixes on the SoulSync-made discover playlists. None change public method signatures; all are tightenings on what's already there. (1) Diversity for Hidden Gems + Discovery Shuffle Both used to be `RANDOM() LIMIT N` with no diversity. Could return 50 tracks from one artist or 20 from one album if the discovery pool happened to be skewed. Both now over-fetch 3x and run the existing `_apply_diversity_filter`: - Hidden Gems: max 2 per album, 3 per artist - Discovery Shuffle: max 2 per album, 2 per artist (tighter — shuffle should feel maximally varied) (2) Source-aware popularity thresholds `popularity >= 60` for "Popular Picks" and `popularity < 40` for "Hidden Gems" was Spotify-shaped (0-100 scale). Deezer writes its `rank` value into that column (often six-digit integers); iTunes writes nothing meaningful. For Deezer-primary users: - Popular Picks pulled essentially everything (rank >= 60 = all) - Hidden Gems pulled essentially nothing (rank < 40 = none) New `_get_popularity_thresholds(source)` helper returns per-source values: - Spotify: (60, 40) — the existing 0-100 scale - Deezer: (500_000, 100_000) — ballpark from real rank values - iTunes / unknown: (None, None) — skip the popularity filter entirely, fall back to random + diversity `get_popular_picks` and `get_hidden_gems` now consult the helper. When threshold is None they skip the popularity SQL filter. Diversity + ID gate still apply. (3) Push genre keyword filter into SQL `get_genre_playlist` used to fetch `limit=1_000_000` rows into Python then run a substring keyword filter on `artist_genres`. Bad on big discovery pools. Now the keyword OR chain is generated as SQL placeholders: AND (artist_genres LIKE ? OR artist_genres LIKE ? OR ...) Each placeholder gets `f'%{keyword.lower()}%'` via `extra_params`. `fetch_limit` drops back to `limit * 10`. `_genre_matches` Python helper deleted (only intra-file caller; verified via grep). Parent-genre expansion via `GENRE_MAPPING` preserved — keywords list feeds the LIKE chain unchanged. (4) Filter out tracks already in library Discovery pool can include tracks the user already owns. Hidden Gems / Shuffle / Popular Picks shouldn't surface those. `_select_discovery_tracks` gained `exclude_owned: bool = True` parameter. When True, adds a correlated NOT EXISTS subquery against the `tracks` table covering all 3 source IDs: AND NOT EXISTS ( SELECT 1 FROM tracks t WHERE (t.spotify_track_id IS NOT NULL AND t.spotify_track_id = discovery_pool.spotify_track_id) OR (t.itunes_track_id IS NOT NULL AND t.itunes_track_id = discovery_pool.itunes_track_id) OR (t.deezer_id IS NOT NULL AND t.deezer_id = discovery_pool.deezer_track_id) ) Note column-name asymmetry: tracks.deezer_id vs discovery_pool.deezer_track_id. Inline comment marks the trap. All 5 public discovery methods automatically benefit (default True). Seasonal Playlist doesn't go through the helper so it's unaffected (curated content, dedup is wrong intent there). Tests 12 new tests in `tests/test_personalized_playlists_id_gate.py` (27 total in the file): - Hidden Gems + Discovery Shuffle apply diversity (cap proven by inserting 10 same-artist + same-album rows and asserting return count ≤ per-album cap) - Popularity thresholds: Spotify (60, 40), Deezer larger scale, iTunes None / None - Popular Picks skips threshold filter when None - Genre playlist pushes filter to SQL (parent + child genre expansion) - Owned-track exclusion: filtered when match, kept when no match, opt-out flag works - Deezer column-name asymmetry pinned (regression footgun) Test fixture re-added the minimal `tracks` table (4 columns: id, spotify_track_id, itunes_track_id, deezer_id) — only what the new NOT EXISTS subquery needs to join. Plus `insert_library_track` helper. Verification - 27/27 in this test file pass (15 prior + 12 new) - 2232/2232 full suite green - ruff clean LOC delta: - core/personalized_playlists.py: 1030 → 1101 (+71) - tests/test_personalized_playlists_id_gate.py: 352 → 616 (+264)		3 weeks ago
..
artists	Surface silent exceptions in core/artists — 23 sites	3 weeks ago
automation	Surface silent exceptions across remaining modules — ~70 sites	3 weeks ago
discovery	Surface silent exceptions across remaining modules — ~70 sites	3 weeks ago
download_engine	Add module logger + surface silent exceptions in 7 logger-less files — 12 sites	3 weeks ago
download_plugins	Address Copilot doc-drift review	3 weeks ago
downloads	Surface silent exceptions across remaining modules — ~70 sites	3 weeks ago
enrichment	Surface silent exceptions across remaining modules — ~70 sites	3 weeks ago
imports	Surface silent exceptions across remaining modules — ~70 sites	3 weeks ago
library	Final silent-exception sweep + ruff S110 lint guardrail — ~45 sites	3 weeks ago
media_server	Final review-pass nits — class docstring, dead branch, dead imports, boot resilience	3 weeks ago
metadata	Final silent-exception sweep + ruff S110 lint guardrail — ~45 sites	3 weeks ago
playlists	Final silent-exception sweep + ruff S110 lint guardrail — ~45 sites	3 weeks ago
repair_jobs	Final silent-exception sweep + ruff S110 lint guardrail — ~45 sites	3 weeks ago
search	Cin-6: Rename soulseek_client global → download_orchestrator	3 weeks ago
stats	Surface silent exceptions across remaining modules — ~70 sites	3 weeks ago
streaming	Cin-6: Rename soulseek_client global → download_orchestrator	3 weeks ago
watchlist	Final silent-exception sweep + ruff S110 lint guardrail — ~45 sites	3 weeks ago
wishlist	Surface silent exceptions across remaining modules — ~70 sites	3 weeks ago
workers	Final silent-exception sweep + ruff S110 lint guardrail — ~45 sites	3 weeks ago
acoustid_client.py	Clean up 286 ruff lint errors to unblock CI and fix 10 latent bugs	1 month ago
acoustid_verification.py	Reject AcoustID matches whose version disagrees with the expected track	3 weeks ago
album_consistency.py	Surface silent exceptions across remaining modules — ~70 sites	3 weeks ago
api_call_tracker.py	Surface silent exceptions across remaining modules — ~70 sites	3 weeks ago
artist_source_detail.py	Move metadata helpers into package modules	4 weeks ago
artist_source_lookup.py	Make artist_name Optional in find_library_artist_for_source	1 month ago
audiodb_client.py	Add API Rate Monitor dashboard with real-time speedometer gauges	2 months ago
audiodb_worker.py	Surface silent exceptions in workers + repair jobs — ~30 sites	3 weeks ago
auto_import_worker.py	Surface silent exceptions in import pipeline — 11 sites	3 weeks ago
automation_engine.py	Final silent-exception sweep + ruff S110 lint guardrail — ~45 sites	3 weeks ago
connection_detect.py	Final silent-exception sweep + ruff S110 lint guardrail — ~45 sites	3 weeks ago
connection_test.py	Cin-6: Rename soulseek_client global → download_orchestrator	3 weeks ago
database_update_worker.py	Final silent-exception sweep + ruff S110 lint guardrail — ~45 sites	3 weeks ago
debug_info.py	Surface silent exceptions across remaining modules — ~70 sites	3 weeks ago
deezer_client.py	Add download buttons + bulk action to artist top-tracks sidebar	3 weeks ago
deezer_download_client.py	Surface silent exceptions in metadata clients — 37 sites	3 weeks ago
deezer_worker.py	Surface silent exceptions in workers + repair jobs — ~30 sites	3 weeks ago
discogs_client.py	Surface silent exceptions in metadata clients — 37 sites	3 weeks ago
discogs_worker.py	Surface silent exceptions in workers + repair jobs — ~30 sites	3 weeks ago
download_orchestrator.py	Lift shared download dataclasses + boot via singleton factory	3 weeks ago
genius_client.py	Add API Rate Monitor dashboard with real-time speedometer gauges	2 months ago
genius_worker.py	Surface silent exceptions in workers + repair jobs — ~30 sites	3 weeks ago
genre_filter.py	Expand default genre whitelist from 223 to 272 genres	1 month ago
hifi_client.py	Surface engine-not-wired errors + exclude soulseek from monitor aggregation	3 weeks ago
hydrabase_client.py	Surface silent exceptions across remaining modules — ~70 sites	3 weeks ago
hydrabase_worker.py	Improve graceful shutdown and rollback safety	1 month ago
itunes_client.py	Surface silent exceptions in metadata clients — 37 sites	3 weeks ago
itunes_worker.py	Surface silent exceptions in workers + repair jobs — ~30 sites	3 weeks ago
jellyfin_client.py	Final silent-exception sweep + ruff S110 lint guardrail — ~45 sites	3 weeks ago
lastfm_client.py	Add Last.fm Track Radio to Discover page	1 month ago
lastfm_worker.py	Surface silent exceptions in workers + repair jobs — ~30 sites	3 weeks ago
library_reorganize.py	Final silent-exception sweep + ruff S110 lint guardrail — ~45 sites	3 weeks ago
lidarr_download_client.py	Surface silent exceptions across remaining modules — ~70 sites	3 weeks ago
listenbrainz_client.py	Remove emojis from all Python log and print statements	1 month ago
listenbrainz_manager.py	Final silent-exception sweep + ruff S110 lint guardrail — ~45 sites	3 weeks ago
listening_stats_worker.py	Final silent-exception sweep + ruff S110 lint guardrail — ~45 sites	3 weeks ago
lyrics_client.py	Remove emojis from all Python log and print statements	1 month ago
matching_engine.py	Merge remote-tracking branch 'origin/dev' into refactor/media-server-engine	3 weeks ago
metadata_service.py	Move profile Spotify cache into registry	4 weeks ago
musicbrainz_client.py	MusicBrainz: Resolve release-group MBIDs to a release on album click	1 month ago
musicbrainz_search.py	MusicBrainz: Dedupe same-named homonyms in artist search results	1 month ago
musicbrainz_service.py	Add metadata cache maintenance and health monitoring	2 months ago
musicbrainz_worker.py	Surface silent exceptions in workers + repair jobs — ~30 sites	3 weeks ago
navidrome_client.py	Final silent-exception sweep + ruff S110 lint guardrail — ~45 sites	3 weeks ago
personalized_playlists.py	Discover: sharpen track selection (diversity, source-aware popularity, library dedup, SQL genre)	3 weeks ago
plex_client.py	Surface silent exceptions across remaining modules — ~70 sites	3 weeks ago
qobuz_client.py	Address Copilot doc-drift review	3 weeks ago
qobuz_worker.py	Surface silent exceptions in workers + repair jobs — ~30 sites	3 weeks ago
reorganize_queue.py	Reorganize queue: race + dedupe fixes from kettui review	4 weeks ago
reorganize_runner.py	Final silent-exception sweep + ruff S110 lint guardrail — ~45 sites	3 weeks ago
repair_worker.py	Surface silent exceptions in repair_worker — 16 sites	3 weeks ago
replaygain.py	Add module logger + surface silent exceptions in 7 logger-less files — 12 sites	3 weeks ago
runtime_state.py	Add module logger + surface silent exceptions in 7 logger-less files — 12 sites	3 weeks ago
seasonal_discovery.py	Final silent-exception sweep + ruff S110 lint guardrail — ~45 sites	3 weeks ago
socketio_cors.py	Socket.IO CORS: handle self-review nits	4 weeks ago
soulid_worker.py	Surface silent exceptions in workers + repair jobs — ~30 sites	3 weeks ago
soulseek_client.py	Final silent-exception sweep + ruff S110 lint guardrail — ~45 sites	3 weeks ago
soulsync_client.py	Surface silent exceptions across remaining modules — ~70 sites	3 weeks ago
soundcloud_client.py	Surface engine-not-wired errors + exclude soulseek from monitor aggregation	3 weeks ago
spotify_client.py	Add download buttons + bulk action to artist top-tracks sidebar	3 weeks ago
spotify_public_scraper.py	Add Spotify Link tab for public playlist/album scraping without API credentials	2 months ago
spotify_worker.py	Surface silent exceptions in workers + repair jobs — ~30 sites	3 weeks ago
tag_writer.py	Surface silent exceptions across remaining modules — ~70 sites	3 weeks ago
tidal_client.py	Surface silent exceptions in metadata clients — 37 sites	3 weeks ago
tidal_download_client.py	Surface engine-not-wired errors + exclude soulseek from monitor aggregation	3 weeks ago
tidal_worker.py	Surface silent exceptions in workers + repair jobs — ~30 sites	3 weeks ago
watchlist_scanner.py	Final silent-exception sweep + ruff S110 lint guardrail — ~45 sites	3 weeks ago
web_scan_manager.py	MS Cin-5: Drop per-server globals — engine owns the clients	3 weeks ago
wishlist_service.py	Extract wishlist logic into dedicated package	4 weeks ago
worker_utils.py	Fix Album Completeness job reporting zero findings for every album	1 month ago
youtube_client.py	Final silent-exception sweep + ruff S110 lint guardrail — ~45 sites	3 weeks ago