Four selection-quality fixes on the SoulSync-made discover playlists.
None change public method signatures; all are tightenings on what's
already there.
(1) Diversity for Hidden Gems + Discovery Shuffle
Both used to be `RANDOM() LIMIT N` with no diversity. Could return
50 tracks from one artist or 20 from one album if the discovery
pool happened to be skewed. Both now over-fetch 3x and run the
existing `_apply_diversity_filter`:
- Hidden Gems: max 2 per album, 3 per artist
- Discovery Shuffle: max 2 per album, 2 per artist (tighter — shuffle
should feel maximally varied)
(2) Source-aware popularity thresholds
`popularity >= 60` for "Popular Picks" and `popularity < 40` for
"Hidden Gems" was Spotify-shaped (0-100 scale). Deezer writes its
`rank` value into that column (often six-digit integers); iTunes
writes nothing meaningful. For Deezer-primary users:
- Popular Picks pulled essentially everything (rank >= 60 = all)
- Hidden Gems pulled essentially nothing (rank < 40 = none)
New `_get_popularity_thresholds(source)` helper returns per-source
values:
- Spotify: (60, 40) — the existing 0-100 scale
- Deezer: (500_000, 100_000) — ballpark from real rank values
- iTunes / unknown: (None, None) — skip the popularity filter
entirely, fall back to random + diversity
`get_popular_picks` and `get_hidden_gems` now consult the helper.
When threshold is None they skip the popularity SQL filter. Diversity
+ ID gate still apply.
(3) Push genre keyword filter into SQL
`get_genre_playlist` used to fetch `limit=1_000_000` rows into Python
then run a substring keyword filter on `artist_genres`. Bad on big
discovery pools.
Now the keyword OR chain is generated as SQL placeholders:
AND (artist_genres LIKE ? OR artist_genres LIKE ? OR ...)
Each placeholder gets `f'%{keyword.lower()}%'` via `extra_params`.
`fetch_limit` drops back to `limit * 10`. `_genre_matches` Python
helper deleted (only intra-file caller; verified via grep).
Parent-genre expansion via `GENRE_MAPPING` preserved — keywords list
feeds the LIKE chain unchanged.
(4) Filter out tracks already in library
Discovery pool can include tracks the user already owns. Hidden Gems
/ Shuffle / Popular Picks shouldn't surface those.
`_select_discovery_tracks` gained `exclude_owned: bool = True`
parameter. When True, adds a correlated NOT EXISTS subquery against
the `tracks` table covering all 3 source IDs:
AND NOT EXISTS (
SELECT 1 FROM tracks t WHERE
(t.spotify_track_id IS NOT NULL AND t.spotify_track_id = discovery_pool.spotify_track_id)
OR (t.itunes_track_id IS NOT NULL AND t.itunes_track_id = discovery_pool.itunes_track_id)
OR (t.deezer_id IS NOT NULL AND t.deezer_id = discovery_pool.deezer_track_id)
)
Note column-name asymmetry: tracks.deezer_id vs
discovery_pool.deezer_track_id. Inline comment marks the trap. All
5 public discovery methods automatically benefit (default True).
Seasonal Playlist doesn't go through the helper so it's unaffected
(curated content, dedup is wrong intent there).
Tests
12 new tests in `tests/test_personalized_playlists_id_gate.py` (27
total in the file):
- Hidden Gems + Discovery Shuffle apply diversity (cap proven by
inserting 10 same-artist + same-album rows and asserting return
count ≤ per-album cap)
- Popularity thresholds: Spotify (60, 40), Deezer larger scale,
iTunes None / None
- Popular Picks skips threshold filter when None
- Genre playlist pushes filter to SQL (parent + child genre expansion)
- Owned-track exclusion: filtered when match, kept when no match,
opt-out flag works
- Deezer column-name asymmetry pinned (regression footgun)
Test fixture re-added the minimal `tracks` table (4 columns: id,
spotify_track_id, itunes_track_id, deezer_id) — only what the new
NOT EXISTS subquery needs to join. Plus `insert_library_track`
helper.
Verification
- 27/27 in this test file pass (15 prior + 12 new)
- 2232/2232 full suite green
- ruff clean
LOC delta:
- core/personalized_playlists.py: 1030 → 1101 (+71)
- tests/test_personalized_playlists_id_gate.py: 352 → 616 (+264)
// --- post-2.4.2 dev work — entries hidden by _getLatestWhatsNewVersion until the build version bumps ---
{date:'Unreleased — 2.4.3 dev cycle'},
{title:'Discover: Sharper Track Selection (Diversity, Source-Aware Popularity, Library Dedup, SQL Genre Filter)',desc:'four selection-quality fixes on the soulsync-made discover playlists. (1) hidden gems and discovery shuffle had no diversity caps — they could return 50 tracks from the same artist or 20 from one album. now both apply the existing `_apply_diversity_filter` (over-fetch 3x then enforce per-album/per-artist caps; shuffle uses tighter caps because it should feel maximally varied). (2) `popularity` thresholds were spotify-shaped (0-100 scale, popular >= 60 / hidden < 40), but deezer writes its rank value into that column (often six-digit integers) and itunes writes nothing meaningful. for deezer-primary users this meant popular picks pulled essentially everything and hidden gems pulled nothing. new `_get_popularity_thresholds(source)` returns per-source values: spotify (60, 40), deezer (500_000, 100_000) ballpark, itunes/other (None, None) which skips the popularity filter entirely and falls back to random + diversity. (3) `get_genre_playlist` used to load up to 1M discovery_pool rows into python and run a substring keyword filter on the json column. now the keyword OR chain pushes down into sql via `(artist_genres LIKE ? OR ...)` placeholders, fetch_limit drops to limit*10. parent-genre expansion via GENRE_MAPPING preserved. (4) discovery selectors now exclude tracks the user already owns — `_select_discovery_tracks` gained `exclude_owned: bool = True` (default on) which adds a `NOT EXISTS (SELECT FROM tracks WHERE source_id matches)` correlated subquery covering the spotify/itunes/deezer-id columns (with the deezer-column-name asymmetry handled inline: discovery_pool.deezer_track_id vs tracks.deezer_id). hidden gems / shuffle / popular picks / decade / genre browser all benefit automatically. 12 new tests (27 total in the file): diversity caps, source-aware threshold values, threshold-skip behavior, sql-pushed genre filter, parent-genre expansion, owned-track exclusion, opt-out flag, and the deezer-column-name asymmetry trap. 2232/2232 full suite green.',page:'discover'},
{title:'Discover: Stop Showing Undownloadable Tracks (+Lift +Cleanup)',desc:'audit found multiple discover-page sections (hidden gems / discovery shuffle / popular picks / decade browser / genre browser) had no `WHERE (spotify_track_id IS NOT NULL OR itunes_track_id IS NOT NULL OR ...)` gate on their selection sql. tracks with no source ids in the discovery pool were getting displayed, the user would click download, and the download would silently fail because there was nothing to look up. fix: lifted all five discovery_pool selection methods into shared private helpers (`_select_discovery_tracks`, `_apply_diversity_filter`, `_compute_adaptive_diversity_limits`) on `PersonalizedPlaylistsService`. mandatory id-validity gate is hard-coded into the selector — no opt-out flag, every public method inherits it for free. behavior preserved: same diversity tiers, same over-fetch multipliers, same popularity thresholds, same blacklist filter. ~314 lines of repeated select/diversity boilerplate collapsed across the 5 methods (-55% on those methods\' business logic). also deleted four sections that had been stubbed returning [] for ages (recently added / top tracks / forgotten favorites / familiar favorites) — frontend, backend endpoints, html blocks, helper docs, all gone. on the frontend, lifted the duplicated decade-browser + genre-browser tab management (~314 lines of identical fetch-tabs / render-tabstrip / fetch-content / render-tracklist / wire-sync-button code) into one shared `createTabbedBrowserSection(config)` helper. each browser is now a thin wrapper: ~3 lines per public function. 14 new tests pin the gate (every selector filters null-id rows), the diversity caps, the adaptive limit tiers, the source filter, and the blacklist filter.',page:'discover'},
{title:'Internal: Discover Controller — Cin Pre-Review Polish',desc:'tightened the controller before opening the PR. (1) dropped the magic `extractItems` defaults — controller used to auto-pull `data.items` / `data.albums` / `data.artists` / `data.tracks` / `data.results` if no extractor was provided. removed the fallback chain. each section now MUST supply its own `extractItems(data) => array` callback. cin standard: explicit > implicit; the auto-fallback could silently grab the wrong key on endpoints that return multiple arrays. validated at register-time so misuse fails immediately. all 10 existing call sites already had explicit extractors so no migration churn. (2) replaced the `renderItems` returning null convention (used by Your Albums + manualDom-style sections) with an explicit `manualDom: true` config flag. clearer intent at the call site, less likely to be confused with a renderer error. (3) added a minimal node `--test` JS test file at `tests/static/test_discover_section_controller.mjs` — 32 tests pin the lifecycle contract: config validation (every required field), happy-path fetch+render, empty/stale/error states, no-fetch `data:` mode, manualDom mode, callable `fetchUrl`, load coalescing, refresh bypass, hook error containment, error toasts. runs via `node --test tests/static/` directly, OR via the regular pytest sweep (`tests/test_discover_section_controller_js.py` shells out to node and asserts a clean exit). skipped gracefully when node isn\'t available or is < 22. closes the "controller is a contract, pin it at the test boundary" gap that cin would have flagged. 2205/2205 full suite green (was 2204 + 1 new pytest wrapper); 32/32 node --test pass; ruff clean; js parses clean.',page:'discover'},
{title:'Internal: Discover Cleanup Round — Toast Errors, Stale State, Skipped Sections',desc:'follow-up to the controller migration. extended `createDiscoverSectionController` with the hooks the per-section migrations surfaced as needed: callable `fetchUrl` (resolves the seasonal-playlist recreate-on-key-change hack), no-fetch `data:` mode (lets render-only sections like seasonal albums use the controller without inventing a fake endpoint), `beforeLoad` hook (lets dynamically-inserted sections like because-you-listen-to ensure their container exists before the spinner shows), `onSuccess(data)` hook (cleaner home for sibling header / subtitle / button updates than folding them into renderItems), and an `isStale` / `onStale` / `renderStale` triple for the third render state (data is empty BUT upstream is still discovering — show updating UI + start a poller, instead of the bare empty-state copy). turned on `showErrorToast: true` for every migrated section — section load failures now surface a global toast instead of silently spinning forever or swallowing into console.debug. that\'s the JohnBaumb #369 pattern applied at the UI layer. migrated the two sections that didn\'t fit the original controller contract: `loadYourAlbums` (uses isStale/onStale for stale-fetch UI + onSuccess for subtitle/filters/download-button side-effects + renderItems returning null since it delegates to the existing grid renderer) and `loadSeasonalAlbums` (uses no-fetch data mode since the parent `loadSeasonalContent` already fetched the season payload). also lifted the duplicated decade-tab + genre-tab sync-status block (✓/⏳/✗/percentage) into a `_renderSyncStatusBlock(idPrefix)` helper — two call sites now share one implementation. listenbrainz playlists keep their own block because the semantics differ (matching progress vs download progress). audit found the 13 supposedly-dead hidden sections aren\'t dead at all — they\'re gated on user data (discovery pool, library content, metadata cache) and self-surface when their data exists. removed one orphaned `loadPersonalizedDailyMixes()` call from `blockDiscoveryArtist` — daily mixes is intentionally paused, refreshing it from there was a no-op.',page:'discover'},