Followup to the enrichment-bubble registry consolidation. The
dashboard polling + click handlers all hit
/api/enrichment/<service>/{status,pause,resume} now, so the 30
hand-rolled per-service routes in web_server.py have zero callers
and can come out:
/api/musicbrainz/{status,pause,resume}
/api/audiodb/{status,pause,resume}
/api/discogs/{status,pause,resume}
/api/deezer/{status,pause,resume}
/api/spotify-enrichment/{status,pause,resume}
/api/itunes-enrichment/{status,pause,resume}
/api/lastfm-enrichment/{status,pause,resume}
/api/genius-enrichment/{status,pause,resume}
/api/tidal-enrichment/{status,pause,resume}
/api/qobuz-enrichment/{status,pause,resume}
Worker init blocks stay (they still construct the workers + persist
pause state). Section comment headers are preserved with a one-line
note pointing readers at the new generic blueprint.
Test fixtures in tests/conftest.py and
tests/metadata/test_enrichment_events.py also updated to use the
new URL paths so they reflect production reality. They were
synthetic stubs that never depended on the production routes —
purely cosmetic alignment.
Net: ~510 lines deleted from web_server.py. Full pytest 1541
passed; ruff clean.
The dashboard's enrichment-status bubbles (MusicBrainz, AudioDB,
Discogs, Deezer, Spotify, iTunes, Last.fm, Genius, Tidal, Qobuz) each
had its own copy-pasted /status, /pause, /resume route in web_server.py
— 30 routes that differed only in the worker reference and a couple
of per-service quirks (Spotify's rate-limit guard, Last.fm/Genius
yield-override behavior, Tidal/Qobuz extra status fields).
Replace them with a registry-driven blueprint:
- core/enrichment/services.py declares an EnrichmentService dataclass
with worker_getter, config_paused_key, pre_resume_check,
auto_pause_token, and extra_status_defaults — all variation captured
as data, no branching on service id.
- core/enrichment/api.py exposes a Flask blueprint with three routes
(/api/enrichment/<service>/{status,pause,resume}). Per-service
quirks are honored via the descriptor: Spotify's rate-limit ban
still returns 429 with `rate_limited: true`, Last.fm/Genius still
drop the auto-pause token and add the yield override, Tidal/Qobuz
still merge `authenticated: false` into the fallback payload.
- web_server.py registers all 10 services after their workers
initialize, wires the host-side hooks (config_manager.set,
_download_auto_paused.discard, _download_yield_override.add), and
registers the blueprint.
- webui/static/enrichment.js polling + click handlers now hit the
generic endpoints. The per-service `update<Service>StatusFromData`
functions are unchanged — they still process the same payload.
This is the cutover step. Old per-service routes are intentionally
left in place as a fallback during the soak period — they currently
have zero callers in the codebase and will be deleted in a follow-up
patch once production has run on the new pipeline for a few days.
27 new tests in tests/test_enrichment_services.py cover the registry
behavior + every quirk path through the generic blueprint (rate-limit
guard, auto-pause token cleanup, persisted-pause config keys, extra
default fields, worker-not-initialized fallback, exceptions). Full
suite 1541 passed; ruff clean.
Artist-detail is a "pseudo-page" reachable from Library, the unified
Search page, and the global search popover. It has no [data-page]
match in the sidebar, so navigateToPage's bulk-active-removal left
every nav button unhighlighted while the user was viewing an artist —
the sidebar offered no visual anchor for where they were.
Now:
- navigateToPage('artist-detail') falls back to highlighting the
Library button when no [data-page] match exists, anchoring the
sidebar to the canonical home for artist detail views.
- A new _updateSidebarLibraryBreadcrumb() helper rewrites the Library
button label between plain "Library" and a "Library / Artist Name"
breadcrumb based on currentPage + artistDetailPageState. Long names
(>14 chars) truncate with an ellipsis; the full name shows on hover
via the title attribute.
- Called from navigateToPage (entering / leaving the page) and from
loadArtistDetailData (covers same-page artist switches in the
similar-artist chain where currentPage stays 'artist-detail').
CSS adds .nav-text-root / .nav-text-sep / .nav-text-context selectors
so the "Library" anchor word stays visually dominant while the
separator and artist name dim to a secondary tier — readable but not
competing for attention.
Pure visual change. No backend touched. No new tests (DOM-only).
Patch release wrapping up the 2.4.1 dev cycle. Highlights:
- Watchlist no longer re-downloads compilation/soundtrack tracks
(#458 dedup orphan cleanup + the album-match fix work in tandem
to stop the loop).
- Duplicate detector catches slskd dedup orphans via a second
filename-bucket pass.
- Beatport tab hidden temporarily — Cloudflare Turnstile blocks the
scraper and the official OAuth API is closed to public devs.
- Service worker for cover art + installable PWA manifest.
- Browser caching for static assets (1y) and discover pages (5min).
- Socket.IO same-origin default + admin-only /api/settings.
Files updated:
- web_server.py: _SOULSYNC_BASE_VERSION 2.4.0 -> 2.4.1
- webui/index.html: sidebar version button + modal subtitle
- webui/static/helper.js: WHATS_NEW dev-cycle marker -> release date,
fallback version in _getLatestWhatsNewVersion, 8 new
VERSION_MODAL_SECTIONS entries promoted from this cycle
- .github/workflows/docker-publish.yml: workflow_dispatch default
version_tag updated to 2.4.1
Self-review of the previous commit found a real false-positive risk in
the new filename-bucket pass: two unrelated songs that happen to share
a canonical filename (e.g. ``Yellow.mp3`` by Coldplay vs by some other
artist) would be grouped because all metadata gates were dropped.
The filename pass now layers a safety net under ``require_metadata_match=False``:
- If both rows carry a duration: must agree within 3 seconds. Same
source download = identical duration; a 3+ second gap means
different recordings.
- Else if both rows carry an artist: relaxed 0.6 similarity check —
catches dedup orphans that share an artist tag while rejecting
strangers-with-same-filename.
- Else (no duration AND at least one artist blank): skip — too little
signal to safely group.
5 additional regression tests cover the false-positive prevention
paths plus the genuine dedup-orphan scenarios that must still be
caught after the safety net.
Two related bugs reported on Discord by Mushy.
1. The watchlist re-downloaded the same OST track up to 7 times.
``is_track_missing_from_library`` compared Spotify's album name and
the media-server scan's album name with a raw SequenceMatcher at a
strict 0.85 threshold. Compilations and soundtracks routinely fail
this — Spotify reports
``"Napoleon Dynamite (Music From The Motion Picture)"`` while the
Plex / Navidrome / Jellyfin tag scan saves it as
``"Napoleon Dynamite OST"``. Raw similarity ≈ 0.49, so the scanner
declared the track missing on every 30-minute scan and added it back
to the wishlist. The wishlist then issued a fresh download. slskd
appended ``_<19-digit-ns-timestamp>`` to each new copy because the
target file already existed, and the user ended up with seven copies
of one song in one folder.
Fix: extract two pure helpers — ``_normalize_album_for_match``
strips qualifier parentheticals (Music From X, OST, Deluxe Edition,
Remastered, Anniversary, etc.) and trailing dash-clauses;
``_albums_likely_match`` checks equality after normalization,
substring containment, and a relaxed 0.6 fuzzy ratio. A volume /
part / disc / standalone-trailing-number guard rejects pairs like
``"Greatest Hits Vol. 1"`` vs ``"Greatest Hits Vol. 2"`` so the
relaxed threshold doesn't introduce false positives on serialized
releases. After this change the Napoleon Dynamite case collapses
to ``"napoleon dynamite" == "napoleon dynamite"`` via the equality
short-circuit and the redownload loop dies.
2. The duplicate detector found only one of the seven dupe files.
The detector buckets tracks by the first 4 chars of their normalized
tag title. Files written by slskd directly into a library folder
often get inconsistent (or blank) tags from the media-server rescan,
so the seven copies were bucketed apart by parsed title and never
compared.
Fix: refactor the per-bucket comparison into ``_scan_bucket``, then
add a second pass — ``_build_filename_buckets`` re-buckets leftover
tracks by canonical filename stem (slskd dedup tail stripped via
``_strip_slskd_dedup_suffix``, same regex the import-cleanup PR uses)
plus extension. Filename agreement is itself strong evidence the
files came from the same source download, so the second pass calls
``_scan_bucket`` with ``require_metadata_match=False`` to skip the
title / artist / cross-album gates. The same-physical-file guard
still runs so bind-mount duplicates aren't flagged.
72 new regression tests across two files cover the album-match
helpers (28 tests including the Napoleon Dynamite scenario, 7 volume
disagreements, 8 positive/negative pairs, 5 defensive cases) and the
new filename-bucket pass (16 tests across bucket construction, scan
integration, and existing title-pass behavior). Full pytest 1509
passed; ruff clean.
Reported by Mushy in Discord.
Beatport added Cloudflare Turnstile to every public page on
beatport.com. The unified scraper now receives bot-challenge HTML
instead of real content, so all /api/beatport/* endpoints return
500 with "Could not fetch Beatport homepage".
The official Beatport v4 API is locked behind OAuth application
registration that isn't open to the public — confirmed via the
docs at api.beatport.com/v4/docs and community projects
(beets-beatport4). The public docs SPA client_id only accepts
browser-based flows (post-message redirect URI), which can't be
driven server-side.
Hide the Beatport tab on the Sync page so users stop hitting the
broken endpoints. Backend routes and beatport_unified_scraper.py
stay in code — revival is a one-attribute HTML change once
Cloudflare relaxes or a workaround is found.
Reported via the homepage 500 spam in user logs.
slskd appends "_<19-digit unix-nanosecond timestamp>" to a downloaded
filename when the destination already contains a same-named file
(concurrent downloads of the same track, partial-file retries after a
connection drop, cancelled-then-redownloaded files, the same track
surfacing in multiple synced playlists). The file-finder code already
recognized the suffix when matching a download to its source — but
after the canonical file moved into the library, the leftover
"_<timestamp>" siblings sat orphaned in the downloads folder forever.
Reported on Discord by Shdjfgatdif.
cleanup_slskd_dedup_siblings() runs at the end of each successful
import (3 safe_move_file sites in pipeline.py) and prunes any
remaining siblings that strip down to the canonical stem with the
same extension. Conservative match (>= 18 trailing digits) keeps
legitimate filenames like "Track 5" and "Album 1995" untouched. Per-
file unlink failures are swallowed so a single locked file doesn't
block the rest.
17 regression tests cover the suffix-strip primitive, orphan removal,
no-op cases, mismatched extensions, subdirectories, and partial-failure
recovery.
Importing web_server fires utils.logging_config.setup_logging at
module-init, which clears + re-installs handlers on the shared
'soulsync' logger and pins its level to the user's configured
value. That mutation leaks across tests in the same pytest process.
This file runs alphabetically before test_library_reorganize_orchestrator,
so the leak broke test_watchdog_warns_about_stuck_workers downstream
— it relies on caplog capturing soulsync.library_reorganize warnings
via root-logger propagation, and the reconfigured logger's new
handler chain swallowed those records before they reached caplog
(caplog.records came back empty even though pytest's live-log
capture clearly showed the warning fired).
Adds an autouse fixture that snapshots the soulsync logger's
handlers, level, and propagate flag before each test in this file
and restores them afterwards. Pollution stays scoped to this file.
tests/test_tidal_auth_instructions.py also imports web_server but
runs alphabetically AFTER test_library_reorganize_orchestrator so
it never tripped this — fix is scoped here, not a project-wide
conftest, so we don't change behaviour for unrelated test files.
- Show Discogs with a lock icon until a personal access token is present.
- Prevent selecting locked Discogs and steer users to the Discogs settings section.
- Keep metadata-source availability and selection state synced as the token changes.
- Show Spotify with a lock icon when it is not currently selectable.
- Keep the explanation in the hover title instead of cluttering the dropdown label.
- Redirect users to the Spotify settings section when they try to pick a locked source.
- Drive the Spotify settings accordion from live auth state instead of treating it as configured/healthy when the session is missing.
- Reuse the existing yellow missing-state styling so unauthenticated Spotify is visually distinct from active Spotify.
- Keep the shared status refresh path updating the settings view immediately after auth changes.
- Return a distinct post-auth warning page when Spotify OAuth completes but the client still does not report an authenticated session.
- Send the completion signal back to the opener so the settings UI can refresh and show the warning state immediately.
- Keep the standalone callback server and the main Flask callback path aligned on the same result-page helper.
- Make the Spotify auth completion popup notify the opener across callback origins.
- Refresh service status in the settings UI after auth completes so the button flips to Disconnect immediately.
- Keep the standalone callback instruction page and the main app flow working with the same completion signal.
- Flatten the Spotify service-status rendering so it shows rate-limit and recovery states explicitly, while otherwise displaying the active metadata provider directly.
- Keep the Spotify auth controls and metadata-source picker aligned with the real session state after authenticate and disconnect flows.
- Return "Unmapped" for unknown metadata source labels instead of implying iTunes.
- Update the metadata registry tests to cover the new label fallback.
- Send Spotify auth completion back to the opener so the settings page refreshes immediately
- Make the local auth flow go straight through to Spotify instead of showing the temporary instruction page
- Keep the remote/docker instruction page available for manual callback setups
- Sync Spotify status, connect/disconnect buttons, and metadata source selection after auth and disconnect
- Keep the disconnect behavior aligned with the active primary metadata source
- Hide the auth button when a Spotify session is active
- Treat disconnect as a session change, not a provider swap
- Share metadata source labels in the registry
- Tighten rate-limit copy around Spotify-specific behavior
Pytest tears down its log file handles before atexit runs. Every
"Shutting down ..." line a worker emits while stopping then crashes
Python's logger with "I/O operation on closed file" and floods CI
stderr with --- Logging error --- traceback blocks. The CI sanity
check workflow noticed once tests started importing web_server (the
Tidal-auth integration test PR + this parallel-imports PR are the
first two test files that boot the full module).
Adds a tiny atexit handler that flips ``logging.raiseExceptions =
False`` BEFORE the other shutdown handlers run. atexit's LIFO order
makes "registered last" mean "runs first", so this fires ahead of
cleanup_monitor / _atexit_shutdown / _atexit_save_history and any
log calls those make can't bubble the closed-stream traceback.
The shutdown messages themselves are best-effort debug
breadcrumbs, not data we need to preserve at process exit, so
silencing the internal handler errors costs nothing.
Discord-reported (fresh.dumbledore + maintainer ack): the
/api/import/singles/process route iterated staging files through a
plain Python for loop. Per-file work is dominated by metadata
search round-trips (Spotify/iTunes/Deezer/Discogs), so a multi-
track manual import on a typical home network was painfully slow.
Adds a dedicated import_singles_executor (3 workers) alongside the
existing executor pool, and refactors the route to submit every
file at once and aggregate results via as_completed. Worker count
balances throughput against any single provider's per-source rate
limits — the same shape used by missing_download_executor.
Extracts the per-file pipeline into _process_single_import_file
which returns a typed (status, payload) outcome:
- ("ok", final_title) on success
- ("error", message) for missing/malformed input or pipeline failure
The worker wraps its own exceptions so a single bad file can't
crash the batch; the route adds a belt-and-suspenders try/except
around future.result() for any worker-level surprises.
Pipeline thread-safety verified: post_process_matched_download
already serializes per-file via post_process_locks (one lock per
context_key — and each import gets a unique UUID context_key), DB
writes serialize through SQLite's WAL + busy_timeout, metadata
registry uses RLocks, no bare module-level mutable state.
Adds 9 regression tests:
- 4 worker-contract tests (missing file, malformed match, pipeline
exception wrapping, happy-path return shape)
- 2 executor-config tests (worker count, thread name prefix)
- 1 integration test that proves the route actually parallelizes
by checking wall-clock duration is well under sequential cost
- 1 mixed-outcome aggregation test
- 1 worker-crash recovery test
Doesn't address the related "stops on tab close" complaint —
that's a separate request-lifecycle issue that needs job_id +
polling, not just parallelism.
Discord-reported (winecountrygames + fresh.dumbledore): "Import only
makes Albums folder no singles or eps". Users with a
${albumtype}s/$albumartist/... album_path template saw an "Albums"
folder fill up correctly but never any "Singles" or "EPs" folder.
build_import_album_info detected an album using
``total_tracks > 1`` AND ``album_name != track_title``. Spotify
singles fail both — total_tracks is 1 and the album is usually
named after the song. The result was that staging/auto-import
routed singles through single_path, which doesn't honour
$albumtype, so the user's per-type folder layout never applied.
Now also treats the metadata source's explicit release-type
classification ("single", "ep", "compilation") as evidence that
this is an album-shaped release, so it routes through album_path
and the user's $albumtype substitution runs. The default fallback
value "album" is deliberately excluded from this check so
single-track downloads with no real metadata behave exactly as
before.
Adds 10 regression tests covering the reported scenario, EP and
compilation explicit types, and three guards: normal multi-track
albums still detected, default 'album' type falls through, and
empty/unknown types fall through.
Discord-reported: clicking the Tidal "Authenticate" button on a
Docker setup landed users on a remote-access instructions page that
told them their callback URL would look like
http://127.0.0.1:8888/tidal/callback?code=... — Spotify's port,
hardcoded into the Tidal instructions. Users who followed those
instructions literally saved 8888 into their tidal.redirect_uri
setting; that mismatched their Tidal Developer App's registered
:8889 redirect URI and Tidal returned error 1002 (invalid redirect
URI) on every auth attempt.
Pull the port from the actual TidalClient.redirect_uri the OAuth
URL was just built with (urlparse), with the SOULSYNC_TIDAL_CALLBACK_PORT
env var as fallback when the URI can't be parsed. Both the Step 2
example and the Step 3 highlighted URL now reflect whatever Tidal
port the user is actually configured to use.
Adds 3 regression tests covering the reported scenario, custom
callback ports via SOULSYNC_TIDAL_CALLBACK_PORT, and a defensive
fallback when redirect_uri is unparseable. Tests hit the real
/auth/tidal route through Flask's test client and assert the
rendered HTML, so future hardcoded ports get caught immediately.
The /api/library/watchlist-all-unwatched endpoint required the
user's currently active metadata source's ID column on each library
artist. A Spotify-primary user with library artists only matched
against iTunes or Deezer saw them silently skipped — surfacing on
Discord as "Library and Watchlist not syncing correctly". The per-
artist Enhanced View sync sometimes "fixed" them because it triggered
metadata enrichment that occasionally populated the missing Spotify
ID, but couldn't help artists Spotify simply doesn't carry.
Extracts the picker as a standalone helper so it can be tested
directly:
core/watchlist/source_picker.py:pick_artist_id_for_watchlist
Picks the active source first when available, then falls back through
spotify -> itunes -> deezer -> discogs in registration order. Empty
strings count as missing. Numeric IDs are coerced to str so SQLite's
TEXT columns store them in the same form library code reads back.
Returns (None, None) only when the artist has zero source IDs — the
only legitimate skip reason now.
Adds 10 regression tests covering active-source priority for each
supported primary, fallback ordering through every secondary, the
zero-IDs base case, unrecognized active source (e.g. hydrabase still
falls through), empty-string handling, and numeric coercion.
Discord-reported scenario: a single "Super Single" by Artist1 feat.
Artist2 is also on Artist1's "Super Album". When the album is fully
owned, Artist1's discography correctly shows the single as complete,
but Artist2's discography (where the same track also appears as a
single) shows it as missing.
Two layers needed for the fix:
Scanner: the Jellyfin/Emby path was keeping only ArtistItems[0],
which is almost always equal to the album artist — so the
distinguishing per-track credit was silently suppressed. Now joins
every ArtistItems entry with "; " and stores the value when there
are multiple credits OR when the single credit differs from the
album artist. Plex's originalTitle already carries the full multi-
artist tag, so Plex users benefit without needing the scanner change.
Scorer: _calculate_track_confidence now splits track_artist on the
common multi-artist delimiters real-world tags use (",", ";", "&",
"feat.", "ft.", "featuring", "vs.", "x") and scores each piece
independently against the search artist, taking the max along with
the whole-string similarity as the floor. Never reduces a score —
purely additive matching for previously-missed featured-artist
credits.
Adds 12 regression tests covering the reported scenario, primary-
artist back-compat, every delimiter variant (parametrized), no-
regression on exact match, and the scanner storing every ArtistItem.
Existing Jellyfin-scanned rows persist their old single-artist value
until the next library scan rewrites them; Plex rows benefit
immediately on next match without needing a rescan.
Re-enabling the previously-dead album-aware fallback could in
theory leak false positives if its 0.8 album-title floor were
ineffective. Pin the floor with a clearly-mismatched album hint
("Disney Hits" against "Ray of Light") and assert the search
returns no match. Distinct artist names with no shared words so
the main path actually fails through to the fallback (the prior
draft used "Different Artist" / "Real Artist" which both contain
"Artist" and scored above the main path's threshold, never
reaching the fallback at all).
Two bugs surfacing the same user-reported symptom: a Vaiana OST
track ("Where You Are" by Christopher Jackson) wouldn't match against
a Plex/Emby library because the album sits under the album artist
(Lin-Manuel Miranda).
Bug 1: the data was already there but scoring ignored it. The DB
schema has a tracks.track_artist column, the scanner populates it
from Plex's originalTitle and Jellyfin's ArtistItems[0], and the SQL
WHERE clause already searches it — but _rows_to_tracks dropped the
column on its way to the Python object, and _calculate_track_confidence
only scored against the album-artist JOIN. Candidates whose track-
artist matched got returned by the search and then immediately
filtered out by the low confidence score.
Fix: _rows_to_tracks now propagates row['track_artist'] onto the
returned object, and _calculate_track_confidence takes the better of
(album-artist similarity, track-artist similarity) so soundtracks
match through whichever credit the search query carries.
Bug 2: the album-aware fallback path constructed DatabaseTrack with
kwargs the dataclass doesn't accept (artist_name, album_title,
server_source). Every row TypeError'd, the outer except swallowed it
silently, and the fallback never matched anything since the column
was added — invisible because nothing logged it.
Fix: build DatabaseTrack with valid fields and attach the joined
columns afterwards, the same pattern _rows_to_tracks uses.
Adds 6 regression tests covering: track-artist match (the OST case),
album-artist still matches, scorer takes the better of the two,
defensive handling for tracks without track_artist, search-path
attribute propagation, and the previously-dead album-aware fallback.
The "Clean Search History" automation card kept showing a stale
'DownloadOrchestrator' object has no attribute 'base_url' error
even after the underlying handler bug was fixed in 77d20e9. Root
cause is in the engine, not that handler: AutomationEngine only
captured uncaught exceptions into last_error. Handlers that
report failure by RETURNING {'status': 'error', ...} were treated
as successful from the engine's perspective, so subsequent
gracefully-failing runs never updated the row to reflect the
current state.
Both the timer (run_automation) and event (_handle_event_trigger)
paths now extract the error string from a result whose status is
'error', falling through 'error' -> 'reason' -> 'message' -> a
placeholder so last_error is never None on actual failures
regardless of which key the handler chose. Existing behaviour for
raised exceptions and successful runs is preserved.
Also normalizes _auto_clean_search_history's return key from
'reason' to 'error' so older deployed engines that only check
the canonical key still see the failure.
Adds 7 regression tests covering every result shape the engine
might receive.