When a file failed AcoustID verification and got quarantined, the next
auto-wishlist cycle would search for the same track, the deterministic
quality picker would re-select the same (uploader, filename) source,
re-download it, and re-quarantine it. Users woke up to hundreds of
duplicate .quarantined entries from a single bad upload — same source
URL repeatedly, byte-for-byte identical files.
Root cause: `SoulseekClient.filter_results_by_quality_preference` ranks
candidates by quality + bitrate density only. Quarantine history wasn't
consulted, so a high-bitrate FLAC upload with a wrong-track AcoustID
fingerprint kept winning the picker against every other candidate.
Fix shape:
- New helper `core/imports/quarantine.py::get_quarantined_source_keys`
reads every quarantine sidecar's `context.original_search_result`
and returns the set of `(username, filename)` tuples for O(1)
membership checks. Sidecars missing the context field (legacy thin
sidecars written pre-Feb 2026, or orphaned files) and corrupt JSON
are skipped silently — defensive against transient FS / encoding
issues.
- `SoulseekClient._drop_quarantined_sources` runs the membership
filter against incoming TrackResults, drops matches, logs a single
INFO line with the skip count. Called first inside
`filter_results_by_quality_preference` so all four callers
(search-and-download, master worker, validation, orchestrator)
benefit transparently.
- Approving or deleting a quarantine entry removes its sidecar, so
the dedup key disappears from the set on the next search — gives
the user a way to opt back in to a previously-quarantined source
without restarting the app.
7 helper tests cover: missing dir, empty dir, well-formed sidecars
collected as tuples, legacy sidecars skipped, empty source fields
skipped (so empty-string keys can't accidentally drop unrelated
results), corrupt JSON tolerated, duplicate quarantines collapse.
5 integration tests pin: clean candidates pass, known-bad candidates
drop, missing quarantine dir returns input unchanged, filesystem
errors swallowed (defensive), full `filter_results_by_quality_preference`
runs the dedup BEFORE the quality picker — so a high-quality
quarantined source can't win on bitrate.
692 existing download + import tests still green. Cosmetic surface
of the fix is invisible — same UX as today when no quarantine entries
exist; loop only kicks in once a sidecar has been written.
Out of scope: bulk-select / multi-delete UI for the quarantine tab —
S-Bryce mentioned this as a separate pain point in the issue, but
it's its own UX work, not a one-commit drive-by.
S-Bryce reported that for some artists (Vocaloid producers, JP indie
acts, niche Western indie) the artist detail page was missing whole
release-groups visible on musicbrainz.org. Downloaded tracks from
those release-groups appeared in artist track counts but were not
bound to any visible album / single card — orphan "ghost" tracks the
user couldn't browse to.
Two duplicated bugs fed each other:
1. `core/musicbrainz_search.py` browsed MB release-groups with
`release_types=['album', 'ep', 'single']`. MB's primary-type
vocabulary is {Album, Single, EP, Broadcast, Other} — music
videos, one-off web releases, and broadcast singles use Other.
Pre-fix the filter dropped them at the API layer.
2. Three sites duplicated the same "raw primary-type → internal
album_type" mapping with slightly different vocabularies and all
silently defaulted unknown values (including 'Other') to 'album':
core/musicbrainz_search.py `_map_release_type`
core/metadata/types.py inline `{single:single, ep:ep}.get(...)`
core/metadata/cache.py Deezer-specific record_type guard
Letting Other through the filter without a real mapper would have
placed music videos in the Albums view alongside LPs — visually
misleading.
Fix shape:
- New `core/metadata/release_type.py` — single canonical mapper
consumed by every provider's raw→Album projection. Knows the full
MB vocabulary including 'other' and 'broadcast'; routes both into
the singles bucket since they're functionally single-track
releases. Compilation secondary-type override preserved (MB's
canonical Greatest-Hits pattern is `primary=Album,
secondary=[Compilation]`).
- `core/musicbrainz_search.py` `_map_release_type` becomes a thin
alias for the new helper so the six internal call sites stay
intact. API filter gains 'other'.
- `core/metadata/types.py` Album projection drops its inline mini-
mapper and calls the canonical helper. Now also handles the
compilation secondary-type override it was previously missing.
- The Deezer-specific cache.py guard stays as-is — Deezer's
record_type vocabulary is closed (album|single|ep), not affected
by this issue.
Verified end-to-end against MB for S-Bryce's artist (`46196b9c-affa-
4616-b53b-e967c8bd70e0`, inabakumori): pre-fix returned 22 release-
groups; post-fix returns 27, with the 5 extra all landing in the
Singles section with album_type='single' as intended.
23 new unit tests pin the mapper contract (case-insensitive primary
types, compilation secondary override, Other/Broadcast → single,
unknown → album default preserved, defensive empty/None inputs).
2 new tests in test_musicbrainz_search pin the API filter inclusion
of 'other' and the round-trip into the Singles bucket. All 516
existing metadata tests still green — refactor leaves historical
behaviour for {album, ep, single, compilation} unchanged.
When slskd_url is configured but the host is unreachable (slskd not
running, wrong port, host.docker.internal not resolving), the frontend's
/api/downloads/status polling fanned out to every download plugin
including Soulseek. soulseek_client._make_request hit a DNS / connect
failure on each poll and logged it at ERROR. Result: one
"Cannot connect to host host.docker.internal:5030" log line every
~2-3 seconds for the entire duration of any download — visible spam
even when the user wasn't using Soulseek at all.
Caught aiohttp.ClientConnectorError explicitly in both _make_request
and _make_direct_request. First failure emits one WARNING with
actionable context (start slskd, or clear soulseek.slskd_url if you
don't use Soulseek). Subsequent failures demote to DEBUG. The
_last_unreachable_logged flag resets on any successful (200/201/204)
response so a later outage warns again — suppression is per-outage,
not per-process-lifetime. Same shape as the existing _last_401_logged
suppression for auth failures.
The architectural gap (status polling fans out to soulseek even when
the user has soulseek disabled in their active download sources) is
intentionally left for a follow-up. The plugin-iteration code lives
in core/download_engine/engine.py and core/download_orchestrator.py;
threading a "skip-when-not-active" gate through every caller is a
bigger refactor than this user-facing log cleanup warrants. The
WARNING-once message tells the user what to do in the meantime.
5 new pinning tests cover the suppression contract: connection error
returns None (not raises), first failure WARNs + sets flag, repeats
stay quiet, successful response resets the flag, _make_direct_request
follows the same pattern, and non-connection exceptions still log at
ERROR so real bugs aren't hidden behind the new suppression.
The Fix Track Match modal's auto-search was hardcoded to query only
Spotify -> Deezer -> iTunes, ignoring MusicBrainz entirely — even for
users with MB set as their primary metadata source. MB-niche recordings
(canonical entries with diacritics, fringe / non-mainstream tracks that
the commercial catalogues don't carry) had no chance.
Wiring:
- New `MusicBrainzSearchClient.search_tracks_with_artist(track, artist,
limit)` for surfaces that already have title + artist split. Uses MB's
bare-query mode (strict=False) — diacritic-folded, alias/sortname
indexed — same recall rationale as the earlier MBID-paste endpoint.
- New route `GET /api/musicbrainz/search_tracks` mirrors the existing
/api/{spotify,itunes,deezer}/search_tracks endpoints exactly: accepts
`track`+`artist` (or legacy `query`) + `limit`, returns
`{tracks: [{id, name, artists, album, duration_ms, image_url, source}]}`.
Applies the same `core.metadata.relevance.rerank_tracks` pass Deezer /
iTunes use, which is critical because MB's free-text scoring weighs
title-text matches heavily and would otherwise rank cover / tribute
recordings above the canonical version.
- `_search_tracks_text` gains a `min_score` parameter. The cascade path
passes 20 (vs the enhanced-search-tab default of 80) so MB recordings
whose title doesn't literally contain the artist name still enter the
candidate pool — without that, "Army of Me" + "Bjork" only surfaces
the HIRS Collective cover (score 100) and drops Björk's canonical
recording (score 28). The rerank pass then surfaces Björk by artist
match. Verified against real MB API: pre-fix returned only the cover;
post-fix top 5 are all Björk.
- Fix popup `allSources` array (wishlist-tools.js) gets MB appended.
The existing `activeIdx` reorder logic moves MB to the front when
it's the active primary; otherwise MB sits last (1 req/sec rate
limit makes it the slowest source).
7 new unit tests on the adapter: bare-query mode is used, missing
artist falls back to None (drops AND-clause), empty inputs short-circuit,
low-score candidates are kept for rerank to handle, default strict +
default min_score behaviour preserved for the existing search-tab path,
client errors are swallowed so the cascade falls through to the next
source.
Discogs intentionally absent — Discogs has no track-level search API
(see core/discogs_client.py:575 — returns []). Adding a Flask endpoint
that always returns empty would be a permanent no-op.
Commit 478bcc5d (`fix(amazon): search albums/artists and track numbers
for t2tunes`) switched `search_albums` to query `types=track` and derive
Album objects from the album metadata on each track hit — Amazon's
album-type query is broken upstream. The matching test was left asserting
the old "filter out track hits → return []" behavior and has been failing
in CI ever since.
Rewritten to assert the current intended behavior: track hits yield
distinct albums by album ASIN, with the artist credit + name preserved.
No code change.
Power-user escape hatch on the Discovery Fix Track Match modal — when
fuzzy auto-search ranks the wrong recording among many same-title
versions (10 remasters, live cuts, alt sessions), paste the MusicBrainz
recording URL or bare UUID into the new field and resolve straight to
that record.
Layout:
- Shape adapter `get_recording_flat(mbid)` lives in
`core/musicbrainz_search.py` next to existing `get_track_details`.
Returns the flat Fix-popup track shape (artists as `string[]`,
album as string, single `image_url`) — distinct from the
Spotify-shaped nested dict `get_track_details` returns.
- New route `GET /api/musicbrainz/recording/<mbid>` is a thin wrapper:
validates MBID format with an anchored UUID regex, calls the adapter,
returns 400 / 404 / 200 with no inline shape massaging.
- Frontend `parseMusicBrainzMbid()` lives in `shared-helpers.js` —
pure URL/UUID parser, reusable from other surfaces (failed-MB cache,
manual match) without duplication.
- Fix modal HTML gets one new input row + button; existing search row
and result render pipeline are untouched. New `lookupDiscoveryFixByMbid()`
fetches the endpoint and feeds the single result through the existing
`renderDiscoveryFixResults` -> confirm-dialog -> match pipeline, so MB-
paste matches go through the exact same selection flow as auto-search
results.
- Enter-key bound on the MBID input via a separate handler ref so its
lifecycle matches the search-input handlers without conflating the
two submit targets.
7 unit tests cover the adapter: happy path, empty/None MBID, MB returns
None, recording-without-release (empty album), multi-artist credits,
includes-list contract, and client-error swallow.
Out of scope: the Fix popup's fuzzy cascade is still hardcoded to
spotify/deezer/itunes regardless of which primary source the user has
configured. Adding MB to that cascade (when MB is the active primary)
is a separate concern.
Two bugs surfacing on the Fix popup and enhanced-search MB tab:
1. Strict Lucene phrase queries (`recording:"X" AND artist:"Y"`) killed
recall on user-facing manual search — diacritics ("Bjork" vs canonical
"Björk"), bracketed suffixes like "(Live)", and any AND-clause
mismatch returned zero results. Added `strict: bool = True` param to
`search_release` / `search_recording`; when False, sends a bare query
joining title + artist so MB hits alias/sortname indexes with
diacritic folding. `/api/musicbrainz/search` (Fix popup) and
`core/library/service_search.py` (service tabs) now pass strict=False.
Enrichment workers stay on strict mode — precision matters there
because they auto-accept the top hit above a confidence threshold.
2. Every MB album click was silently 404-ing — `_render_release_as_album`
passed `cover-art-archive` as an MB `inc` param, but it's not a valid
include for the /release resource (MB rejects with 400). The CAA flags
come back on every release response by default, so dropping the bad
include preserves the image-scope picker logic intact.
Add MusicBrainz watchlist artist ID storage, badges, linked-provider editing, and per-artist preferred source support.
Backfill watchlist MusicBrainz matches from already-enriched library artists so existing MusicBrainz worker matches appear in watchlist cards and settings.
Extend bulk watchlist add, liked artist matching, artist map source picking, and service status labels to recognize MusicBrainz, with regression tests for watchlist ID persistence and backfill.
Register MusicBrainz as a first-class metadata source alongside Deezer, iTunes, Spotify, Discogs, and Hydrabase. Expose the shared client through metadata services, add the settings option, and expand the MusicBrainz search adapter with source-compatible artist, album, track, and detail methods.
Carry MusicBrainz IDs through similar-artist discovery, recommended artists, artist map serialization, and personalized playlist selection. Update DB migrations and lookup filters so similar_artist_musicbrainz_id is preserved on older schemas and used for source requirements and library exclusion.
Normalize MusicBrainz album adapter output for import context and add regression coverage for registry mapping, typed album conversion, and similar-artist filtering. Verified by user with 120 focused tests passing.
Manual matches can be created from sync history as mirrored while wishlist and download flows later see the same track as wishlist or a provider source. Add a shared track-level lookup that falls back from exact source/id to source_track_id and title/artist, then use it for wishlist adds, cleanup, and download analysis so mapped tracks are not re-added or redownloaded.
Add coverage for mirrored-source matches being honored by wishlist cleanup and download batches, including the internal wishlist force-download path.
Ensure the Amazon enrichment worker verifies its required columns before querying pending work or progress, preventing upgraded installs from spamming no-such-column errors when amazon_match_status is missing.
Add regression coverage for legacy databases without Amazon enrichment columns.
Artist detail pages previously always pushed /artist-detail to the URL,
so refreshing the page or sharing a link would drop users on a broken
empty page with no artist loaded.
URL format is now /artist-detail/:source/:id (e.g.
/artist-detail/spotify/4tZwfgrHOc3mvqsCAfo4LT or
/artist-detail/library/42). The source segment lets the backend
synthesize a response from the right metadata client without a DB hit.
Changes:
Client routing (legacy shell + TanStack bridge)
- buildArtistDetailPath / _getDeepLinkArtistDetail added to init.js;
parse both new :source/:id and legacy bare :id formats so old
bookmarks still work
- navigateToPage passes artistId + artistSource through to the router
bridge, which builds the dynamic href instead of hardcoding route.path
- resolveShellPageFromPath / resolveLegacyShellPageFromPath use a prefix
match so /artist-detail/* resolves to artist-detail page-id
- globals.d.ts typed for artistId / artistSource options
- activateLegacyPath and syncActivePageFromLocation (popstate) both
restore artist from URL using skipRouteChange:true to avoid a
re-navigation loop back to /artist-detail
- loadInitialData restores artist from URL on page load (router not yet
mounted at DOMContentLoaded so legacy path runs unconditionally)
- Same-artist guard in navigateToArtistDetail prevents double-fetch
when the router fires activateLegacyPath after the initial navigation
Server
- artist_source_detail.build_source_only_artist_detail now resolves
artist name from the source API when none is supplied, so deep-link
restores with an empty name string still render correctly
Tests
- test_spa_deep_linking: /artist-detail/42 and /artist-detail/spotify/ID
both serve index.html
- bridge.test.ts: source-aware URL building and library fallback
- route-manifest.test.ts: prefix path resolution
- artist_source_detail: name resolved from source when input is empty
Add service-level coverage for the Enhanced Library I Have This flow: copying an existing source file, writing the target album DB row, preserving source audio, inheriting album identity tags, and migrating older track tables that lack disc_number.
Add a conservative Soulseek album preflight scorer so album downloads choose a coherent slskd folder before per-track enqueue. The scorer compares album title, artist, year, track count, tracklist coverage, peer quality, and penalizes unexpected deluxe/remix/live-style folders.
Preserve hybrid source priority by only running Soulseek album preflight when Soulseek is the selected source or first in the hybrid order. If Soulseek is only a fallback behind another source, the normal hybrid flow is left alone.
Reuse the richest wishlist album context across tracks in the same album group so release date, artwork, album type, and album artist stay consistent for path generation. Also preserve peer-quality tie breakers when attempting equal-confidence candidates.
Tests cover correct-folder selection over larger wrong editions, Soulseek primary vs fallback hybrid behavior, shared wishlist album context, and peer-quality candidate ordering.
Schema: ALTER TABLE artists ADD COLUMN amazon_id TEXT with index, added via
_add_amazon_columns migration called after Discogs in _run_migrations.
SOURCE_ID_FIELD: add "amazon" -> "amazon_id" entry. find_library_artist_for_
source now looks up Amazon artists by slug before falling back to name match,
same as every other source. artist_source_detail already stamps artist_info
[source_id_field] = artist_id so the amazon_id is set on source-only payloads.
Tests: add "amazon": "amazon_id" to EXPECTED_SOURCE_ID_FIELD; revert test
assertion back to strict equality (SOURCE_ONLY_ARTIST_SOURCES == SOURCE_ID_
FIELD.keys() holds again now that amazon has a column).
Library upgrade: find_library_artist_for_source returned None immediately for
Amazon because SOURCE_ID_FIELD has no 'amazon' entry (no DB column for Amazon
artist IDs). The name-based fallback was unreachable. Fix: only skip the column
query when column is None, not the whole function — name lookup now runs for
any source when artist_name + active_server are provided.
Artist images: add AmazonClient._get_artist_image_from_albums so the standard
_get_artist_image_from_source path in metadata/artist_image.py can call it as
a fallback (same hook iTunes/Deezer/Discogs expose). Searches by unslugified
artist name, matches primary artist, fetches album cover from album_metadata.
Test: updated test_source_only_set_matches_mapping_keys → _contains_all_mapped_
sources to assert subset (not equality) — SOURCE_ONLY_ARTIST_SOURCES intentionally
includes sources without a DB column that rely on name-only lookup.
T2Tunes albumList entries may not include a release_date field, leaving the
$year path template empty. get_album() now falls back to the first track's
release_date (populated from the FLAC date tag via get_album_tracks) when
album metadata has none. Also try camelCase releaseDate key at all albumList
read sites (Album.from_metadata, get_album, _fetch_album_metas consumers).
1 new test: release_date backfilled from stream date tag when absent from
album metadata. date tag "2024-11-22" added to MEDIA_RESPONSE_FLAC fixture.
media_from_asin returns no duration data. get_album_tracks now does one
search_raw call using the album name + primary artist from stream tags,
filters hits by albumAsin == requested asin, and builds a duration_map
(track asin → duration_ms). Search failures are swallowed — duration_ms
falls back to 0 so the existing behaviour is preserved on error.
2 new tests: duration populated when search returns matching hit; duration
stays 0 when search endpoint returns an error.
- All search_raw calls switched from single-type to types="track,album" — T2Tunes only
returns results when both types are requested together
- _fetch_album_metas: parallel fetch (up to 5 workers) of album cover art via
album_metadata(asin) — T2Tunes search results carry no image URLs
- search_tracks: populates image_url, release_date, total_tracks from album meta
- search_artists: strips feat. credits via _primary_artist() so "Artist feat. X" and
"Artist ft. Y" collapse to one "Artist" entry; uses album cover as artist image
stand-in (same approach as iTunes — T2Tunes has no artist images)
- search_albums: name-based dedup (display_name + artist key) instead of ASIN-based;
populates image_url, release_date, total_tracks from album meta (cap 10 ASIN fetches)
- _strip_edition(): strips [Explicit]/(Explicit) from track/album names — explicit is
the default version; Clean/Edited/Censored labels kept as-is so they stay distinct
- get_album(): applies _strip_edition to name and _primary_artist to artist so
MusicBrainz preflight matching doesn't fail on "[Explicit]" album names
- get_album_tracks(): populates track_number and disc_number from T2TunesStreamInfo
instead of hardcoding None — fixes track ordering in multi-track album downloads
- get_artist() / get_artist_albums(): _unslugify() converts slug artist IDs back to
search names; _primary_artist() in comparison handles feat-annotated results
- SOURCE_ONLY_ARTIST_SOURCES: added "amazon" so artist detail page doesn't 404
- build_source_only_artist_detail: added amazon_client param + dispatch branch
- web_server.py: resolve amazon_client in _build_source_only_artist_detail wrapper;
add source_override=="amazon" branch in get_spotify_album_tracks endpoint
- 77 tests covering all above paths; all pass
get_event_loop() raises RuntimeError on Python 3.11+ Linux when no loop
exists. asyncio.run() creates its own loop per call — no deprecation warning,
works across all supported Python versions.
The download monitor blocks post-processing with a bytes-incomplete guard:
if size > 0 and transferred < size: continue
_stream_to_file throttles engine updates to every 0.5s. The last tick before
the file finishes typically leaves transferred slightly below the Content-Length
size in the engine record. Other streaming clients (YouTube, Tidal, HiFi, etc.)
use their own download threads and don't track bytes at all, so size stays 0
and the guard is always skipped. Amazon was the only client hitting it.
Fix: just before returning the file path from _download_sync, write a final
engine record update setting size == transferred == out_path.stat().st_size
(the decrypted output size). The bytes-incomplete guard then sees
transferred == size and falls through to trigger post-processing normally.
`get_all_downloads` was calling `engine.get_all_records()` — a method that
doesn't exist on DownloadEngine. Same story for `cancel_record` and
`clear_completed`. The engine exposes `iter_records_for_source`, `get_record`,
`update_record`, and `remove_record` — matching what every other streaming
client (Deezer, HiFi, Qobuz, SoundCloud, Tidal, YouTube) already uses.
With `get_all_downloads` silently returning `[]` on every call (the missing
method raised, `except Exception: return []` swallowed it), the download monitor
never saw Amazon records as complete — tasks stayed stuck at 0% even after the
file had fully downloaded.
Changes:
- `get_all_downloads` → `iter_records_for_source('amazon')`
- `get_download_status` → `get_record('amazon', id)`, no try/except
- `cancel_download` → `get_record` check + `update_record` (Cancelled) +
optional `remove_record` — same pattern as deezer/hifi/etc
- `clear_all_completed_downloads` → iterate + `remove_record` for terminal
states; returns True on no-engine (nothing to clear = success)
- `_record_to_status` drops the `download_id` argument; reads `rec['id']`
instead (worker stores `'id'` in every record — `iter_records_for_source`
returns the full record dict)
Tests updated to match: `iter_records_for_source` mock replaces
`get_all_records`, cancel test verifies `update_record`+`remove_record`,
clear test verifies only terminal-state records are removed, graceful-error
test replaced with no-records boundary test (exception propagation is handled
at the engine aggregator layer, not per-plugin).
The engine worker stores the encoded filename under the key 'filename'
(see worker.py dispatch). _record_to_status was reading 'original_filename',
which always returns "" — so every DownloadStatus emitted by
get_all_downloads/get_download_status had an empty filename string.
The download monitor builds lookup keys as
_make_context_key(download.username, download.filename). With filename=""
the key was always "amazon::" which never matched the task's
"amazon::B0B1234||Artist - Title" key. Monitor never detected Amazon
download completions, so tasks sat stuck at Downloading 0% forever even
though the files had actually downloaded.
Also fixes tests that had the same wrong key.
AmazonDownloadClient was missing set_engine() and set_shutdown_check().
The download engine auto-wires plugins by calling set_engine(self) at
registration time if the method exists (engine.py:136). Without it,
_engine stayed None forever, causing every download() call to raise
RuntimeError("_engine is not set") — silently failing and marking all
tracks not found.
All other streaming clients (Deezer, Qobuz, Tidal, HiFi, SoundCloud)
expose set_engine(); Amazon now matches the pattern.
Tests added: set_engine wires _engine, set_shutdown_check wires callback,
set_engine unblocks download dispatch (the exact live failure mode).
`validation.py` had amazon absent from `_streaming_sources`, causing
Amazon TrackResult objects (bitrate=None, size=0) to fall through to
the Soulseek P2P code path and get rejected by
`filter_results_by_quality_preference`. Every album track was marked
not found.
Fix: add 'amazon' to every streaming-source guard tuple/set that was
previously missing it:
- core/downloads/validation.py — primary bug fix (quality-filter bypass)
- core/downloads/status.py — _STREAMING_SOURCE_NAMES frozenset
- core/downloads/task_worker.py — hybrid fallback client map
- core/imports/side_effects.py — || filename→stream-id extraction
- web_server.py — is_streaming_source, transfer list display,
candidate source label, _try_source_reuse, _store_batch_source
- tests/test_download_plugin_conformance.py — registry count + parametrize
Also updates the 2.5.3 What's New entry to drop the stale
"not yet wired" disclaimer.
core/amazon_client.py — T2Tunes-backed metadata client following the
DeezerClient/iTunesClient contract. Exposes search_tracks, search_artists,
search_albums, get_track_details, get_album, get_album_tracks, get_artist,
get_artist_albums, get_track_features. T2TunesStreamInfo dataclass captures
the hex decryption key returned by the proxy (CENC/AES-128). Handles the
"stremeable" API typo. 0.5 s rate-limit guard + api_call_tracker.
core/amazon_download_client.py — DownloadSourcePlugin backed by the above
client. Codec waterfall: FLAC → Opus → EAC3. Downloads the encrypted MP4
container, decrypts with ffmpeg -decryption_key, yields the native audio
file (.flac / .opus / .eac3). Not yet wired into the app source registry —
validated in isolation only; see tests/tools/.
tools/t2tunes_probe.py + tools/t2tunes_media_plan.py — standalone CLI tools
used for live API exploration during development.
tests/tools/test_amazon_client.py — 72 unit tests (all mocked).
tests/tools/test_amazon_download_client.py — 52 unit tests (all mocked).
124 tests pass.
Reproduced: selecting Fresh Tape (or any kind never generated before)
and running the pipeline silently skipped — UI showed
"No tracks in Fresh Tape — skipping sync" with no clue why.
Root cause: ensure_playlist auto-creates the playlist row on first
access with `track_count=0` and `last_generated_at=NULL`, but
`is_stale=0` by default (the column default — fresh rows aren't
"stale", they're "never generated"). Pipeline only refreshed when
`is_stale=True` OR `refresh_first=True`, so first-run rows fell
through both branches → read the empty snapshot → skip.
Fix: pipeline now also refreshes when `existing.last_generated_at is
None`. Same control flow, one extra condition:
if refresh_first OR is_stale OR last_generated_at is None:
refresh
else:
read existing snapshot
This is the right signal: "has the generator ever run for this row"
is exactly what `last_generated_at` tracks (the column is set in
`_persist_snapshot` after every successful refresh).
Stubs in test_handlers_personalized_pipeline.py updated to expose
`last_generated_at` on their SimpleNamespace returns so the new
attribute read doesn't AttributeError. Fresh stubs get a non-None
timestamp so they're treated as already-generated; the new test
`test_never_generated_snapshot_triggers_first_refresh` pins the
first-run-forces-refresh behavior with `last_generated_at=None`.
Snapshots now track when their source data changes. Watchlist scan
emits stale flags on the playlists whose underlying pool just got
refreshed; the next pipeline run sees the flag and regenerates the
snapshot before syncing, so the server playlist never lags the source.
Schema:
- new `is_stale INTEGER NOT NULL DEFAULT 0` column on
`personalized_playlists`, plus an idempotent ADD COLUMN migration
in `ensure_personalized_schema` for installs created before this PR.
- `PlaylistRecord.is_stale: bool = False` exposed on the dataclass so
callers can branch on freshness without re-querying.
Manager:
- new `mark_kinds_stale(kinds, profile_id=None)` flips the flag in
bulk for a list of kinds (used by upstream data refreshers).
- `_persist_snapshot` clears `is_stale = 0` on successful refresh.
- SELECT statements + `_row_to_record` updated to read the column
(with tuple-form length guard for safety).
Pipeline:
- `_build_payloads_for_kinds` now branches: refresh_first=True OR
`existing.is_stale` -> refresh_playlist, else read existing
snapshot. So the auto-refresh kicks in without needing the user to
toggle the refresh-each-run option.
Watchlist scanner emits stale flags at three sites:
- after `update_discovery_pool_timestamp` -> marks pool-fed kinds
stale: hidden_gems, discovery_shuffle, popular_picks, time_machine,
genre_playlist, daily_mix.
- after release_radar `save_curated_playlist` -> marks `fresh_tape`.
- after discovery_weekly `save_curated_playlist` -> marks `archives`.
All three calls go through a module-level `_mark_personalized_kinds_stale`
helper that builds a PersonalizedPlaylistManager with `deps=None` (only
DB access is needed for the flag update — no generator dispatch). Each
call is wrapped in try/except so a flag failure can never abort the
scan itself.
Tests:
- new `TestStaleFlag` class in `test_personalized_manager.py` (6
tests): default-false, single-kind flip, multi-kind, profile
scoping, refresh-clears, empty-list noop.
- two new pipeline tests pin the auto-refresh dispatch:
`test_stale_snapshot_auto_refreshes_even_without_refresh_first`
and `test_non_stale_snapshot_skips_refresh`.
- existing stub-manager `SimpleNamespace` returns gained
`is_stale=False` so the new attribute read doesn't AttributeError.
Full suite: 3391 pass.
User-facing WHATS_NEW entry added under 2.5.2 (above the prior
pipeline auto-sync entry) describing the auto-refresh behavior.
The action was registered + the block declared, but the automation
builder's per-action config renderer didn't have a case for
`personalized_pipeline` so users only saw the bare card with the
generic delay-minutes input — no way to select which playlists to
sync. This commit adds the multi-select picker.
Backend:
- `core/personalized/api.list_kinds(manager=...)` now optionally
takes a manager and includes the resolved variant list per kind
(calls each spec's variant_resolver(deps) when present). Singleton
kinds get an empty `variants` list. Variant-bearing kinds
(time_machine / genre_playlist / daily_mix / seasonal_mix) get
their full enumerated set.
- `web_server.py` `/api/personalized/kinds` route now passes a built
manager so the variants list lands in the response.
Frontend:
- `webui/static/stats-automations.js` `_renderBlockConfigFields`
gains a `personalized_pipeline` branch that renders a scrollable
multi-select picker:
- Singletons (Hidden Gems, Discovery Shuffle, Popular Picks,
Fresh Tape, The Archives) = one checkbox row per kind
- Variant kinds = a section header + one checkbox row per variant
(e.g. Time Machine: 1960s/1970s/.../2020s; Seasonal: halloween/
christmas/valentines/summer/spring/autumn)
- Pre-checks rows that match the existing `kinds` config on edit
- New `_autoLoadPersonalizedKinds(slotKey)` fetches `/api/personalized/kinds`
(cached after first load), renders the picker DOM, and pre-checks
saved selections via `data-kind` / `data-variant` attributes on
the checkboxes.
- `_renderBuilderCanvas` calls the loader for any `cfg-*-kinds-picker`
it finds in the freshly-rendered slots.
- The save-time `_collectActionConfig` walks the picker's checked
inputs (matched by `data-kind` attribute) and emits
`{kinds: [{kind, variant?}, ...], refresh_first, skip_wishlist}`
in the same shape the handler expects.
Tests:
- `tests/automation/test_automation_blocks.py::_FIELD_TYPES` adds
'personalized_playlist_select' so the block-shape regression test
accepts the new field type. (Test was failing because it whitelists
every field type used across all blocks.)
- 189 automation + personalized API tests pass; full suite intact.
Follow-up to the personalized-playlists standardization PR. New
`personalized_pipeline` automation action syncs selected discover-
page playlists (Hidden Gems / Discovery Shuffle / Time Machine /
Genre / Daily Mix / Fresh Tape / The Archives / Seasonal Mix) to
the active media server + queues missing tracks for download.
Same pattern as the existing mirrored `playlist_pipeline` but two
phases instead of four — no REFRESH (no external source to re-pull)
and no DISCOVER (manager-backed snapshots are already metadata-
matched). Pipeline shape:
SNAPSHOT → SYNC → WISHLIST
Where SNAPSHOT either reads the persisted track list from
`PersonalizedPlaylistManager` (default) or refreshes it first when
`refresh_first=true` (cron use case: regenerate Hidden Gems nightly
and sync the fresh set).
Shared helper extraction:
PHASE 3 (SYNC loop) + PHASE 4 (WISHLIST tail) lifted out of mirrored
`playlist_pipeline` into `core/automation/handlers/_pipeline_shared.py`
as `run_sync_and_wishlist(deps, automation_id, playlists, sync_one_fn,
sync_id_for_fn, ...)`. Both pipelines call it. Mirrored injects
`auto_sync_playlist` as the per-playlist sync function; personalized
injects a thin wrapper that launches `_run_sync_task` directly with
a pre-built tracks_json. Same sync-state polling / progress emission
/ status counting / wishlist trigger logic — 0 duplication.
Files added:
- core/automation/handlers/_pipeline_shared.py
- core/automation/handlers/personalized_pipeline.py
- tests/automation/test_handlers_personalized_pipeline.py
Files changed:
- core/automation/handlers/playlist_pipeline.py: PHASE 3+4 replaced
with shared helper call (~100 lines deleted, 1 helper invocation
added; behavior identical).
- core/automation/deps.py: new `build_personalized_manager` field
(lazy builder so the pipeline gets a fresh PersonalizedPlaylistManager
per run).
- core/automation/handlers/__init__.py + registration.py: register
`personalized_pipeline` action with the shared `pipeline_running`
guard so it can't overlap mirrored.
- core/automation/blocks.py: new `personalized_pipeline` block
declaration with config_fields (kinds multi-select, refresh_first,
skip_wishlist).
- web_server.py: thread `_build_personalized_manager` into
AutomationDeps construction.
- All 5 automation test fixtures: `_build_deps` adds
`build_personalized_manager=lambda: None` stub.
- tests/automation/test_handler_registration.py:
EXPECTED_ACTION_NAMES + EXPECTED_GUARDED_ACTIONS gain
`personalized_pipeline`.
Trigger schema:
{
"_automation_id": "...",
"kinds": [
{"kind": "hidden_gems"},
{"kind": "time_machine", "variant": "1980s"},
{"kind": "seasonal_mix", "variant": "halloween"}
],
"refresh_first": false,
"skip_wishlist": false
}
Tests (14 new, 178 automation total):
- _track_to_sync_shape: basic shape, source ID fallback chain,
no-id returns empty string
- empty config / non-list kinds / empty kinds list all return
error + clear pipeline_running flag
- _build_payloads_for_kinds: skips invalid entries, skips kinds
with no tracks, refresh_first vs ensure dispatch, payload shape
+ sync_id format, manager exception swallowed continues
- _sync_personalized_playlist: launches background thread + returns
status='started'
- happy path: stubbed sync_states drives helper to completion, flag
cleaned up
Full suite: 3383 passed.
Note: the trigger UI block declares config_fields but the frontend
doesn't yet render the `personalized_playlist_select` multi-select
type — usable today via API; polished UI ships in a follow-up
frontend PR.
Adds the first quality feature on top of the manager: when
`config.exclude_recent_days > 0`, the manager drops any track from
the generator's output whose primary id was served by this kind
for this profile in the last N days.
Lives at the manager layer, not in each generator, so:
- generators stay focused on selection logic
- staleness behavior stays uniform across every kind
- enabling/disabling per playlist is just a config patch
Implementation:
- New `PersonalizedPlaylistManager._apply_quality_filters` runs after
generator returns, before `_persist_snapshot`.
- Reads recent ids via existing `recent_track_ids` accessor.
- Tracks without a primary id pass through unchanged (nothing to
dedupe on -- happens for sourceless tracks during edge cases).
- Returns a new list (never mutates input).
Default `exclude_recent_days = 0` preserves pre-overhaul behavior.
Per-playlist override via `PUT /api/personalized/playlist/<kind>/config`
with `{"exclude_recent_days": N}`. Recommended values:
- Discovery Shuffle: 1-3 days (high churn desired)
- Hidden Gems: 7-14 days (avoid same gems weekly)
- Time Machine / Genre: 30+ days (slow rotation OK, stable view preferred)
4 new boundary tests:
- Zero days = no filter (default behavior preserved)
- Positive days drops tracks served in window
- Filter preserves new tracks alongside dropped ones
- Tracks without primary id pass through unchanged
3369 tests pass total.
Note: listening-history cross-ref + seeded shuffle are deferred to
a future PR. Each requires deeper integration -- listening history
needs a play-events table the discovery pool can query against;
seeded shuffle needs the legacy generators to accept a seed param
without breaking their existing diversity / popularity logic.
Wraps the manager + generator dispatch behind one HTTP surface so
the UI can drop the patchwork `/api/discover/personalized/*` calls
in favor of a single REST shape. Legacy endpoints stay alive for
backward compat during the UI migration window.
New endpoints:
- GET /api/personalized/kinds — list every registered kind + metadata
- GET /api/personalized/playlists — list every persisted playlist for the active profile
- GET /api/personalized/playlist/<kind> — fetch singleton + tracks
- GET /api/personalized/playlist/<kind>/<variant> — fetch variant + tracks
- POST /api/personalized/playlist/<kind>/refresh — regenerate singleton
- POST /api/personalized/playlist/<kind>/<variant>/refresh — regenerate variant
- PUT /api/personalized/playlist/<kind>/config — patch singleton config
- PUT /api/personalized/playlist/<kind>/<variant>/config — patch variant config
Per-call manager construction wires the deps each generator needs:
- database (MusicDatabase singleton)
- service (PersonalizedPlaylistsService for legacy generator calls)
- seasonal_service (SeasonalDiscoveryService for seasonal_mix)
- get_current_profile_id (active profile accessor)
- get_active_discovery_source (source dispatcher)
API handlers themselves live as pure functions in
`core/personalized/api.py` so they're testable without Flask. The
Flask layer in `web_server.py` is a thin parse-body / call-handler /
jsonify wrapper.
11 new boundary tests (122 personalized total):
- list_kinds enumerates registry, exposes default config + tags
- list_playlists returns empty list when none exist, serializes
PlaylistRecord shape correctly
- get_playlist_with_tracks auto-creates on first access, returns
persisted tracks, raises ValueError on unknown kind
- refresh_playlist runs generator and returns track snapshot,
forwards config_overrides to the generator
- update_config patches stored config
3365 tests pass total. Manager construction triggers generator
registration via `from core.personalized import generators` import
side-effect.
Begins the standardization of the personalized-playlist subsystem.
Pre-existing state was a patchwork: Group A (Fresh Tape / Archives /
Seasonal Mix) lived in `discovery_curated_playlists` and
`curated_seasonal_playlists` with inconsistent shapes; Group B
(Hidden Gems / Discovery Shuffle / Time Machine / Popular Picks /
Genre / Daily Mixes) was computed on-demand by
`PersonalizedPlaylistsService` with no persistence -- every call
reran the generator with `ORDER BY RANDOM()` so results rotated.
Post-overhaul (this PR) every personalized playlist lands in one
unified storage layer with stable identity, persistent track lists,
explicit refresh, and per-playlist user-tweakable config.
Foundation in this commit (no behavior change yet):
- `database/personalized_schema.py`: 3 tables created idempotently
at app startup (wired into `MusicDatabase._initialize_database`).
- `personalized_playlists`: one row per (profile, kind, variant)
with config_json, track_count, last_generated_at,
last_synced_at, last_generation_source, last_generation_error.
Variant '' (empty string) for singletons; non-empty for
time_machine / seasonal_mix / genre_playlist / daily_mix.
- `personalized_playlist_tracks`: current snapshot per playlist.
Atomically replaced on refresh.
- `personalized_track_history`: append-only log powering the
`exclude_recent_days` config knob.
- `core/personalized/types.py`: `Track`, `PlaylistConfig`,
`PlaylistRecord` dataclasses. `PlaylistConfig.merged()` for
partial-update PATCH semantics; `Track.from_dict()` accepts the
legacy generator output shape unchanged.
- `core/personalized/specs.py`: `PlaylistKindSpec` (kind,
name_template, default_config, generator, variant_resolver) and a
module-level registry. Generators register at import time;
manager dispatches by kind.
- `core/personalized/manager.py`: `PersonalizedPlaylistManager` --
the only thing that touches the new tables. Owns:
- ensure_playlist (auto-create row from kind defaults)
- get_playlist / list_playlists
- refresh_playlist (atomic snapshot replace; generator exception
preserves previous good snapshot + records error on row)
- get_playlist_tracks
- update_config (deep-merge with stored config, including extra dict)
- recent_track_ids (staleness lookup for generators)
35 boundary tests in `tests/test_personalized_manager.py` pin every
shape: config round-trip / merge semantics / extra deep-merge /
defaults; Track.from_dict tolerance + primary_id fallback chain;
registry dedup / display_name with+without variant; manager
ensure_playlist auto-create + idempotency, variant separation,
required-variant enforcement, unknown-kind error; refresh persists
+ replaces atomically + survives generator exception with previous
snapshot intact + records source from first track + round-trips
nested track_data_json; update_config patch semantics; list_playlists
profile scoping; staleness history scoped to (profile, kind, days).
3304 tests pass total. Generators ship in subsequent commits on this
branch -- each kind migrated one at a time with its own per-kind
boundary tests.
Per-handler boundary tests pin each handler's body in isolation.
Adding engine-boundary tests that pin the REGISTRATION layer:
- every expected action name registered, no drops, no extras
- guarded actions register a guard, unguarded ones don't
- every registered handler is callable
- every guard returns a bool
- all four progress callbacks registered in the right slots
- progress_init / progress_finish / record_history / on_library_scan_completed
are invocable through the engine's stored callable shape (not just
the bare extracted function)
- finish callback respects _manages_own_progress flag at the engine
boundary too
- library_scan_completed wiring registers a callback on the scan
manager and that callback fires engine.emit when invoked
- every handler returns a `{'status': ...}` dict on a minimal config
trigger -- proves no handler raises into the engine, even when its
guard / short-circuit / error path is the one taken
Uses a minimal _RecordingEngine that captures registrations + a
_RecordingScanMgr that captures completion callbacks. No real
AutomationEngine, no real Flask app, no real DB. The kettui standard
for refactor PRs: don't ship "behavior preserved" claim that's only
validated at the function boundary -- exercise the engine seam too.
EXPECTED_ACTION_NAMES + EXPECTED_GUARDED_ACTIONS frozen sets at the
top: any future drift (rename / drop / add a handler / change which
ones are guarded) fails this test immediately so refactor PRs can't
quietly mutate the registration shape.
13 new tests, 164 automation tests pass total.
Cleans up the four remaining inline callbacks at the bottom of
`web_server._register_automation_handlers` so the function is now
purely deps-construction + register_all + a logger.info line.
Lifted:
- `_progress_init`, `_progress_finish`, `_record_automation_history`,
and `_on_library_scan_completed` -> core/automation/handlers/progress_callbacks.py
Each is a top-level function that takes deps as a parameter; the
engine sees thin lambdas through `register_progress_callbacks` /
`register_library_scan_completed_emitter` (called from `register_all`).
Two new deps fields:
- `init_automation_progress` (delegates into the live progress tracker)
- `record_progress_history` (delegates into _auto_progress.record_history)
12 new boundary tests in tests/automation/test_progress_callbacks.py
pin every shape:
- progress_init forwards to init_automation_progress
- progress_finish skips when handler manages its own progress
(prevents double-emit of finished status)
- progress_finish: completed -> finished/Complete/success;
error -> error/Error/error; msg falls through error -> reason ->
status -> 'done'
- record_history threads the live db into the recorder
- on_library_scan_completed: no engine = noop, server type taken
from web_scan_manager._current_server_type, defaults to 'unknown'
- register_library_scan_completed_emitter: no scan manager = noop,
registered callback emits the right event when invoked
3256 tests pass, no regression.
Final state of `_register_automation_handlers`:
- Was: 1530 lines, 21 nested closures + 4 progress callbacks
- Now: ~50 lines, builds AutomationDeps and calls register_all
web_server.py: 34,220 -> 34,187 lines (-33 net, -1,406 across the
whole branch).
Final commit of the automation-handler refactor. With this commit
every closure that used to live in
`web_server._register_automation_handlers` is now a top-level
function in `core/automation/handlers/`.
Handlers extracted in this commit:
- start_database_update + deep_scan_library
-> core/automation/handlers/database_update.py
Both share the db_update_state monitoring pattern (poll until
status flips, stall detection emits warning at 10 min, 2-hour
outer timeout). Lifted into a shared `_run_with_progress` helper
inside the module so the per-handler bodies stay tiny.
- run_duplicate_cleaner -> core/automation/handlers/duplicate_cleaner.py
- start_quality_scan -> core/automation/handlers/quality_scanner.py
- clear_quarantine, cleanup_wishlist, update_discovery_pool,
backup_database, refresh_beatport_cache
-> core/automation/handlers/maintenance.py
Grouped because each body is short (~20-50 lines) and they share
no state — splitting into per-handler files would just add import
noise.
- clean_search_history, clean_completed_downloads, full_cleanup
-> core/automation/handlers/download_cleanup.py
Grouped because all three reach the download orchestrator,
tasks_lock, and download_batches/download_tasks accessors. The
full_cleanup multi-step orchestration shares phase-detection
logic with clean_completed_downloads.
- run_script -> core/automation/handlers/run_script.py
- search_and_download -> core/automation/handlers/search_and_download.py
`AutomationDeps` grew with the new dependency surface:
- get_db_update_state + db_update_lock + db_update_executor +
run_db_update_task + run_deep_scan_task
- get_duplicate_cleaner_state + duplicate_cleaner_lock +
duplicate_cleaner_executor + run_duplicate_cleaner
- get_quality_scanner_state + quality_scanner_lock +
quality_scanner_executor + run_quality_scanner
- download_orchestrator + run_async + tasks_lock +
get_download_batches + get_download_tasks +
sweep_empty_download_directories + get_staging_path
- docker_resolve_path + get_current_profile_id +
get_watchlist_scanner + get_app + get_beatport_data_cache
- set_db_update_automation_id (writes the legacy global so the live
DB-update progress callbacks still living in web_server.py keep
emitting against the active automation card)
`web_server._register_automation_handlers` is now ~50 lines: build
deps once, call register_all. The 667-line block of remaining
closure definitions and engine register calls is gone.
The final orphan was the `_db_update_automation_id` module global —
the DB-update progress callbacks at line ~14080 still read it
directly, so the extracted database_update handler propagates the
automation id through `deps.set_db_update_automation_id` (a closure
in web_server that writes the global). When the legacy callbacks
get extracted in a future PR the setter goes away.
Tests:
- tests/automation/test_handlers_maintenance.py adds 21 boundary
tests covering every newly-extracted handler shape: guard
short-circuits (already-running returns skipped), deps wiring
(set_db_update_automation_id called with the right id),
exception swallow contract, status returns, path-traversal
blocked in run_script, source-mode skip in clean_search_history,
active-batch skip in clean_completed_downloads, etc.
- 3244 tests pass (was 3223 — 21 new), no regression.
web_server.py: 35,593 -> 34,220 lines (-1,373 net across 3 commits).
Issue #1 from the extraction punch list is now COMPLETE.
Continues the lift from `web_server._register_automation_handlers`.
This commit extracts the four playlist-lifecycle closures:
- `refresh_mirrored` -> core/automation/handlers/refresh_mirrored.py
- `sync_playlist` -> core/automation/handlers/sync_playlist.py
- `discover_playlist` -> core/automation/handlers/discover_playlist.py
- `playlist_pipeline` -> core/automation/handlers/playlist_pipeline.py
The pipeline composes refresh + sync + discover, so all four ship
together. The pipeline imports the other three handler modules
directly (cross-handler call) instead of going through the engine,
preserving the "single trigger from the user's perspective" UX.
`AutomationDeps` grew to cover the new dependency surface:
- run_playlist_discovery_worker, run_sync_task, load_sync_status_file
(pre-existing background-task entry points)
- get_deezer_client, parse_youtube_playlist (per-source clients)
- get_sync_states (live mutable accessor for the sync UI's state dict)
`web_server._register_automation_handlers` now wires those plus the
existing infrastructure into a single `AutomationDeps` and calls
`register_all`. The 669-line block of closure definitions and engine
register calls (lines 959-1627 pre-edit) is gone -- the file shed
743 lines net on this commit.
`tests/automation/test_handlers_playlist.py` adds 17 new boundary
tests:
- discover_playlist: no_id error, specific_id starts worker, all=True
enumerates, no playlists in db
- refresh_mirrored: error path, source filter (file/beatport excluded),
Spotify happy path with auto-discovered marker, per-playlist
exception captured into errors counter
- sync_playlist: no_id, not_found, no_tracks, no-discovered-tracks
skip, discovered-track happy path, unchanged-since-last-sync skip
- playlist_pipeline: no_playlist clears running flag, no-refreshable
clears running flag, exception clears running flag
3223 tests pass. web_server.py: 35,593 -> 34,850 lines (743 removed).
Begins the lift of `web_server._register_automation_handlers` (1530
lines, 20 nested closures) into `core/automation/handlers/`. Each
extracted handler is a top-level function that accepts
`(config, deps)` instead of reaching for module-level globals --
makes them unit-testable in isolation.
Infrastructure:
- `core/automation/deps.py`: `AutomationDeps` (dependency-injection
bundle of clients + callables) and `AutomationState` (mutable flags
shared across handler invocations, with thread-safe accessors).
- `core/automation/handlers/__init__.py` + `registration.py`: one-stop
`register_all(deps)` that wires every extracted handler to the
engine.
First batch of handlers extracted:
- `process_wishlist` -> `core/automation/handlers/process_wishlist.py`
- `scan_watchlist` -> `core/automation/handlers/scan_watchlist.py`
- `scan_library` -> `core/automation/handlers/scan_library.py`
`web_server._register_automation_handlers` now builds the deps once
and calls `register_all(deps)` for the extracted batch. Remaining
17 closures still live below; subsequent commits in this branch
finish the lift.
14 boundary tests in `tests/automation/test_handlers_simple.py` pin
every shape: success path, exception swallow contract, fresh-vs-stale
state detection (scan_watchlist's id() trick), guard short-circuits,
state cleanup on exceptions, AutomationState concurrent-safe accessors.
All 101 automation tests pass; no regression.
Issue #607 (AfonsoG6) -- two AcoustID problems:
1. Live recordings false-quarantining as "Version mismatch: expected
'... (Live at Venue)' (live) but file is '...' (original)" because
MusicBrainz often stores the recording entity with a bare title --
the venue / live annotation lives on the release entity, not the
recording. The audio fingerprint correctly identifies the live
recording, but the title-text comparison flagged it as wrong.
New pure helper `core/matching/version_mismatch.py:is_acceptable_version_mismatch`
accepts the mismatch only when:
- One-sided AND involves 'live': exactly one side is 'live' and
the other is bare 'original'. Two-sided mismatches stay strict.
- Fingerprint score >= 0.85 (stricter than the existing 0.80
minimum -- escape valve only fires when AcoustID is more
confident than its own threshold).
- Bare title similarity >= 0.70.
- Artist similarity >= 0.60.
Other version markers (instrumental, remix, acoustic, demo, etc)
stay strict -- those have distinct fingerprints AND MB always
annotates them in the recording title. The existing
test_acoustid_version_mismatch.py suite passes unchanged.
2. Audio-mismatch failure message reported "identified as '' by ''
(artist=100%)" when AcoustID returned multiple recordings -- prior
code mixed `recordings[0]`'s strings (which can be empty) with
`best_rec`'s scores. Now uses `matched_title` / `matched_artist`
consistently in both the high-confidence-skip path and the final
fail message.
Issue #608 (AfonsoG6) -- quarantine modal:
3. Approve / Delete buttons silently no-op'd when the filename
contained an apostrophe -- the unescaped quote broke the inline JS
in the onclick handler. Now wraps the id via
`escapeHtml(JSON.stringify(id))`, which round-trips quotes /
backslashes / unicode / newlines safely through the HTML attribute
to JS string boundary.
4. Bonus UX: quarantine entry expanded view now shows source uploader
(username) and original soulseek filename when the sidecar carries
that context -- helps trace which uploader the bad file came from.
Backend exposes `source_username` + `source_filename` fields from
`sidecar.context.original_search_result`. Degrades to '' on legacy
thin sidecars.
Tests:
- 23 new boundary tests in tests/matching/test_version_mismatch.py
pin every shape: equal versions trivial, one-sided live both
directions, threshold floors (each just below default -> reject),
two-sided strict, non-live one-sided strict (covers exact
test_instrumental_returned_for_vocal_request_fails scenario),
custom-threshold overrides.
- 4 existing test_acoustid_version_mismatch.py tests pass unchanged.
- 507 AcoustID / matching / imports tests pass.