Keep full refresh moving when post-clear VACUUM hits a transient disk I/O error, and retry clear_server_data once when the clear step itself sees the same transient SQLite failure.
Retry metadata cache maintenance writes once on transient disk I/O errors so first-attempt cache jobs do not fail when an immediate retry would succeed.
Tests cover best-effort VACUUM, clear retry behavior, and cache maintenance retry behavior.
Ensure upgraded databases have the tracks.file_size and albums.api_track_count columns after all legacy migrations run. Add defensive repair paths for Jellyfin track imports and album track-count caching so stale schemas self-heal instead of dropping full-refresh track imports.
Tests cover legacy schema repair and api_track_count self-repair.
Add MusicBrainz watchlist artist ID storage, badges, linked-provider editing, and per-artist preferred source support.
Backfill watchlist MusicBrainz matches from already-enriched library artists so existing MusicBrainz worker matches appear in watchlist cards and settings.
Extend bulk watchlist add, liked artist matching, artist map source picking, and service status labels to recognize MusicBrainz, with regression tests for watchlist ID persistence and backfill.
Register MusicBrainz as a first-class metadata source alongside Deezer, iTunes, Spotify, Discogs, and Hydrabase. Expose the shared client through metadata services, add the settings option, and expand the MusicBrainz search adapter with source-compatible artist, album, track, and detail methods.
Carry MusicBrainz IDs through similar-artist discovery, recommended artists, artist map serialization, and personalized playlist selection. Update DB migrations and lookup filters so similar_artist_musicbrainz_id is preserved on older schemas and used for source requirements and library exclusion.
Normalize MusicBrainz album adapter output for import context and add regression coverage for registry mapping, typed album conversion, and similar-artist filtering. Verified by user with 120 focused tests passing.
Manual matches can be created from sync history as mirrored while wishlist and download flows later see the same track as wishlist or a provider source. Add a shared track-level lookup that falls back from exact source/id to source_track_id and title/artist, then use it for wishlist adds, cleanup, and download analysis so mapped tracks are not re-added or redownloaded.
Add coverage for mirrored-source matches being honored by wishlist cleanup and download batches, including the internal wishlist force-download path.
Show actionable missing album tracks in the enhanced library from canonical metadata, with a practical Manage flow for Add to Library or I Have This.
Implement I Have This as a non-destructive copy/import path: copy the chosen existing file, run normal post-processing with the missing track context, insert the real library row, and inherit album identity tags from target siblings so Navidrome does not split albums.
Improve the modal with selectable search results, visible import progress, disabled controls during import, and missing-track row styling.
- Artist cards, hero section, and enhanced view now show Amazon Music badges
when amazon_id is populated (AMAZON_LOGO_URL constant, orange #FF9900 brand)
- Enhanced view artist and album match status rows include amazon_match_status
chip with click-to-rematch via openManualMatchModal
- getServiceUrl: added amazon (album/track ASIN → music.amazon.com) and fixed
missing discogs entries; serviceLabels adds tidal/qobuz/amazon
- Enhanced view enhanced-artist-id-badges includes amazon_id entry
- DB SELECTs for library artists list and artist detail now return amazon_id;
both response dicts include the field
- watchlist_artists migration adds amazon_artist_id column
- Watchlist config GET: amazon_artist_id in SELECT/WHERE/response (index 18)
- Watchlist artists list response includes amazon_artist_id
- link-provider endpoint: amazon added to valid_providers and col_map
- _populateLinkedProviderSection: amazonId param + Amazon Music source row
- Watchlist card source badges render Amazon pill (watchlist-source-amazon CSS)
- _openSourceSearch labels map includes amazon
- service_search: amazon_worker injected via init(); _search_service amazon branch
uses search_artists/albums/tracks, same {id,name,image,extra} return shape
- _SERVICE_ID_COLUMNS: amazon → amazon_id for artist/album/track
- _init_service_search call passes amazon_worker_obj
- amazon_client._fetch_album_metas: 5-minute TTL cache per ASIN — cached hits
skip _rate_limit() and HTTP call entirely; fixes ~10s artist detail load
- registry.py: removed amazon from METADATA_SOURCE_PRIORITY and
METADATA_SOURCE_LABELS — T2Tunes has no discography API, cannot serve as a
primary metadata source; Amazon remains a download source + ASIN enricher
- Settings metadata source dropdown and help text updated accordingly
Background worker matching library artists/albums/tracks to Amazon ASINs
via T2Tunes search. Follows same 6-tier priority queue as Deezer/iTunes/
Spotify/Qobuz/Tidal workers. Backfills artist thumbnails from album cover
stand-ins (T2Tunes exposes no direct artist images).
- core/amazon_worker.py: new AmazonWorker class with full parity
- database/music_database.py: expand _add_amazon_columns to cover
amazon_id/amazon_match_status/amazon_last_attempted on artists,
albums, and tracks (was artists-only)
- web_server.py: import, init, register in enrichment panel, add to
scan pause/resume dicts and rate monitor key map
- helper.js: WHATS_NEW 2.5.3 entry for enrichment worker
Schema: ALTER TABLE artists ADD COLUMN amazon_id TEXT with index, added via
_add_amazon_columns migration called after Discogs in _run_migrations.
SOURCE_ID_FIELD: add "amazon" -> "amazon_id" entry. find_library_artist_for_
source now looks up Amazon artists by slug before falling back to name match,
same as every other source. artist_source_detail already stamps artist_info
[source_id_field] = artist_id so the amazon_id is set on source-only payloads.
Tests: add "amazon": "amazon_id" to EXPECTED_SOURCE_ID_FIELD; revert test
assertion back to strict equality (SOURCE_ONLY_ARTIST_SOURCES == SOURCE_ID_
FIELD.keys() holds again now that amazon has a column).
Snapshots now track when their source data changes. Watchlist scan
emits stale flags on the playlists whose underlying pool just got
refreshed; the next pipeline run sees the flag and regenerates the
snapshot before syncing, so the server playlist never lags the source.
Schema:
- new `is_stale INTEGER NOT NULL DEFAULT 0` column on
`personalized_playlists`, plus an idempotent ADD COLUMN migration
in `ensure_personalized_schema` for installs created before this PR.
- `PlaylistRecord.is_stale: bool = False` exposed on the dataclass so
callers can branch on freshness without re-querying.
Manager:
- new `mark_kinds_stale(kinds, profile_id=None)` flips the flag in
bulk for a list of kinds (used by upstream data refreshers).
- `_persist_snapshot` clears `is_stale = 0` on successful refresh.
- SELECT statements + `_row_to_record` updated to read the column
(with tuple-form length guard for safety).
Pipeline:
- `_build_payloads_for_kinds` now branches: refresh_first=True OR
`existing.is_stale` -> refresh_playlist, else read existing
snapshot. So the auto-refresh kicks in without needing the user to
toggle the refresh-each-run option.
Watchlist scanner emits stale flags at three sites:
- after `update_discovery_pool_timestamp` -> marks pool-fed kinds
stale: hidden_gems, discovery_shuffle, popular_picks, time_machine,
genre_playlist, daily_mix.
- after release_radar `save_curated_playlist` -> marks `fresh_tape`.
- after discovery_weekly `save_curated_playlist` -> marks `archives`.
All three calls go through a module-level `_mark_personalized_kinds_stale`
helper that builds a PersonalizedPlaylistManager with `deps=None` (only
DB access is needed for the flag update — no generator dispatch). Each
call is wrapped in try/except so a flag failure can never abort the
scan itself.
Tests:
- new `TestStaleFlag` class in `test_personalized_manager.py` (6
tests): default-false, single-kind flip, multi-kind, profile
scoping, refresh-clears, empty-list noop.
- two new pipeline tests pin the auto-refresh dispatch:
`test_stale_snapshot_auto_refreshes_even_without_refresh_first`
and `test_non_stale_snapshot_skips_refresh`.
- existing stub-manager `SimpleNamespace` returns gained
`is_stale=False` so the new attribute read doesn't AttributeError.
Full suite: 3391 pass.
User-facing WHATS_NEW entry added under 2.5.2 (above the prior
pipeline auto-sync entry) describing the auto-refresh behavior.
Begins the standardization of the personalized-playlist subsystem.
Pre-existing state was a patchwork: Group A (Fresh Tape / Archives /
Seasonal Mix) lived in `discovery_curated_playlists` and
`curated_seasonal_playlists` with inconsistent shapes; Group B
(Hidden Gems / Discovery Shuffle / Time Machine / Popular Picks /
Genre / Daily Mixes) was computed on-demand by
`PersonalizedPlaylistsService` with no persistence -- every call
reran the generator with `ORDER BY RANDOM()` so results rotated.
Post-overhaul (this PR) every personalized playlist lands in one
unified storage layer with stable identity, persistent track lists,
explicit refresh, and per-playlist user-tweakable config.
Foundation in this commit (no behavior change yet):
- `database/personalized_schema.py`: 3 tables created idempotently
at app startup (wired into `MusicDatabase._initialize_database`).
- `personalized_playlists`: one row per (profile, kind, variant)
with config_json, track_count, last_generated_at,
last_synced_at, last_generation_source, last_generation_error.
Variant '' (empty string) for singletons; non-empty for
time_machine / seasonal_mix / genre_playlist / daily_mix.
- `personalized_playlist_tracks`: current snapshot per playlist.
Atomically replaced on refresh.
- `personalized_track_history`: append-only log powering the
`exclude_recent_days` config knob.
- `core/personalized/types.py`: `Track`, `PlaylistConfig`,
`PlaylistRecord` dataclasses. `PlaylistConfig.merged()` for
partial-update PATCH semantics; `Track.from_dict()` accepts the
legacy generator output shape unchanged.
- `core/personalized/specs.py`: `PlaylistKindSpec` (kind,
name_template, default_config, generator, variant_resolver) and a
module-level registry. Generators register at import time;
manager dispatches by kind.
- `core/personalized/manager.py`: `PersonalizedPlaylistManager` --
the only thing that touches the new tables. Owns:
- ensure_playlist (auto-create row from kind defaults)
- get_playlist / list_playlists
- refresh_playlist (atomic snapshot replace; generator exception
preserves previous good snapshot + records error on row)
- get_playlist_tracks
- update_config (deep-merge with stored config, including extra dict)
- recent_track_ids (staleness lookup for generators)
35 boundary tests in `tests/test_personalized_manager.py` pin every
shape: config round-trip / merge semantics / extra deep-merge /
defaults; Track.from_dict tolerance + primary_id fallback chain;
registry dedup / display_name with+without variant; manager
ensure_playlist auto-create + idempotency, variant separation,
required-variant enforcement, unknown-kind error; refresh persists
+ replaces atomically + survives generator exception with previous
snapshot intact + records source from first track + round-trips
nested track_data_json; update_config patch semantics; list_playlists
profile scoping; staleness history scoped to (profile, kind, days).
3304 tests pass total. Generators ship in subsequent commits on this
branch -- each kind migrated one at a time with its own per-kind
boundary tests.
Foundation commit for issue #442 — Japanese kanji ↔ romanized name
quarantines and equivalent cross-script mismatches. MusicBrainz
exposes alternate-spelling aliases on every artist record but
SoulSync's matching never consulted them; cross-script comparison
scored 0% on raw similarity and the file got quarantined even when
MusicBrainz knew both names belonged to the same artist.
This commit only adds the column. Subsequent commits in this PR:
- Build a pure alias-aware artist comparison helper
- Wire the MusicBrainz worker to populate aliases on enrichment
- Add a live MB lookup with cache for un-enriched artists
- Wire the helper into the AcoustID verifier where the quarantine
decision actually fires
Schema change is additive (NULL default), gated by the same
`PRAGMA table_info` check the existing `_add_musicbrainz_columns`
helper uses, so re-running on databases that already have the
column is a no-op.
Verified:
- New `artists.aliases` column present in fresh DB init
- JSON round-trip works (mirrors the existing `genres` column pattern)
- No existing tests broken
Catches the silent excepts the awk-based earlier sweeps missed:
- Bare `except:` followed by `pass` (also swallows KeyboardInterrupt
and SystemExit — actively wrong). Upgraded to `except Exception as
e: logger.debug("...: %s", e)`. ~14 sites across connection_detect,
soulseek_client, listenbrainz_manager, watchlist_scanner,
youtube_client, navidrome_client, jellyfin_client, web_server.
- `except Exception:` + pass that the awk pattern missed (e.g.
multi-line or unusual whitespace). ~31 sites across automation_engine,
database_update_worker, music_database, spotify_client, web_server,
others.
- 14 legitimate cleanup sites left silent with explicit `# noqa: S110`
+ comment explaining why (atexit handlers, finally-block conn.close
calls). Logging during shutdown can itself crash because file handles
get torn down before the handler fires.
Also enables `S110` rule in pyproject.toml so this pattern fails CI
going forward — drift fails at PR review instead of at runtime against
a wedged worker thread. Tests path keeps S110 ignored (test fixtures
legitimately use try-except-pass for cleanup).
Adds a WHATS_NEW entry to helper.js summarizing the full #369 sweep.
Verified: `python -m ruff check .` → All checks passed.
Verified: `python -m pytest tests/` → 2188 passed.
Closes#369
Mostly schema-migration ALTER TABLE fallbacks (column-already-exists
is the silent expected case) plus a few cache-purge/notify-migration
spots. Same pattern as the web_server sweep: `except Exception as e:
logger.debug("...: %s", e)`.
Refs #369
GitHub issue #503 (@hadshaw21). Adding a HiFi instance via downloader
settings popped up ``no such table: hifi_instances`` even though
"Test Connection" and "Check All Instances" both worked.
Root cause: ``MusicDatabase._initialize_database`` runs every
``CREATE TABLE`` + every migration step inside one sqlite transaction.
Python's sqlite3 module doesn't autocommit DDL by default, so if any
later migration step throws on a user's specific DB shape (e.g. an
old volume from a prior SoulSync version with quirky schema state),
the WHOLE batch rolls back — including the ``hifi_instances`` CREATE
that ran earlier in the function. The user's next boot retries init,
hits the same migration failure, rolls back again. The ``hifi_instances``
table never lands no matter how many restarts.
Fix: defensive lazy-create. New ``_ensure_hifi_instances_table(cursor)``
helper runs ``CREATE TABLE IF NOT EXISTS`` on demand, called immediately
before every CRUD operation that touches ``hifi_instances``:
- ``get_hifi_instances`` / ``get_all_hifi_instances`` (read)
- ``add_hifi_instance`` / ``remove_hifi_instance`` (CRUD)
- ``toggle_hifi_instance`` / ``reorder_hifi_instances`` (CRUD)
- ``seed_hifi_instances`` (defaults seed)
Idempotent — costs one no-op CREATE check when the table is already
present, fully recovers from a broken init state. Read methods now
return empty instead of raising when init failed; write methods work
end-to-end.
Doesn't paper over the underlying init issue (still worth tracking
which migration step breaks for which user DB shapes — separate
concern) but makes HiFi instance management self-healing in the
meantime.
Tests:
- 7 obsolete tests that pinned ``raises sqlite3.OperationalError``
removed — that contract is no longer correct
- 7 new tests pin the lazy-create behavior: every CRUD method works
against a DB that's missing the ``hifi_instances`` table, verifying
the table gets created and the operation completes
2162/2162 full suite green. Pure additive — no behavior change for
users with a healthy DB; affected users get back to working hifi
instance management.
Closes#503.
Discord request: pull user's Discogs collection into the Your Albums
section on Discover, similar to how Spotify Liked Albums works.
Implementation extends the existing 3-source pipeline (Spotify /
Tidal / Deezer) to a 4-source pipeline with click-context dispatch —
Discogs-only albums open with rich Discogs release detail (vinyl/CD
format, year, label, country, tracklist). Mirrors the per-source
dispatch pattern from enhanced/global search.
Discogs client (`core/discogs_client.py`):
- New `get_authenticated_username()` resolves the username for the
configured personal token via Discogs's `/oauth/identity` endpoint.
Cached on the instance so subsequent collection page-fetches don't
re-hit it.
- New `get_user_collection(username=None, folder_id=0, per_page=100,
max_pages=50)` walks all pages of `/users/{username}/collection/
folders/{folder_id}/releases`. Returns normalized dicts ready for
upsert_liked_album. folder_id=0 = Discogs's "All" folder.
Pagination cap of max_pages*per_page = 5000 releases — bounds
runtime on heavy collections.
- New `get_release(release_id)` thin wrapper for `/releases/{id}` —
returns the raw API response so the album-detail endpoint can
render rich context.
- Both methods defensive: missing token → empty list, malformed
responses → skipped, falsy ids → None. Disambiguation suffix
stripping (`Madonna (3)` → `Madonna`) so Discogs artist names
match what Spotify/Tidal/Deezer use.
Schema (`database/music_database.py`):
- New `discogs_release_id TEXT` column on `liked_albums_pool`.
Migration uses the established `try SELECT, except ALTER TABLE`
pattern. Idempotent; safe on existing installs.
- Added the column to the canonical CREATE TABLE for fresh installs.
- `upsert_liked_album` extended with `'discogs': 'discogs_release_id'`
in BOTH the INSERT and UPDATE id-column maps so Discogs source_id
routes to the new column. INSERT statement column count + value
count updated together.
Backend (`web_server.py`):
- `/api/discover/your-albums/sources` — adds Discogs to the
`connected` list when `discogs.token` config is set.
- `_fetch_liked_albums` — new branch for Discogs. Lazy-imports
DiscogsClient, respects the `enabled_sources` config, walks the
collection, upserts each release. Same try/except shape as the
existing source branches.
- `/api/discover/album/<source>/<album_id>` — new `discogs` branch
fetches the release via DiscogsClient.get_release, normalizes the
Discogs tracklist format, parses Discogs's `MM:SS`/`HH:MM:SS`
duration strings to milliseconds, returns the same response shape
as the Spotify/Deezer/iTunes branches.
Frontend (`webui/static/discover.js`):
- `openYourAlbumsSourcesModal` — adds Discogs to `sourceInfo` with
the vinyl emoji icon. Existing toggle/save plumbing handles it.
- `openYourAlbumDownload` — restructured the per-source dispatch:
builds an ordered list of (source, id) tuples, tries each in turn,
breaks on the first successful response. Pure-Discogs albums go
straight to the Discogs detail endpoint → modal opens with Discogs
context. Multi-source albums prefer Spotify/Deezer first since
their tracklists carry proper streaming IDs ready for download.
Tests: `tests/test_discogs_collection_source.py` — 12 cases:
- get_user_collection: empty without token, normalizes response
shape, strips disambiguation suffix, handles missing year, skips
malformed releases, paginates correctly, caps at max_pages,
uses explicit username when provided.
- get_release: passes id through to /releases/{id}, returns None
for invalid ids without API call.
- liked_albums_pool: discogs_release_id round-trips through upsert
+ get; multi-source dedup carries both Spotify and Discogs IDs
on the same row.
Verified: full suite 1825 pass (12 new), ruff clean, smoke test
populating + reading the discogs_release_id column round-trips
correctly via the real DB.
WHATS_NEW entry under '2.4.2' dev cycle.
Discord request (Samuel [KC]): show how much disk space the library
takes on the Stats page. Implementation piggybacks on the existing
deep scan — Plex/Jellyfin/Navidrome all return file size in their
track API responses, so we read it during the deep scan and store
it on the tracks row. Aggregation is then a single SQL query — no
filesystem walk, no extra I/O during the scan, no separate stat
job. SoulSync standalone gets size from os.path.getsize at insert
time (different code path; the file is local when we write the row).
Schema (`database/music_database.py`):
- New `file_size INTEGER` column on `tracks`. Migration uses the
established `try SELECT, except ALTER TABLE ADD COLUMN` pattern.
Idempotent; safe on existing installs. NULL on legacy rows so
they don't contribute to totals until next deep scan refreshes.
- Added the column to the canonical CREATE TABLE so fresh installs
get it without going through the migration path.
Track-object plumbing:
- `core/jellyfin_client.py` — JellyfinTrack reads MediaSources[0].Size
alongside existing Bitrate read. None when 0 / missing.
- `core/navidrome_client.py` — NavidromeTrack reads `size` from
the Subsonic song object (int coercion + None on parse fail).
- `core/soulsync_client.py` — SoulSyncTrack does os.path.getsize
(only "server" where size has to come from disk).
- Plex needs no client-side change: track.media[0].parts[0].size
is read directly inside insert_or_update_media_track.
Persistence — TWO separate insert paths:
(a) `database/music_database.py:insert_or_update_media_track` —
Plex/Jellyfin/Navidrome flows. Reads file_size from Plex's
MediaPart OR `track_obj.file_size` wrapper attribute (defensive
Plex-attr-not-present check + > 0 type guard).
INSERT writes the new column.
UPDATE uses COALESCE(?, file_size) so a None from the server
on a re-sync (rare Jellyfin Size omission) doesn't blank an
existing value. Pinned via test.
(b) `core/imports/side_effects.py:record_soulsync_library_entry` —
SoulSync standalone flow. Completely separate code path: the
standalone deep scan moves files to staging for auto-import
rather than calling insert_or_update_media_track. After the
auto-import processes them, side_effects writes the tracks row
directly. Reads file_size via os.path.getsize(final_path) at
insert time (file is local) and includes it in the INSERT
column list. SoulSync only does INSERT-if-not-exists (no
UPDATE path), so no COALESCE concern.
Aggregator (`database/music_database.py:get_library_disk_usage`):
- SELECT COALESCE(SUM(file_size), 0), COUNT(file_size),
COUNT(*) - COUNT(file_size) for the totals.
- Per-format breakdown done in Python via os.path.splitext over
(file_path, file_size) rows — sidesteps SQLite's first-vs-last-dot
ambiguity for paths like /music/Kendrick/M.A.A.D City/01.flac.
- Defensive: skips empty paths, paths without extension, and
implausibly long extensions (>6 chars). Returns the full
empty-shape dict (NOT a partial / undefined) when the column
doesn't exist or queries fail, so the UI's `if (!data.has_data)`
branch handles fresh installs cleanly.
API + UI:
- `core/stats/queries.py` — thin pass-through get_library_disk_usage
matching the existing query-helper convention.
- `web_server.py` — new /api/stats/library-disk-usage endpoint
mirroring the /api/stats/db-storage pattern.
- `webui/index.html` — new card in System Statistics above the
Database Storage card.
- `webui/static/stats-automations.js` — _loadLibraryDiskUsage +
_renderLibraryDiskUsage. Empty state: "Run a Deep Scan to
populate (X tracks pending)". Partial: "X measured (+Y pending)".
Full: total + format bars proportional to the largest format.
- `webui/static/style.css` — .stats-disk-* styled to match the
Database Storage card.
Backward compatibility:
- Migration is additive; existing rows get NULL file_size; the
empty-shape return from the aggregator means the UI renders
cleanly without errors before any deep scan runs.
- Old installs upgrading will see "Run a Deep Scan to populate
(N tracks pending)". Running their next deep scan fills sizes —
the existing scan flow doesn't need any changes, just consumes
the new track-wrapper attribute.
Tests:
- `tests/test_library_disk_usage.py` — 13 cases covering schema
migration, NULL defaults on legacy inserts, fresh-install empty
shape, summing with mixed NULL/known sizes, per-format breakdown,
mixed-case extensions, paths with album-name dots, missing
extensions, empty file_path, implausibly long extensions,
JellyfinTrack.file_size persistence via insert_or_update_media_track,
COALESCE preservation on null re-sync.
- `tests/imports/test_import_side_effects.py` — extended the
existing record_soulsync_library_entry test to assert
track_row['file_size'] == os.path.getsize(final_path), pinning
the SoulSync-standalone path. Test fixture's tracks schema also
updated to include the file_size column.
Verified: full suite 1813 pass (13 new, 1 existing-test extension),
ruff clean, smoke test populating + reading the column round-trips
correctly.
WHATS_NEW entry under '2.4.2' dev cycle.
Discord report (Samuel [KC]): tracks of the same album sometimes carry
different MUSICBRAINZ_ALBUMID tags, which causes Navidrome (and other
media servers grouping by album MBID) to split the album into multiple
entries. Two-part fix — one for existing libraries, one for the root
cause that lets new imports drift.
Part 1 — Detector + fix action (catches existing dissenters):
`core/repair_jobs/mbid_mismatch_detector.py`:
- New helpers: `_read_album_mbid_from_file` and
`_write_album_mbid_to_file` use the Picard-standard tag conventions
(`TXXX:MusicBrainz Album Id` for MP3, `MUSICBRAINZ_ALBUMID` for
FLAC/OGG, `----:com.apple.iTunes:MusicBrainz Album Id` for MP4).
- New scan phase `_scan_album_mbid_consistency` runs after the
existing track-MBID scan: groups tracks by DB `album_id`, reads
each track's embedded album MBID, finds the consensus
(most-common) MBID via `Counter`, flags dissenters. Tracks without
an album MBID at all are skipped (they don't break Navidrome —
only an explicit MBID disagreement does). Albums where MBIDs are
perfectly tied (no clear consensus) are skipped too — surface as
a manual decision instead of fixing toward a 1/N tie.
- New finding type `album_mbid_mismatch` carries `consensus_mbid`,
`wrong_mbid`, `consensus_count`, `total_tracks_with_mbid`, and a
human-readable reason string.
`core/repair_worker.py`:
- Added `'album_mbid_mismatch': self._fix_album_mbid_mismatch` to the
fix dispatch dict and to the `fixable_types` tuple so auto-fix +
bulk-fix paths pick it up.
- New `_fix_album_mbid_mismatch` method reads `consensus_mbid` from
finding details, resolves the dissenter's file path via the shared
library resolver, calls `_write_album_mbid_to_file` to rewrite the
tag in place. Doesn't touch the album's other tracks (they're
already in agreement).
Part 2 — Root cause fix (prevents new SoulSync imports from drifting):
The original in-memory `mb_release_cache` in `core/metadata/source.py`
maps `(normalized_album, artist) -> release_mbid` so per-track
enrichment of the same album hits the cache and writes the same
MUSICBRAINZ_ALBUMID to every track. That cache is bounded (4096
entries) and in-process — so cache eviction (when other albums are
processed in between) and server restart can BOTH cause
inconsistency. Per-track album-name variation (e.g. some tracks
tagged `"Album"`, others tagged `"Album (Deluxe)"`) and per-track
artist variation (features) make it worse.
`core/metadata/album_mbid_cache.py` (new module):
- DB-backed `lookup(normalized_album, artist) -> release_mbid` and
`record(...)` functions. Same key shape as the in-memory cache.
- Strict additive design: every public function is wrapped in
try/except and degrades to None / no-op on ANY database error.
The existing in-memory cache + MusicBrainz lookup remains the
authoritative fallback. If this module breaks, downloads continue
exactly as they would today.
`database/music_database.py`:
- New `mb_album_release_cache` table with composite primary key
`(normalized_album_key, artist_key)`. Reverse-lookup index on
`release_mbid` for future debug tooling. Created via the existing
`CREATE TABLE IF NOT EXISTS` migration pattern — idempotent, no
schema version bump needed.
`core/metadata/source.py`:
- Surgical change inside the existing `embed_source_ids`
in-memory-cache-miss branch: BEFORE calling MusicBrainz, consult
the persistent cache. If a previous SoulSync run already resolved
this album's release MBID, reuse it. After a successful MB lookup,
store in BOTH caches. Both calls wrapped in defensive try/except
so any failure falls through to existing logic.
Tests:
- `tests/metadata/test_album_mbid_cache.py` — 16 cache tests:
round-trip, idempotent re-record, overwrite semantics, clear_all,
album+artist independence (no Greatest Hits collisions),
defensive None-on-empty-input, graceful degradation when the DB
is unavailable / connection raises / commit fails, schema sanity
(table + index exist after init).
- `tests/test_album_mbid_consistency.py` — 13 detector tests:
tag read/write round-trip on real FLAC files, Picard-standard tag
descriptors, defensive paths (unreadable file, empty input),
detector behavior (agreement → no flags, lone dissenter → flag,
ties → no flag, single-track albums → skipped, no-MBID tracks →
skipped, unresolvable file paths → skipped).
- `tests/metadata/test_metadata_enrichment.py` — added autouse
fixture monkeypatching the persistent cache to no-op for tests in
this file. The existing tests pin per-call MB counts and
in-memory cache state; without the fixture, persistent rows from
earlier tests would bypass the MB call. Persistent layer has its
own dedicated tests.
Verified: 1782 tests pass (29 new), ruff clean, smoke test confirms
end-to-end cache round-trip works.
WHATS_NEW entry under '2.4.2' dev cycle.
Followup to fix/watchlist-external-id-match. The companion PR closed
the demand side — the watchlist scanner asks for tracks by external IDs
before falling back to fuzzy. But for users on Plex / Jellyfin /
Navidrome the supply side was still broken: tracks.spotify_track_id
(and the other ID columns) only got populated by the asynchronous
enrichment workers, sometimes hours after the file was actually
written. During that window the ID match fell through to fuzzy and
the bug returned.
We were already collecting every ID during post-processing — they
live in the `pp` dict in core/metadata/source.py:embed_source_ids and
get embedded into file tags. We just dropped the in-memory copy
afterwards.
This PR persists them and uses them:
- Schema migration adds spotify_track_id / itunes_track_id /
deezer_track_id / tidal_track_id / qobuz_track_id /
musicbrainz_recording_id / audiodb_id / soul_id / isrc columns +
indexes to the existing track_downloads table (already keyed by
file_path).
- core/metadata/source.py:embed_source_ids exposes pp["id_tags"] and
the resolved ISRC back to the import context as _embedded_id_tags
/ _isrc.
- core/imports/side_effects.py:record_download_provenance reads those
context fields and passes them to db.record_track_download, which
now accepts the new ID kwargs and persists them.
- New db.get_provenance_by_file_path with exact + basename-suffix
fallback (handles container mount-root differences between
download-time path and media-server-reported path).
- New db.backfill_track_external_ids_from_provenance copies IDs
from track_downloads onto a tracks row idempotently — COALESCE on
every column preserves any value the enrichment worker already
wrote (enrichment is more authoritative for late binding).
- database/music_database.py:insert_or_update_media_track (the
single insertion point used by every Plex / Jellyfin / Navidrome
sync) calls the backfill immediately after each INSERT/UPDATE.
- New core/library/track_identity.py:find_provenance_by_external_id
used as a second-tier fallback in watchlist_scanner.is_track_missing
_from_library — catches the window between download and media-server
sync. Caller checks os.path.exists on the provenance file_path
before treating it as "already in library" so a deleted file
doesn't prevent re-download.
Effect: freshly downloaded files become ID-recognizable to the
watchlist on the very next scan, no enrichment-wait window.
19 regression tests in tests/test_provenance_id_persistence.py:
- Schema migration adds expected columns + indexes
- record_track_download persists every ID kwarg
- record_track_download backward-compat (old kwargs still work)
- get_provenance_by_file_path: exact match, basename fallback for
mount-root differences, multi-record latest-wins, defensive None
- backfill: copies all IDs, preserves existing via COALESCE,
no-op when no provenance exists
- find_provenance_by_external_id: per-ID lookup, ISRC cross-bridge,
OR semantics, latest-wins on multiple matches
Out of scope: backfilling provenance for files downloaded BEFORE
this PR (their track_downloads rows don't carry the new IDs). Those
continue to wait for enrichment. Acceptable — only affects historical
files; new downloads benefit immediately.
Full pytest 1625 passed; ruff clean.
Discord-reported scenario: a single "Super Single" by Artist1 feat.
Artist2 is also on Artist1's "Super Album". When the album is fully
owned, Artist1's discography correctly shows the single as complete,
but Artist2's discography (where the same track also appears as a
single) shows it as missing.
Two layers needed for the fix:
Scanner: the Jellyfin/Emby path was keeping only ArtistItems[0],
which is almost always equal to the album artist — so the
distinguishing per-track credit was silently suppressed. Now joins
every ArtistItems entry with "; " and stores the value when there
are multiple credits OR when the single credit differs from the
album artist. Plex's originalTitle already carries the full multi-
artist tag, so Plex users benefit without needing the scanner change.
Scorer: _calculate_track_confidence now splits track_artist on the
common multi-artist delimiters real-world tags use (",", ";", "&",
"feat.", "ft.", "featuring", "vs.", "x") and scores each piece
independently against the search artist, taking the max along with
the whole-string similarity as the floor. Never reduces a score —
purely additive matching for previously-missed featured-artist
credits.
Adds 12 regression tests covering the reported scenario, primary-
artist back-compat, every delimiter variant (parametrized), no-
regression on exact match, and the scanner storing every ArtistItem.
Existing Jellyfin-scanned rows persist their old single-artist value
until the next library scan rewrites them; Plex rows benefit
immediately on next match without needing a rescan.
Two bugs surfacing the same user-reported symptom: a Vaiana OST
track ("Where You Are" by Christopher Jackson) wouldn't match against
a Plex/Emby library because the album sits under the album artist
(Lin-Manuel Miranda).
Bug 1: the data was already there but scoring ignored it. The DB
schema has a tracks.track_artist column, the scanner populates it
from Plex's originalTitle and Jellyfin's ArtistItems[0], and the SQL
WHERE clause already searches it — but _rows_to_tracks dropped the
column on its way to the Python object, and _calculate_track_confidence
only scored against the album-artist JOIN. Candidates whose track-
artist matched got returned by the search and then immediately
filtered out by the low confidence score.
Fix: _rows_to_tracks now propagates row['track_artist'] onto the
returned object, and _calculate_track_confidence takes the better of
(album-artist similarity, track-artist similarity) so soundtracks
match through whichever credit the search query carries.
Bug 2: the album-aware fallback path constructed DatabaseTrack with
kwargs the dataclass doesn't accept (artist_name, album_title,
server_source). Every row TypeError'd, the outer except swallowed it
silently, and the fallback never matched anything since the column
was added — invisible because nothing logged it.
Fix: build DatabaseTrack with valid fields and attach the joined
columns afterwards, the same pattern _rows_to_tracks uses.
Adds 6 regression tests covering: track-artist match (the OST case),
album-artist still matches, scorer takes the better of the two,
defensive handling for tracks without track_artist, search-path
attribute propagation, and the previously-dead album-aware fallback.
- add neutral wishlist payload helpers while keeping legacy Spotify aliases
- route wishlist removal and classification through generic track data
- keep API and service compatibility for existing callers
Five issues kettui flagged on PR #377:
- Worker race (reorganize_queue.py): _next_queued() picked an item and
released the lock, then re-acquired to flip status='running'. A
cancel() landing in that window marked the item cancelled but the
worker still ran it. Replaced with _claim_next_or_wait() that picks
AND flips under one lock acquisition.
- Wakeup race (reorganize_queue.py): _wakeup.clear() after the empty
check could lose an enqueue's _wakeup.set(), parking a freshly-queued
album for up to 60 seconds. Replaced Lock + Event with a single
threading.Condition; cond.wait() releases and re-acquires atomically
on notify.
- Bulk dedupe (reorganize_queue.py:enqueue_many): looped single-item
enqueue, so a duplicate album_id later in the same batch could slip
through if the worker finished the first copy before the loop
reached the second. Now holds the lock for the whole batch and tracks
a per-batch seen set, so intra-batch duplicates dedupe against each
other and not just pre-existing items.
- Preview button stuck disabled (library.js:loadReorganizePreview):
early returns and thrown errors skipped the re-enable line. Moved
state into a canApply flag committed in finally, so any exit path
lands the button correctly.
- DB helpers swallowing failures (music_database.py): get_album_display_meta
and get_artist_albums_for_reorganize used to catch every Exception
and return None / [], so a real DB outage masqueraded as "album not
found" / "no albums". Now lets exceptions bubble; the route layer
already wraps them as 500.
Tests:
- test_cancel_and_run_are_mutually_exclusive — hammers enqueue+cancel
pairs and asserts the invariant that no successfully-cancelled item
ever ran (catches regressions to the atomic pick).
- test_enqueue_many_dedupes_batch_internal_duplicates — pins the
intra-batch dedupe.
- test_get_album_display_meta_propagates_db_errors and
test_get_artist_albums_for_reorganize_propagates_db_errors — pin
the bubble-up behavior.
Changelog updated in helper.js and version modal.
Replaces the single-slot "one reorganize at a time, return 409 on collision"
model with a per-user FIFO queue. Buttons stay clickable, "Reorganize All"
is one backend call instead of an N-call JS loop, and a status panel mounted
at the top of the artist actions bar shows live progress (active item,
queued count, recent completions) with per-item cancel buttons.
Backend
- core/reorganize_queue.py: singleton queue + worker thread, dedupe-on-
enqueue, cancel rules (queued cancellable, running not), enqueue_many
for bulk operations, progress fan-out via update_active_progress
- core/reorganize_runner.py: factory builds the worker's runner closure
with injected dependencies. Reads config per-call so changing the
download path in Settings takes effect on the next reorganize without
a server restart
- database/music_database.py: get_album_display_meta and
get_artist_albums_for_reorganize — moves the SQL out of route handlers
- web_server.py: thin enqueue/snapshot/cancel/clear endpoints, runner
registration at module load. Old _reorganize_state globals + status
endpoint deleted. Static-asset cache buster (?v=<server-start>)
added so JS/CSS updates ship live without users clearing cache
Frontend
- webui/static/library.js: status panel mount, polling (1.5s when
active, 8s when idle), expand/collapse, per-item cancel, debounced
enhanced-view reload (one reload per artist batch instead of N).
Per-album reorganize button paints with queued/running indicator
and short-circuits to a toast when the album is already in queue
- webui/static/style.css: panel + button styling matching the existing
glass-UI accents
- webui/static/helper.js + version modal: WHATS_NEW entry
Tests (22 new)
- tests/test_reorganize_queue.py (19 tests): FIFO order, dedupe,
per-item source, cancel rules, continue-on-failure, snapshot
shape, progress propagation, bulk enqueue
- tests/test_reorganize_runner.py (4 tests): per-call config reads,
setup-failure summary, dependency injection, progress fan-out
- tests/test_reorganize_db_methods.py (7 tests): SQL JOIN behavior,
ordering, fallback for blank strings, artist isolation
Full suite 549 passed in 27s.
Reported by kettui on PR #374 review:
> api_track_count is not copied during the ratingKey migration, so
> the cache disappears when an album row is rekeyed. Add it to
> enrichment_cols or the next completeness scan will fall back to
> live API lookups again.
When Plex changes an album's ratingKey (after a library rescan), the
sync code rekeys the album row by inserting a new row at the new ID
and copying enrichment columns from the old row. The list of
columns to copy did not include `api_track_count`, so the cached
authoritative track count was lost on rekey — and the next completeness
scan would hit the fallback path that calls back out to the
metadata source's API. Defeats the cache.
Added `api_track_count` to the album-level `enrichment_cols` at
`music_database.py:4724`. The artist-level lists at lines 4238 and
4554 don't need updating — those are for artist rekeys and don't
carry album-scoped fields.
No new test — existing migration code has no test infrastructure
and writing a Plex-mocked one is larger than this fix. Cin will say
if he wants test coverage in his next review pass.
Credit: kettui — PR #374 review comment that flagged the missing
column in the rekey allowlist.
Reported by sassmastawillis: the Album Completeness maintenance job
scans 3127 albums in 0.1 seconds and reports 0 findings — for every
user, regardless of whether their library is actually complete.
Restoring an older DB surfaced 7 correct findings, so the code logic
works; the DB state is what's making everything look complete.
Root cause: `albums.track_count` is only ever written by server-sync
paths — Plex's `leafCount`/`childCount` and SoulSync standalone's
`len(tracks)`. It's the OBSERVED count of tracks SoulSync has indexed,
which is always exactly what `COUNT(tracks)` returns for that album.
The completeness job treated it as the EXPECTED total and compared it
against the observed count. They're equal by construction, so
`actual >= expected` is always true: skip, 0.1s scan, 0 findings.
Fix: new `api_track_count INTEGER` column on `albums`, written only by
metadata-source code paths. Populated in two places so the scan is
fast and the fallback is robust.
1. Enrichment workers — shared helper `set_album_api_track_count`
in `core/worker_utils.py`. Called by each worker's existing
`_update_album` method alongside its other album-column UPDATEs:
- spotify_worker: `album_obj.total_tracks` from the Spotify Album
dataclass (already in hand, zero new API calls)
- itunes_worker: same, from the iTunes Album dataclass
- deezer_worker: `nb_tracks` from full_data, falling back to
search_data when the full lookup didn't run
- discogs_worker: count of tracklist rows where `type_=='track'`
(Discogs tracklists interleave heading and index rows that
shouldn't count as songs)
Helper skips the write on zero/None/negative/non-numeric inputs
so a source lacking track info can't clobber a good value a
different source already wrote. Caller owns the transaction —
helper just queues an UPDATE on the caller's cursor without
committing, so it batches cleanly with each worker's existing
multi-UPDATE pattern.
Hydrabase worker deliberately not touched — it's a P2P mirror
that doesn't write album metadata to the local DB. Hydrabase-
primary users hit the fallback path below.
2. Album Completeness repair job — new `al.api_track_count` column
in the SELECT, read first in the scan loop. On miss (album never
enriched, or enrichment workers haven't run yet on a fresh
install), falls through to the existing `_get_expected_total()`
API lookup and persists the result via the same shared helper
(wrapped in connection/commit management since the repair job
runs outside a worker's batched transaction).
Also removed `al.track_count` from the scan's SELECT — now unused
since the observed count was the whole source of this bug, and
leaving a dead SELECT would invite a future engineer to re-introduce
the same comparison.
Help text on the job card was reworded so it honestly describes
current behavior ("counts cached during normal enrichment are used
when available; otherwise the job queries a metadata source
directly") rather than the old "active provider first, then others
as fallback" phrasing, which doesn't match how the cache actually
fills — any enrichment worker that runs can populate it, and the
last writer wins. Document-only follow-up if this edge case ever
bites in practice: add a `api_track_count_source` column so the
scan can prefer the configured primary source's count over others
(e.g. deluxe vs. standard edition mismatches). Not worth the
complexity today.
For existing users, the first completeness scan after upgrade is
fast to the extent their library is already enriched: the workers
already ran and populated `api_track_count` on their normal schedule.
For brand-new installs, the scan's fallback path handles the cold
start — slower, but correct, and subsequent scans are fast.
Does NOT affect:
- Download / post-processing / wishlist / sync code paths — none
of them read `track_count` for completeness semantics.
- Plex / Jellyfin / Navidrome / standalone sync — still write
`track_count` exactly as before; `api_track_count` is a separate
column they never touch.
- Other repair jobs.
- Any UI path — same finding schema, just correct counts now.
Files:
- database/music_database.py — idempotent migration adding
`api_track_count INTEGER DEFAULT NULL` to the existing album-column
check block.
- core/worker_utils.py — new `set_album_api_track_count` helper with
the documented skip-on-bad-input contract.
- core/spotify_worker.py, itunes_worker.py, deezer_worker.py,
discogs_worker.py — one-liner call from each `_update_album`.
- core/repair_jobs/album_completeness.py — scan uses the cache;
fallback path persists API-lookup results via the shared helper;
help text updated to match actual behavior.
- tests/test_worker_utils_album_track_count.py — 9 tests covering
the helper's write/skip contract + no-commit invariant.
- tests/test_album_completeness_job.py — 2 tests for the repair
job's fallback-path wrapper.
- webui/static/helper.js — WHATS_NEW entry.
Credit: sassmastawillis spotted the bug; the "restored older DB
finds 7 albums" signal pinpointed DB state over code logic and
made the diagnosis tractable.
PR #340 added ruff to the build-and-test.yml CI gate, which surfaced
286 pre-existing lint errors. Left unfixed, every feature branch push
fails CI. This commit resolves all of them so CI goes green and
contributors can actually land work.
Auto-fixes (248 of 286): removed unused f-string prefixes (F541),
renamed unused loop control variables with underscore prefix (B007),
removed duplicate imports (F811).
Manually fixed 10 latent bugs ruff caught (all wrapped in try/except
today, silently failing):
- music_database.py: _add_discovery_tables() called undefined
conn.commit() — would have crashed the iTunes-support migration
for existing databases. Now uses cursor.connection.commit().
- web_server.py settings GET: referenced undefined download_orchestrator
when it should be soulseek_client. Feature (_source_status on the
settings payload) was silently missing for UI auto-disable logic.
- web_server.py _process_wishlist_automatically: active_server
undefined in track-ownership check. Auto-wishlist was falling
through to the error handler and re-downloading owned tracks.
- web_server.py start_wishlist_missing_downloads: same active_server
bug in the manual wishlist path.
- web_server.py _process_failed_tracks_to_wishlist_exact: emitted
wishlist_item_added automation event with undefined artist_name
and track. Automation event silently never fired correctly.
- web_server.py discovery metadata enrichment: referenced cache
without calling get_metadata_cache() first. Track enrichment from
cached API responses was silently skipped.
- web_server.py Beatport discovery worker: wing-it fallback branch
used undefined successful_discoveries variable. Wing-it counter
never incremented correctly. Now uses state['spotify_matches']
consistently with the rest of the function.
- web_server.py _run_full_missing_tracks_process: stale import json
mid-function shadowed the module-level import, making an earlier
json.dumps() call reference an unbound local (F823).
- web_server.py discovery loop: platform loop variable shadowed
the module-level platform import (F402).
- core/watchlist_scanner.py: 7 lambda captures of loop variables
(B023 classic Python closure-in-loop bug) now bind at creation.
No existing tests had to change. Full suite stays at 263 passed.
Artist detail pages ran check_album_exists_with_editions and check_track_exists
per discography item, each firing 5+ title variations times 3 artist variations
of fuzzy LIKE searches plus fallback broad-artist queries. For a 30-album artist
that was ~450 SQL round-trips just to answer "which of these do I own."
Hoist the artist's library albums and tracks into memory once per request via
two new helpers — get_candidate_albums_for_artist and get_candidate_tracks_for_albums —
and thread them through as optional candidate_albums / candidate_tracks params on
check_album_exists_with_editions, check_album_exists_with_completeness,
check_track_exists, check_album_completion, and check_single_completion.
Batched path scores the same _calculate_album_confidence / _calculate_track_confidence
against the in-memory list, preserving Smart Edition Matching and accuracy.
Title-only cross-artist fallback still fires for collaborative-album edge cases.
None on either param preserves legacy per-item SQL behavior for unaffected callers.
Applied to both /api/library/completion-stream (library artist detail page) and
iter_artist_discography_completion_events (Artists search page). Timing logs
added to confirm the pre-fetch cost and loop elapsed time.
On a Kendrick page load, per-album resolution drops from ~8 seconds to under
the 50ms streaming sleep floor. Observed ~100x SQL reduction on the happy path.
Four enrichment workers (Last.fm, MusicBrainz, Tidal, Qobuz) had a
bug where every background loop re-processed the same rows because
the existing-ID short-circuit path never set match_status, and two
workers queried the wrong column when checking for an existing ID.
lastfm_worker._get_existing_id queried a non-existent lastfm_id
column; the real column is lastfm_url. The method now reads
lastfm_url for all three entity types.
musicbrainz_worker._get_existing_id queried musicbrainz_id for all
entity types, but albums use musicbrainz_release_id and tracks use
musicbrainz_recording_id. The method now uses a per-type column map.
All four workers (lastfm, musicbrainz, tidal, qobuz) now write
match_status='matched' when they short-circuit on an already-present
external ID, so these rows are no longer re-selected on the next
worker sweep.
A new migration (_backfill_match_status_for_existing_ids) runs once
on startup to retroactively set match_status='matched' for rows that
already have an external ID but NULL match_status. This covers legacy
data, manual matches, and rows populated from file tags outside the
worker.
The /api/v1/library/tracks endpoint called search_tracks() to get
DatabaseTrack objects, then immediately called api_get_tracks_by_ids()
to re-hydrate full rows for serialization. Two round trips per search.
Added api_search_tracks() that returns dict rows with all track columns
plus artist_name, album_title, and album_thumb_url in a single query.
The basic and fuzzy search helpers were refactored to share raw-row
implementations, so the existing search_tracks() still returns
DatabaseTrack objects for the many internal callers that depend on
that shape (matching pipeline, repair worker, web UI search).
The wishlist list endpoint previously loaded and JSON-decoded the full
wishlist, filtered by category in Python, then sliced in memory. Cost
grew linearly with wishlist size on every page request.
get_wishlist_tracks now accepts offset and category parameters, both
applied in SQL via LIMIT/OFFSET and json_extract. get_wishlist_count
also accepts category so COUNT(*) matches the filtered page. The API
endpoint uses these to return only the requested page.
Backward compatible: other callers (core/wishlist_service) pass no
offset/category and still receive the full list.
Users can now override which metadata provider (Spotify, Deezer, Apple Music,
Discogs) is used when scanning a specific watchlist artist for new releases.
The selector appears in the artist config modal and only shows sources the
artist has enrichment IDs for. Default behavior is unchanged — all artists
use the global metadata source unless explicitly overridden.
Split Downloads page into main list (left) and batch panel (right).
Each active batch gets a color-coded card with artwork thumbnail,
progress bar, per-track status with download percentages, and
expandable track list. Download rows get matching color indicators.
- Click batch name to open its download/wishlist modal
- Filter icon narrows main list to one batch with clear banner
- Collapsible panel toggle for full-width list view
- Completed batches fade out after 15 seconds
- 7-day batch history with source type color dots
- Artwork fallback shows colored initial when no art available
- Per-track progress: download %, spinner for searching, proc label
- source_page column on sync_history for UI origin tracking
- /api/downloads/all includes batch summaries and per-track progress
- /api/downloads/batch-history endpoint for history queries
- Responsive layout, overflow-x hidden to prevent scroll flicker
Full auto-import pipeline: background worker watches the staging folder,
identifies music using embedded tags → folder name parsing → AcoustID
fingerprinting, matches files to metadata source tracklists, and
processes high-confidence matches through the existing post-processing
pipeline automatically.
Worker: AutoImportWorker with start/stop/pause/resume, configurable
scan interval (default 60s), confidence threshold (default 90%), and
auto-process toggle. Processes one folder per cycle, alphabetical
order. Disc folder detection, stability checking, content hash dedup.
Confidence gate: 90%+ auto-processes silently, 70-90% queued as
pending review with approve/dismiss actions, <70% flagged for manual
identification. Track matching uses weighted algorithm (title 45%,
artist 15%, track number 30%, album tag 10%).
Database: auto_import_history table tracks every scan result with
folder hash, match data JSON, confidence, status, timestamps.
API: 7 endpoints — status, toggle, settings (GET/POST), results
(filtered/paginated), approve, reject.
UI: Auto tab on Import page with enable toggle, confidence slider,
scan interval selector. Live result cards with album art, confidence
bar (green/yellow/red), status badges, match stats. 5-second polling.
Full automation page upgrade with group management and drag-and-drop:
Backend: batch_update_group() and bulk_set_enabled() DB methods, new
PUT /api/automations/group and POST /api/automations/bulk-toggle endpoints.
Group headers: rename (inline edit), delete (choice dialog — keep
automations or delete all), bulk toggle (enable/disable all in group).
Actions appear on hover, styled as small icon buttons.
Drag and drop: non-system cards are draggable between group sections.
Drop zones show dashed accent border feedback. Collapsed sections
auto-expand on 500ms drag-hover. System/Hub sections dimmed during drag.
dragenter counter pattern handles child element bubbling.
Delete group dialog: glass card modal with three options — keep
automations (move to My Automations), delete everything, or cancel.
Two bugs: (1) 'wishlist' was missing from the settings save whitelist,
so the toggle silently reset to ON on every page reload. (2) The
wishlist cleanup function unconditionally removed tracks sharing the
same name+artist regardless of album, ignoring the allow_duplicates
setting. Now when allow_duplicates is on, the dedup key includes the
album name so same song from different albums can coexist.
Explored status was stored only in frontend memory; on reload the badge
disappeared because the API never returned it. Added explored_at column
to mirrored_playlists (auto-migrated), written when build-tree completes,
and read back via SELECT * so the badge survives page refreshes.