When no cached token exists, spotipy's auth probe starts an interactive
OAuth flow that binds 127.0.0.1:<redirect_port> inside the container.
This either steals Flask's port 8008 (crash loop) or binds loopback-only
on 8888 (unreachable from Docker host — 'connection reset by peer').
Now checks for a cached token before probing. If none exists, returns
False immediately so users authenticate via the SoulSync web UI instead.
No behavior change for already-authenticated users.
Fixes#269
New core/genre_filter.py with ~180 curated default genres. When strict
mode is enabled in Settings → Library Preferences → Genre Whitelist,
only whitelisted genres pass through during enrichment. Junk tags from
Last.fm (artist names, radio shows, playlist names) are silently dropped.
Applied at all 10 genre write points: Spotify, Last.fm, AudioDB, Deezer,
Discogs, iTunes, Qobuz enrichment workers + post-processing genre merge
+ initial download artist/album creation.
Strict mode is OFF by default — zero behavior change for existing users.
First enable auto-populates the whitelist with defaults. Users can add,
remove, search, and reset genres via the Settings UI.
The Duplicate Detector repair job had its own ignore_cross_album setting
that was independent of the global allow_duplicate_tracks setting. When
a user enabled 'Allow duplicate tracks across albums', the detector
still flagged same-titled tracks on different albums as duplicates.
Now respects the global setting — if duplicates are allowed, cross-album
matches are always skipped.
Users can now override which metadata provider (Spotify, Deezer, Apple Music,
Discogs) is used when scanning a specific watchlist artist for new releases.
The selector appears in the artist config modal and only shows sources the
artist has enrichment IDs for. Default behavior is unchanged — all artists
use the global metadata source unless explicitly overridden.
The redownload branch had `import json, uuid` locally inside the function,
which caused Python to treat `uuid` as a local variable for the entire
function scope. When the retag branch ran instead, `uuid` was unbound.
Both modules are already imported at the top of the file.
Move Hydrabase availability checks into metadata_service so source resolution owns the policy. Keep web_server delegating to the centralized helper and add tests for the enabled/disabled cases.
Move artist discography resolution into core metadata_service, introduce MetadataLookupOptions, and keep web_server focused on request handling. Add focused tests for the new service boundary and preserve current fallback behavior for now.
New MusicBrainz tab in Enhanced and Global search — finds tracks and
albums on MusicBrainz's community database with Cover Art Archive
images. Covers obscure tracks that Spotify/Deezer/iTunes miss.
- core/musicbrainz_search.py: search adapter with Track/Artist/Album
dataclasses, Cover Art Archive integration, smart query parsing
- Albums deduplicated (keeps best version with date and art)
- No artist results shown (MusicBrainz has no artist images)
- Album detail with full tracklist for download modal
- Smart word-boundary splitting for queries without separators
- Global search results container widened from 620px to 920px
- UI version bumped to 2.32
Files with embedded tags (artist+title from post-processing) were
failing import because the metadata search scored low (66%) and the
AcoustID result returned before the tag-preference code could run.
- Tag-based identification now returns 85% confidence when embedded
tags have an artist field, borrowing album art from weak metadata
- AcoustID search result only accepted at 80%+ confidence, otherwise
kept as fallback (doesn't short-circuit past tag preference)
- AcoustID None artist/title falls back to tag data via 'or' operator
- Stop retrying failed/unidentified items every scan cycle
Items with status needs_identification, failed, or rejected were not
in the skip list, causing them to be re-scanned and re-logged every
60 seconds indefinitely. Now skips all terminal statuses.
New 'soulsync' media server option manages the library directly from
the filesystem, bypassing Plex/Jellyfin/Navidrome entirely.
Two paths populate the library:
1. Downloads/imports write artist/album/track to DB immediately at
post-processing completion, with pre-populated enrichment IDs
(Spotify, Deezer, MusicBrainz) so workers skip re-discovery
2. soulsync_client.py scans Transfer folder for incremental/deep scan
via DatabaseUpdateWorker (same interface as server clients)
New files:
- core/soulsync_client.py: filesystem scanner implementing the same
interface as Plex/Jellyfin/Navidrome clients. Recursive folder scan,
Mutagen tag reading, artist/album/track grouping, hash-based stable
IDs, incremental scan by modification time.
Modified:
- web_server.py: _record_soulsync_library_entry() at post-processing
completion, client init, scan endpoint integration, status endpoint,
web_scan_manager media_clients dict, test-connection cache updates
- config/settings.py: accept 'soulsync' in set_active_media_server,
get_active_media_server_config, is_configured, validate_config
- core/web_scan_manager.py: add soulsync to server_client_map
Dedup: checks existing artist/album by name across ALL server sources
before inserting to avoid duplicates. Enrichment IDs only written when
the column is empty (won't overwrite existing data).
Race condition: scanner re-scanned folders while post-processing was
still moving files, causing partial matches and ghost failures. Now
tracks in-progress paths and skips them on subsequent scans.
Coverage penalty fix: individual tracks that match at 80%+ confidence
now auto-import even when overall album coverage is low (e.g. 2 of 18
tracks present). Previously low coverage killed the entire import.
Import page: stats bar, filter pills, Scan Now, Approve All, Clear
History (clears imported + failed), live scan progress.
- Track numbers defaulted to 1 instead of using metadata source values
- Release dates not captured, causing missing year in path templates
- Cover art missing for Deezer (direct image_url not checked)
- Track names in expanded view showed Unknown (wrong JSON field name)
- Read year/date from embedded file tags as fallback
- Add Deezer get_album_metadata/get_album_tracks fallbacks
- Handle Deezer tracks.data response format
Loose audio files in the staging root are now picked up alongside album
folders. Singles are identified via embedded tags, filename parsing
(Artist - Title.ext), or AcoustID fingerprinting, then matched against
the configured metadata source. Confidence-gated processing applies
the same way as album folders (90%+ auto, 70-90% review, <70% manual).
Soulseek results from "Various Artists", "VA", "Unknown Artist", and
"Unknown Album" folders are now rejected before scoring. These
compilation folders rarely contain properly tagged files for the target
artist.
Clearing the wishlist now also cancels any active wishlist download
batch and resets the auto-processing flag, so downloads don't keep
running after the source tracks are removed.
Priority 0 query (artist + album + title) was gated behind a download
mode check that excluded Soulseek, the source that benefits most from
it. Soulseek searches match against file paths where users organize as
Artist/Album/Track — without the album name, ambiguous artist names
could match wrong-artist results (e.g. "Bleakness" as an album folder
instead of an artist). Removed the mode gate so all sources get the
most specific query first.
Repair-worker album fills now generate explicit track IDs when copying rows, instead of relying on SQLite auto-assignment that no longer exists for TEXT primary keys. The unknown-artist fixer now does the same for new artists.
Also add a regression test for the album-fill copy branch and keep the AcoustID scanner resilient to legacy null-ID rows.
Full auto-import pipeline: background worker watches the staging folder,
identifies music using embedded tags → folder name parsing → AcoustID
fingerprinting, matches files to metadata source tracklists, and
processes high-confidence matches through the existing post-processing
pipeline automatically.
Worker: AutoImportWorker with start/stop/pause/resume, configurable
scan interval (default 60s), confidence threshold (default 90%), and
auto-process toggle. Processes one folder per cycle, alphabetical
order. Disc folder detection, stability checking, content hash dedup.
Confidence gate: 90%+ auto-processes silently, 70-90% queued as
pending review with approve/dismiss actions, <70% flagged for manual
identification. Track matching uses weighted algorithm (title 45%,
artist 15%, track number 30%, album tag 10%).
Database: auto_import_history table tracks every scan result with
folder hash, match data JSON, confidence, status, timestamps.
API: 7 endpoints — status, toggle, settings (GET/POST), results
(filtered/paginated), approve, reject.
UI: Auto tab on Import page with enable toggle, confidence slider,
scan interval selector. Live result cards with album art, confidence
bar (green/yellow/red), status badges, match stats. 5-second polling.
Switch similar-artist backfill to the shared provider-priority flow instead of assuming iTunes as the fallback.
Reuse the generic metadata search helpers, keep a compatibility alias for the old helper name, and update the scanner tests to cover the new path.
Add a regression test that verifies backfill walks each available fallback provider and persists the resolved IDs per source.
Shift similar-artist lookup to the shared metadata provider priority flow.
Use generic provider clients for search and metadata extraction instead of
branching on Spotify/iTunes-specific paths.
Add a regression test that verifies MusicMap matching queries the provider
priority list and preserves canonical metadata from the best match.
Make discovery pool population and curated playlists follow the configured metadata source order. Keep Spotify strict where fallback would corrupt source-specific IDs, and trim fan-out with smaller similar-artist samples and page caps. Leave the remaining incremental path for follow-up.
Reduce request volume in the discovery helpers while keeping the source-priority model intact.
- make cache_discovery_recent_albums source-priority aware
- cap Spotify artist-album pagination in the discovery and incremental paths
- reduce the similar-artist sample size for the cache-refresh helper
- keep Spotify strict where fallback would contaminate source-specific IDs
- add regression coverage for source order, strict Spotify lookups, and pagination caps
Watchlist scanner: empty discography (no new releases in lookback) was
treated as API failure, causing "Failed to get artist discography" for
artists like Kendrick Lamar who simply had no recent releases. Now
distinguishes None (API failure → try next source) from [] (success,
no new tracks). Spotify backfill now uses the authenticated client
instance instead of creating a fresh unauthenticated one.
Wishlist nebula: album remove now sends album_name (API updated to
accept album_name as fallback alongside album_id). Track remove
re-renders the nebula after deletion. Toned down processing pulse
animation.
Updated test to verify fallback triggers on API failure (None), not
on empty results.
Replaced track-count-only release selection with deterministic scoring
across 6 factors: track count match (40pts), release status (10pts),
country preference with US/worldwide bias (10pts), format preference
favoring Digital/CD over Vinyl/Cassette (10pts), barcode presence (3pts),
and date completeness (2pts). Same inputs always produce the same release.
Also fixed critical bug: _embed_source_ids was missing the context
parameter, silently skipping ALL source ID tag embedding since the
MusicBrainz consistency commit. Now passes context from the caller.
Make discovery pool population respect provider priority while keeping Spotify strict, and reduce unnecessary request volume in the hot discovery paths.
- keep discovery fan-out source-priority aware
- preserve cache use where freshness is not required
- cap Spotify artist-album pagination in discovery and cache refresh paths
- keep incremental release checks to a single page, since they only need the newest releases
- add regression coverage for provider order, strict Spotify handling, and pagination caps
Rewrote the AcoustID scanner job to scan all library tracks (via DB file
paths resolved to disk) instead of only the Transfer folder. Checkpoints
by track ID for robust resume across restarts. Defaults changed to
enabled, 24h interval, batch size 200.
Added _fix_acoustid_mismatch handler with three actions:
- retag: update DB title/artist to match actual audio content
- redownload: add expected track to wishlist and delete wrong file
- delete: remove wrong file and DB record
This catches cases like a file tagged as "Dinosaur Bones" that is
actually "Helicopters" — the scanner fingerprints the audio, detects
the mismatch, and the user can fix it from Library Maintenance findings.
Resolve Spotify artist matching through the exact Spotify client only, so watchlist ID backfill cannot drift to fallback-provider results. Remove the remaining preemptive provider availability check from the backfill loop.
Allow cached Spotify search results to return even when Spotify is rate-limited or temporarily unavailable, and remove redundant rate-limit gating after auth checks.
Drop the old active-provider artist lookup helpers from watchlist_scanner now that the web scan flow resolves sources through the shared metadata priority.
Keep the Spotify-specific feature toggles in place for discovery and sync paths that still use them.
Move the web watchlist scan core onto the shared metadata source priority so primary provider settings are respected during artist, album, and image resolution.
Add coverage for primary-source-first discography lookup and fallback to later providers when the primary source has no albums.
Bring placeholder tracklist skipping back into the shared watchlist scan path, and centralize the DB-only artist image backfill helper so both web scan entrypoints reuse the same logic.
Drop the legacy watchlist scan entrypoints that are no longer used by the web scan flow, and keep the live refresh path pointed at the shared scanner helper.
Move the shared watchlist scan loop into core/watchlist_scanner.py so web_server.py only handles triggers, locks, progress, and post-scan orchestration.
Manual and scheduled watchlist scans now share the same scanner-side core, while the web entrypoints keep profile selection and automation progress updates.
Two-layer detection: (1) check the Qobuz API response for sample=True
before downloading, and (2) validate actual file duration with mutagen
after download — if under 35 seconds, delete and return None. Qobuz
returns valid audio files for previews (~2-5MB FLAC) that pass the
existing 100KB size check, so duration is the reliable signal.